Tara Calishain

I dream in data flows

Concept Lensing

Being uneducated (I rock a GED) makes me constantly aware that my knowledge is extremely limited. This has shaped much of my search thinking. When I approach a certain kind of search problem I assume there is some kind of higher education classification system I’m completely unfamiliar with and some kind of professional vocabulary I do not know. At the same time I DO know that using specialized vocabulary or classification systems in my query will get me better search results.

So I’m back to that wonderful chewy question: how do I ask for what I don’t know? with the followup: how do I ask for it in the most meaningful, revealing way? Thanks to the Stract API and ChatGPT’s API I was able to make a tool that tests what I call “concept lensing” as applied to general topic searches that are difficult to perform on general Web search engines, especially when you don’t have the knowledge background to construct detailed queries.

For example, say you’re interested in public health, but that’s a HUGE topic. You have to have specific lenses to examine it through in order to break Web search results down into meaningful, useful groups. You may have a medical background with little understanding of public policy, or vice-versa — somewhere there’s a knowledge gap when you try to search. Perhaps you come across an article called “Air pollution high at US public schools with kids from marginalized groups.” You want to research the topic of public health through concepts relevant to that article, but you don’t have the understanding to easily get the concepts out and put them in a search.

But ChatGPT, as a reference bot stuffed with knowledge, can easily extract relevant concepts from an article. And you’re not asking it to apply any context or generate any new information, just intelligently-summarize text. Once the concepts have been extracted from a block of text, each one is applied as a contextual lens to a Stract API query that includes the original topic (in this case public health) and the concept lens. After all concepts are searched, the results are combined, filtered for duplicates, and presented.

So text from an article called “Air pollution high at US public schools with kids from marginalized groups”

Turns into a set of concepts:

pollutants, students of color, poverty, schools, NASA, Health and Air Quality, GeoHealth, particulate matter, nitrogen dioxide, respiratory conditions

Which, when added to the general topic of “public health” and filtered for duplicates, turns into 168 topic-focused results via Stract.

Using this tool, you are using AI to help you build meaningful general search queries, but in a way that’s completely transparent and doesn’t give the AI much room to make stuff up.

I guess the next step will be to see if further analyzing the Stract results for additional concept data is useful.