Human attention as context for Internet search is immensely powerful. Having an understanding of WHEN a topic was of particular interest allows you to create date-bounded searches that provide more information-rich results and less junk. Which is why it drives me absolutely bonkers that we have a gold mine of human attention records in the form of Wikipedia page view logs and nobody gives a shit but me!
(If you’re out there and you DO give a shit, hang in there. I love ya.)
Just having an AI go out, hoover everything up, and try to make sense of it doesn’t work. Building search spaces using human attention as context, then asking the AI to apply itself to that space and only that space… Well, then you have something I’m calling Calidossi (Calishat dossier).
Calidossi uses Wikipedia page-view information to identify high-interest days for a topic, then uses a news API to gather news stories in a one-day radius around those dates. The stories are filtered, organized by date, sliced into datasets, and fed to a ChatGPT API call which has instructions to a) restrict its summary to the information at hand and b) ignore all search results irrelevant to the stated topic. The AI-generated summary is shown in conjunction with its sources so readers can review the articles themselves, gather information for additional searching, whatever.
It’s really nice for just checking up on a topic. Maybe I’m interested in learning about recent advancements in psilocybin, for example. If I know of some relevant publications I might have some luck there, but no telling what I’d find if I tried searching the general Web. With Calidossi, on the other hand, I can use the human attention records of Wikipedia to target specific dates for news searching, netting me a set of targeted, information-rich articles which make terrific summaries.
I like these results a lot but there’s no reason to stop at just news results. Using Mojeek and a version of date mode I can find Web pages as well as news articles specific to time spans. Or maybe allow the user to specify custom domain sets using Mojeek API’s fi feature?
There’s so much infosewage online now that it’s getting harder to query your way around it. Shifting to a strategy of tightly-defined search spaces predicated on meaningful, measurable things like human interest, as I’m doing here, is another way to get better results.