5 Comments
User's avatar
Rob's avatar

Being rather ancient (a caveman perhaps?), I can remember the days of Dialog where you could interrogate hundreds of databases pretty much using the same search syntax.

That was very handy obviously (although the timed costs for searching were quite stressful)

Pascal Martinolli's avatar

Excellent post ! At Université de Montréal we made a BINGO to list all the questions to ask yourself about the features of genAI for document retrieval (in French) : https://boite-outils.bib.umontreal.ca/ld.php?content_id=37726345

Aaron Tay's avatar

Very nice. This is somewhat similar to my older post on questions to ask when testing AI academic search engines: https://aarontay.substack.com/p/testing-ai-academic-search-engines-what

A lot of the issues in your table are genuinely new problems unique to the 2023+ generation of "AI-powered tools" — the explosion of retrieval techniques beyond just lexical search + learning-to-rank means users now have zero sense of how their query input is actually used. I'd call this retrieval opacity, which covers both query manipulation and the methodology black box.

And then there's all the generative synthesis and agentic stuff on top of that.

That said, some of these problems arguably pre-date the current wave. Not knowing what's in the index, results throwing up retracted works that aren't properly labelled — these stem from the shift toward inclusive rather than exclusive indexes.

Google Scholar, Lens, Dimensions, OpenAlex (formerly MAG), Semantic Scholar — they all had these Corpus/index opacity issues before "gen AI" entered the picture.

Of course, it's a perfect storm now because the 2023+ generation of AI-powered search typically builds on these big, inclusive, often open indexes as their base — compounding the opacity problem.

Alan's avatar

I found the suggestions for ways forward for the interfaces really helpful - thanks. I think there is additional challenge from the shifting nature of the models being used to power the various blank search boxes. If capabilities and "preferences" of the model are changing (and the capabilities surely must be to warrant adding the newer models) then the user faces a new learning task whenever the model is updated. I guess we need the systems to hold some elements unchanged at least for most of the time? Having the visual presentation of some of the filter type affordances seems one way to address this but feels like a "librarianly" solution to trying to wrest back some control

Aaron Tay's avatar

I think the "blank box" issue has 2 issues (1) False positives - you run a certain type of query X thinking it can do X and 2) False negatives - you never run a certain typeo of query Y thinking it can't do it. I think (1) is a far more serious issue espically if it fails smoothly yet quietly. For (1), the AI/LLM can parse the query and refuse to run if it noted you asked to filter by metadata filter Z but that isn't in the index. How reliably this can be done is a big question.