Discussion about this post

User's avatar
Francesca Frati's avatar

I read your post with great interest and agree that these are important questions and considerations. It may be that at some point we will need to rethink whether Boolean continues to be the best approach to searching, although, to be fair, the purpose of Boolean was never to rank results.

Pubmed’s query expansion and ranking does seem to work better than other examples you have stated, but it is still problematic, and in the health sciences we routinely teach students how to bypass automatic mapping and advise caution when using Best Match ranking. This article discusses the ways in which Pubmed’s Best Match can introduce bias into search results: https://pmc.ncbi.nlm.nih.gov/articles/PMC8830327/. We are also seeing an erosion of the quality of indexing since the adoption of automatic indexing in Pubmed and other databases. This has a huge impact on the ability of human searchers, and “AI” driven search to retrieve relevant results, regardless of relevance ranking.

I can’t speak to other disciplines, but in the health sciences, librarians do routinely engage with what happens after retrieval. In the clinical context, limiting search results to the best type of evidence for the question being asked, for example limiting to randomized controlled trials to answer therapy questions or limiting to systematic reviews, is an effective way to quickly identify relevant results despite large retrievals. Biomedical databases have built in “clinical queries” limits that are useful in this regard. On the back end these limits are search queries that have been validated for sensitivity and specificity that are applied the search query. In the context of knowledge synthesis, we routinely teach researchers how to develop and refine eligibility criteria to facilitate screening, and we use sample sets of articles known to meet the inclusion criteria to adjust and improve the search, among other techniques. When conducting scoping reviews, iterative searching to improve relevance of search results, is built into the methodology. In the context of knowledge synthesis we routinely encounter “search resistant concepts” (though I was not familiar with this term, so thank you). It does seem like BM25 retrieval over Boolean in this context is something worth exploring, however it would need to allow for transparency and reproducibility which does not seem to be possible at this time.

Perhaps most importantly, in both contexts, we focus much attention on helping searchers ask clearly formulated questions. In my 20+ years of experience lack of clarity in the question is a more important factor in determining whether search results are relevant than issues in search structure or database relevancy ranking.

The ability to ask answerable questions and search iteratively is what makes human searchers more effective than AI. In my limited experience with LLMs and AI driven search this is where they fail.

The more I engage with the topic the more I come back to the idea that LLMs are like any other technological advance: we first think they will topple all that came before, and then we slowly realize they are a tool like any other, that they have their uses, but that the old technology also still has its uses. I suspect that BM25 versus Boolean is one of these things: we will still want to be able to design transparent, predictable and reproducible searches (I would not want my medical care to be based on a small and random set of relevant results!), but in some contexts, it will be useful to be able to quickly identify some relevant results. Google Scholar already serves that purpose, so why not other tools that maybe do it better?

PS there is an increasing body of evidence that shows that PICO based searching is problematic and you are correct to question whether LLMs are capable of recognising that outcomes should be screened for and not included in the search in most instances.

Elyse G.'s avatar

Hello Aaron,

My fellow librarians and I are having a hard time figuring out whether Web of Science Research Assistant still works exclusively with keyword searches using Boolean operators, or if they've introduced some form of semantic search. In a section of their website (which is already 8 months old!), there is a brief mention of semantic search: “Retrieving Articles: We start by retrieving articles that exhibit the highest degree of semantic similarity to the user’s query and complement them by adding articles with the utmost relevance through a keyword search.” without further explanation (https://webofscience.zendesk.com/hc/en-us/articles/31437630410129-Web-of-Science-Research-Assistant#h_01JGKZG2P1XN9FWQGW98CKWB18).

As for the Smart Search feature, it indicates that you can limit results to those from semantic search or keyword search. Have you tested the WoS Search Assistant again recently? Do you think it’s still based solely on keyword search and Boolean operators, as the tool seems to indicate when it provides a response (for example, in the “How are these results generated?” section)? If there is indeed a semantic search component, do you have more information about it?

4 more comments...

No posts

Ready for more?