Where to search for research journal literature - some common errors I see on choice of sources (I)
As academic librarians helping early-stage researchers (Masters, Phds students), we are often asked to provide guidance on the literature review process in one shot classes. One thing we tend to focus on during such sessions is the keyword search technique, though many of us also cover alternative to keyword techniques like citation searching, starting off with review articles etc.
It seems to me though there are limits to what we can help for keyword searching in one shot classes, since the audience will all be working in varied topics (most commonly they may not even have a good idea of what they are looking for) and as much as we can give general advice on the use of keywords at the end of day the user has to do a lot of their own practice via iterated searching (unless this is an area where the librarian had prior experience working in)
One thing though I have been thinking about increasingly is to talk about WHERE to search.
Compared to twenty or even ten years ago, the number of academic databases, academic search engines and other tools available to search has increased exponentially even if you focus only on those that work via keyword searching.
Below are just some...

Just some places one can start when searching literature
In my time in working on academic discovery, even I find it difficult to keep them straight, how should one classify them and think about them?
In this blog post, I will set out my thoughts on these tools and how I think about them, providing my mental model on how I classify the tools into 6 main categories and why I recommend these tools based on what the user values in a literature review.
I will cover some common errors I notice users make where errors are defined as when their expressed preferences with regards to what they want to accomplish with their literature review, contradict the sources, tools or techniques they use.
In part II of this blog post, I will go deeper into the weeds by discussing more about each of the 6 categories and how some sources do not fit easily.
Note : I focus mostly on journal-article type literature reviews, if the user is looking for news-types, datasets, business resources or even books it is mostly out of scope for this article. This blog post also is addressed to librarians and advanced researchers who really want to understand search sources and not for average users....
The major types of search sources
In today’s Scholarly environment, there are a bewildering array of academic databases or academic search engines these days to search for literature via keywords. They come from traditional players like Scholarly publishers, Scholarly aggregators and increasingly third-party aggregators who are both commercial (e.g. Ebsco) or non-profit (e.g. Cambia's Lens.org). The following is my attempt to categorize them
Publisher Platforms (e.g. Elsevier’s ScienceDirect, Wiley Online Library)
Aggregator subject databases- both full text & Abstracting and indexing (e.g. Proquest’s ABI/Inform Complete , EconLit)
Cross disciplinary Citation Indexes (e.g. Web of Science, Scopus)
Academic Social Networks (e.g. ResearchGate, Mendeley Search)
Academic Search engines (e.g. Google Scholar, Semantic Scholar)
Each category is distinguished using a mix of the position they hold in the publishing workflow (related to type of producer), the type of technology used and most importantly properties users care about.
These categories are not all encompassing, nor are they completely distinct categories, but they provide a basic framework for understanding.
The "academic search engine" category has expanded substantially and is the most technologically diverse. Google Scholar is the most well-known of them, but they differ substantially e.g. compare Lens.org, Semantic Scholar etc and they could fit also into the Cross Disciplinary Citation Index category. Similarly global aggregators of open content like CORE and BASE as well as "Web Scale Discovery Services" where are near universal on Academic library homepages like Proquest's Summon and Primo, Ebsco Discovery Service, OCLC's Worldcat discovery do not typically serve as citation indexes but sit uneasily in these categories. We will discuss these nuances in part II.
Questions to ask
Being a generalist, I had the opportunity to work with users of varying disciplines and levels from the domains of Business and Management to Information Systems to Economics and Social Sciences and more.
From more conservative Postgraduates that only want to look at results from peer reviewed "top journals" (sometimes they even have a specific list like FT50 Journals) to experienced faculty who are confident in their ability to sieve out poor quality papers and instead focus on functionality to monitor and catch new research papers expressing latest research ideas produced by their peers.
Then there are a subset of users who are working on formal systematic reviews and need powerful search functionality to users who are doing a simple narrative review and search with very simple keywords and justr want one or two relevant papers.
Business & Management domains in particular have seen an uptick in number of "reviews" produced , guided by a number of guidance/opinion pieces on how to do them in specific subfields like marketing. Unfortunately, most of them seem to be reinventing the wheel or do not conform to the traditional Cochrane Reviews or Campbell Collaboration type reviews. See Business Librarian involved in evidence synthesis google group for more.
Because of the variety of needs and demands when doing literature review, the following are some questions I tend to ask or encourage searchers to think about before deciding on which tools to use.
Question 1: Are you interested only in journal articles from “top” or “prestigious” journals, or are you interested in finding relevant articles regardless of the journal it is found in.
Question 2: Are you interested only in final published versions of journal articles or are you interested also in finding preprints, working papers, submitted versions, accepted manuscripts and other grey literature
Question 3: When a new article of interest to you is released, how important is it for you to become aware of it immediately?
Question 4: How important is it for your search to be able to pick up all or as many relevant articles as possible? Or is it less important if you miss out on some relevant papers?
Below shows some recommendations based on how answers are given

Some common errors I see made in choice of tool or source.
I find that when I ask users such questions, they sometimes express preferences that contradict what they use.
One of the most common errors I see is people searching or creating search keyword alerts for one favoured publisher platforms like Elsevier’s ScienceDirect, Wiley Online Library or for a more niche example, AOM (Academy of Management) Journals etc and that is all they do....

Publisher Platform : Wiley Online
This happens I suspect because inexperienced researchers start off at their favorite journal webpage, which is usually accessed via a Publisher Platform or portal.
While it makes sense to search and do a saved keyword alert for their favourite journal, the mistake comes when they do it over the whole platform and assume that is enough to cover their discipline.
Unfortunately, as you know, it's extremely rare for one discipline area of study to be covered by just one publisher (some exceptions include ACM for Computer Science , IEEE or if your research area is extremely niche that one publisher holds the lion's share of the literature) even one of the big five - Elsevier, Springer-Nature, Wiley, Sage, Taylor & Francis. To create a search alert or just search in one publisher platform risks missing a ton of relevant papers.
The next type of error isn't as serious but affects productivity and efficiency.
Question 1 and 2 relate to coverage or size of the index. While it seems bigger is better particularly if you have infinite amount of time to devote to searching, bigger indexes also incur costs in terms of needing more skill to surface relevant results (higher precision) all things being equal.
That is why if you are just looking for prefiltered results that focus on "top journals" and you are absolutely uninterested in anything that isn't a final published version or in grey literature at all using something like Google Scholar is pointless and counter productive since it will surface a lot of results you are not interested in.
You are better off just using Scopus or Web of Science which covers exactly what you want ie peer reviewed "prestigious journals" and final published versions (or near final published versions) that have passed peer review.
Scopus and Web of Science do include "Articles-in-Press (AiP)" or "Early access online" versions for selected journals (roughly 8,000 titles for Scopus as of now). These are accepted manuscripts or even near final formatted pdfs that may not be assigned a vol or issue yet and is not formally published. That said, I suspect if you really care about such versions, going direct to the publisher site would be even better.
You also won't see preprints/earlier versions of published or not yet published papers.
Back in Jan 2021, Scopus released a post with the title "Preprints are now in Scopus!" This is a somewhat misleading title, while it did start to index items from preprint servers such as Bioarxiv, SSRN , they are incorporated into author profiles but will never appear in the actual Scopus proper search results.

Scopus results with Articles-in-Press results
New as of Feb 2023, Clarivate has announced the launch of a "Preprint Citation Index". Unlike the Scopus move, this is a full-blown index and results show in the main search if that index is explicitly selected. Citations from preprints do not affect JIF etc. Some current weaknesses - covers only 5 preprint servers excluding some popular ones like SSRN. More details
Alternatively, you might even be better off with a subject index like Psycinfo or EconLit, if your topic of area is squarely in a discipline (though these tend to include grey literature like dissertations as well as monographs included) and you don't want to be distracted by irrelevant results from other disciplines.

Psycinfo via Ebscohost platform
Because these are subject specific, they have domain specific features like subject specific filters and thesaurus to help create search strategies that maximize precision and/or recall.
For example, where else but in a psychology specific database can you find filters for age or gender of the subjects and the ability to filter to tests & measures.

Some psychology specific filters from Psycinfo on Ebsco
Not sure which database to start? Try - Search where you will find most: Comparing the disciplinary coverage of 56 bibliographic databases. which estimates not just the size but the relative coverage of each discipline. For example, it correctly notes while CINAHL isn't even close to the biggest database by size, it has the largest relative coverage for nursing.
Question 3 addresses users who are worried about being "scooped" and/or want to know immediately when a paper with "their" research idea is used.
I notice some users tell me they value this a lot, but their tool of choice for searching or saved keyword is Scopus or Web of Science.
This is a horrible choice since these are by far the slowest to index new publications. They are clearly slower than publisher portals and known to be slower than Google Scholar in indexing new published articles when they appear on the publisher portal. (See also more recent study that compares Google Scholar, Web of Science, Scopus on this and how it affects picking up of citations)
On average, fully-indexed article data will appear on Scopus within 2-3 weeks of publication on the publisher’s website and some publishers can be up to 4-6 weeks behind.
In fact, if your worry is not to be scooped, just noticing when a paper finally passes peer reviews and is published in a journal is far too late given the long lag times for a paper to be published.
In a recent late 2022 Scholarly Kitchen article - Publishing Fast and Slow: A Review of Publishing Speed in the Last Decade, there is a great analysis of a journal’s turnaround time (TAT). The overall TAT is broken down into two components - Peer review TAT measuring from submission to acceptance and Production TAT from acceptance to publication.

Review TAT (turn overtime 2019-2020) by discipline
Just looking at Review TAT alone it can be >300 days for fields like accounting (just slightly under a year), between 250-300 for fields like marketing, Business and international management, Economics and Finance. This is an eternity really....
To avoid this issue, you will need to venture into the world of monitoring peer prints if your discipline is into that (increasingly this is true for many areas - due to the rise of the preprints).
Here disciplinary expectations differ, there are disciplines like High energy physics, Computer Science, there it is expected for preprints to be made available at the earliest opportunity (and Arxiv is the first ever preprint server), to disciplines where "preprints" may be put up just before submissions to journals or even acceptance is made and to the rare areas where there is no such culture at all.
If this is your concern you should consider searching in such preprint servers. Depending on your preprint server, saved keyword alerts might or may not be possible. That is why a good workaround is to use Google Scholar, since their coverage (including speed of indexing) of most preprint servers like Arxiv, RePec, SSRN, Bioarxiv is excellent.
Some other major considerations
Source data indexed
Another major feature to consider about search sources is whether the search source indexes or covers only metadata – typically at least Title, Abstract, Keyword/Subject, but also author, references, affiliations, funding etc or does it include full-text.
Two search sources even if they cover the same set and coverage of journal articles (say the last 20 years of the same set of journals), will require very different styles of searching, if one includes full-text and the search query is matching full-text as well as the usual metadata as opposed to another search source that can match only article, title, abstract and keyword/Subject.
For one, the same search query will result in a lot more hits for the source which matches full-text. Compared to a search source that can match full text such as Google Scholar,, one that covers only metadata such as Scopus will have far fewer results and it may be worth extending the search with more synonyms using the OR function to avoid missing relevant results.
Search and browse features
On the other hand, when searching sources which can match full text, one might try strategies that try to restrain the search to prevent the search results from exploding, for example by using controlled vocabulary (if available) or by more liberal use of phrase searching.
The other issue is that increasingly many search sources particularly those not by traditional publishers or aggregators are starting to have poor support of strict Boolean and other standard search functionality.
This is particularly bad if you are doing systematic review type work where you aim to maximize recall in searches i.e. get as many relevant results in your search. Of course, the easiest way to maximise recall is to do extremely broad searches, but this means spending a lot of time going through results. So there is a tradeoff.
So how do you address this tradeoff? This is why you need powerful flexible search functionality so that you can express nuanced granular searches.
This includes functionality like
Strict Boolean support (option to turn off stemming/lemmatization) including exact phrase search, proximity operators etc
Truncation/wildcard support
Field specific searches (e.g. title only, abstract only, keyword only)
Parenthesis support to control order of operations
Reasonable search query length
Controlled vocab (e.g. MESH) and more
See the following for details
Which academic search systems are suitable for systematic reviews or meta-analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources (evaluates search and browse features particularly their support of Boolean)
Why Nested Boolean search statements may not work as well as they did
Reliability/accurately of indexed data
Lastly there are differences in reliability of data
Some sources may have less accurate results, they may miss out articles or may have erroneous data about articles (this including citation matching errors as well as deduplication issues). This is because Google Scholar relies on crawlers to crawler publisher websites and parse pdfs, html to include journal article, while this enables it to index new papers far faster than traditional citation indexes like Scopus which have some amount of human curation, the fact that it is automated also means crawlers may miss certain papers or more commonly result in metadata errors.
In fact, in the early days of Google Scholar, there was a whole genre of research papers just documenting such errors, while these errors have diminished, it is clear they are still not as accurate as Scopus or Web of Science and it's common to see researchers complaining about errors in their GS profile.
While Scopus/Web of Science or publisher platforms do use automation and do have errors, they tend to do a bit more human checks (they have a far larger human team than Google Scholar which has only a dozen or so people I believe) and the scope of their work is less making it easier to do checks.
In fact, with Scopus or Web of Science, it is easy to report errors in records and they will be handled and fixed. With Google Scholar, my experience is that it is pointless to report errors in individual records (they don't fix records one by one but focus on overall systematic errors) though if you are a source that Google Scholar draws from (e.g. Publisher, Institutional Repository), you might get a response.
Conclusion
With all these considerations, it is fair to say that Google Scholar and Scopus are almost polar opposites in many ways

In part II of the series, I will go in depth to describe the diverse types of academic databases, search engines and pros and cons of each category.

