Discussion about this post

User's avatar
Chad Ratashak's avatar

Great article. I’ve seen a related phenomenon with Google AI Overview and Perplexity hallucinations seemingly confirming the existence of known-fake cases. https://open.substack.com/pub/midwestfrontierai/p/doppelganger-hallucinations-test?r=36ivo&utm_medium=ios

Benjamin Armintor's avatar

This is a very useful write-up, thank you!

I want to push back a little on your assessment of the impact of generative AI in the cited bluesky post - I dug into the 44 works Google Scholar references. There is a related 2019 citations of one of the authors other works, but of the 41 I could follow up on, they were all actually from 2024 (technically one is data 2023-12-31) or later, almost all in 2025. The article GS dates in 2021 is almost certainly bad metadata; the underlying OJS Dublin Core metadata indicates it was submitted in 2025 (relatedly, GS reproduces in-text metadata uncritically and also doesn't reindex - one paper removed the fake citation during revisions at arxiv).

They are also really all different citations: There are 9 references to nonexistent Routledge monographs with different subtitles and years of publication; there are 4 purported Oxford Research Encyclopedia of Education entries complete with different (and fabricated) DOIs. All in all there's at least 30 different citations to over a dozen different journals or academic presses. It's very difficult to come to the conclusion that these are being copied in lazy-but-good faith, with the possible exception of 3 identical references to the same non-existent article in Education and Information Technologies (a real journal! just a non-existent article).

I also want to note that among the citing papers are an IEEE-Explore piece and an edited volume from a Routledge imprint (CRC Press) - the latter especially is a problem for using publication quality as a proxy for authenticity.

8 more comments...

No posts

Ready for more?