Librarian & fake news - Bayes, metaknowledge & epistemic humility
The recent rise in interest in fake news has given us librarians a reason to once again trumpet loudly the value of what we do in teaching information or media literacy.
Librarians were quick to establish our turf by calling out articles that mention information literacy without mentioning librarians.
This #fakenews article is what has lots of #librarians saying "this is what we we're trying to tell you all along" https://t.co/NOgX2bZT4b
— steven bell (@blendedlib) December 12, 2016
Besides the expected library sources, pieces began to appear in mainstream sources such as the Salon, U.S. News & World Report and most recently PBS began to praise the role librarians can play in fighting the rise of fake news. Many librarians were ecstatic, finally our moment in the sun has come!
So how do librarians fight fake news? A running joke among some librarians is that the librarian's standard solution to all the world's ills is to build a LibGuide.
And indeed, librarians and libraries such as Cornell University Library, Indiana University East Library, University of Virginia Library have created or adapted existing material to create guides on fake news.
As I write this, there are at least 1,651 Libguide pages that mention "fake news".

While I salute the efforts of librarians to create guides, I fear the actual impact is more like the sentiment below I saw expressed by someone on Twitter*.
"Before I read this thing my friend posted on Facebook, let me open up that helpful LibGuide in another tab." <--No Student Ever
— Lane Wilkinson (@lnwlk) January 24, 2017
* Modified 21/2/2017 with actual Tweet made by Wilkinson
All this talk about helping or teaching our users deal with fake news, made me curious. What roles can librarians play in this? Is it a matter of teaching the CRAP test or worse some black and white view that only .org/.gov or peer reviewed article is reliable (do you automatically trust information on .gov sites under the Trump administration?) Or does teaching information literacy using various "Threshold concepts" be the solution? Is the line between fake news sites and biased news always clear and distinct?
While the answers to these questions are probably not going to be easy, below are three articles I've read that made me think more deeply on the topic.
1. Boyd’s interesting yet scary argument
Danah boyd is a well known Researcher at Microsoft Research and has been influential in helping us understand how young people relate to technology.
Recently she wrote a very provocative piece Did Media Literacy Backfire? that made me think.
I covered the whole argument over here at Medium but in a nutshell her argument seems to be that, certain topics are extremely complicated and that it takes a real expert with years of experience and expertise to pick apart the opposing counter arguments particularly for topics where there are many who for various reasons spend years horning their arguments (think anti-evolution arguments). As such it would be better for people to just accept the consensus judgement of experts.
She argues that media literacy may backfire if we train people to believe they should and are capable of evaluating all arguments and statements. We train them to doubt and make up their own minds.
I would add that ACRL's new Framework for Information Literacy for Higher Education states that “Authority Is Constructed and Contextual”, even gives individuals the license to decide they shouldn't automatically trust mainstream sources.
She writes "If the media is reporting on something, and you don’t trust the media, then it is your responsibility to question their authority, to doubt the information you are being given."
Add the natural tendencies of people to privilege evidence that supports their original beliefs and media literacy backfires.
"People believe in information that confirms their priors. In fact, if you present them with data that contradicts their beliefs, they will double down on their beliefs rather than integrate the new knowledge into their understanding."
As such she implies that in many matters it’s better for people not to try to figure out the truth themselves but to just trust the experts.
You can read my full coverage and response to this scary argument here but it's a interesting question to think about, in our rush to teach students to think for themselves and to evaluate information do we teach them humility to say we don't know enough to decide either way? Do we teach the concept of "Epistemic learned helplessness"?
2. Wilkinson's new research Agenda for information literacy - Bayesian inference
You may be wondering what Boyd means above by "People believe in information that confirms their priors...", this is where the idea of Bayesian inference comes in.
Lane Wilkinson is a "philosophically-inclined instruction librarian" at the University of Tennessee at Chattanooga. Currently acting as Director of Instruction for the library, he is also known for his heavy criticism of ACRL's new Framework for Information Literacy for Higher Education.
He recently wrote an interesting piece that suggests brings in the idea of Bayeisan inference into information literacy.
It's a fascinating idea, and while I have come across the idea of bayesian models of reasoning in other contexts but like many librarians the idea of its intersection with information literacy passed me by.
One reason I suspect for this is that the concept of bayes thinking is not easy to grasp for many (including me). While many of us memorised the formula for bayes' theorem in school, a intuitive understanding of it eludes many. IMHO some of the best explanations can be found here
Lane setups the traditional explanation by using cancer detection reliability as an analogy. But I will skip this and go directly to the implication (in a simplified manner).
As Lane explains the idea here is that people have "priors", their belief in whether a certain fact is true or not. New information that they learn will shift their beliefs with the magnitude of change depending on how reliable they think the source is.
So say before reading anything a person's belief that Obama was born in the US was say 50%, call this P(A) = 50%. This is their "prior" probability. In this case they are undecided.
Say they read a news story that gives evidence he was indeed born in the US call that B.
Say they perceive the source as quite reliable, in other words the articles from that source are usually correct.
This implies two things, firstly
P(B|A) , the probability that B occurs (the article exists) given that it is true (aka Obama was born in the US = A) is high - assume 80%
AND
P(B| Not A) or P (B | 'A) is low, aka if Obama was not born in the US, an article saying so is unlikely to appear - assume 10%
Throw in the bayes formula (see below) and you find P(A|B) aka the posterior probability that they should believe A given that (or conditional on the fact) there is a news article B rises to 89%. So reading the news article means they should increase their belief that Obama was born in the US to 89%.

If on the other hand if they think the source is not so reliable say a liberal source (if they are conservatives), then say they feel P(B|A) is only 60% and P(B| 'A) is say 50%, plug into bayes theorem and P(A|B) rises to only 55%.
In more extreme cases where they think the source is more likely to be wrong than right, e.g. P(B|A)< 50%, reading the article makes the reader even more doubtful than before!
Hopefully I got that all right! If you understood all that, then you understand what Boyd was saying earlier.
"People believe in information that confirms their priors. In fact, if you present them with data that contradicts their beliefs, they will double down on their beliefs rather than integrate the new knowledge into their understanding."
In other words, people do not trust sources that disagree with what they already believe. Based on bayes theorem this means evidence from such sources will either not change their belief much or even in extreme cases drives their belief in the opposite direction.
Lane clarifies that he doesn't intend librarians to teach students bayes interference (It can get pretty complicated as this involves philosophical issues in epistemology after all), but that information literacy can be studied under the lens of bayesian interference. He lists quite a few intriguing questions to study for example he asks "How do we adjust when our trusted, reliable sources publish something false? For example, when peer-reviewed journals retract articles." and "What are the information-seeking behaviors of students researching something they have a strong opinion on?"
3. Improving crowd sourcing by weighting metaknowledge
Lane talks about bringing "theories of cognitive science, psychology, information science, economics, philosophy, law, decision theory, and so on into library studies".
It's an interesting thought, I have been reading quite a bit in the past few years on cognitive biases, decision theory and while like Lane, I don't expect librarians to teach this in information literacy classes, it does seem to be an interesting domain that is related and can inform information literacy. For a taste of this, recently I read an interesting idea about improving wisdom of the crowds by using meta knowledge.
You can find the argument on Nature "A solution to the single-question crowd wisdom problem", or if you prefer to read more layman friendly articles at Aeon or coverage by MIT News.
Essentially the problem with Wisdom of the crowds is one averages across everyone equally. The idea is if we can identify who are the "experts" in the crowd, we can improve the reliability of our results by counting their opinions more.
How do we identify such experts if we are not experts ourselves? The key insight is that experts not only have more content knowledge, they also have better metaknowledge. Here's how you exploit this.
From the Aeon article,
"When you take a survey, ask people for two numbers: their own best guess of the answer (the ‘response’) and also their assessment of how many people they think will agree with them (the ‘prediction’). The response represents their knowledge, the prediction their metaknowledge. After you have collected everyone’s responses, you can compare their metaknowledge predictions to the group’s averaged knowledge. That provides a concrete measure: people who provided the most accurate predictions – who displayed the most self-awareness and most accurate perception of others – are the ones to trust"
Confused? Here's a concrete example used in the article. They asked a group of MIT and Princeton Undergraduates the following question " Is Philadelphia the capital of Pennsylvania?"
The correct answer is "No", in fact the capital is Harrisburg. Most people who make this mistake, would say "Yes" and predict that most people say 90% would say "Yes" too as they aren't aware that the answer is in fact wrong.
People who know the answer is "No", mostly will also know that "Yes" is a common error and when asked to predict how many % will agree with them will guess a lower figure say only 30% will agree with them (or alternatively 70% will say "yes"). In other words they have better metaknowledge, they not only know the fact, they know others know less.
When you look at the final result, the "No" group would most probably have a more accurate prediction of the overall yes/no split as the "Yes" group thinks "Yes" is the obvious answer that most will go for.
I highly recommend you read the Aeon article and the Nature article, it goes more in depth into how metaknowledge can be leveraged and the various experiments the authors did to verify the effectiveness of this technique, the way variants of this technique can act as a lie detector and/or "truth serum" when asking questions that respondents have a bias to hide the truth e.g asking if they have committed plagiarism or made up data.
I find this article fascinating as it provides a partial answer to the question of how to reliably verify the question, how do we know who is an expert?
Conclusion
As I have admitted before, information literacy particularly for freshman hasn't been a big interest of mine. Part of it is because often for me it reduces to teaching Boolean operators (something that I'm of the view is getting less and less necessary) , showing undergraduate how to push buttons in databases , helping freshman who are worried a misplaced dot will get marks deducted or pushing mechanical rules like the CRAP test.
Most probably I'm doing it wrong, but still I do enjoy thinking and discussing deep epistemological questions like "How do we know who is an expert?" , "When should we know to express humility and rely on experts?" etc.
I of course understand that due to the constraints of time and the type of audience, a deep discussion isn't always appropriate, though I see hints of deeper engagement with the new ACRL information literacy model's focus on threshold concepts.

