Is enhancing language models with human knowledge a good idea?

Angelie Kraft, Eloïse Soulier

AI researchers are trying to make large language models (LLMs) more factually accurate by adding knowledge (e.g., to avoid "hallucinations"). But can we assume that this knowledge is objective and value-neutral? We will discuss the social nature of knowledge and knowledge production and reflect on what it means for LLMs.

Large language models (LLMs), like ChatGPT or Bard, are very good at producing texts, but they often make mistakes with facts (they "hallucinate"). To improve this, researchers are adding knowledge from huge knowledge bases. The combination of AI and these knowledge bases is promoted as a way to make AI models more reliable and trustworthy. However, there are some important questions to consider. Is it legitimate to call the content of these databases "knowledge"? Or is the term being used to make the technology sound more credible than it really is? Also, if these knowledge bases contain real knowledge, can we say that they fully and objectively represent the world without any bias or personal views?

We will explore these questions by examining how these data collections are created and used and by comparing them to traditional philosophical views of knowledge. Using basic ideas from social epistemology, we will discuss the social nature of knowledge and reflect on what it means for these technologies. Who produces this knowledge, what role do their situation in the world, experiences and values play in this process? Lastly, we will discuss perspectives on the kind of knowledge and information we want to input into AI systems to ensure they are accurate, safe, and fair.