Large language models trained on questionable stuff online will produce more of the same. Retrieval augmented generation is one way to get closer to truth. Credit: alberto clemares exposito / Shutterstock Even people not in tech seemed to have heard of Sam Altman’s ouster from OpenAI on Friday. I was with two friends the next day (one works in construction and the other in marketing) and both were talking about it. Generative AI (genAI) seems to have finally gone mainstream. What it hasn’t done, however, is escape the gravitational pull of BS, as Alan Blackwell has stressed. No, I don’t mean that AI is vacuous, long on hype, and short on substance. AI is already delivering for many enterprises across a host of industries. Even genAI, a small subset of the overall AI market, is a game-changer for software development and beyond. And yet Blackwell is correct: “AI literally produces bullshit.” It makes up stuff that sounds good based on training data. Even so, if we can “box it in,” as MIT professor of AI Rodney Brooks describes, genAI has potential to make a big difference in our lives. ‘ChatGPT is a bullshit generator’ Truth is not fundamental to how large language models function. LLMs are “deep learning algorithms that can recognize, summarize, translate, predict, and generate content using very large data sets.” Note that “truth” and “knowledge” have no place in that definition. LLMs aren’t designed to tell you the truth. As detailed in an OpenAI forum, “Large language models are probabilistic in nature and operate by generating likely outputs based on patterns they have observed in the training data. In the case of mathematical and physical problems, there may be only one correct answer, and the likelihood of generating that answer may be very low.” That’s a nice way of saying you might not want to rely on ChatGPT to do basic multiplication problems for you, but it could be great at crafting an answer on the history of algebra. In fact, channeling Geoff Hinton, Blackwell says, “One of the greatest risks is not that chatbots will become super intelligent, but that they will generate text that is super persuasive without being intelligent.” It’s like “fake news” on steroids. As Blackwell says, “We’ve automated bullshit.” This isn’t surprising, given the primary sources for the LLMs underlying ChatGPT and other GenAI systems are Twitter, Facebook, Reddit, and “other huge archives of bullshit.” However, “there is no algorithm in ChatGPT to check which parts are true,” such that the “output is literally bullshit,” says Blackwell. What to do? ‘You have to box things in carefully’ The key to getting some semblance of useful knowledge out of LLMs, according to Brooks, is “boxing in.” He says, “You have to box [LLMs] in carefully so that the craziness doesn’t come out, and the making stuff up doesn’t come out.” But how does one “box an LLM in?” One critical way is through retrieval augmented generation (RAG). I love how Zachary Proser characterizes it: “RAG is like holding up a cue card containing the critical points for your LLM to see.” It’s a way to augment an LLM with proprietary data, giving the LLM more context and knowledge to improve its responses. RAG depends on vectors, which are a foundational element used in a variety of AI use cases. A vector embedding is just a long list of numbers that describe features of the data object, like a song, an image, a video, or a poem, stored in a vector database. They’re used to capture the semantic meaning of objects in relation to other objects. Similar objects are grouped together in the vector space. The closer two objects, the more similar they are. (For example, “rugby” and “football” will be closer to each other than “football” and “basketball”). You can then query for related entities that are similar based on their characteristics, without relying on synonyms or keyword matching. As Proser concludes, “Since the LLM now has access to the most pertinent and grounding facts from your vector database, it can provide an accurate answer for your user. RAG reduces the likelihood of hallucination.” Suddenly, your LLM is much more likely to give you a true response, not merely a response that sounds true. This is the sort of “boxing in” that can make LLMs actually useful and not hype. Otherwise, it’s just automated bullshit. Related content analysis Beyond the usual suspects: 5 fresh data science tools to try today The mid-month report includes quick tips for easier Python installation, a new VS Code-like IDE just for Python and R users, and five newer data science tools you won't want to miss. By Serdar Yegulalp Jul 12, 2024 2 mins Python Programming Languages Software Development analysis Generative AI won’t fix cloud migration You’ve probably heard how generative AI will solve all cloud migration problems. It’s not that simple. Generative AI could actually make it harder and more costly. By David Linthicum Jul 12, 2024 5 mins Generative AI Artificial Intelligence Cloud Computing news HR professionals trust AI recommendations HireVue survey finds 73% of HR professionals trust AI to make candidate recommendations, while 75% of workers are opposed to AI making hiring decisions. By Paul Krill Jul 11, 2024 3 mins Technology Industry Careers how-to Safety off: Programming in Rust with `unsafe` What does it mean to write unsafe code in Rust, and what can you do (and not do) with the 'unsafe' keyword? The facts may surprise you. By Serdar Yegulalp Jul 11, 2024 8 mins Rust Programming Languages Software Development Resources Videos