1
The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.
A hallucination in LLMs is any discrepancy between the expected output of an LLM and the actual output.
As with any new technology, bleeding edge companies are rushing to see where they can use them. On the surface, it’s a commendable action. Organizations don’t want to be caught flat-footed when these types of generational platform opportunities happen.
But with new capability comes new risks. And LLMs are no different. Chief among them is something called a “hallucination”, where an LLM like ChatGPT says incorrect information. That could be as simple as asking who the current president is, and it says “Abraham Lincoln.” That’s funny for a hobbyist, but dangerous for an enterprise company relying on the LLM’s accuracy.
We’ll go over a technique to address hallucinations called Retrieval-Augmented Generation, or RAG. This technique helps remove some risk of hallucinations in modern LLMs.
It’s no silver bullet, but if you’re considering building LLM solutions, and worried about safety, it should be a core requirement.
Hallucinations usually occur from inconsistencies in training data, or pushing the LLM beyond its limits.
If an LLM is trained on incorrect data, that will amplify when it’s answering questions. It has no way of knowing the data is incorrect, because it just gets really good at memorizing its training data. As far as it's concerned, that training data is its knowledge of the world.
LLMs also make things up when they don’t have up-to-date knowledge, too. Asking ChatGPT who the president-elect is in December of this year might be inaccurate (both based on training data, and election results).
There are two main types of hallucinations that LLMs produce.
Sometimes, the LLM can have all the right data, but come to the wrong answer.
If you give it the text of a document, and ask it to summarize the document, it’s not guaranteed to do it perfectly every time. It had the right context, but didn’t translate the content correctly.
Another great example is math. LLMs can’t do math - they’re predicting sequences, not performing calculations. So when you ask an LLM to do complicated math, it’s likely it won’t arrive at the correct answer.
These can be dangerous as well. Asking an LLM to summarize a policy for a customer means you have to rely on that LLM’s output.
This is what organizations usually worry about. You ask ChatGPT who the first person to walk on the moon was, and it responds with Vladimir Putin.
But fabrications can happen here, too. Imagine asking your company’s LLM to describe the refund policy, and it makes up an answer that’s nowhere in the terms and conditions. Without giving the LLM access to these documents, it’s pulling from the only information it has; its training data. That’s not reliable for an organization.
A new technique for mitigating hallucinations called “Retrieval-Augmented Generation” is a promising way to address the concern of hallucination.
The concept is simple - give the LLM access to knowledge beyond its training data. When it gets a question, it can encode that question and compare it against the external database called a vector database, making it much easier to find a document with the closest answer.
Storing data in its raw form can be problematic for search. Searching for text data is easy enough if you know the exact phrase, like searching “2024 Toyota Rav4”.
But what if you want to ask a more nuanced question? Like “What’s the best trim option for a 2024 Toyota Rav4 when I’m budget-conscious but also concerned about having the best features?” You might get lucky and that exact question is the title of an article. But odds are, it isn’t. And traditional search engines usually match by exact phrase, meaning you won’t get great results.
A vector database makes it much easier to search like humans search. It uses an algorithm that computes the mathematical representation of the article, called a vector embedding.
From there, it’s straightforward to get this working for your LLM.
There are mathematical formulas that help find similar vectors. Source data is converted to vectors with thousands of dimensions which allows finding things that are similar - not equal.
We can ask a question, and get the five (for example) closest answers that match the content of the question.
So, a vector database becomes our "knowledge base", where original data from multiple sources is loaded. Then, when a user asks a question, the system:
That last part is crucial.
Just like Wikipedia with citations, we can have more confidence in our answer if the source is included. We can tell if the answer is hallucinated or real.
If we give an LLM access to data outside its training set, we can unlock entire new capabilities while reducing the chance for error.
Any data science will tell you that data quality forms the bedrock for any successful AI project. RAG is no different. It doesn’t matter how well-designed your system or vector database are - garbage in equals garbage out.
What does data quality actually mean in the context of RAG?
That last point is important. Giving your LLM all the data you have won’t make its performance better. Imagine sifting through articles about ancient Rome when you’re trying to locate the best way to change the oil on a mid-2000s Mercedes. At best, you’re increasing retrieval time. At worst, you’re confusing the LLM and degrading its answers.
Similarly, consider the audience. If you’re building a system for teenagers, but all of your knowledge base documents are scientific papers, your similarity matching will be worse, because the tone and question structure will not match. This isn’t an unsolvable problem, but it does mean more care in the prompt and setup.
LLMs are unpredictable. The same prompt in GPT-3.5 might behave differently (or worse) in GPT-4 or GPT-4o.
The best hedge against this is a bulletproof, finely-tuned prompt. Here’s a few characteristics of a dependable prompt:
Prompt adjustment is an often-overlooked aspect of RAG. Simply telling an LLM that it has access to an external database isn’t useful if the LLM doesn’t have context on what the database is used for, when to access it, what the data looks like, etc.
We’ve covered the beginning and middle of the diagram - setting up for success with quality data, and structuring the retrieval correctly with focused prompts.
But once we retrieve the information, we need to make sure it’s correct and useful.
There’s a few ways we can do this.
This is an evolving practice, but still underrated way to improve reliability of your system.
It can scale as simple or as complex as needed - from simply running through a GPT-powered grading step, to referencing the original database and keeping track of hallucinations.
This is just the tip of the iceberg. Introducing RAG into your product’s architecture means new opportunities, but also new problems. Prompt injection, data integrity and security, system performance, cost…all points to consider.
If you’re confused where to start, we’ve built enterprise-level solutions for Fortune 500 clients. Our team is ready to understand your specific use case, and build out a custom system around it.