Internal demo – RAG with multi-level access hero

Generative AI

Generative-AI-powered chatbot with multi-level information access

Year

2023

Services

Generative AI Development

Technologies

GPT, Langchain, Chainlit

A demo RAG chatbot

Extracting information from unstructured data is a common use case for generative AI. While this may seem straightforward at first glance, it poses several challenges related to data security, accuracy, and overall compliance. To showcase how these challenges can be effectively addressed, we have developed a demo chatbot with multi-level information access, that demonstrates strategies for tackling the complexities of searching for information within intricate reports.

Challenges

Internal demo - RAG with multi-level access - screenshot

Large internal database

Corporate knowledge databases are often built of loads of unstructured data: files, reports, notes, etc. Most of the searching methods through these documents are ineffective as they require a lot of manual work.

Multi-level access

Depending on the companies’ internal policies, different departments, or even specific roles, may be authorized to access certain documents. When implementing generative AI to help users search for information through the knowledge base, it is crucial to indicate what pieces of information they are authorized to access, and develop a chatbot with multi-level information access.

LLM’s hallucinations

In general, large language models generate answers that are syntactically correct. If they don’t know the right answer, they return an answer that seems probable. In other words, they hallucinate. However, if we want to rely on the model’s responses, it needs to be trustworthy.

Trust

As hallucinations are a well-known problem of LLMs and many people were aware of it, eliminating it was not enough — because the users would never be sure if the model’s response is correct or just probable.

Solutions

Generative AI + RAG | Simplified architecture

Retrieval Augmented Generation (RAG)

RAG is an AI framework for retrieving information from external sources of knowledge to ground LLMs. Thanks to that, the model is able to provide users with accurate, up-to-date information that comes from the designated data sources (e.g., the reports), and there is no need for additional training.

Model returning answer with the source of the information

If the answer was not found in the database, the model was specifically instructed to say so and not to make the answer up. Additionally, if the model was able to provide the answer, it would add a footnote indicating the information source. That way, it ensured users that the answer was not a model’s “hallucination.”

Proof of Concept to prove the technical feasibility of the chosen approach

For the sake of this demo, integrating an extensive knowledge base into the model would be an overkill. As our main goal was to validate the technical feasibility of the designed solution, we decided to use a sample document, RAG, and a simple chatbot interface. Additionally, we added a multi-level access feature that manages users’ access to specific parts of the report.

Technology we used

GPT-4

Langchain

Chroma

Chainlit

Unstructured.io

Gains

The PoC proved the technical feasibility of the designed solution. When implemented, it can:

Be easily adopted thanks to a familiar interface

One of the objectives of incorporating new tools is to make work more efficient. To become a true alternative to previously used solutions, they need to be intuitive, and their onboarding time should be as short as possible. A well-known interface of a chatbot makes it possible to be used from day one by any employee.

Speed up finding information in unstructured data

Instead of searching for the right report and then looking for the information in that report, the users tell the Gen AI-powered chatbot what they need to know, and the model returns an answer with a reference to the original document — saving up hours of research work.

Help organizations tackle compliance issues

Assigning different levels of authorization to different roles or departments makes it easy to manage access to confidential information and comply with corporate policies.