Generative AI

Generative-AI-powered chatbot with multi-level information access

A demo RAG chatbot

Extracting information from unstructured data is a common use case for generative AI. While this may seem straightforward at first glance, it poses several challenges related to data security, accuracy, and overall compliance. To showcase how these challenges can be effectively addressed, we have developed a demo chatbot with multi-level information access, that demonstrates strategies for tackling the complexities of searching for information within intricate reports.

01

Challenges

Large internal database

Corporate knowledge databases are often built of loads of unstructured data: files, reports, notes, etc. Most of the searching methods through these documents are ineffective as they require a lot of manual work.

Multi-level access

Depending on the companies’ internal policies, different departments, or even specific roles, may be authorized to access certain documents. When implementing generative AI to help users search for information through the knowledge base, it is crucial to indicate what pieces of information they are authorized to access, and develop a chatbot with multi-level information access.

LLM’s hallucinations

In general, large language models generate answers that are syntactically correct. If they don’t know the right answer, they return an answer that seems probable. In other words, they hallucinate. However, if we want to rely on the model’s responses, it needs to be trustworthy.

Trust

As hallucinations are a well-known problem of LLMs and many people were aware of it, eliminating it was not enough — because the users would never be sure if the model’s response is correct or just probable.

02

Solutions

Retrieval Augmented Generation (RAG)

RAG is an AI framework for retrieving information from external sources of knowledge to ground LLMs. Thanks to that, the model is able to provide users with accurate, up-to-date information that comes from the designated data sources (e.g., the reports), and there is no need for additional training.

Model returning answer with the source of the information

If the answer was not found in the database, the model was specifically instructed to say so and not to make the answer up. Additionally, if the model was able to provide the answer, it would add a footnote indicating the information source. That way, it ensured users that the answer was not a model’s “hallucination.”

Proof of Concept to prove the technical feasibility of the chosen approach

For the sake of this demo, integrating an extensive knowledge base into the model would be an overkill. As our main goal was to validate the technical feasibility of the designed solution, we decided to use a sample document, RAG, and a simple chatbot interface. Additionally, we added a multi-level access feature that manages users’ access to specific parts of the report.

Technology we used
GPT-4
Langchain
Chroma
Chainlit
Unstructured.io
03

Gains

The PoC proved the technical feasibility of the designed solution. When implemented, it can:

Be easily adopted thanks to a familiar interface

One of the objectives of incorporating new tools is to make work more efficient. To become a true alternative to previously used solutions, they need to be intuitive, and their onboarding time should be as short as possible. A well-known interface of a chatbot makes it possible to be used from day one by any employee.

Speed up finding information in unstructured data

Instead of searching for the right report and then looking for the information in that report, the users tell the Gen AI-powered chatbot what they need to know, and the model returns an answer with a reference to the original document — saving up hours of research work.

Help organizations tackle compliance issues

Assigning different levels of authorization to different roles or departments makes it easy to manage access to confidential information and comply with corporate policies.

Discover our
other projects

Building a complex generative AI platform from scratch in only 8 months
What's the secret to building a successful generative AI platform (in only 8 months)? Read about our cooperation with Caju AI and find out!...
Improving the performance of the GPT-4-powered chatbot by 1900% with Pinecone, LangChain, and embeddings
See how Pinecone, LangChain, and embeddings helped us improve the performance of the GPT-4-powered chatbot by 1900%....
Using predictive models to reduce customer churn by more than 20%
Read the case study and learn how to use predictive models to reduce churn by more than 20% and get 10x return on your investment....
clutch logoTop Artificial Intelligence Companies 2023
clutch logoTop AI Companies 2023
clutch logoTop Web Developers 2023
clutch logoTop Web Developers 2023