A leading metal trading company in Latin America faced significant challenges analyzing its sales meeting minutes. The company aimed to derive valuable insights from highly unstructured data to drive business decisions. As their trusted LLM partner, we gathered the requirements, designed the architecture, and implemented a comprehensive solution.
The project objective was to provide a solution for analyzing unstructured meeting minutes from sales meetings to gain insights into sales volumes, complaints, and salespeople performance.
The sales meeting notes are in Spanish and highly unstructured. Each salesperson had a unique style and described different events and issues, making it challenging for other people involved in the process (e.g., the managers) to extract consistent information.
In order to improve sales processes and grow the organization, it was crucial for the business users to have a 360-degree view of sales operations, including details on salespeople, product performance, sales volumes, and complaints. Such a view would allow for cross-analysis between the accounts and trend-capturing. However, it required structuring data coming from different sources, following different patterns, and written in different languages.
While instant answers were not required, it was important to avoid user frustration and present summaries based on the gathered data in seconds rather than minutes.
Many ambiguities related to people’s names and company-specific acronyms made it difficult to analyze data and make data-driven decisions. It was crucial to ensure that the LLM understood all these ambiguities and that the entities mentioned in queries corresponded to the correct entities.
The whole process of processing queries, retrieving information generating the response by AI, sending it to API, and presenting information to the user is quite a complex process. However, it was crucial to optimize its performance to the limits.
We built a robust data ingestion pipeline using GPT to extract predefined entities from the raw text. These entities included people, prices, products, places, and companies. The goal was to identify and label these entities properly for further processing.
The next step was developing an Entity Linking/Disambiguation pipeline. This involved cross-checking recognized entities against a knowledge base to ensure accuracy. The knowledge base comprised entity names, categories, definitions, and sample usage to determine the correct entities. Our approach included several low-cost filtering steps, such as:
Human supervision was employed in cases of ambiguity, where no entity scored high, or multiple entities had similar scores. This ensured proper entity marking and provided feedback for further refinement.
Finally, we enabled users to ask natural language queries, such as “How many meetings did José attend in the past month?” These queries were translated into MongoDB queries, allowing efficient retrieval of structured data from the preprocessed notes. Finally, the answer can be presented to the end user, according to their preferences, in tabular or text format.
The implemented solution gave our client a powerful tool to analyze their sales meeting minutes effectively. Key benefits included:
Users could quickly derive insights from unstructured meeting notes, and get analysis on patterns and trends between clients in terms of prices, sell volume, etc., which enables better decision-making.
This option allows business users to view detailed information on sales operations, product performance, and complaints, improve offers, and detect the main issues or claims our clients have.
Summaries were generated in seconds, making it possible for business users to get the information they need within seconds (e.g., during a sales call).
Enhanced accuracy in identifying and linking entities, reducing ambiguities.