Evaluating The Performance Of Llms On SQL And Nosql Database Using Langsmith
DOI:
https://doi.org/10.64252/5h2yej74Keywords:
LLMs, RAG, database, SQL, LangSmith, OpenAI, Meta AI, Google AI, Anthropic, accuracy, correctness, error rate and latency.Abstract
The integration of Large Language Models (LLMs) with organizational databases enables powerful Retrieval Augmented Generator (RAG) systems for advanced data analysis and informed decision making. Existing solutions primarily demonstrate RAG implementations with PDFs or SQL databases, often lacking a comprehensive evaluation of LLM performance. In contrast, this research aims to identify the most effective RAG system for both SQL and NoSQL databases by evaluating the performance of twelve leading LLMs from OpenAI, Meta AI, Google AI and Anthropic. The evaluation leverages LangSmith to assess performance across key metrics such as s accuracy, correctness, error rate, P50 latency and P99 latency, Ultimately proposing the best suited LLM model for RAG based database applications