On Vector Databases and Gen AI. Q&A with Frank Liu

Q1. From a database perspective, what are the main challenges that Generative AI (GenAI) poses?

The biggest challenge that Gen AI poses is scale. We have so many sources of data – data used to train the models, organization-specific proprietary data that we would like to feed into the models, and outputs generated by the models as well. Developers need a way to store, index, and manage this wealth of data – that’s where vector databases fit in.

Q2. Do you have any recommendations on choosing the right Large Language Model for a given business?

If you’re seeking a general-purpose language model, you can choose any top vendors, such as OpenAI, Anthropic, Bard, Claw, Google’s Gemini, and many more. However, if you’re in search of a more customizable solution or one that can cater to a specific industry, it’s recommended that you opt for an open-source model like Llama2, Mistral, Falcon, and more. In particular, fine-tuning an open-source model with your own data can be very effective.

Q3. Can you tell us the main benefits of applying Retrieval-Augmented Generation (RAG) on Large Language Models (LLMs)?

Retrieval augmented generation, or RAG, is a technique to enhance the performance of large language models or generative AI applications. It involves providing additional data to the system beyond what it was originally trained on. To achieve this goal, you can use a vector database to search your internal corpus of documents, images, and videos, which are then sent along with the original prompt as context to the language model. This approach allows the system to generate more accurate and relevant responses.

Q4. There is an increasing interest in vector databases (also known as vector storage). Why?

It is partly because of the connection to retrieval augmented generation we covered earlier. More broadly, I think people are also starting to see that vector databases are great for applications outside of generative AI, such as product recommendation, fraud and anomaly detection, molecular search, and more. A big part of this is also because this technology has become democratized and is more broadly accessible to individual developers and large non-tech-native organizations alike.

Q5. What is the role of vector databases in Generative AI applications?

When it comes to Generative AI, you can leverage a vector database to store and search for more relevant material to feed your Generative AI model. That way, your Generative AI apps can generate more accurate and relevant answers to your questions. 

Q6. How do vector databases and RAG fit together?

Usually, you’d use a vector database to retrieve the most relevant documents first and then send them to your larger language model. That whole process together is called RAG. 

Q7. Can you give us an example?

Let’s use a large financial services company as an example. They can access many analyst reports, S1 filings, and other financial data from public companies. Sometimes, they may need to ask specific questions about a company’s financials, such as Q3 revenue in 2024. In such cases, you can use a vector database and embeddings to retrieve the most relevant documents related to that company’s Q3 earnings. You can then send these documents and the original query to a large language model for processing. 

This process is known as RAG (Retrieval Augmented Generation). Adding the retrieved documents to the original prompt allows the large language model to generate more accurate and relevant answers without specific training on that data. RAG is a powerful way to augment your generative AI capabilities and improve the accuracy of your large language models.

Q8. Regarding vector databases, what are your practical tips for machine learning engineers? 

It’s important to understand the database side of vector search before delving into the challenges of building a scalable vector search system that can handle thousands or tens of thousands of queries per second. This understanding will help you appreciate the issues vector databases aim to address. It will also give you a better understanding of why having a purpose-built vector database with many essential database features is necessary. 

To get the full lowdown on this topic, head over to

https://zilliz.com/learn/what-is-vector-database

…………………………………………………

Frank Liu | Head of AI & ML

Frank Liu is the Head of AI & ML at Zilliz, with over eight years of industry experience in machine learning and hardware engineering. Before joining Zilliz, Frank co-founded an IoT startup based in Shanghai and worked as an ML Software Engineer at Yahoo in San Francisco. He presents at major industry events like the Open Source Summit and writes tech content for leading publications such as Towards Data Science and DZone. His passion for ML extends beyond the workplace; in his free time, he trains ML models and experiments with unique architectures. Frank holds MS and BS degrees in Electrical Engineering from Stanford University

Sponsored by Zilliz

You may also like...