On Vector Databases. Q&A with Akmal Chaudhri

Q1. You’re planning to write a new book exploring vector databases—what is it about this particular moment in data technology that makes you want to dive deep into this subject now?

We’re at a genuine inflection point. For decades, databases were primarily about structured records – rows, columns, keys and transactions. What’s changed in the last two years is that unstructured data and machine-generated meaning have moved to the centre of application design, largely driven by generative AI. Vector databases sit exactly at that intersection.

What makes this moment compelling is that vector search is no longer experimental – it’s now foundational. It’s powering retrieval-augmented generation, multimodal search, recommendation systems and intelligent assistants across industries. At the same time, the ecosystem is evolving quickly, which creates both excitement and confusion.

This feels like the right moment to step back and provide clarity: not vendor hype, not theoretical ML research, but a practical, comparative view of how these systems actually work in real applications.

Q2. Looking back at your career working with databases and data technologies, what lessons or insights do you think will be most relevant as you approach writing about vector databases? How has your perspective on teaching technical concepts evolved?

One of the biggest lessons I’ve learned is that technology only becomes meaningful when people can connect it to a problem they already understand. Early in my career, I focused heavily on features and performance benchmarks. Over time, I’ve realized that what matters more is context: why a technology exists, what it replaces, what trade-offs it makes and what kinds of mistakes people commonly make when adopting it.

My approach to teaching has shifted from “here’s how it works” to “here’s why you’d use it, when it breaks down and how it fits into a broader system.” With vector databases, in particular, I think it’s crucial to connect the dots between embeddings, search, ranking, memory and real user-facing applications rather than treating vectors as an isolated technical novelty.

Q3. Vector databases are appearing everywhere right now, from AI applications to search systems. From your vantage point, what do you see as the most misunderstood or under-appreciated aspects of this technology that you’d want to clarify?

One major misunderstanding is the idea that vector databases are simply “faster search engines for AI.” In reality, they introduce a fundamentally different retrieval model based on semantic similarity rather than exact matching. That shift has consequences – both powerful and risky.

Another under-appreciated aspect is that embedding quality often matters more than the database itself. People spend a lot of time debating which vector database to use, but much less time thinking about how embeddings are generated, normalized, updated and evaluated.

Finally, many teams underestimate how important hybrid search is – combining vectors with filters, metadata and traditional ranking. Pure vector search is rarely enough for production systems on its own.

Q4. You’re considering a comparative approach—looking at multiple vector databases rather than focusing on just one. What do you think readers and practitioners can learn from seeing different implementations and architectural choices side-by-side that they might miss otherwise?

When you only work with one system, it’s easy to confuse implementation details with fundamental truths. A comparative approach exposes what’s essential and what’s simply a design choice.

For example, some systems are fully managed cloud services, others are open-source and embedded; some are built for massive distributed scale, others for local experimentation. Seeing these side by side helps readers understand not just how to use a tool, but why that tool exists in the first place.

It also encourages better architectural thinking. Instead of asking “Which database is best?”, practitioners start asking “Which constraints matter most for my problem – latency, cost, scale, control or developer ergonomics?”

Q5. The field is moving incredibly fast—you mentioned the tension between “seven weeks” and “seven days” as a timeframe. How do you think about writing technical content that remains valuable even as specific tools and features evolve? What makes technical writing resilient in rapidly changing domains?

Specific APIs will change. Product names will change. Even entire platforms will rise and fall. What doesn’t change nearly as fast are core ideas: how approximate nearest neighbor search works, how embeddings represent meaning, why indexing strategies matter and how retrieval pipelines fit into real systems.

Resilient technical writing focuses on:

  • Mental models, not just commands.
  • Trade-offs, not just features.
  • Patterns, not just products.

If a reader understands the underlying concepts clearly, they can adapt to new tools with confidence. My ideal goal is to write something that’s still useful even if a reader picks it up several years from now and the ecosystem has shifted again.

I’m inclined to call the book “Seven Vector Databases in Seven Days” as technology moves very rapidly today and many developers don’t have a lot of time to skill up.

Qx. Anything else you wish to add?

What excites me most about this project is that vector databases represent more than just a new category of infrastructure – they’re a sign that databases are becoming cognitive systems, not just storage engines. They are beginning to store meaning, not just data.

That’s a profound shift. And it deserves the same kind of careful, practical exploration that earlier generations gave to relational, NoSQL and distributed databases.

Right now, I am finishing up a book with the working title “The SingleStore Cookbook”, where I’ve focused on hands-on examples of using the technology for ML and AI. That experience has been invaluable and will directly inform my next publishing project, which will focus specifically on vector databases. My goal remains the same: to make these technologies approachable, practical and understandable, so developers can implement real-world applications quickly and confidently.

Books

……………………………………….

Akmal Chaudhri

Technical leader and evangelist with extensive experience across databases, AI and developer enablement. Specialised in technical writing, education and community strategy. Proven ability to translate complex technology into clear, engaging narratives that inspire learning and adoption. Regular international speaker, published author and contributor to thought leadership in data systems, AI and developer education.

You may also like...