As developers, it's in our nature to get excited about an emerging technical space—especially if it involves a new addition to the infrastructure stack. But this enthusiasm can sour, especially if it’s for a wave of tooling that isn’t drastically different from the status quo. Is this what is happening to vector databases?
Towards the end of 2023, (standalone) vector databases dominated tech headlines. They followed the wave of next-generation LLMs like GPT-3.5 and Claude. Specifically, they solved one of the biggest limitations of LLMs: LLMs can incorporate finite parametric context before their accuracy drops-off. With vector databases, developers can implement retrieval augmented generation (a.k.a. RAG), where only the necessary context is retrieved to supplement the LLM’s prompt, maximizing accuracy. In short, developers can use RAG to integrate a massive knowledge base without tripping the LLM’s recommended context window.
Accordingly, vector databases were positioned to be the next it category of data storage—following previous waves that targeted search (e.g. Elastic), unstructured data (e.g. MongoDB), and analytical data (e.g. Timescale, ClickHouse). Today, however, it’s unclear if vector databases will remain a standalone category. A vector database is more of a conduit for language models—the models do the heavy-lifting, handling vectorizations (i.e. creating embeddings) and re-ranking. Accordingly, the database depends on these models to carry-out vector search. This raises the question: are vector databases actually an independent category, or is vector search just a need-to-have feature of today’s databases?
Let’s discuss.
The Vector Database Heavyweights
One (naive) angle of looking at vector databases is financial. There were multiple vector databases that raised incredible amounts of money in 2023 and early 2024. Pinecone raised a $100M Series B, nearing unicorn status. Its younger and open source competitor, Chroma, raised a whopping $18M seed round. Other competitors, like Weaviate and Zilliz, raised similar rounds. Clearly investors saw enough positive signs to warrant entering the fray and making a bet on a future decacorn.
Another angle is adoption. Pinecone counts heavyweights like Microsoft, Hubspot, Handshake, and ClickUp as customers. Milvus lists behemoths like PayPal, Nvidia, Bosch, and Intuit. Weaviate reports that it has been downloaded over 13M times. Chroma has scored nearly 18K Github stars, and counts IBM, Salesforce, and Anduril as customers. It’s hard to dispute that vector databases have attained some serious developer love.
But money doesn’t imply success, and logos don’t imply heavy use in production. In other words, it’s possible that the 2023 hype of vector databases has sustained their image by piquing the curiosity of developers and investors. That might sound like a scathing accusation—and it is—but the numbers cast doubt for dedicated vector databases.
The Popular Alternative: Existing Systems
While some companies adopted the new vector database projects, others used existing databases that added vector search functionality. For example, Postgres got pg_vector (shepherded by Supabase), Timescale came out with pgai and pgvectorscale, MongoDB added vector capabilities to its Atlas database service, and Redis made vector search one of its core features. These three databases were already household names, ranked amongst the top 7 most adopted databases.
All-purpose databases getting a “plugin-like” extension to incorporate an emerging need is historically common. For example, after OLAP databases like ClickHouse and DuckDB took flight, Hydra emerged, uniting Postgres with DuckDB to create a somewhat funky OLTP—OLAP hybrid. Even earlier, when semi-structured data needs grew popular, heavyweight solutions like MongoDB had to contend with simpler alternatives like Postgres’s evolved support for JSON.
But there’s a difference when it comes to vector search. In the other examples, the “plugins” were perceived as the “DIY,” hacky approach. They are middle-of-the-road solutions. But for vector search, pursuing a consolidated strategy appears to be the norm. In Retool’s State of AI in 2024, MongoDB and Postgres (via pg_vector) had double the market share of Chroma and Pinecone.
Why? An obvious reason is that AI companies might have previously used the same database, and continuing to use it for vector storage is cleaner (and familiar). Developers don’t have to learn how to use new query languages and manage new infrastructure. Additionally, these pre-existing databases have massive communities and massive teams behind them; they are heavily audited and therefore less prone for errors—something that has riddled solutions like Pinecone since launch. But perhaps, most acutely, solutions like pg_vector are competitive against databases like Pinecone on performance.
Embeddings and rerankings
A common argument for standalone solutions is that they can solve problems that might be inflexible for generalized solutions. In the case of vector search, there are two levers that could potentially differentiate it: the accuracy of embeddings and the speed of reranking.
Embeddings are the core of vector search. An embedding is the vectorized representation of data. Data must be vectorized by the same model that’s used to process the query text; those vectors can establish similarity (typically via cosine similarity). And while many applications use the familiar LLMs for creating embeddings (e.g. OpenAI), some might opt for specialized solutions that are designed for embeddings, such as Voyage AI. By the same token, vector databases could compete by creating their own embedding models that outperform off-the-shelf counterparts.
But embeddings will never be a perfect measure of similarity. The solution to this problem is reranking. Embeddings can feature thousands of dimensions, but they’re still compressed data. Accordingly, cosine similarity is only a crude measure of similarity—using a language model to process the query text and establish similarity is much more accurate. Reranking is the process of taking a crude list of similar context (ranked by vector search) and reranking it by using an intermediary reranking model. The problem? Reranking is slow.
Even still, standalone vector databases aren’t the only solutions capable of creating proprietary embeddings and reranking models. In fact, the opposite is true—MongoDB recently acquired Voyage AI in an effort to bring state-of-the-art embeddings and reranking algorithms into MongoDB’s database. Because embeddings and reranking are pre- and post-processes to the actual search and retrieval step, one could conceivably integrate them with any solution, not just specialized databases.
The age-old question: feature or product?
So, is vector search a feature or a product? For databases, the answer typically boils down to a few questions:
- Can this functionality be reasonably implemented in existing databases?
- If so, is the performance acceptable for most use cases? For most problems, this is a function of organizing data at the memory-level to optimize for writes or reads. By extension, does this functionality mandate a special memory structure?
- If so, do the common use cases not require any extreme optimizations?
If the answer is yes to all three of these questions, we’re talking about a feature. If the answer is no to any, then there is ground for a new product category.
For example, let’s apply this reasoning to OLAP databases like ClickHouse. Aggregating metrics from traditional OLTP table data (e.g. Postgres) is technically possible, but it’s expensive and slow. To fix this, OLAP databases inverted storage at the memory level, prioritizing columnar reads over row reads. This spun OLAP into its own category because it failed the first rule (and the subsequent ones).
When we apply the same logic to vector databases, we end up with a different verdict for two rules:
- Most databases can feasibly integrate another indexing / search algorithm. Given that vector search uses a fairly straightforward minimization function (cosine similarity), it’s not crazy to add to most databases. We’ve seen evidence of this with already Postgres, MySQL, MongoDB, Oracle, MariaDB, and more.
- Because vectors are highly multi-dimensional, there are limited optimizations that can be done on-disk. In fact, most Hierarchical Navigable Small World (HNSW) implementations—which are optimized search algorithms for vector spaces—require the graphs to be accessible in RAM, which isn’t prescriptive on how data is stored on-disk. Additionally, because reranking is a much slower process than vector search, optimizing retrieval is a lesser sub-problem.
Based on this, vector search starts to sound like a feature.
But looking at the third consideration, the conversation gets more nuanced. Technically speaking, a vector database could also ship with an in-house embedding model and reranking model. Could these features alone make it competitive? Creating proprietary models is hard, and a highly-specialized vector search company might be able to gain an edge against generalized solutions that have sprawling priorities. MongoDB’s acquisition of VoyageAI suggests it thinks not.
That said, Chroma is the new Redis
This article, thus far, does not shed a positive light on Chroma—by the merit that it’s a dedicated vector database and doesn’t have any custom embeddings or reranking special powers. But Chroma has serious customers and solid adoption—why?
Benchmarks. Comparing Chroma and pg_vector shows that Chroma is a significantly faster option. Chroma can hit querying speeds as fast as 40-80ms, making it the leading standalone vector database. But the reason why is not because Chroma has a magically better vector search algorithm or reranking model. It’s because Chroma is an in-memory database like Redis. Chroma has roughly the same benefits as Redis ( same hazards, too).
So perhaps Chroma will follow a Redis-like trajectory. Like Redis, Chroma is an in-memory database that provides fast access to data via a simple SDK. It’s a great addition to a Postgres or even MongoDB stack that needs accessible vector search at blazing-fast speeds. But, to be clear, this isn’t proof of vector search being a category creator—it’s just a matter of Chroma’s positioning as a strong accessory database atop a more generalized, robust solution.
Verdict: Did we over-hype it?
On balance: yeah, I think so.
Said differently: given the cost of adopting and operationalizing new infrastructure, it doesn’t seem like the juice is worth the squeeze. At least not yet. The traction of pg_vector alone makes it hard to imagine that a standalone vector database will stand the test of time (and this is coming from someone optimistic enough to join a database company when it had $1M in revenue).
But this industry moves fast, and it wouldn’t be so far-fetched for a vector database startup to navigate the market more adeptly than larger incumbent companies and open source projects. For what it’s worth, I hope they do – we need more, not less, competition in data infrastructure.
BTW—we are sponsoring a webinar with O'Reilly on RAG and security! If this article was interesting to you, and you're interested in learning about how sensitive data should be navigated with RAG, please join Susan Shu Chang, Greg Sarjeant, and other incredible developers on March 4th. Register here.