The Hype About Vector Databases
Investors can’t seem to get enough of them.
Generative AI is promising the second industrial revolution. The first jumped productivity 10x in 100 years what GenAI wants to do in 5-10 🔥
Vector databases are essential building blocks for GenAI applications. These applications create, store, and search data generated by their LLM and AI models. Vector databases store the relationships between this data, which could be anything from videos, reels, text, conversations, audio, to documents, etc.
Everybody now seems to want a piece of this. Pinecone, a vector database company, beamed to $750 million in value in just 4 years based on what it means for commercial AI applications.
Others are in hot pursuit: Weaviate, from the sleepy Netherlands, zoomed to a $50 million Series B, Qdrant landed a $7.5 million deal, and Chroma, an open source project with just 1.2k GitHub stars, secured a major $18 million funding.
No buzzwords. No futuristic stuff. Just a database! But is the hype surrounding this database justified, or is it merely a byproduct of the AI-driven boom?
👉What led to the birth of vector databases?
The world of databases has undergone a series of transformations. Initially, SQL, or relational databases, dominated the scene, accommodating structured data within tidy tables.
Yet, as Web 2.0 companies emerged and their data demands grew, the NoSQL revolution emerged, providing greater adaptability and the capacity to handle extensive data sets.
However, a significant portion of today's data, approximately 80%, exists in an unstructured format: social media posts, images, audio files, and video data that don't neatly fit into a relational database. As we navigate the AI-driven era, a new player has stepped onto the stage to manage this complexity: vector databases.
👉So why are investors so interested in vector databases?
The answer lies in the importance of vectors for the success of generative AI.
Without vector embeddings, there would be no GPT-4.
No ChatGPT.
No Bard.
Vector databases are the key to unlocking the full potential of generative AI. Let’s dive a little deeper into understanding the correlation between the two!
👉What exactly is a vector database?
📶 Imagine you’re trying to find information online. You type a simple query into the search bar, hoping that it will return the results you need. You sift through pages of irrelevant data, trying to find what you're really looking for. It can be a frustrating experience, and it's all due to the limitations of traditional search engines.
This is where vector databases come in. They store text, images, and video documents in their vector representation, which allows for super-relevant queries that are semantically related to what you're searching for.
A vector database is a unique approach to storing information that utilize vectors as the fundamental structure. Unlike traditional databases that arrange data in tables, vector databases organize data using high-dimensional vectors. These vectors can then be represented in mathematical space as vector embeddings.
Vector Embedding
In simple terms, vector embeddings are numerical representations of subjects or words. Having more dimensions allows for a richer understanding of the data's meaning
👉Role of Vector Embeddings in AI
By plotting data points in mathematical space, computers can grasp the relationships between them and determine their level of correlation. This enables an AI model to understand queries in a contextual way, similar to how a human would.
Without understanding the semantics or context, an AI might provide accurate answers logically but miss the intended meaning.
📶 For example, consider the phrase "She's as cold as ice." Without understanding the figurative meaning, an AI might interpret it literally, describing someone's temperature rather than their personality or demeanor.
👉Revolutionary Synergy: Generative AI + Vectors
As AI applications rely heavily on vector embeddings, vector databases prove particularly suitable. No surprise, they are now often referred to as "AI databases.”
Recommendation Systems: With efficient data storage and retrieval, vector databases, along with large language models and memory, enable AI systems to learn user preferences. This information can be queried automatically to deliver personalized recommendations that match individual interests.
Search Engines: Vector databases assist users by analyzing query context and retrieving closely correlated keywords. This ensures better understanding and more accurate search results.
Image and Video Analysis: Leveraging video and image embedding models, AI can be fine-tuned to identify items similar to a query in images. This game-changing capability is actively implemented in numerous online shopping apps and websites.
As unstructured data like images, videos, and text continues to expand, the need for powerful analytics tools like vector databases will only grow.
👉Competition is heating up
With the exponential growth in popularity of vector databases, traditional options like Postgres and Redis continue to compete by introducing their own vector search functionalities. Established companies like Oracle and IBM are adapting by integrating AI-related services into their offerings.
Oracle provides a variety of AI algorithms with a focus on fast in-database learning, while IBM has rebranded its DB2 as the "AI database”.
The AI and vector database market is on fire, with over $350 million invested in less than 10 months. This interest extends beyond investors. While AI applications have long dominated the headlines, the spotlight is now shining on the infrastructure software that powers these applications.
In the current landscape, distinguishing between hype and genuine innovation can be challenging. Jeff Delaney, a Google developer expert and creator of the Fireship YouTube channel, experienced this firsthand when he launched an impromptu vector database project called Rektor.
With no revenue, business plan, or code to showcase initially, the company's valuation skyrocketed to an impressive $420 million in a remarkably short period of time.
What’s your take on this? Are vector databases finally receiving the attention they deserve, or are these investments just “new money for another dimension of AI”?
Suggested Readings:
Our previous stories like these on tech innovation:
Techverse is our personal itch to discover exciting new tech products. Share these stories with your friends or colleagues here on Twitter or WhatsApp.
Cheers,
Team TechVerse