The Power of Curiosity in Data Science
A Principal Applied AI Engineer at Redis, describes himself as “curious,” which is one of the most desirable attributes that hiring managers look for in a data scientist. In this blog post, we will explore AI-powered search and vector databases and how they can help us make sense of unstructured data.
The Basics: Understanding Vector Embeddings
Vector embeddings are commonly used to represent unstructured data such as images, videos, texts, and audio. They are usually high-dimensional and are created through passing unstructured data through machine learning models. With off-the-shelf models like Hugging Face Transformers, it is now easier than ever to create vector embeddings with just a few lines of code.
Vector embeddings can be used for similarity search, which is the process of finding similar pieces of unstructured data in a dataset.
How Redis is Powering Vector Search: A Use Case Overview
Redis is a popular in-memory data structure store that has been gaining popularity in recent years for its ability to handle complex data types and perform real-time data processing at scale. One area where Redis has been particularly successful is in vector search, which allows users to search for similar items based on their vector representations.
Implementing Vector Search
Vector search has a wide range of use cases, from visual search for products to natural language search for document retrieval, and even anomaly detection and threat detection. One of the key challenges in implementing vector search is creating vector embeddings from unstructured data and then creating an index to search over. The process of creating vector embeddings is simple with libraries like Hugging Face, but taking it to production requires a vector database to hold these vectors and serve them to a platform.
Introducing Redis as a Vector Database
Redis has introduced a new vector data type and added it to the Redis search module, which is a pluggable way to add functionality to Redis. With Redis search, you can now perform vector search along with other types of search, such as full text search and tag-based search. Redis search offers a range of indexing methods, including hierarchical navigable small worlds (HNSW) and K nearest neighbors (KNN), as well as distance metrics like Euclidean and cosine similarity.
Visual Search with Image-based Embeddings
Vector Search allows for image-based searches, where users can upload an image and retrieve visually similar results. This feature is not limited to just one brand or product, as the system can identify and retrieve similar products from different brands. Additionally, users can apply filters and experiment with different queries to see how the system works.
Document Retrieval using Natural Language
Vector Search can also be used for document retrieval based on user natural language queries. The system can retrieve relevant papers from archives based on the similarity score between the query and the abstracts of the papers. This feature enables users to search for documents even if their search query is not perfect or well-formed.
Rediscovering Search with Redis and Relevance AI
Redis and Relevance AI have partnered to create a platform that enables non-data scientists to benefit from the Vector Search capability. This platform provides a GUI with reporting functionality and observable dashboards, allowing users to gain insights into their data without any coding knowledge. Additionally, the platform can be integrated with other systems to enable updates and materialization scheduling.
Conclusion
Vector Search has opened up new possibilities for search capabilities, enabling users to perform more powerful searches and retrieve more relevant results. With Redis and Relevance AI’s partnership, even non-data scientists can benefit from this technology, making it accessible to a wider audience. The potential applications for Vector Search are vast, from document retrieval to medical diagnosis prediction, and we can’t wait to see how it will continue to evolve and impact different domains.
Disclaimer: We may not be able to provide 100% perfect information. We gather questions and get answers from experts. The experts provide answers but the text is not written directly by them so there may be some typos. I prefer to put this note at top but that causes some issues with display. Thank for reading.