Focus on search: Set your Generative AI projects up for success

28 Sept

When designing Retrieval Augmented Generation (RAG) AI solutions, search is paramount. A robust and accurate search mechanism ensures that the most relevant information from your knowledge base is retrieved, enhancing the quality and effectiveness of your AI application. So, how can you get the most out of your search capabilities in Azure AI Search? Let's explore this topic further.

Importance of Search in Generative AI Solutions

Imagine your generative AI app is a knowledgeable assistant responding to user queries. For this assistant to provide insightful answers, it needs access to the right information.

This is where search comes in. If your search mechanism is poorly designed, your AI might retrieve irrelevant or incomplete data, leading to inaccurate and unhelpful responses. Therefore, investing time in fine-tuning your search process is crucial for a successful RAG solution.

Azure AI Search: What It Is and What It Does

Azure AI Search is a cloud-based search service that allows you to quickly and easily add a robust search experience to your applications.

Some of the key features of Azure AI Search that promote good search quality and performance include:

Semantic Ranker. This feature uses AI to reorder search results based on how semantically relevant they are to the query. This means that the results are ranked based on the meaning of the query, rather than simply on keyword matching. The semantic ranker is now generally available (GA), and it is included with all tiers of Azure AI Search at no additional cost.
Scoring Profiles. These allow you to control how documents are ranked based on factors such as freshness, location, or price. For example, you can create a scoring profile that boosts the ranking of documents that were recently updated or that are located near the user's location. You can use scoring profiles in combination with both keyword search and vector search.
AI Enrichment. This allows you to extract text and information from content that can't be indexed for full-text search. This includes things like images and videos. You can use AI enrichment to make your search results more comprehensive and relevant.
Vector Search. This feature allows you to search for documents that are semantically similar to a query. You can use vector search to find documents that are relevant to a query, even if they don't contain the exact keywords. Vector search is now generally available.
Hybrid Search. This feature combines keyword search and vector search. This allows you to get the best of both worlds. You can use keyword search to find documents that contain specific terms, and you can use vector search to find documents that are semantically similar to a query.
Integrations. Azure AI Search integrates with other Azure services, such as Azure Blob Storage, Azure SQL Database, Azure Cosmos DB, and Azure OpenAI. This makes it easy to use Azure AI Search to search your data, no matter where it is stored.

Four Stages of Building a Search Pipeline

Building an effective search pipeline involves a multi-faceted approach, encompassing several key stages:

Data Ingestion. Before you can search your data, you need to ingest it into Azure AI Search. Data can come from a variety of sources such as Azure Blob storage, Azure SQL Database, Azure Cosmos DB, and OneLake. Indexers, a feature of Azure AI Search, automate this process, extracting searchable content from your data sources and transforming it into JSON documents for indexing. Indexers can also perform change and deletion detection to ensure your search index stays updated.
Chunking Strategy. Large documents often need to be divided into smaller, manageable units called chunks. This process, known as chunking, is crucial for optimizing RAG responses and performance. Chunking allows for multiple retrieved documents to be passed to an LLM within its context window, provides a mechanism for ranking the most relevant passages, and enables vector search, which has a per-model limit to how much content can be embedded into each vector.
The Document Layout skill available in Azure AI Search offers a structure-aware chunking approach, breaking content into headings and semantically coherent chunks like paragraphs and sentences. This skill utilizes the layout model in Document Intelligence to identify the document structure and represent it in JSON using Markdown syntax. Remember, choosing a suitable chunk size and overlap, and preserving sentence boundaries for semantic coherence, is key for effective chunking.
Types of Indexes and Search Strategies. Azure AI Search offers three main types of indexes and corresponding search strategies:
- Keyword Search: This traditional approach breaks content into terms, creating inverted indexes for quick retrieval. Keyword search excels at finding exact matches but might struggle with semantic understanding. It uses the BM25 probabilistic model for scoring, which determines relevance based on the frequency of search terms in a document relative to their frequency in the entire corpus.
- Vector Search: This modern technique uses embeddings, mathematical representations of text that capture semantic meaning, to find documents similar in meaning to a query. Approximate Nearest Neighbor (ANN) search algorithms, such as Hierarchical Navigable Small World (HNSW), are commonly used to efficiently find similar vectors in a large dataset. An alternative scoring algorithm, Exhaustive K-Nearest Neighbors (KNN), available in preview, performs a brute-force search to find the nearest neighbors, ensuring the highest level of accuracy but potentially requiring more computational resources.
- Hybrid Search: This powerful approach combines the strengths of both keyword and vector search, offering comprehensive results. Azure AI Search uses Reciprocal Rank Fusion (RRF) to merge results from both methods and produce a unified, highly relevant result set.
Querying/Prompting. Crafting effective queries or prompts is the final step in retrieving the most relevant information. Experimenting with different query formulations, utilizing search operators, and leveraging features like filters and facets can significantly impact the accuracy and relevance of your results. For example, you can use the search, filter, and vectorQueries parameters in a hybrid search query to refine results based on both keyword and vector criteria.

Methods for Bulk Testing Search Results

Bulk testing of search results is essential to ensure the accuracy and relevance of your search engine. You can use a variety of tools and techniques for bulk testing, including:

Azure Search Performance Testing Tool: This open-source tool provides a framework for benchmarking the performance of your Azure AI Search service, including both query and data ingestion workloads.
REST Clients: Utilize REST clients, such as Visual Studio Code with the REST extension, to create and execute a range of queries for testing various search scenarios.
Search Explorer: The built-in Search Explorer in the Azure portal allows you to interactively test queries, refine scoring profiles, and analyze search results.
Azure AI Prompt Flow: With prompt flow, you can evaluate and compare variations of prompts, gather user feedback, and measure different metrics, including groundedness, relevance, and retrieval score.

By using these methods, you can thoroughly evaluate the effectiveness of your search implementation and make necessary adjustments for optimal performance.

Improving the Quality and Query Performance of Search Results

Beyond the core features, there are several best practices for enhancing the quality and performance of your search results:

Optimize Your Index. Maintaining a lean and efficient index is crucial for performance. Periodically review your index size and schema for content reduction opportunities. Simplify your schema by limiting field attributes and using alternatives to complex types like Collections or flattening field hierarchies.
Optimize Your Queries. Craft your queries carefully to avoid unnecessary complexity. Limit the number of searchable fields, reduce the amount of data returned, avoid partial term searches, and simplify filters. Utilize search functions instead of overly complex filter criteria, and break down complex regular expressions for better performance.
Utilize Hybrid Search and Semantic Ranking. Leverage the power of hybrid search by combining keyword and vector search. Enable semantic ranking to reorder results based on semantic relevance, ensuring the most relevant content appears at the top. Hybrid search plus semantic re-ranking offers significant advantages, including higher answer recall, broader query coverage, and increased precision.
Refine Your Chunking Strategy. Experiment with different chunk sizes, overlaps, and boundary strategies to find the optimal configuration for your data and embedding model. The goal is to create chunks that are both semantically coherent and comprehensive, improving search relevance and recall. You can now specify token chunking in the Text Split skill, allowing you to chunk by token length and set the tokenizer and any tokens that shouldn't be split.
Utilize Advanced Features. Explore advanced features like query rewriting in the semantic ranker to generate more relevant results. Experiment with rescoring options for compressed vectors to find a balance between index size and retrieval quality. Utilize the vectorQueries.Weight property to fine-tune the influence of individual vector queries in multi-query requests.

By carefully implementing these strategies, you can create a highly efficient and relevant search experience for your RAG AI solutions, leading to more accurate, informative, and valuable results for your users. Remember, a well-tuned search mechanism is the cornerstone of a successful RAG AI application.

Further information:

Azure AI Search: RAG for better results, larger scale, faster answers

Advanced RAG with LlamaIndex, Azure AI Search and Azure AI Foundry

Hybrid query - Azure AI Search | Microsoft Learn

Feature descriptions - Azure AI Search | Microsoft Learn

What's new in Azure AI Search | Microsoft Learn

Chunk and vectorize by document layout - Azure AI Search | Microsoft Learn

Add scoring profiles - Azure AI Search | Microsoft Learn

Azure AI Search: Outperforming vector search with hybrid retrieval and reranking | Microsoft Community Hub

Raising the bar for RAG excellence: introducing generative query rewriting and new ranking model

Alex Papli