Vertex AI Vector Search with Firestore
Vertex AI Vector Search allows you to index and retrieve documents. The documents are stored in Firestore and the corresponding document IDs are indexed using the vector search index provided by Vertex AI. These are suitable for production use cases.
Installation
Section titled “Installation”The vector search functionality is built into Genkit Go. You need to import the vectorsearch package:
npm install @genkit-ai/vertexaiConfiguration
Section titled “Configuration”- Create a vector search index in Vertex AI. Details on creating vector search index can be found at Create your Vector Search Index
- Create a Firestore Dataset and a Collection within that dataset to store the documents that will be indexed. More information to create Firestore datasets is available here
To use the Vertex AI Vector search with Firestore, initialize it and define a retriever with an embedder. You can also use a custom indexer and retriever for indexing and retrieving documents from the Firestore dataset:
import { initializeApp } from 'firebase-admin/app';import { getFirestore } from 'firebase-admin/firestore';
import { Document, genkit, z } from 'genkit';
import { textEmbedding004, vertexAI } from '@genkit-ai/vertexai';
import { getFirestoreDocumentIndexer, getFirestoreDocumentRetriever, vertexAIVectorSearch, vertexAiIndexerRef, vertexAiRetrieverRef, type DocumentIndexer, type DocumentRetriever,} from '@genkit-ai/vertexai/vectorsearch';
// // Initialize Firebase appinitializeApp({ projectId: PROJECT_ID });
const db = getFirestore();
// Use our helper functions here, or define your own document retriever and document indexerconst firestoreDocumentRetriever: DocumentRetriever = getFirestoreDocumentRetriever(db, FIRESTORE_COLLECTION);
const firestoreDocumentIndexer: DocumentIndexer = getFirestoreDocumentIndexer( db, FIRESTORE_COLLECTION,);
// Configure Genkit with Vertex AI pluginconst ai = genkit({ plugins: [ vertexAI({ projectId: PROJECT_ID, location: LOCATION, googleAuth: { scopes: ['https://www.googleapis.com/auth/cloud-platform'], }, }), vertexAIVectorSearch({ projectId: PROJECT_ID, location: LOCATION, vectorSearchOptions: [ { publicDomainName: VECTOR_SEARCH_PUBLIC_DOMAIN_NAME, indexEndpointId: VECTOR_SEARCH_INDEX_ENDPOINT_ID, indexId: VECTOR_SEARCH_INDEX_ID, deployedIndexId: VECTOR_SEARCH_DEPLOYED_INDEX_ID, documentRetriever: firestoreDocumentRetriever, documentIndexer: firestoreDocumentIndexer, embedder: textEmbedding004, }, ], }), ],});Configuration Options
Section titled “Configuration Options”- projectId (string): GCP Project ID
- location (string): GCP Project location
- indexId (string): Vector search index id
- indexEndpointId (string): Vector search endpoint id corresponding to the vector search index. More details can be found here.
- deployedIndexId (string): Vector search deployed index id corresponding to the vector search endpoint. More details to deploy an index to an index endpoint can be found here.
- publicDomainName (string): Public Domain Name of the vector search index endpoint.
- embedder (
ai.Embedder): The embedding model to use. Must be a configured embedder in your Genkit project. - documentIndexer (
func(ctx context.Context, docs []*ai.Document) ([]string, error)): Document indexer used to insert data with unique IDs in Firestore. This can be a custom document indexer as well depending on the user’s requirement. - documentRetriever (
func(ctx context.Context, neighbors []Neighbor, options any) ([]*ai.Document, error)): Document retriever used to retrieve data with corresponding ID from Firestore. This can be a custom document retriever as well depending on the user’s requirement.
Indexing Documents
Section titled “Indexing Documents”To populate with data, you need to implement your own indexing logic using the ai.Document format. Genkit provides a sample indexing function as well:
async ({ datapoints }) => { const documents: Document[] = datapoints.map((dp) => { const metadata = { restricts: structuredClone(dp.restricts), numericRestricts: structuredClone(dp.numericRestricts), }; return Document.fromText(dp.text, metadata); }); await ai.index({ indexer: vertexAiIndexerRef({ indexId: VECTOR_SEARCH_INDEX_ID, displayName: 'firestore_index', }), documents, }); return { result: 'success' };};Retrieving Documents
Section titled “Retrieving Documents”Use ai.Retrieve with the retriever you defined:
async ({ query, k, restricts, numericRestricts }) => { const startTime = performance.now(); const metadata = { restricts: structuredClone(restricts), numericRestricts: structuredClone(numericRestricts), }; const queryDocument = Document.fromText(query, metadata); const res = await ai.retrieve({ retriever: vertexAiRetrieverRef({ indexId: VECTOR_SEARCH_INDEX_ID, displayName: 'firestore_index', }), query: queryDocument, options: { k }, }); const endTime = performance.now(); return { result: res .map((doc) => ({ text: doc.content[0].text!, metadata: JSON.stringify(doc.metadata), distance: doc.metadata?.distance, })) .sort((a, b) => b.distance - a.distance), length: res.length, time: endTime - startTime, };};