Vertex AI plugin

The Vertex AI plugin provides access to Google Cloud’s enterprise-grade AI platform, offering advanced features beyond basic model access. Use this for enterprise applications that need grounding, Vector Search, Model Garden, or evaluation capabilities.

Accessing Google GenAI Models via Vertex AI

All languages support accessing Google’s generative AI models (Gemini, Imagen, etc.) through Vertex AI with enterprise authentication and features.

The unified Google GenAI plugin provides access to models via Vertex AI using the vertexAI initializer:

Basic Model Access

Installation

npm i --save @genkit-ai/google-genai

Configuration

import { genkit } from 'genkit';
import { vertexAI } from '@genkit-ai/google-genai';

const ai = genkit({
  plugins: [
    vertexAI({ location: 'us-central1' }), // Regional endpoint
    // vertexAI({ location: 'global' }),      // Global endpoint
  ],
});

Authentication Methods:

Application Default Credentials (ADC): The standard method for most Vertex AI use cases, especially in production. It uses the credentials from the environment (e.g., service account on GCP, user credentials from gcloud auth application-default login locally). This method requires a Google Cloud Project with billing enabled and the Vertex AI API enabled.
Vertex AI Express Mode: A streamlined way to try out many Vertex AI features using just an API key, without needing to set up billing or full project configurations. This is ideal for quick experimentation and has generous free tier quotas. Learn More about Express Mode.

// Using Vertex AI Express Mode (Easy to start, some limitations)
// Get an API key from the Vertex AI Studio Express Mode setup.
vertexAI({ apiKey: process.env.VERTEX_EXPRESS_API_KEY }),

Note: When using Express Mode, you do not provide projectId and location in the plugin config.

Available Models

The following Gemini models are registered for use with the Vertex AI plugin:

Gemini 3+ Series:

gemini-3.5-flash
gemini-3.1-pro-preview
gemini-3.1-flash-lite-preview
gemini-3-flash-preview
gemini-3.1-flash-image

Gemini 2.5 Series:

gemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite

Basic Usage

import { genkit } from 'genkit';
import { vertexAI } from '@genkit-ai/google-genai';

const ai = genkit({
  plugins: [vertexAI({ location: 'us-central1' })],
});

const response = await ai.generate({
  model: vertexAI.model('gemini-3.1-pro-preview'),
  prompt: 'Explain Gemini Enterprise Agent Platform in simple terms.',
});

console.log(response.text);

Model Configuration (PayGo)

You can specify the payGo option to use Flex or Priority routing on Vertex AI.

const response = await ai.generate({
  model: vertexAI.model('gemini-flash-lite-latest'),
  prompt: 'Explain Gemini Enterprise Agent Platform in simple terms.',
  config: {
    payGo: 'priority', // Can be 'priority', 'priority-only', 'flex', or 'flex-only'
  },
});

Multimodal Input

Gemini models can process multimodal inputs, including images and video. When using videos, you can use videoMetadata to specify specific timestamps or sampling rates.

const response = await ai.generate({
  model: vertexAI.model('gemini-flash-latest'),
  prompt: [
    { text: 'transcribe this video' },
    {
      media: {
        url: 'gs://cloud-samples-data/video/animals.mp4',
        contentType: 'video/mp4',
      },
      metadata: {
        videoMetadata: {
          fps: 0.5,
          startOffset: '3.5s',
          endOffset: '10.2s',
        },
      },
    },
  ],
});

Text Embedding

const embeddings = await ai.embed({
  embedder: vertexAI.embedder('text-embedding-005'),
  content: 'Embed this text.',
});

Image Generation (Imagen)

Available Models:

virtual-try-on-001

// The virtual-try-on model requires two specific media inputs: the person and the product.
const response = await ai.generate({
  model: vertexAI.model('virtual-try-on-001'),
  prompt: [
    {
      media: { url: `data:image/png;base64,${personImageBase64}`, contentType: 'image/png' },
      metadata: { type: 'personImage' },
    },
    {
      media: { url: `data:image/png;base64,${productImageBase64}`, contentType: 'image/png' },
      metadata: { type: 'productImage' },
    },
  ],
});

const generatedImage = response.media;

Video Generation (Veo)

Generate videos from text prompts or manipulate existing images to create dynamic video content.

Available Models:

veo-3.1-generate-preview
veo-3.1-fast-generate-preview
veo-3.1-lite-generate-preview
veo-3.0-generate-001
veo-3.0-fast-generate-001
veo-2.0-generate-001

Usage (Text-to-Video):

let { operation } = await ai.generate({
  model: vertexAI.model('veo-3.1-lite-generate-preview'),
  prompt: 'A majestic dragon soaring over a mystical forest at dawn.',
  config: {
    aspectRatio: '16:9',
    durationSeconds: 8,
    resolution: '1080p',
    personGeneration: 'allow_adult',
  },
});

if (!operation) throw new Error('No operation returned');

while (!operation.done) {
  operation = await ai.checkOperation(operation);
  await new Promise((resolve) => setTimeout(resolve, 5000));
}

const video = operation.output?.message?.content.find((p) => !!p.media);

Video Extension:

You can extend an existing Veo-generated video by providing it as input to another generation request:

let { operation } = await ai.generate({
  model: vertexAI.model('veo-3.1-generate-preview'),
  prompt: [
    { text: 'Track the butterfly into the garden as it lands on a flower.' },
    {
      media: {
        contentType: 'video/mp4',
        url: previousVeoVideo.media.url,
      },
    },
  ],
  config: {
    aspectRatio: '16:9', // Must match the original video
  },
});

Music Generation (Lyria)

Generate high-quality music and audio clips.

Available Models:

lyria-3-pro-preview
lyria-3-clip-preview
lyria-002 (Legacy)

Usage:

const response = await ai.generate({
  model: vertexAI.model('lyria-3-pro-preview'),
  prompt: 'A cheerful acoustic folk song with guitar and harmonica.',
});

const audioMedia = response.media;

Thinking Config

Thinking Level (Gemini 3.0+)

const response = await ai.generate({
  model: vertexAI.model('gemini-3.1-pro-preview'),
  prompt: 'what is heavier, one kilo of steel or one kilo of feathers',
  config: {
    thinkingConfig: {
      thinkingLevel: 'HIGH', // Or 'LOW' or 'MEDIUM'
      includeThoughts: true,
    },
  },
});

Thinking Budget (Gemini 2.5)

const { message } = await ai.generate({
  model: vertexAI.model('gemini-pro-latest'),
  prompt: 'what is heavier, one kilo of steel or one kilo of feathers',
  config: {
    thinkingConfig: {
      thinkingBudget: 1024,
      includeThoughts: true,
    },
  },
});

Grounding (Vertex AI Search & Google Search)

Enable Google Search or Vertex AI Search data stores to provide answers grounded in verifiable sources.

// Google Search Grounding
const searchResponse = await ai.generate({
  model: vertexAI.model('gemini-flash-latest'),
  prompt: 'What are the top tech news stories this week?',
  config: {
    tools: [{ googleSearch: {} }],
  },
});

// Vertex AI Search Grounding
const vertexResponse = await ai.generate({
  model: vertexAI.model('gemini-flash-latest'),
  prompt: 'Summarize our company policies.',
  config: {
    vertexRetrieval: {
      datastore: {
        projectId: 'your-project-id',
        location: 'us-central1',
        dataStoreId: 'your-data-store-id',
      },
      disableAttribution: false,
    },
  },
});

Enterprise Features (JavaScript Only)

Model Garden Integration

Access third-party models through Vertex AI Model Garden:

Anthropic (Claude) Models

Available Models:

claude-opus-4-7
claude-sonnet-4-6
claude-opus-4-6
claude-haiku-4-5@20251001
claude-sonnet-4-5@20250929
claude-sonnet-4@20250514
claude-opus-4-5@20251101
claude-opus-4-1@20250805
claude-opus-4@20250514

import { vertexModelGarden } from '@genkit-ai/vertexai/modelgarden';

const ai = genkit({
  plugins: [vertexModelGarden({ location: 'us-central1' })],
});

const response = await ai.generate({
  model: vertexModelGarden.model('claude-sonnet-4-6'),
  prompt: 'What should I do when I visit Melbourne?',
});

Advanced Configuration (Optional):

You can provide configuration options to tailor the model’s behavior, such as enabling extended thinking features.

const response = await ai.generate({
  model: vertexModelGarden.model('claude-sonnet-4-6'),
  prompt: 'What should I do when I visit Melbourne?',
  config: {
    thinking: {
      enabled: true,
      budgetTokens: 2048,
    },
    output_config: {
      effort: 'high', // Can be 'low', 'medium', 'high', or 'xhigh'
    },
  },
});

For the full list of available Claude models see: Available Claude models

Llama (Meta) Models

Available Models:

meta/llama-4-maverick-17b-128e-instruct-maas
meta/llama-4-scout-17b-16e-instruct-maas
meta/llama-3.3-70b-instruct-maas

const ai = genkit({
  plugins: [vertexModelGarden({ location: 'us-central1' })],
});

const response = await ai.generate({
  model: vertexModelGarden.model(
    'meta/llama-4-maverick-17b-128e-instruct-maas',
  ),
  prompt: 'Write a function that adds two numbers together',
});

For the full list of available Llama models see: Fully-managed Llama models

Mistral AI Models

Available Models:

mistral-medium-3
mistral-ocr-2505
mistral-small-2503
codestral-2

const ai = genkit({
  plugins: [vertexModelGarden({ location: 'us-central1' })],
});

const response = await ai.generate({
  model: vertexModelGarden.model('mistral-medium-3'),
  prompt: 'Write a function that adds two numbers together',
  config: {
    temperature: 0.7,
    maxOutputTokens: 1024,
    topP: 0.9,
    topK: 40,
  },
});

For the full list of available Mistral AI models see: Mistral AI models

Evaluation Metrics

Use Vertex AI Rapid Evaluation API for model evaluation:

import {
  vertexAIEvaluation,
  VertexAIEvaluationMetricType,
} from '@genkit-ai/vertexai/evaluation';

const ai = genkit({
  plugins: [
    vertexAIEvaluation({
      location: 'us-central1',
      metrics: [
        VertexAIEvaluationMetricType.SAFETY,
        {
          type: VertexAIEvaluationMetricType.ROUGE,
          metricSpec: {
            rougeType: 'rougeLsum',
          },
        },
      ],
    }),
  ],
});

Available metrics:

BLEU: Translation quality
ROUGE: Summarization quality
Fluency: Text fluency
Safety: Content safety
Groundedness: Factual accuracy
Summarization Quality/Helpfulness/Verbosity: Summary evaluation

Run evaluations:

genkit eval:run
genkit eval:flow -e vertexai/safety

Vector Search

Use Vertex AI Vector Search for enterprise-grade vector operations:

Setup

Create a Vector Search index in the Google Cloud Console
Configure dimensions based on your embedding model:
- gemini-embedding-001 / gemini-embedding-2-preview: default 3072 dimensions; you can set output_dimensionality on embed calls (for example 768, 1536, or 3072 per Google). Size the index to the length you actually use.
- text-embedding-005: 768 dimensions
- text-multilingual-embedding-002: 768 dimensions
- multimodalEmbedding001: 128, 256, 512, or 1408 dimensions
Deploy the index to a standard endpoint

Configuration

import { vertexAIVectorSearch } from '@genkit-ai/vertexai/vectorsearch';
import {
  getFirestoreDocumentIndexer,
  getFirestoreDocumentRetriever,
} from '@genkit-ai/vertexai/vectorsearch';

const ai = genkit({
  plugins: [
    vertexAIVectorSearch({
      projectId: 'your-project-id',
      location: 'us-central1',
      vectorSearchOptions: [
        {
          indexId: 'your-index-id',
          indexEndpointId: 'your-endpoint-id',
          deployedIndexId: 'your-deployed-index-id',
          publicDomainName: 'your-domain-name',
          documentRetriever: firestoreDocumentRetriever,
          documentIndexer: firestoreDocumentIndexer,
          embedder: vertexAI.embedder('gemini-embedding-001'),
        },
      ],
    }),
  ],
});

Usage

import {
  vertexAiIndexerRef,
  vertexAiRetrieverRef,
} from '@genkit-ai/vertexai/vectorsearch';

// Index documents
await ai.index({
  indexer: vertexAiIndexerRef({
    indexId: 'your-index-id',
  }),
  documents,
});

// Retrieve similar documents
const results = await ai.retrieve({
  retriever: vertexAiRetrieverRef({
    indexId: 'your-index-id',
  }),
  query: queryDocument,
});

Next Steps

Learn about generating content to understand how to use these models effectively
Explore evaluation to leverage Vertex AI’s evaluation metrics
See RAG to implement retrieval-augmented generation with Vector Search
Check out creating flows to build structured AI workflows
For simple API key access, see the Google AI plugin