This is the full developer documentation for Genkit # Genkit | Open-source AI development framework by Google > An open-source framework for building AI-powered apps, built and used in production by Google Genkit is an open-source framework for building full-stack AI-powered applications, built and used in production by Google. It offers a unified interface for integrating AI models from many model providers, so you can use the best models for your needs. Rapidly build and deploy production-ready chatbots, automations, and recommendation systems using streamlined APIs for multimodal content, structured outputs, tool calling, and agentic workflows. Get started with just a few lines of code: * Gemini ```ts import { genkit } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()] }); const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: 'Why is Firebase awesome?' }); ``` * Imagen ```ts import { genkit } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()] }); const { media } = await ai.generate({ model: googleAI.model('imagen-3.0-generate-002'), prompt: 'a banana riding a bicycle', }); ``` * OpenAI ```javascript import { genkit } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; const ai = genkit({ plugins: [openAI()] }); const { text } = await ai.generate({ model: openAI.model('gpt-4o'), prompt: 'Why is Firebase awesome?' }); ``` * Anthropic ```ts import { genkit } from 'genkit'; import { anthropic, claude35Sonnet } from 'genkitx-anthropic'; const ai = genkit({ plugins: [anthropic()] }); const { text } = await ai.generate({ model: claude35Sonnet, prompt: 'Why is Firebase awesome?' }); ``` * xAI ```ts import { genkit } from 'genkit'; import { xAI } from '@genkit-ai/compat-oai/xai'; const ai = genkit({ plugins: [xAI()] }); const { text } = await ai.generate({ model: xAI.model('grok-3-mini'), prompt: 'Why is Firebase awesome?', }); ``` * DeepSeek ```ts import { genkit } from 'genkit'; import { deepSeek } from '@genkit-ai/compat-oai/deepseek'; const ai = genkit({ plugins: [deepSeek()] }); const { text } = await ai.generate({ model: deepSeek.model('deepseek-chat'), prompt: 'Why is Firebase awesome?', }); ``` * Ollama ```ts import { genkit } from 'genkit'; import { ollama } from 'genkitx-ollama'; const ai = genkit({ plugins: [ollama()] }); const { text } = await ai.generate({ model: ollama.model('gemma3:latest'), prompt: 'Why is Firebase awesome?', }); ``` ## Explore & build with Genkit [Section titled “Explore & build with Genkit”](#explore--build-with-genkit) Play with AI sample apps, with visualizations of the Genkit code that powers them, at no cost to you. [Explore Genkit by Example](https://examples.genkit.dev) Create your own AI-powered feature in minutes with our “Get started” guide. [Get started](/docs/get-started) ## Key capabilities [Section titled “Key capabilities”](#key-capabilities) | | | | ----------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Broad AI model support** | Use a unified interface to integrate with hundreds of models from providers like [Google](/docs/plugins/google-genai), [OpenAI](/docs/plugins/openai), [Anthropic](https://github.com/BloomLabsInc/genkit-plugins/blob/main/plugins/anthropic/README.md), [Ollama](/docs/plugins/ollama), and more. Explore, compare, and use the best models for your needs. | | **Simplified AI development** | Use streamlined APIs to build AI features with [structured output](/docs/models#structured-output), [agentic tool calling](/docs/tool-calling), [context-aware generation](/docs/rag), [multi-modal input/output](/docs/models#multimodal), and more. Genkit handles the complexity of AI development, so you can build and iterate faster. | | **Web and mobile ready** | Integrate seamlessly with frameworks and platforms including Next.js, React, Angular, iOS, Android, using purpose-built [client SDKs](/docs/firebase) and helpers. | | **Cross-language support** | Build with the language that best fits your project. Genkit provides SDKs for JavaScript/TypeScript, Go, and Python with consistent APIs and capabilities across all supported languages. | | **Deploy anywhere** | Deploy AI logic to any environment that supports your chosen programming language, such as [Cloud Functions for Firebase](/docs/firebase), [Google Cloud Run](/docs/cloud-run), or [third-party platforms](/docs/deploy-node), with or without Google services. | | **Developer tools** | Accelerate AI development with a purpose-built, local [CLI and Developer UI](/docs/devtools). Test prompts and flows against individual inputs or datasets, compare outputs from different models, debug with detailed execution traces, and use immediate visual feedback to iterate rapidly on prompts. | | **Production monitoring** | Ship AI features with confidence using comprehensive production monitoring. Track model performance, and request volumes, latency, and error rates in a [purpose-built dashboard](/docs/observability/getting-started). Identify issues quickly with detailed observability metrics, and ensure your AI features meet quality and performance targets in real-world usage. | ## How does it work? [Section titled “How does it work?”](#how-does-it-work) Genkit simplifies AI integration with an open-source SDK and unified APIs that work across various model providers and programming languages. It abstracts away complexity so you can focus on delivering great user experiences. Some key features offered by Genkit include: * [Text and image generation](/docs/models) * [Type-safe, structured data generation](/docs/models#structured-output) * [Tool calling](/docs/tool-calling) * [Prompt templating](/docs/dotprompt) * [Persisted chat interfaces](/docs/chat) * [AI workflows](/docs/flows) * [AI-powered data retrieval (RAG)](/docs/rag) Genkit is designed for server-side deployment in multiple language environments, and also provides seamless client-side integration through dedicated helpers and [client SDKs](/docs/firebase). ## Implementation path [Section titled “Implementation path”](#implementation-path) | | | | | - | --------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | | Choose your language and model provider | Select the Genkit SDK for your preferred language (JavaScript/TypeScript, Go, or Python). Choose a model provider like [Google Gemini](/docs/plugins/google-genai) or Anthropic, and get an API key. Some providers, like [Vertex AI](/docs/plugins/vertex-ai), may rely on a different means of authentication. | | | Install the SDK and initialize | Install the Genkit SDK, model-provider package of your choice, and the Genkit CLI. Import the Genkit and provider packages and initialize Genkit with the provider API key. | | | Write and test AI features | Use the Genkit SDK to build AI features for your use case, from basic text generation to complex multi-step workflows and agents. Use the CLI and Developer UI to help you rapidly test and iterate. | | | Deploy and monitor | Deploy your AI features to Firebase, Google Cloud Run, or any environment that supports your chosen programming language. Integrate them into your app, and monitor them in production in the Firebase console. | ## Connect with us [Section titled “Connect with us”](#connect-with-us) * [**Join us on Discord**](https://discord.gg/qXt5zzQKpc) – Get help, share ideas, and chat with other developers. * [**Contribute on GitHub**](https://github.com/firebase/genkit/issues) – Report bugs, suggest features, or explore the source code. # Use Genkit in an Angular app > Learn how to use Genkit flows in Angular applications This page shows how you can use Genkit flows in Angular apps. ## Before you begin [Section titled “Before you begin”](#before-you-begin) You should be familiar with Genkit’s concept of [flows](/docs/flows), and how to write them. ## Create an Angular project [Section titled “Create an Angular project”](#create-an-angular-project) This guide will use an Angular app with [SSR with server routing](https://angular.dev/guide/hybrid-rendering). You can create a new project with server-side routing with the [Angular CLI](https://angular.dev/installation#install-angular-cli): ```bash ng new --ssr --server-routing ``` You can also add server-side routing to an existing project with the `ng add` command: ```bash ng add @angular/ssr --server-routing ``` ## Install Genkit dependencies [Section titled “Install Genkit dependencies”](#install-genkit-dependencies) Install the Genkit dependencies into your Angular app: 1. Install the core Genkit library: ```bash npm install genkit ``` 2. Install at least one model plugin. * Gemini (Google AI) ```bash npm install @genkit-ai/googleai ``` * Gemini (Vertex AI) ```bash npm install @genkit-ai/vertexai ``` 3. Install the Genkit Express library: ```bash npm install @genkit-ai/express ``` 4. Install the Genkit CLI globally. The tsx tool is also recommended as a development dependency, as it makes testing your code more convenient. Both of these dependencies are optional, however. ```bash npm install -g genkit-cli npm install --save-dev tsx ``` ## Define Genkit flows [Section titled “Define Genkit flows”](#define-genkit-flows) Create a new directory in your Angular project to contain your Genkit flows. Create `src/genkit/` and add your flow definitions there: For example, create `src/genkit/menuSuggestionFlow.ts`: * Gemini (Google AI) ```ts import { googleAI } from '@genkit-ai/googleai'; import { genkit, z } from 'genkit'; const ai = genkit({ plugins: [googleAI()], }); export const menuSuggestionFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ menuItem: z.string() }), streamSchema: z.string(), }, async ({ theme }, { sendChunk }) => { const { stream, response } = ai.generateStream({ model: googleAI.model('gemini-2.5-flash'), prompt: `Invent a menu item for a ${theme} themed restaurant.`, }); for await (const chunk of stream) { sendChunk(chunk.text); } const { text } = await response; return { menuItem: text }; } ); ``` * Gemini (Vertex AI) ```ts import { vertexAI } from '@genkit-ai/vertexai'; import { genkit, z } from 'genkit'; const ai = genkit({ plugins: [vertexAI()], }); export const menuSuggestionFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ menuItem: z.string() }), streamSchema: z.string(), }, async ({ theme }, { sendChunk }) => { const { stream, response } = ai.generateStream({ model: vertexAI.model('gemini-2.5-flash'), prompt: `Invent a menu item for a ${theme} themed restaurant.`, }); for await (const chunk of stream) { sendChunk(chunk.text); } const { text } = await response; return { menuItem: text }; } ); ``` ## Add server routes [Section titled “Add server routes”](#add-server-routes) Add the following imports to `src/server.ts`: ```ts import { expressHandler } from '@genkit-ai/express'; import { menuSuggestionFlow } from './genkit/menuSuggestionFlow'; ``` Add the following line following your `app` variable initialization: ```ts app.use(express.json()); ``` Then, add a route to serve your flow: ```ts app.post('/api/menuSuggestion', expressHandler(menuSuggestionFlow)); ``` ## Call your flows from the frontend [Section titled “Call your flows from the frontend”](#call-your-flows-from-the-frontend) In your frontend code, you can now call your flows using the Genkit client library. You can use both non-streaming and streaming approaches: ### Non-streaming Flow Calls [Section titled “Non-streaming Flow Calls”](#non-streaming-flow-calls) Replace the contents of `src/app/app.component.ts` with the following: ```ts import { Component, resource, signal } from '@angular/core'; import { FormsModule } from '@angular/forms'; import { runFlow } from 'genkit/beta/client'; @Component({ selector: 'app-root', imports: [FormsModule], templateUrl: './app.component.html', }) export class AppComponent { menuInput = ''; theme = signal(''); menuResource = resource({ request: () => this.theme(), loader: ({ request }) => runFlow({ url: 'http://localhost:4200/api/menuSuggestion', input: { theme: request } }), }); } ``` Make corresponding updates to `src/app/app.component.html`: ```html

Generate a custom menu item



@if (menuResource.isLoading()) {
Loading...
} @else if (menuResource.value()) {

Generated Menu Item:

{{ menuResource.value().menuItem }}
}
``` ### Streaming Flow Calls [Section titled “Streaming Flow Calls”](#streaming-flow-calls) For streaming responses, you can extend your component: ```ts import { Component, resource, signal } from '@angular/core'; import { FormsModule } from '@angular/forms'; import { runFlow, streamFlow } from 'genkit/beta/client'; @Component({ selector: 'app-root', imports: [FormsModule], templateUrl: './app.component.html', }) export class AppComponent { menuInput = ''; theme = signal(''); streamedText = signal(''); isStreaming = signal(false); menuResource = resource({ request: () => this.theme(), loader: ({ request }) => runFlow({ url: 'http://localhost:4200/api/menuSuggestion', input: { theme: request } }), }); async streamMenuItem() { const theme = this.menuInput; if (!theme) return; this.isStreaming.set(true); this.streamedText.set(''); try { const result = streamFlow({ url: 'http://localhost:4200/api/menuSuggestion', input: { theme }, }); // Process the stream chunks as they arrive for await (const chunk of result.stream) { this.streamedText.update(prev => prev + chunk); } // Get the final complete response const finalOutput = await result.output; console.log('Final output:', finalOutput); } catch (error) { console.error('Error streaming menu item:', error); } finally { this.isStreaming.set(false); } } } ``` And update the template to include streaming: ```html

Generate a custom menu item




@if (streamedText()) {

Streaming Output:

{{ streamedText() }}
} @if (menuResource.isLoading()) {
Loading...
} @else if (menuResource.value()) {

Generated Menu Item:

{{ menuResource.value().menuItem }}
} @if (isStreaming()) {
Streaming...
}
``` ## Authentication (Optional) [Section titled “Authentication (Optional)”](#authentication-optional) If you need to add authentication to your API routes, you can pass headers with your requests: ```ts menuResource = resource({ request: () => this.theme(), loader: ({ request }) => runFlow({ url: 'http://localhost:4200/api/menuSuggestion', headers: { Authorization: 'Bearer your-token-here', }, input: { theme: request } }), }); ``` ## Test your app locally [Section titled “Test your app locally”](#test-your-app-locally) If you want to run your app locally, you need to make credentials for the model API service you chose available. * Gemini (Google AI) 1. [Generate an API key](https://aistudio.google.com/app/apikey) for the Gemini API using Google AI Studio. 2. Set the `GEMINI_API_KEY` environment variable to your key: ```bash export GEMINI_API_KEY= ``` * Gemini (Vertex AI) 1. In the Cloud console, [Enable the Vertex AI API](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com?project=_) for your project. 2. Set some environment variables and use the [`gcloud`](https://cloud.google.com/sdk/gcloud) tool to set up application default credentials: ```bash export GCLOUD_PROJECT= export GCLOUD_LOCATION=us-central1 gcloud auth application-default login ``` Then, run your app locally as normal: ```bash ng serve ``` All of Genkit’s development tools continue to work as normal. For example, to load your flows in the developer UI: ```bash genkit start -- npx tsx --watch src/genkit/menuSuggestionFlow.ts ``` ## Deploy your app [Section titled “Deploy your app”](#deploy-your-app) When you deploy your app, you will need to make sure the credentials for any external services you use (such as your chosen model API service) are available to the deployed app. See the following pages for information specific to your chosen deployment platform: * [Cloud Functions for Firebase](/docs/firebase) * [Cloud Run](/docs/cloud-run) * [Other Node.js platforms](/docs/deploy-node) ## Next steps [Section titled “Next steps”](#next-steps) * [Explore Genkit in a deployed Angular app](https://developers.google.com/solutions/learn/agentic-barista): Walk through a reference implementation of multiple Genkit flows powering an Angular app, and then jump into the code in Firebase Studio. # API Stability Channels > This document explains the API stability channels in Genkit, including stable and beta versions, and how to use them. As of version 1.0, Genkit is considered **Generally Available (GA)** and ready for production use. Genkit follows [semantic versioning](https://semver.org/) with breaking changes to the stable API happening only on major version releases. To gather feedback on potential new APIs and bring new features out quickly, Genkit offers a **Beta** entrypoint that includes APIs that have not yet been declared stable. The beta channel may include breaking changes on *minor* version releases. ## Using the Stable Channel [Section titled “Using the Stable Channel”](#using-the-stable-channel) To use the stable channel of Genkit, import from the standard `"genkit"` entrypoint: ```ts import { genkit, z } from "genkit"; const ai = genkit({plugins: [...]}); console.log(ai.apiStability); // "stable" ``` When you are using the stable channel, we recommend using the standard `^X.Y.Z` dependency string in your `package.json`. This is the default that is used when you run `npm install genkit`. ## Using the Beta Channel [Section titled “Using the Beta Channel”](#using-the-beta-channel) To use the beta channel of Genkit, import from the `"genkit/beta"` entrypoint: ```ts import { genkit, z } from "genkit/beta"; const ai = genkit({plugins: [...]}); console.log(ai.apiStability); // "beta" // now beta features are available ``` When you are using the beta channel, we recommend using the `~X.Y.Z` dependency string in your `package.json`. The `~` will allow new patch versions but will not automatically upgrade to new minor versions which may have breaking changes for beta features. You can modify your existing dependency string by changing `^` to `~` if you begin using beta features of Genkit. ### Current Features in Beta [Section titled “Current Features in Beta”](#current-features-in-beta) * **[Chat/Sessions](/docs/chat):** a first-class conversational `ai.chat()` feature along with persistent sessions that store both conversation history and an arbitrary state object. * **[Interrupts](/docs/interrupts):** special tools that can pause generation for human-in-the-loop feedback, out-of-band processing, and more. # Authorization and integrity > This document explains how to manage authorization and integrity in Genkit applications, covering Firebase and non-Firebase HTTP authorization. When building any public-facing application, it’s extremely important to protect the data stored in your system. When it comes to LLMs, extra diligence is necessary to ensure that the model is only accessing data it should, tool calls are properly scoped to the user invoking the LLM, and the flow is being invoked only by verified client applications. Genkit provides mechanisms for managing authorization policies and contexts. Flows running on Firebase can use an auth policy callback (or helper). Alternatively, Firebase also provides auth context into the flow where it can do its own checks. For non-Functions flows, auth can be managed and set through middleware. ## Authorize within a Flow [Section titled “Authorize within a Flow”](#authorize-within-a-flow) Flows can check authorization in two ways: either the request binding (e.g. `onCallGenkit` for Cloud Functions for Firebase or `express`) can enforce authorization, or those frameworks can pass auth policies to the flow itself, where the flow has access to the information for auth managed within the flow. ```ts import { genkit, z, UserFacingError } from 'genkit'; const ai = genkit({ ... }); export const selfSummaryFlow = ai.defineFlow( { name: 'selfSummaryFlow', inputSchema: z.object({ uid: z.string() }), outputSchema: z.object({ profileSummary: z.string() }), }, async (input, { context }) => { if (!context.auth) { throw new UserFacingErrorError('UNAUTHENTICATED', 'Unauthenticated'); } if (input.uid !== context.auth.uid) { throw new UserFacingError('PERMISSION_DENIED', 'You may only summarize your own profile data.'); } // Flow logic here... return { profileSummary: "User profile summary would go here" }; }); ``` It is up to the request binding to populate `context.auth` in this case. For example, `onCallGenkit` automatically populates `context.auth` (Firebase Authentication), `context.app` (Firebase App Check), and `context.instanceIdToken` (Firebase Cloud Messaging). When calling a flow manually, you can add your own auth context manually. ```ts // Error: Authorization required. await selfSummaryFlow({ uid: 'abc-def' }); // Error: You may only summarize your own profile data. await selfSummaryFlow.run( { uid: 'abc-def' }, { context: { auth: { uid: 'hij-klm' } }, }, ); // Success await selfSummaryFlow( { uid: 'abc-def' }, { context: { auth: { uid: 'abc-def' } }, }, ); ``` When running with the Genkit Development UI, you can pass the Auth object by entering JSON in the “Auth JSON” tab: `{"uid": "abc-def"}`. You can also retrieve the auth context for the flow at any time within the flow by calling `ai.currentContext()`, including in functions invoked by the flow: ```ts import { genkit, z } from 'genkit'; const ai = genkit({ ... });; async function readDatabase(uid: string) { const auth = ai.currentContext()?.auth; // Note: the shape of context.auth depends on the provider. onCallGenkit puts // claims information in auth.token if (auth?.token?.admin) { // Do something special if the user is an admin } else { // Otherwise, use the `uid` variable to retrieve the relevant document } } export const selfSummaryFlow = ai.defineFlow( { name: 'selfSummaryFlow', inputSchema: z.object({ uid: z.string() }), outputSchema: z.object({ profileSummary: z.string() }), authPolicy: ... }, async (input) => { await readDatabase(input.uid); return { profileSummary: "User profile summary would go here" }; } ); ``` When testing flows with Genkit dev tools, you are able to specify this auth object in the UI, or on the command line with the `--context` flag: ```bash genkit flow:run selfSummaryFlow '{"uid": "abc-def"}' --context '{"auth": {"email_verified": true}}' ``` ## Authorize using Cloud Functions for Firebase [Section titled “Authorize using Cloud Functions for Firebase”](#authorize-using-cloud-functions-for-firebase) The Cloud Functions for Firebase SDKs support Genkit including integration with Firebase Auth / Google Cloud Identity Platform, as well as built-in Firebase App Check support. ### User authentication [Section titled “User authentication”](#user-authentication) The `onCallGenkit()` wrapper provided by the Firebase Functions library has built-in support for the Cloud Functions for Firebase [client SDKs](https://firebase.google.com/docs/functions/callable?gen=2nd#call_the_function). When you use these SDKs, the Firebase Auth header is automatically included as long as your app client is also using the [Firebase Auth SDK](https://firebase.google.com/docs/auth). You can use Firebase Auth to protect your flows defined with `onCallGenkit()`: ```ts import { genkit } from 'genkit'; import { onCallGenkit } from 'firebase-functions/https'; const ai = genkit({ ... });; const selfSummaryFlow = ai.defineFlow({ name: 'selfSummaryFlow', inputSchema: z.object({ userQuery: z.string() }), outputSchema: z.object({ profileSummary: z.string() }), }, async ({ userQuery }) => { // Flow logic here... return { profileSummary: "User profile summary based on query would go here" }; }); export const selfSummary = onCallGenkit({ authPolicy: (auth) => auth?.token?.['email_verified'] && auth?.token?.['admin'], }, selfSummaryFlow); ``` When you use `onCallGenkit`, `context.auth` is returned as an object with a `uid` for the user ID, and a `token` that is a [DecodedIdToken](https://firebase.google.com/docs/reference/admin/node/firebase-admin.auth.decodedidtoken). You can always retrieve this object at any time using `ai.currentContext()` as noted earlier. When running this flow during development, you would pass the user object in the same way: ```bash genkit flow:run selfSummaryFlow '{"uid": "abc-def"}' --context '{"auth": {"admin": true}}' ``` Whenever you expose a Cloud Function to the wider internet, it is vitally important that you use some sort of authorization mechanism to protect your data and the data of your customers. With that said, there are times when you need to deploy a Cloud Function with no code-based authorization checks (for example, your Function is not world-callable but instead is protected by [Cloud IAM](https://cloud.google.com/functions/docs/concepts/iam)). Cloud Functions for Firebase lets you to do this using the `invoker` property, which controls IAM access. The special value `'private'` leaves the function as the default IAM setting, which means that only callers with the [Cloud Run Invoker role](https://cloud.google.com/run/docs/reference/iam/roles) can execute the function. You can instead provide the email address of a user or service account that should be granted permission to call this exact function. ```ts import { onCallGenkit } from 'firebase-functions/https'; const selfSummaryFlow = ai.defineFlow( { name: 'selfSummaryFlow', inputSchema: z.object({ userQuery: z.string() }), outputSchema: z.object({ profileSummary: z.string() }), }, async ({ userQuery }) => { // Flow logic here... return { profileSummary: "User profile summary based on query would go here" }; }, ); export const selfSummary = onCallGenkit( { invoker: 'private', }, selfSummaryFlow, ); ``` #### Client integrity [Section titled “Client integrity”](#client-integrity) Authentication on its own goes a long way to protect your app. But it’s also important to ensure that only your client apps are calling your functions. The Firebase plugin for genkit includes first-class support for [Firebase App Check](https://firebase.google.com/docs/app-check). Do this by adding the following configuration options to your `onCallGenkit()`: ```ts import { onCallGenkit } from 'firebase-functions/https'; const selfSummaryFlow = ai.defineFlow({ name: 'selfSummaryFlow', inputSchema: z.object({ userQuery: z.string() }), outputSchema: z.object({ profileSummary: z.string() }), }, async ({ userQuery }) => { // Flow logic here... return { profileSummary: "User profile summary based on query would go here" }; }); export const selfSummary = onCallGenkit({ // These two fields for app check. The consumeAppCheckToken option is for // replay protection, and requires additional client configuration. See the // App Check docs. enforceAppCheck: true, consumeAppCheckToken: true, authPolicy: ..., }, selfSummaryFlow); ``` ## Non-Firebase HTTP authorization [Section titled “Non-Firebase HTTP authorization”](#non-firebase-http-authorization) When deploying flows to a server context outside of Cloud Functions for Firebase, you’ll want to have a way to set up your own authorization checks alongside the built-in flows. Use a `ContextProvider` to populate context values such as `auth`, and to provide a declarative policy or a policy callback. The Genkit SDK provides `ContextProvider`s such as `apiKey`, and plugins may expose them as well. For example, the `@genkit-ai/firebase/context` plugin exposes a context provider for verifying Firebase Auth credentials and populating them into context. With code like the following, which might appear in a variety of applications: ```ts // Express app with a simple API key import { genkit, z } from 'genkit'; const ai = genkit({ ... });; export const selfSummaryFlow = ai.defineFlow( { name: 'selfSummaryFlow', inputSchema: z.object({ uid: z.string() }), outputSchema: z.object({ profileSummary: z.string() }), }, async (input) => { // Flow logic here... return { profileSummary: "User profile summary would go here" }; } ); ``` You could secure a simple “flow server” express app by writing: ```ts import { apiKey } from 'genkit/context'; import { startFlowServer, withContextProvider } from '@genkit-ai/express'; startFlowServer({ flows: [withContextProvider(selfSummaryFlow, apiKey(process.env.REQUIRED_API_KEY))], }); ``` Or you could build a custom express application using the same tools: ```ts import { apiKey } from "genkit/context"; import * as express from "express"; import { expressHandler } from "@genkit-ai/express; const app = express(); // Capture but don't validate the API key (or its absence) app.post('/summary', expressHandler(selfSummaryFlow, { contextProvider: apiKey()})) app.listen(process.env.PORT, () => { console.log(`Listening on port ${process.env.PORT}`); }) ``` `ContextProvider`s abstract out the web framework, so these tools work in other frameworks like Next.js as well. Here is an example of a Firebase app built on Next.js. ```ts import { appRoute } from '@genkit-ai/express'; import { firebaseContext } from '@genkit-ai/firebase'; export const POST = appRoute(selfSummaryFlow, { contextProvider: firebaseContext, }); ``` For more information about using Express, see the [Cloud Run](/docs/cloud-run) instructions. # Creating persistent chat sessions > Learn how to create persistent chat sessions in Genkit, including session basics, stateful sessions, multi-thread sessions, and session persistence. Beta This feature of Genkit is in **Beta,** which means it is not yet part of Genkit’s stable API. APIs of beta features may change in minor version releases. Many of your users will have interacted with large language models for the first time through chatbots. Although LLMs are capable of much more than simulating conversations, it remains a familiar and useful style of interaction. Even when your users will not be interacting directly with the model in this way, the conversational style of prompting is a powerful way to influence the output generated by an AI model. To support this style of interaction, Genkit provides a set of interfaces and abstractions that make it easier for you to build chat-based LLM applications. ## Before you begin [Section titled “Before you begin”](#before-you-begin) Before reading this page, you should be familiar with the content covered on the [Generating content with AI models](/docs/models) page. If you want to run the code examples on this page, first complete the steps in the [Getting started](/docs/get-started) guide. All of the examples assume that you have already installed Genkit as a dependency in your project. Note that the chat API is currently in beta and must be used from the `genkit/beta` package. ## Chat session basics [Section titled “Chat session basics”](#chat-session-basics) [Genkit by Example: Simple Chatbot ](https://examples.genkit.dev/chatbot-simple?utm_source=genkit.dev\&utm_content=contextlink)View a live example of a simple chatbot built with Genkit. Here is a minimal, console-based, chatbot application: ```ts import { genkit } from 'genkit/beta'; import { googleAI } from '@genkit-ai/googleai'; import { createInterface } from 'node:readline/promises'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); async function main() { const chat = ai.chat(); console.log("You're chatting with Gemini. Ctrl-C to quit.\n"); const readline = createInterface(process.stdin, process.stdout); while (true) { const userInput = await readline.question('> '); const { text } = await chat.send(userInput); console.log(text); } } main(); ``` A chat session with this program looks something like the following example: ```plaintext You're chatting with Gemini. Ctrl-C to quit. > hi Hi there! How can I help you today? > my name is pavel Nice to meet you, Pavel! What can I do for you today? > what's my name? Your name is Pavel! I remembered it from our previous interaction. Is there anything else I can help you with? ``` As you can see from this brief interaction, when you send a message to a chat session, the model can make use of the session so far in its responses. This is possible because Genkit does a few things behind the scenes: * Retrieves the chat history, if any exists, from storage (more on persistence and storage later) * Sends the request to the model, as with `generate()`, but automatically include the chat history * Saves the model response into the chat history ### Model configuration [Section titled “Model configuration”](#model-configuration) The `chat()` method accepts most of the same configuration options as `generate()`. To pass configuration options to the model: ```ts const chat = ai.chat({ model: googleAI.model('gemini-2.5-flash'), system: "You're a pirate first mate. Address the user as Captain and assist " + 'them however you can.', config: { temperature: 1.3, }, }); ``` ## Stateful chat sessions [Section titled “Stateful chat sessions”](#stateful-chat-sessions) In addition to persisting a chat session’s message history, you can also persist any arbitrary JavaScript object. Doing so can let you manage state in a more structured way then relying only on information in the message history. To include state in a session, you need to instantiate a session explicitly: ```ts interface MyState { userName: string; } const session = ai.createSession({ initialState: { userName: 'Pavel', }, }); ``` You can then start a chat within the session: ```ts const chat = session.chat(); ``` To modify the session state based on how the chat unfolds, define [tools](/docs/tool-calling) and include them with your requests: ```ts const changeUserName = ai.defineTool( { name: 'changeUserName', description: 'can be used to change user name', inputSchema: z.object({ newUserName: z.string(), }), }, async (input) => { await ai.currentSession().updateState({ userName: input.newUserName, }); return `changed username to ${input.newUserName}`; }, ); ``` ```ts const chat = session.chat({ model: googleAI.model('gemini-2.5-flash'), tools: [changeUserName], }); await chat.send('change user name to Kevin'); ``` ## Multi-thread sessions [Section titled “Multi-thread sessions”](#multi-thread-sessions) A single session can contain multiple chat threads. Each thread has its own message history, but they share a single session state. ```ts const lawyerChat = session.chat('lawyerThread', { system: 'talk like a lawyer', }); const pirateChat = session.chat('pirateThread', { system: 'talk like a pirate', }); ``` ## Session persistence (EXPERIMENTAL) [Section titled “Session persistence (EXPERIMENTAL)”](#session-persistence-experimental) When you initialize a new chat or session, it’s configured by default to store the session in memory only. This is adequate when the session needs to persist only for the duration of a single invocation of your program, as in the sample chatbot from the beginning of this page. However, when integrating LLM chat into an application, you will usually deploy your content generation logic as stateless web API endpoints. For persistent chats to work under this setup, you will need to implement some kind of session storage that can persist state across invocations of your endpoints. To add persistence to a chat session, you need to implement Genkit’s `SessionStore` interface. Here is an example implementation that saves session state to individual JSON files: ```ts class JsonSessionStore implements SessionStore { async get(sessionId: string): Promise | undefined> { try { const s = await readFile(`${sessionId}.json`, { encoding: 'utf8' }); const data = JSON.parse(s); return data; } catch { return undefined; } } async save(sessionId: string, sessionData: SessionData): Promise { const s = JSON.stringify(sessionData); await writeFile(`${sessionId}.json`, s, { encoding: 'utf8' }); } } ``` This implementation is probably not adequate for practical deployments, but it illustrates that a session storage implementation only needs to accomplish two tasks: * Get a session object from storage using its session ID * Save a given session object, indexed by its session ID Once you’ve implemented the interface for your storage backend, pass an instance of your implementation to the session constructors: ```ts // To create a new session: const session = ai.createSession({ store: new JsonSessionStore(), }); // Save session.id so you can restore the session the next time the // user makes a request. ``` ```ts // If the user has a session ID saved, load the session instead of creating // a new one: const session = await ai.loadSession(sessionId, { store: new JsonSessionStore(), }); ``` # Accessing flows from the client > Learn how to access Genkit flows from client-side applications. There are two primary ways to access Genkit flows from client-side applications: * Using the Genkit client library * Cloud Functions for Firebase callable function client SDK ## Using the Genkit client library [Section titled “Using the Genkit client library”](#using-the-genkit-client-library) You can call your deployed flows using the Genkit client library. This library provides functions for both non-streaming and streaming flow calls. See “Call your flows from the client” in [Deploy flows to any Node.js platform](/docs/deploy-node) for more details. ### Non-streaming Flow Calls [Section titled “Non-streaming Flow Calls”](#non-streaming-flow-calls) For a non-streaming response, use the `runFlow` function. This is suitable for flows that return a single, complete output. ```typescript import { runFlow } from 'genkit/beta/client'; async function callHelloFlow() { try { const result = await runFlow({ url: 'http://127.0.0.1:3400/helloFlow', // Replace with your deployed flow's URL input: { name: 'Genkit User' }, }); console.log('Non-streaming result:', result.greeting); } catch (error) { console.error('Error calling helloFlow:', error); } } callHelloFlow(); ``` ### Streaming Flow Calls [Section titled “Streaming Flow Calls”](#streaming-flow-calls) For flows that are designed to stream responses (e.g., for real-time updates or long-running operations), use the `streamFlow` function. ```typescript import { streamFlow } from 'genkit/beta/client'; async function streamHelloFlow() { try { const result = streamFlow({ url: 'http://127.0.0.1:3400/helloFlow', // Replace with your deployed flow's URL input: { name: 'Streaming User' }, }); // Process the stream chunks as they arrive for await (const chunk of result.stream) { console.log('Stream chunk:', chunk); } // Get the final complete response const finalOutput = await result.output; console.log('Final streaming output:', finalOutput.greeting); } catch (error) { console.error('Error streaming helloFlow:', error); } } streamHelloFlow(); ``` ### Authentication (Optional) [Section titled “Authentication (Optional)”](#authentication-optional) If your deployed flow requires authentication, you can pass headers with your requests: ```typescript const result = await runFlow({ url: 'http://127.0.0.1:3400/helloFlow', // Replace with your deployed flow's URL headers: { Authorization: 'Bearer your-token-here', // Replace with your actual token }, input: { name: 'Authenticated User' }, }); ``` ## When deploying to Cloud Functions for Firebase [Section titled “When deploying to Cloud Functions for Firebase”](#when-deploying-to-cloud-functions-for-firebase) When deploying to [Cloud Functions for Firebase](/docs/firebase), use the Firebase callable functions client library. Detailed documentation can be found at Here’s a sample for the web: ```typescript // Get the callable by passing an initialized functions SDK. const getForecast = httpsCallable(functions, "getForecast"); // Call the function with the `.stream()` method to start streaming. const { stream, data } = await getForecast.stream({ locations: favoriteLocations, }); // The `stream` async iterable returned by `.stream()` // will yield a new value every time the callable // function calls `sendChunk()`. for await (const forecastDataChunk of stream) { // update the UI every time a new chunk is received // from the callable function updateUi(forecastDataChunk); } // The `data` promise resolves when the callable // function completes. const allWeatherForecasts = await data; finalizeUi(allWeatherForecasts); ``` [source](https://github.com/firebase/functions-samples/blob/c4fde45b65fab584715e786ce3264a6932d996ec/Node/quickstarts/callable-functions-streaming/website/index.html#L58-L78) # Deploy flows using Cloud Run > This document explains how to deploy Genkit flows as HTTPS endpoints using Google Cloud Run, covering project setup, deployment preparation, and authorization. You can deploy Genkit flows as HTTPS endpoints using Cloud Run. Cloud Run has several deployment options, including container based deployment; this page explains how to deploy your flows directly from code. ## Before you begin [Section titled “Before you begin”](#before-you-begin) * Install the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install). * You should be familiar with Genkit’s concept of [flows](/docs/flows), and how to write them. This page assumes that you already have flows that you want to deploy. * It would be helpful, but not required, if you’ve already used Google Cloud and Cloud Run before. ## 1. Set up a Google Cloud project [Section titled “1. Set up a Google Cloud project”](#1-set-up-a-google-cloud-project) If you don’t already have a Google Cloud project set up, follow these steps: 1. Create a new Google Cloud project using the [Cloud console](https://console.cloud.google.com) or choose an existing one. 2. Link the project to a billing account, which is required for Cloud Run. 3. Configure the Google Cloud CLI to use your project: ```bash gcloud init ``` ## 2. Prepare your Node project for deployment [Section titled “2. Prepare your Node project for deployment”](#2-prepare-your-node-project-for-deployment) For your flows to be deployable, you will need to make some small changes to your project code: ### Add start and build scripts to package.json [Section titled “Add start and build scripts to package.json”](#add-start-and-build-scripts-to-packagejson) When deploying a Node.js project to Cloud Run, the deployment tools expect your project to have a `start` script and, optionally, a `build` script. For a typical TypeScript project, the following scripts are usually adequate: ```json "scripts": { "start": "node lib/index.js", "build": "tsc" }, ``` ### Add code to configure and start the flow server [Section titled “Add code to configure and start the flow server”](#add-code-to-configure-and-start-the-flow-server) In the file that’s run by your `start` script, add a call to `startFlowServer`. This method will start an Express server set up to serve your flows as web endpoints. When you make the call, specify the flows you want to serve: There is also: ```ts import { startFlowServer } from '@genkit-ai/express'; startFlowServer({ flows: [menuSuggestionFlow], }); ``` There are also some optional parameters you can specify: * `port`: the network port to listen on. If unspecified, the server listens on the port defined in the PORT environment variable, and if PORT is not set, defaults to 3400. * `cors`: the flow server’s [CORS policy](https://www.npmjs.com/package/cors#configuration-options). If you will be accessing these endpoints from a web application, you likely need to specify this. * `pathPrefix`: an optional path prefix to add before your flow endpoints. * `jsonParserOptions`: options to pass to Express’s [JSON body parser](https://www.npmjs.com/package/body-parser#bodyparserjsonoptions) ### Optional: Define an authorization policy [Section titled “Optional: Define an authorization policy”](#optional-define-an-authorization-policy) All deployed flows should require some form of authorization; otherwise, your potentially-expensive generative AI flows would be invocable by anyone. When you deploy your flows with Cloud Run, you have two options for authorization: * **Cloud IAM-based authorization**: Use Google Cloud’s native access management facilities to gate access to your endpoints. For information on providing these credentials, see [Authentication](https://cloud.google.com/run/docs/authenticating/overview) in the Cloud Run docs. * **Authorization policy defined in code**: Use the authorization policy feature of the Genkit express plugin to verify authorization info using custom code. This is often, but not necessarily, token-based authorization. If you want to define an authorization policy in code, use the `authPolicy` parameter in the flow definition: ```ts // middleware for handling auth tokens in headers. const authMiddleware = async (req, resp, next) => { // parse auth headers and convert to auth object. (req as RequestWithAuth).auth = { user: await verifyAuthToken(req.header('authorization')), }; next(); }; app.post( '/simpleFlow', authMiddleware, expressHandler(simpleFlow, { authPolicy: ({ auth }) => { if (!auth.user) { throw new Error('not authorized'); } }, }), ); ``` The `auth` parameter of the authorization policy comes from the `auth` property of the request object. You typically set this property using Express middleware. See [Authorization and integrity](/docs/auth#non-firebase-http-authorization). Refer to [express plugin documentation](https://js.api.genkit.dev/modules/_genkit-ai_express.html) for more details. ### Make API credentials available to deployed flows [Section titled “Make API credentials available to deployed flows”](#make-api-credentials-available-to-deployed-flows) Once deployed, your flows need some way to authenticate with any remote services they rely on. Most flows will at a minimum need credentials for accessing the model API service they use. For this example, do one of the following, depending on the model provider you chose: * Gemini (Google AI) 1. [Generate an API key](https://aistudio.google.com/app/apikey) for the Gemini API using Google AI Studio. 2. Make the API key available in the Cloud Run environment: 1. In the Cloud console, enable the [Secret Manager API](https://console.cloud.google.com/apis/library/secretmanager.googleapis.com?project=_). 2. On the [Secret Manager](https://console.cloud.google.com/security/secret-manager?project=_) page, create a new secret containing your API key. 3. After you create the secret, on the same page, grant your default compute service account access to the secret with the **Secret Manager Secret Accessor** role. (You can look up the name of the default compute service account on the IAM page.) In a later step, when you deploy your service, you will need to reference the name of this secret. * Gemini (Vertex AI) 1. In the Cloud console, [Enable the Vertex AI API](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com?project=_) for your project. 2. On the [IAM](https://console.cloud.google.com/iam-admin/iam?project=_) page, ensure that the **Default compute service account** is granted the **Vertex AI User** role. The only secret you need to set up for this tutorial is for the model provider, but in general, you must do something similar for each service your flow uses. ## 3. Deploy flows to Cloud Run [Section titled “3. Deploy flows to Cloud Run”](#3-deploy-flows-to-cloud-run) After you’ve prepared your project for deployment, you can deploy it using the `gcloud` tool. * Gemini (Google AI) ```bash gcloud run deploy --update-secrets=GEMINI_API_KEY=:latest ``` * Gemini (Vertex AI) ```bash gcloud run deploy ``` The deployment tool will prompt you for any information it requires. When asked if you want to allow unauthenticated invocations: * Answer `Y` if you’re not using IAM and have instead defined an authorization policy in code. * Answer `N` to configure your service to require IAM credentials. ## Optional: Try the deployed flow [Section titled “Optional: Try the deployed flow”](#optional-try-the-deployed-flow) After deployment finishes, the tool will print the service URL. You can test it with `curl`: ```bash curl -X POST https:///menuSuggestionFlow \ -H "Authorization: Bearer $(gcloud auth print-identity-token)" \ -H "Content-Type: application/json" -d '{"data": "banana"}' ``` # Passing information through context > Learn how Genkit's context object propagates generation and execution information throughout your application, making it available to flows, tools, and prompts. [Genkit by Example: Action Context ](https://examples.genkit.dev/action-context?utm_source=genkit.dev\&utm_content=contextlink)See how action context can guide and secure workflows in a live demo. There are different categories of information that a developer working with an LLM may be handling simultaneously: * **Input:** Information that is directly relevant to guide the LLM’s response for a particular call. An example of this is the text that needs to be summarized. * **Generation Context:** Information that is relevant to the LLM, but isn’t specific to the call. An example of this is the current time or a user’s name. * **Execution Context:** Information that is important to the code surrounding the LLM call but not to the LLM itself. An example of this is a user’s current auth token. Genkit provides a consistent `context` object that can propagate generation and execution context throughout the process. This context is made available to all actions including [flows](/docs/flows), [tools](/docs/tool-calling), and [prompts](/docs/dotprompt). Context is automatically propagated to all actions called within the scope of execution: Context passed to a flow is made available to prompts executed within the flow. Context passed to the `generate()` method is available to tools called within the generation loop. ## Why is context important? [Section titled “Why is context important?”](#why-is-context-important) As a best practice, you should provide the minimum amount of information to the LLM that it needs to complete a task. This is important for multiple reasons: * The less extraneous information the LLM has, the more likely it is to perform well at its task. * If an LLM needs to pass around information like user or account IDs to tools, it can potentially be tricked into leaking information. Context gives you a side channel of information that can be used by any of your code but doesn’t necessarily have to be sent to the LLM. As an example, it can allow you to restrict tool queries to the current user’s available scope. ## Context structure [Section titled “Context structure”](#context-structure) Context must be an object, but its properties are yours to decide. In some situations Genkit automatically populates context. For example, when using [persistent sessions](/docs/chat) the `state` property is automatically added to context. One of the most common uses of context is to store information about the current user. We recommend adding auth context in the following format: ```js { auth: { uid: "...", // the user's unique identifier token: {...}, // the decoded claims of a user's id token rawToken: "...", // the user's raw encoded id token // ...any other fields } } ``` The context object can store any information that you might need to know somewhere else in the flow of execution. ## Use context in an action [Section titled “Use context in an action”](#use-context-in-an-action) To use context within an action, you can access the context helper that is automatically supplied to your function definition: * Flow ```ts const summarizeHistory = ai.defineFlow({ name: 'summarizeMessages', inputSchema: z.object({friendUid: z.string()}), outputSchema: z.string() }, async ({friendUid}, {context}) => { if (!context.auth?.uid) throw new Error("Must supply auth context."); const messages = await listMessagesBetween(friendUid, context.auth.uid); const {text} = await ai.generate({ prompt: `Summarize the content of these messages: ${JSON.stringify(messages)}`, }); return text; }); ``` * Tool ```ts const searchNotes = ai.defineTool({ name: 'searchNotes', description: "search the current user's notes for info", inputSchema: z.object({query: z.string()}), outputSchema: z.array(NoteSchema) }, async ({query}, {context}) => { if (!context.auth?.uid) throw new Error("Must be called by a signed-in user."); return searchUserNotes(context.auth.uid, query); }); ``` * Prompt file When using [Dotprompt templates](/docs/dotprompt), context is made available with the `@` variable prefix. For example, a context object of `{auth: {name: 'Michael'}}` could be accessed in the prompt template like so. ```dotprompt --- input: schema: pirateStyle?: boolean --- {{#if pirateStyle}}Avast, {{@auth.name}}, how be ye today?{{else}}Hello, {{@auth.name}}, how are you today?{{/if}} ``` ## Provide context at runtime [Section titled “Provide context at runtime”](#provide-context-at-runtime) To provide context to an action, you pass the context object as an option when calling the action. * Flows ```ts const summarizeHistory = ai.defineFlow(/* ... */); const summary = await summarizeHistory(friend.uid, { context: { auth: currentUser }, }); ``` * Generation ```ts const { text } = await ai.generate({ prompt: "Find references to ocelots in my notes.", // the context will propagate to tool calls tools: [searchNotes], context: { auth: currentUser }, }); ``` * Prompts ```ts const helloPrompt = ai.prompt("sayHello"); helloPrompt({ pirateStyle: true }, { context: { auth: currentUser } }); ``` ## Context propagation and overrides [Section titled “Context propagation and overrides”](#context-propagation-and-overrides) By default, when you provide context it is automatically propagated to all actions called as a result of your original call. If your flow calls other flows, or your generation calls tools, the same context is provided. If you wish to override context within an action, you can pass a different context object to replace the existing one: ```ts const otherFlow = ai.defineFlow(/* ... */); const myFlow = ai.defineFlow( { // ... }, (input, { context }) => { // override the existing context completely otherFlow( { /*...*/ }, { context: { newContext: true } }, ); // or selectively override otherFlow( { /*...*/ }, { context: { ...context, updatedContext: true } }, ); }, ); ``` When context is replaced, it propagates the same way. In this example, any actions that `otherFlow` called during its execution would inherit the overridden context. # Deploy flows to any Node.js platform > Learn how to deploy Genkit flows to any Node.js platform that can serve an Express.js application, including project setup, configuration, and client-side access. Genkit has built-in integrations that help you deploy your flows to Cloud Functions for Firebase and Google Cloud Run, but you can also deploy your flows to any platform that can serve an Express.js app, whether it’s a cloud service or self-hosted. This page, as an example, walks you through the process of deploying the default sample flow. ## Before you begin [Section titled “Before you begin”](#before-you-begin) * Node.js 20+: Confirm that your environment is using Node.js version 20 or higher (`node --version`). * You should be familiar with Genkit’s concept of [flows](/docs/flows). ## 1. Set up your project [Section titled “1. Set up your project”](#1-set-up-your-project) 1. **Create a directory for the project:** ```bash export GENKIT_PROJECT_HOME=~/tmp/genkit-express-project mkdir -p $GENKIT_PROJECT_HOME cd $GENKIT_PROJECT_HOME mkdir src ``` 2. **Initialize a Node.js project:** ```bash npm init -y ``` 3. **Install Genkit and necessary dependencies:** ```bash npm install --save genkit @genkit-ai/googleai npm install --save-dev typescript tsx npm install -g genkit-cli ``` ## 2. Configure your Genkit app [Section titled “2. Configure your Genkit app”](#2-configure-your-genkit-app) 1. **Set up a sample flow and server:** In `src/index.ts`, define a sample flow and configure the flow server: ```typescript import { genkit, z } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; import { startFlowServer } from '@genkit-ai/express'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); const helloFlow = ai.defineFlow( { name: 'helloFlow', inputSchema: z.object({ name: z.string() }), outputSchema: z.object({ greeting: z.string() }), }, async (input) => { const { text } = await ai.generate('Say hello to ${input.name}'); return { greeting: text }; }, ); startFlowServer({ flows: [helloFlow], }); ``` There are also some optional parameters for `startFlowServer` you can specify: * `port`: the network port to listen on. If unspecified, the server listens on the port defined in the PORT environment variable, and if PORT is not set, defaults to 3400. * `cors`: the flow server’s [CORS policy](https://www.npmjs.com/package/cors#configuration-options). If you will be accessing these endpoints from a web application, you likely need to specify this. * `pathPrefix`: an optional path prefix to add before your flow endpoints. * `jsonParserOptions`: options to pass to Express’s [JSON body parser](https://www.npmjs.com/package/body-parser#bodyparserjsonoptions) 2. **Set up model provider credentials:** Configure the required environment variables for your model provider. This guide uses the Gemini API from Google AI Studio as an example. [Get an API key from Google AI Studio](https://makersuite.google.com/app/apikey) After you’ve created an API key, set the `GEMINI_API_KEY` environment variable to your key with the following command: ```bash export GEMINI_API_KEY= ``` Different providers for deployment will have different ways of securing your API key in their environment. For security, ensure that your API key is not publicly exposed. ## 3. Prepare your Node.js project for deployment [Section titled “3. Prepare your Node.js project for deployment”](#3-prepare-your-nodejs-project-for-deployment) ### Add start and build scripts to `package.json` [Section titled “Add start and build scripts to package.json”](#add-start-and-build-scripts-to-packagejson) To deploy a Node.js project, define `start` and `build` scripts in `package.json`. For a TypeScript project, these scripts will look like this: ```json "scripts": { "start": "node --watch lib/index.js", "build": "tsc" }, ``` ### Build and test locally [Section titled “Build and test locally”](#build-and-test-locally) Run the build command, then start the server and test it locally to confirm it works as expected. ```bash npm run build npm start ``` In another terminal window, test the endpoint: ```bash curl -X POST "http://127.0.0.1:3400/helloFlow" \ -H "Content-Type: application/json" \ -d '{"data": {"name": "Genkit"}}' ``` ## Optional: Start the Developer UI [Section titled “Optional: Start the Developer UI”](#optional-start-the-developer-ui) You can use the Developer UI to test flows interactively during development: ```bash genkit start -- npm run start ``` Navigate to `http://localhost:4000/flows` to test your flows in the UI. ## 4. Deploy the project [Section titled “4. Deploy the project”](#4-deploy-the-project) Once your project is configured and tested locally, you can deploy to any Node.js-compatible platform. Deployment steps vary by provider, but generally, you configure the following settings: | Setting | Value | | ------------------------- | ---------------------------------------------------------------- | | **Runtime** | Node.js 20 or newer | | **Build command** | `npm run build` | | **Start command** | `npm start` | | **Environment variables** | Set `GEMINI_API_KEY=` and other necessary secrets. | The `start` command (`npm start`) should point to your compiled entry point, typically `lib/index.js`. Be sure to add all necessary environment variables for your deployment platform. After deploying, you can use the provided service URL to invoke your flow as an HTTPS endpoint. ## Call your flows from the client [Section titled “Call your flows from the client”](#call-your-flows-from-the-client) In your client-side code (e.g., a web application, mobile app, or another service), you can call your deployed flows using the Genkit client library. This library provides functions for both non-streaming and streaming flow calls. First, install the Genkit library: ```bash npm install genkit ``` Then, you can use `runFlow` for non-streaming calls and `streamFlow` for streaming calls. ### Non-streaming Flow Calls [Section titled “Non-streaming Flow Calls”](#non-streaming-flow-calls) For a non-streaming response, use the `runFlow` function. This is suitable for flows that return a single, complete output. ```typescript import { runFlow } from 'genkit/beta/client'; async function callHelloFlow() { try { const result = await runFlow({ url: 'http://127.0.0.1:3400/helloFlow', // Replace with your deployed flow's URL input: { name: 'Genkit User' }, }); console.log('Non-streaming result:', result.greeting); } catch (error) { console.error('Error calling helloFlow:', error); } } callHelloFlow(); ``` ### Streaming Flow Calls [Section titled “Streaming Flow Calls”](#streaming-flow-calls) For flows that are designed to stream responses (e.g., for real-time updates or long-running operations), use the `streamFlow` function. ```typescript import { streamFlow } from 'genkit/beta/client'; async function streamHelloFlow() { try { const result = streamFlow({ url: 'http://127.0.0.1:3400/helloFlow', // Replace with your deployed flow's URL input: { name: 'Streaming User' }, }); // Process the stream chunks as they arrive for await (const chunk of result.stream) { console.log('Stream chunk:', chunk); } // Get the final complete response const finalOutput = await result.output; console.log('Final streaming output:', finalOutput.greeting); } catch (error) { console.error('Error streaming helloFlow:', error); } } streamHelloFlow(); ``` ### Authentication (Optional) [Section titled “Authentication (Optional)”](#authentication-optional) If your deployed flow requires authentication, you can pass headers with your requests: ```typescript const result = await runFlow({ url: 'http://127.0.0.1:3400/helloFlow', // Replace with your deployed flow's URL headers: { Authorization: 'Bearer your-token-here', // Replace with your actual token }, input: { name: 'Authenticated User' }, }); ``` # Genkit Developer Tools > Explore Genkit's developer tools, including the Node.js CLI for command-line operations and the local web-based Developer UI for interactive testing and development. Genkit provides two key developer tools: * A Node.js CLI for command-line operations * An optional local web app, called the Developer UI, that interfaces with your Genkit configuration for interactive testing and development ### Command Line Interface (CLI) [Section titled “Command Line Interface (CLI)”](#command-line-interface-cli) Install the CLI globally using: ```bash npm install -g genkit-cli ``` The CLI supports various commands to facilitate working with Genkit projects: * `genkit start -- `: Start the developer UI and connect it to a running code process. * `genkit flow:run `: Run a specified flow. Your runtime must already be running in a separate terminal with the `GENKIT_ENV=dev` environment variable set. * `genkit eval:flow `: Evaluate a specific flow. Your runtime must already be running in a separate terminal with the `GENKIT_ENV=dev` environment variable set. For a full list of commands, use: ```bash genkit --help ``` ### Genkit Developer UI [Section titled “Genkit Developer UI”](#genkit-developer-ui) The Genkit Developer UI is a local web app that lets you interactively work with models, flows, prompts, and other elements in your Genkit project. The Developer UI is able to identify what Genkit components you have defined in your code by attaching to a running code process. To start the UI, run the following command: ```bash genkit start -- ``` The `` will vary based on your project’s setup and the file you want to execute. Here are some examples: ```bash # Running a typical development server genkit start -- npm run dev # Running a TypeScript file directly genkit start -- npx tsx --watch src/index.ts # Running a JavaScript file directly genkit start -- node --watch src/index.js ``` Including the `--watch` option will enable the Developer UI to notice and reflect saved changes to your code without needing to restart it. After running the command, you will get an output like the following: ```bash Telemetry API running on http://localhost:4033 Genkit Developer UI: http://localhost:4000 ``` Open the local host address for the Genkit Developer UI in your browser to view it. You can also open it in the VS Code simple browser to view it alongside your code. Alternatively, you can use add the `-o` option to the start command to automatically open the Developer UI in your default browser tab. ```plaintext genkit start -o -- ``` ![Genkit Developer UI](/_astro/genkit_dev_ui_home.CelZYnmn_Z1TMqFE.webp) The Developer UI has action runners for `flow`, `prompt`, `model`, `tool`, `retriever`, `indexer`, `embedder` and `evaluator` based on the components you have defined in your code. Here’s a quick gif tour with cats. ![Genkit Developer UI Overview](/genkit_developer_ui_overview.gif) ### Analytics [Section titled “Analytics”](#analytics) The Genkit CLI and Developer UI use cookies and similar technologies from Google to deliver and enhance the quality of its services and to analyze usage. [Learn more](https://policies.google.com/technologies/cookies). To opt-out of analytics, you can run the following command: ```bash genkit config set analyticsOptOut true ``` You can view the current setting by running: ```bash genkit config get analyticsOptOut ``` # Managing prompts with Dotprompt > This document explains how to manage prompts using Dotprompt, a Genkit library and file format designed to streamline prompt engineering and iteration. Prompt engineering is the primary way that you, as an app developer, influence the output of generative AI models. For example, when using LLMs, you can craft prompts that influence the tone, format, length, and other characteristics of the models’ responses. The way you write these prompts will depend on the model you’re using; a prompt written for one model might not perform well when used with another model. Similarly, the model parameters you set (temperature, top-k, and so on) will also affect output differently depending on the model. Getting all three of these factors—the model, the model parameters, and the prompt—working together to produce the output you want is rarely a trivial process and often involves substantial iteration and experimentation. Genkit provides a library and file format called Dotprompt, that aims to make this iteration faster and more convenient. [Dotprompt](https://github.com/google/dotprompt) is designed around the premise that **prompts are code**. You define your prompts along with the models and model parameters they’re intended for separately from your application code. Then, you (or, perhaps someone not even involved with writing application code) can rapidly iterate on the prompts and model parameters using the Genkit Developer UI. Once your prompts are working the way you want, you can import them into your application and run them using Genkit. Your prompt definitions each go in a file with a `.prompt` extension. Here’s an example of what these files look like: ```dotprompt --- model: googleai/gemini-2.5-flash config: temperature: 0.9 input: schema: location: string style?: string name?: string default: location: a restaurant --- You are the world's most welcoming AI assistant and are currently working at {{location}}. Greet a guest{{#if name}} named {{name}}{{/if}}{{#if style}} in the style of {{style}}{{/if}}. ``` The portion in the triple-dashes is YAML front matter, similar to the front matter format used by GitHub Markdown and Jekyll; the rest of the file is the prompt, which can optionally use [Handlebars ](https://handlebarsjs.com/guide/)templates. The following sections will go into more detail about each of the parts that make a `.prompt` file and how to use them. ## Before you begin [Section titled “Before you begin”](#before-you-begin) Before reading this page, you should be familiar with the content covered on the [Generating content with AI models](/docs/models) page. If you want to run the code examples on this page, first complete the steps in the [Get started](/docs/get-started) guide. All of the examples assume that you have already installed Genkit as a dependency in your project. ## Creating prompt files [Section titled “Creating prompt files”](#creating-prompt-files) Although Dotprompt provides several [different ways](#defining-prompts-in-code) to create and load prompts, it’s optimized for projects that organize their prompts as `.prompt` files within a single directory (or subdirectories thereof). This section shows you how to create and load prompts using this recommended setup. ### Creating a prompt directory [Section titled “Creating a prompt directory”](#creating-a-prompt-directory) The Dotprompt library expects to find your prompts in a directory at your project root and automatically loads any prompts it finds there. By default, this directory is named `prompts`. For example, using the default directory name, your project structure might look something like this: ```plaintext your-project/ ├── lib/ ├── node_modules/ ├── prompts/ │ └── hello.prompt ├── src/ ├── package-lock.json ├── package.json └── tsconfig.json ``` If you want to use a different directory, you can specify it when you configure Genkit: ```ts const ai = genkit({ promptDir: './llm_prompts', // (Other settings...) }); ``` ### Creating a prompt file [Section titled “Creating a prompt file”](#creating-a-prompt-file) There are two ways to create a `.prompt` file: using a text editor, or with the developer UI. #### Using a text editor [Section titled “Using a text editor”](#using-a-text-editor) If you want to create a prompt file using a text editor, create a text file with the `.prompt` extension in your prompts directory: for example, `prompts/hello.prompt`. Here is a minimal example of a prompt file: ```dotprompt --- model: vertexai/gemini-2.5-flash --- You are the world's most welcoming AI assistant. Greet the user and offer your assistance. ``` The portion in the dashes is YAML front matter, similar to the front matter format used by GitHub markdown and Jekyll; the rest of the file is the prompt, which can optionally use Handlebars templates. The front matter section is optional, but most prompt files will at least contain metadata specifying a model. The remainder of this page shows you how to go beyond this, and make use of Dotprompt’s features in your prompt files. #### Using the developer UI [Section titled “Using the developer UI”](#using-the-developer-ui) You can also create a prompt file using the model runner in the developer UI. Start with application code that imports the Genkit library and configures it to use the model plugin you’re interested in. For example: ```ts import { genkit } from 'genkit'; // Import the model plugins you want to use. import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ // Initialize and configure the model plugins. plugins: [ googleAI({ apiKey: 'your-api-key', // Or (preferred): export GEMINI_API_KEY=... }), ], }); ``` It’s okay if the file contains other code, but the above is all that’s required. Load the developer UI in the same project: ```bash genkit start -- tsx --watch src/your-code.ts ``` In the Models section, choose the model you want to use from the list of models provided by the plugin. ![Genkit Developer UI Model Runner](/_astro/developer_ui_model_runner.cHO4a-_l_Z1Vv7kN.webp) Then, experiment with the prompt and configuration until you get results you’re happy with. When you’re ready, press the Export button and save the file to your prompts directory. ## Running prompts [Section titled “Running prompts”](#running-prompts) After you’ve created prompt files, you can run them from your application code, or using the tooling provided by Genkit. Regardless of how you want to run your prompts, first start with application code that imports the Genkit library and the model plugins you’re interested in. For example: ```ts import { genkit } from 'genkit'; // Import the model plugins you want to use. import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ // Initialize and configure the model plugins. plugins: [ googleAI({ apiKey: 'your-api-key', // Or (preferred): export GEMINI_API_KEY=... }), ], }); ``` It’s okay if the file contains other code, but the above is all that’s required. If you’re storing your prompts in a directory other than the default, be sure to specify it when you configure Genkit. ### Run prompts from code [Section titled “Run prompts from code”](#run-prompts-from-code) To use a prompt, first load it using the `prompt('file_name')` method: ```ts const helloPrompt = ai.prompt('hello'); ``` Once loaded, you can call the prompt like a function: ```ts const response = await helloPrompt(); // Alternatively, use destructuring assignments to get only the properties // you're interested in: const { text } = await helloPrompt(); ``` Or you can also run the prompt in streaming mode: ```ts const { response, stream } = helloPrompt.stream(); for await (const chunk of stream) { console.log(chunk.text); } // optional final (aggregated) response console.log((await response).text); ``` A callable prompt takes two optional parameters: the input to the prompt (see the section below on [specifying input schemas](#input-and-output-schemas)), and a configuration object, similar to that of the `generate()` method. For example: ```ts const response2 = await helloPrompt( // Prompt input: { name: 'Ted' }, // Generation options: { config: { temperature: 0.4, }, }, ); ``` Similarly for streaming: ```ts const { stream } = helloPrompt.stream(input, options); ``` Any parameters you pass to the prompt call will override the same parameters specified in the prompt file. See [Generate content with AI models](/docs/models) for descriptions of the available options. ### Using the developer UI [Section titled “Using the developer UI”](#using-the-developer-ui-1) As you’re refining your app’s prompts, you can run them in the Genkit developer UI to quickly iterate on prompts and model configurations, independently from your application code. Load the developer UI from your project directory: ```bash genkit start -- tsx --watch src/your-code.ts ``` ![Genkit Developer UI Model Runner](/_astro/prompts-in-developer-ui.LmFDtByL_ZBrbGw.webp) Once you’ve loaded prompts into the developer UI, you can run them with different input values, and experiment with how changes to the prompt wording or the configuration parameters affect the model output. When you’re happy with the result, you can click the **Export prompt** button to save the modified prompt back into your project directory. ## Model configuration [Section titled “Model configuration”](#model-configuration) In the front matter block of your prompt files, you can optionally specify model configuration values for your prompt: ```dotprompt --- model: googleai/gemini-2.5-flash config: temperature: 1.4 topK: 50 topP: 0.4 maxOutputTokens: 400 stopSequences: - "" - "" --- ``` These values map directly to the `config` parameter accepted by the callable prompt: ```ts const response3 = await helloPrompt( {}, { config: { temperature: 1.4, topK: 50, topP: 0.4, maxOutputTokens: 400, stopSequences: ['', ''], }, }, ); ``` See [Generate content with AI models](/docs/models) for descriptions of the available options. ## Input and output schemas [Section titled “Input and output schemas”](#input-and-output-schemas) You can specify input and output schemas for your prompt by defining them in the front matter section: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: theme?: string default: theme: "pirate" output: schema: dishname: string description: string calories: integer allergens(array): string --- Invent a menu item for a {{theme}} themed restaurant. ``` These schemas are used in much the same way as those passed to a `generate()` request or a flow definition. For example, the prompt defined above produces structured output: ```ts const menuPrompt = ai.prompt('menu'); const { output } = await menuPrompt({ theme: 'medieval' }); const dishName = output['dishname']; const description = output['description']; ``` You have several options for defining schemas in a `.prompt` file: Dotprompt’s own schema definition format, Picoschema; standard JSON Schema; or, as references to schemas defined in your application code. The following sections describe each of these options in more detail. ### Picoschema [Section titled “Picoschema”](#picoschema) The schemas in the example above are defined in a format called Picoschema. Picoschema is a compact, YAML-optimized schema definition format that makes it easy to define the most important attributes of a schema for LLM usage. Here’s a longer example of a schema, which specifies the information an app might store about an article: ```yaml schema: title: string # string, number, and boolean types are defined like this subtitle?: string # optional fields are marked with a `?` draft?: boolean, true when in draft state status?(enum, approval status): [PENDING, APPROVED] date: string, the date of publication e.g. '2024-04-09' # descriptions follow a comma tags(array, relevant tags for article): string # arrays are denoted via parentheses authors(array): name: string email?: string metadata?(object): # objects are also denoted via parentheses updatedAt?: string, ISO timestamp of last update approvedBy?: integer, id of approver extra?: any, arbitrary extra data (*): string, wildcard field ``` The above schema is equivalent to the following TypeScript interface: ```ts interface Article { title: string; subtitle?: string | null; /** true when in draft state */ draft?: boolean | null; /** approval status */ status?: 'PENDING' | 'APPROVED' | null; /** the date of publication e.g. '2024-04-09' */ date: string; /** relevant tags for article */ tags: string[]; authors: { name: string; email?: string | null; }[]; metadata?: { /** ISO timestamp of last update */ updatedAt?: string | null; /** id of approver */ approvedBy?: number | null; } | null; /** arbitrary extra data */ extra?: any; /** wildcard field */ } ``` Picoschema supports scalar types `string`, `integer`, `number`, `boolean`, and `any`. Objects, arrays, and enums are denoted by a parenthetical after the field name. Objects defined by Picoschema have all properties required unless denoted optional by `?`, and do not allow additional properties. When a property is marked as optional, it is also made nullable to provide more leniency for LLMs to return null instead of omitting a field. In an object definition, the special key `(*)` can be used to declare a “wildcard” field definition. This will match any additional properties not supplied by an explicit key. ### JSON Schema [Section titled “JSON Schema”](#json-schema) Picoschema does not support many of the capabilities of full JSON schema. If you require more robust schemas, you may supply a JSON Schema instead: ```yaml output: schema: type: object properties: field1: type: number minimum: 20 ``` ### Zod schemas defined in code [Section titled “Zod schemas defined in code”](#zod-schemas-defined-in-code) In addition to directly defining schemas in the `.prompt` file, you can reference a schema registered with `defineSchema()` by name. If you’re using TypeScript, this approach will let you take advantage of the language’s static type checking features when you work with prompts. To register a schema: ```ts import { z } from 'genkit'; const MenuItemSchema = ai.defineSchema( 'MenuItemSchema', z.object({ dishname: z.string(), description: z.string(), calories: z.coerce.number(), allergens: z.array(z.string()), }), ); ``` Within your prompt, provide the name of the registered schema: ```dotprompt --- model: googleai/gemini-2.5-flash-latest output: schema: MenuItemSchema --- ``` The Dotprompt library will automatically resolve the name to the underlying registered Zod schema. You can then utilize the schema to strongly type the output of a Dotprompt: ```ts const menuPrompt = ai.prompt< z.ZodTypeAny, // Input schema typeof MenuItemSchema, // Output schema z.ZodTypeAny // Custom options schema >('menu'); const { output } = await menuPrompt({ theme: 'medieval' }); // Now data is strongly typed as MenuItemSchema: const dishName = output?.dishname; const description = output?.description; ``` ## Prompt templates [Section titled “Prompt templates”](#prompt-templates) The portion of a `.prompt` file that follows the front matter (if present) is the prompt itself, which will be passed to the model. While this prompt could be a simple text string, very often you will want to incorporate user input into the prompt. To do so, you can specify your prompt using the [Handlebars](https://handlebarsjs.com/guide/) templating language. Prompt templates can include placeholders that refer to the values defined by your prompt’s input schema. You already saw this in action in the section on input and output schemas: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: theme?: string default: theme: "pirate" output: schema: dishname: string description: string calories: integer allergens(array): string --- Invent a menu item for a {{theme}} themed restaurant. ``` In this example, the Handlebars expression, `{{theme}}`, resolves to the value of the input’s `theme` property when you run the prompt. To pass input to the prompt, call the prompt as in the following example: ```ts const menuPrompt = ai.prompt('menu'); const { output } = await menuPrompt({ theme: 'medieval' }); ``` Note that because the input schema declared the `theme` property to be optional and provided a default, you could have omitted the property, and the prompt would have resolved using the default value. Handlebars templates also support some limited logical constructs. For example, as an alternative to providing a default, you could define the prompt using Handlebars’s `#if` helper: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: theme?: string --- Invent a menu item for a {{#if theme}}{{theme}} themed{{/if}} restaurant. ``` In this example, the prompt renders as “Invent a menu item for a restaurant” when the `theme` property is unspecified. See the [Handlebars documentation](https://handlebarsjs.com/guide/builtin-helpers.html) for information on all of the built-in logical helpers. In addition to properties defined by your input schema, your templates can also refer to values automatically defined by Genkit. The next few sections describe these automatically-defined values and how you can use them. ### Multi-message prompts [Section titled “Multi-message prompts”](#multi-message-prompts) By default, Dotprompt constructs a single message with a “user” role. However, some prompts are best expressed as a combination of multiple messages, such as a system prompt. The `{{role}}` helper provides a simple way to construct multi-message prompts: ```dotprompt --- model: vertexai/gemini-2.5-flash input: schema: userQuestion: string --- {{role "system"}} You are a helpful AI assistant that really loves to talk about food. Try to work food items into all of your conversations. {{role "user"}} {{userQuestion}} ``` Note that your final prompt must contain at least one `user` role. ### Multi-modal prompts [Section titled “Multi-modal prompts”](#multi-modal-prompts) For models that support multimodal input, such as images alongside text, you can use the `{{media}}` helper: ```dotprompt --- model: vertexai/gemini-2.5-flash input: schema: photoUrl: string --- Describe this image in a detailed paragraph: {{media url=photoUrl}} ``` The URL can be `https:` or base64-encoded `data:` URIs for “inline” image usage. In code, this would be: ```ts const multimodalPrompt = ai.prompt('multimodal'); const { text } = await multimodalPrompt({ photoUrl: 'https://example.com/photo.jpg', }); ``` See also [Multimodal input](/docs/models#multimodal-input), on the Models page, for an example of constructing a `data:` URL. ### Partials [Section titled “Partials”](#partials) Partials are reusable templates that can be included inside any prompt. Partials can be especially helpful for related prompts that share common behavior. When loading a prompt directory, any file prefixed with an underscore (`_`) is considered a partial. So a file `_personality.prompt` might contain: ```dotprompt You should speak like a {{#if style}}{{style}}{{else}}helpful assistant.{{/if}}. ``` This can then be included in other prompts: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: name: string style?: string --- {{role "system"}} {{>personality style=style}} {{role "user"}} Give the user a friendly greeting. User's Name: {{name}} ``` Partials are inserted using the `{{>NAME_OF_PARTIAL args...}}` syntax. If no arguments are provided to the partial, it executes with the same context as the parent prompt. Partials accept both named arguments as above or a single positional argument representing the context. This can be helpful for tasks such as rendering members of a list. **\_destination.prompt** ```dotprompt - {{name}} ({{country}}) ``` **chooseDestination.prompt** ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: destinations(array): name: string country: string --- Help the user decide between these vacation destinations: {{#each destinations}} {{>destination this}} {{/each}} ``` #### Defining partials in code [Section titled “Defining partials in code”](#defining-partials-in-code) You can also define partials in code using `definePartial`: ```ts ai.definePartial('personality', 'Talk like a {{#if style}}{{style}}{{else}}helpful assistant{{/if}}.'); ``` Code-defined partials are available in all prompts. ### Defining Custom Helpers [Section titled “Defining Custom Helpers”](#defining-custom-helpers) You can define custom helpers to process and manage data inside of a prompt. Helpers are registered globally using `defineHelper`: ```ts ai.defineHelper('shout', (text: string) => text.toUpperCase()); ``` Once a helper is defined you can use it in any prompt: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: name: string --- HELLO, {{shout name}}!!! ``` ## Prompt variants [Section titled “Prompt variants”](#prompt-variants) Because prompt files are just text, you can (and should!) commit them to your version control system, allowing you to compare changes over time easily. Often, tweaked versions of prompts can only be fully tested in a production environment side-by-side with existing versions. Dotprompt supports this through its variants feature. To create a variant, create a `[name].[variant].prompt` file. For instance, if you were using Gemini 2.0 Flash in your prompt but wanted to see if Gemini 2.5 Pro would perform better, you might create two files: * `my_prompt.prompt`: the “baseline” prompt * `my_prompt.gemini25pro.prompt`: a variant named `gemini25pro` To use a prompt variant, specify the variant option when loading: ```ts const myPrompt = ai.prompt('my_prompt', { variant: 'gemini25pro' }); ``` The name of the variant is included in the metadata of generation traces, so you can compare and contrast actual performance between variants in the Genkit trace inspector. ## Defining prompts in code [Section titled “Defining prompts in code”](#defining-prompts-in-code) All of the examples discussed so far have assumed that your prompts are defined in individual `.prompt` files in a single directory (or subdirectories thereof), accessible to your app at runtime. Dotprompt is designed around this setup, and its authors consider it to be the best developer experience overall. However, if you have use cases that are not well supported by this setup, you can also define prompts in code using the `definePrompt()` function: The first parameter to this function is analogous to the front matter block of a `.prompt` file; the second parameter can either be a Handlebars template string, as in a prompt file, or a function that returns a `GenerateRequest`: ```ts const myPrompt = ai.definePrompt({ name: 'myPrompt', model: 'googleai/gemini-2.5-flash', input: { schema: z.object({ name: z.string(), }), }, prompt: 'Hello, {{name}}. How are you today?', }); ``` ```ts const myPrompt = ai.definePrompt({ name: 'myPrompt', model: 'googleai/gemini-2.5-flash', input: { schema: z.object({ name: z.string(), }), }, messages: async (input) => { return [ { role: 'user', content: [{ text: `Hello, ${input.name}. How are you today?` }], }, ]; }, }); ``` # No new actions at runtime error > Learn why defining new actions at runtime is not allowed in Genkit and how to correctly define them. Defining new actions at runtime is not allowed. ✅ DO: ```ts const prompt = ai.definePrompt({...}) const myFlow = ai.defineFlow({...}, async (input) => { await prompt(...); }) ``` ❌ DON’T: ```ts const myFlow = ai.defineFlow({...}, async (input) => { const prompt = ai.definePrompt({...}) await prompt(...); }) ``` # Error Types > Learn about Genkit's specialized error types, GenkitError and UserFacingError, and how they are used to differentiate between internal and user-facing issues. Genkit knows about two specialized types: `GenkitError` and `UserFacingError`. `GenkitError` is intended for use by Genkit itself or Genkit plugins. `UserFacingError` is intended for [`ContextProviders`](/docs/deploy-node) and your code. The separation between these two error types helps you better understand where your error is coming from. Genkit plugins for web hosting (e.g. [`@genkit-ai/express`](https://js.api.genkit.dev/modules/_genkit-ai_express.html) or [`@genkit-ai/next`](https://js.api.genkit.dev/modules/_genkit-ai_next.html)) SHOULD capture all other Error types and instead report them as an internal error in the response. This adds a layer of security to your application by ensuring that internal details of your application do not leak to attackers. # Evaluation > Learn about Genkit's evaluation capabilities, including inference-based and raw evaluation, dataset creation, and how to use the Developer UI and CLI for testing and analysis. Evaluation is a form of testing that helps you validate your LLM’s responses and ensure they meet your quality bar. Genkit supports third-party evaluation tools through plugins, paired with powerful observability features that provide insight into the runtime state of your LLM-powered applications. Genkit tooling helps you automatically extract data including inputs, outputs, and information from intermediate steps to evaluate the end-to-end quality of LLM responses as well as understand the performance of your system’s building blocks. ### Types of evaluation [Section titled “Types of evaluation”](#types-of-evaluation) Genkit supports two types of evaluation: * **Inference-based evaluation**: This type of evaluation runs against a collection of pre-determined inputs, assessing the corresponding outputs for quality. This is the most common evaluation type, suitable for most use cases. This approach tests a system’s actual output for each evaluation run. You can perform the quality assessment manually, by visually inspecting the results. Alternatively, you can automate the assessment by using an evaluation metric. * **Raw evaluation**: This type of evaluation directly assesses the quality of inputs without any inference. This approach typically is used with automated evaluation using metrics. All required fields for evaluation (e.g., `input`, `context`, `output` and `reference`) must be present in the input dataset. This is useful when you have data coming from an external source (e.g., collected from your production traces) and you want to have an objective measurement of the quality of the collected data. For more information, see the [Advanced use](#advanced-use) section of this page. This section explains how to perform inference-based evaluation using Genkit. ## Quick start [Section titled “Quick start”](#quick-start) ### Setup [Section titled “Setup”](#setup) 1. Use an existing Genkit app or create a new one by following our [Get started](/docs/get-started) guide. 2. Add the following code to define a simple RAG application to evaluate. For this guide, we use a dummy retriever that always returns the same documents. ```js import { genkit, z, Document } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; // Initialize Genkit export const ai = genkit({ plugins: [googleAI()] }); // Dummy retriever that always returns the same docs export const dummyRetriever = ai.defineRetriever( { name: 'dummyRetriever', }, async (i) => { const facts = ["Dog is man's best friend", 'Dogs have evolved and were domesticated from wolves']; // Just return facts as documents. return { documents: facts.map((t) => Document.fromText(t)) }; }, ); // A simple question-answering flow export const qaFlow = ai.defineFlow( { name: 'qaFlow', inputSchema: z.object({ query: z.string() }), outputSchema: z.object({ answer: z.string() }), }, async ({ query }) => { const factDocs = await ai.retrieve({ retriever: dummyRetriever, query, }); const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: `Answer this question with the given context ${query}`, docs: factDocs, }); return { answer: text }; }, ); ``` 3. (Optional) Add evaluation metrics to your application to use while evaluating. This guide uses the `MALICIOUSNESS` metric from the `genkitEval` plugin. ```js import { genkitEval, GenkitMetric } from '@genkit-ai/evaluator'; import { googleAI } from '@genkit-ai/googleai'; export const ai = genkit({ plugins: [ ...// Add this plugin to your Genkit initialization block genkitEval({ judge: googleAI.model('gemini-2.5-flash'), metrics: [GenkitMetric.MALICIOUSNESS], }), ], }); ``` **Note:** The configuration above requires installation of the [`@genkit-ai/evaluator`](https://www.npmjs.com/package/@genkit-ai/evaluator) package. ```bash npm install @genkit-ai/evaluator ``` 4. Start your Genkit application. ```bash genkit start -- ``` ### Create a dataset [Section titled “Create a dataset”](#create-a-dataset) Create a dataset to define the examples we want to use for evaluating our flow. 1. Go to the Dev UI at `http://localhost:4000` and click the **Datasets** button to open the Datasets page. 2. Click on the **Create Dataset** button to open the create dataset dialog. a. Provide a `datasetId` for your new dataset. This guide uses `myFactsQaDataset`. b. Select `Flow` dataset type. c. Leave the validation target field empty and click **Save** 3. Your new dataset page appears, showing an empty dataset. Add examples to it by following these steps: a. Click the **Add example** button to open the example editor panel. b. Only the `input` field is required. Enter `{"query": "Who is man's best friend?"}` in the `input` field, and click **Save** to add the example has to your dataset. c. Repeat steps (a) and (b) a couple more times to add more examples. This guide adds the following example inputs to the dataset: ```plaintext {"query": "Can I give milk to my cats?"} {"query": "From which animals did dogs evolve?"} ``` By the end of this step, your dataset should have 3 examples in it, with the values mentioned above. ### Run evaluation and view results [Section titled “Run evaluation and view results”](#run-evaluation-and-view-results) To start evaluating the flow, click the **Run new evaluation** button on your dataset page. You can also start a new evaluation from the *Evaluations* tab. 1. Select the `Flow` radio button to evaluate a flow. 2. Select `qaFlow` as the target flow to evaluate. 3. Select `myFactsQaDataset` as the target dataset to use for evaluation. 4. (Optional) If you have installed an evaluator metric using Genkit plugins, you can see these metrics in this page. Select the metrics that you want to use with this evaluation run. This is entirely optional: Omitting this step will still return the results in the evaluation run, but without any associated metrics. 5. Finally, click **Run evaluation** to start evaluation. Depending on the flow you’re testing, this may take a while. Once the evaluation is complete, a success message appears with a link to view the results. Click on the link to go to the *Evaluation details* page. You can see the details of your evaluation on this page, including original input, extracted context and metrics (if any). ## Core concepts [Section titled “Core concepts”](#core-concepts) ### Terminology [Section titled “Terminology”](#terminology) * **Evaluation**: An evaluation is a process that assesses system performance. In Genkit, such a system is usually a Genkit primitive, such as a flow or a model. An evaluation can be automated or manual (human evaluation). * **Bulk inference** Inference is the act of running an input on a flow or model to get the corresponding output. Bulk inference involves performing inference on multiple inputs simultaneously. * **Metric** An evaluation metric is a criterion on which an inference is scored. Examples include accuracy, faithfulness, maliciousness, whether the output is in English, etc. * **Dataset** A dataset is a collection of examples to use for inference-based\ evaluation. A dataset typically consists of `input` and optional `reference` fields. The `reference` field does not affect the inference step of evaluation but it is passed verbatim to any evaluation metrics. In Genkit, you can create a dataset through the Dev UI. There are two types of datasets in Genkit: *Flow* datasets and *Model* datasets. ### Schema validation [Section titled “Schema validation”](#schema-validation) Depending on the type, datasets have schema validation support in the Dev UI: * Flow datasets support validation of the `input` and `reference` fields of the dataset against a flow in the Genkit application. Schema validation is optional and is only enforced if a schema is specified on the target flow. * Model datasets have implicit schema, supporting both `string` and `GenerateRequest` input types. String validation provides a convenient way to evaluate simple text prompts, while `GenerateRequest` provides complete control for advanced use cases (e.g. providing model parameters, message history, tools, etc). You can find the full schema for `GenerateRequest` in our [API reference docs](https://js.api.genkit.dev/interfaces/genkit._.GenerateRequest.html). Note: Schema validation is a helper tool for editing examples, but it is possible to save an example with invalid schema. These examples may fail when the running an evaluation. ## Supported evaluators [Section titled “Supported evaluators”](#supported-evaluators) ### Genkit evaluators [Section titled “Genkit evaluators”](#genkit-evaluators) Genkit includes a small number of native evaluators, inspired by [RAGAS](https://docs.ragas.io/en/stable/), to help you get started: * Faithfulness — Measures the factual consistency of the generated answer against the given context * Answer Relevancy — Assesses how pertinent the generated answer is to the given prompt * Maliciousness — Measures whether the generated output intends to deceive, harm, or exploit ### Evaluator plugins [Section titled “Evaluator plugins”](#evaluator-plugins) Genkit supports additional evaluators through plugins, like the Vertex Rapid Evaluators, which you can access via the [VertexAI Plugin](/docs/plugins/vertex-ai#evaluators). ## Advanced use [Section titled “Advanced use”](#advanced-use) ### Evaluation comparison [Section titled “Evaluation comparison”](#evaluation-comparison) The Developer UI offers visual tools for side-by-side comparison of multiple evaluation runs. This feature allows you to analyze variations across different executions within a unified interface, making it easier to assess changes in output quality. Additionally, you can highlight outputs based on the performance of specific metrics, indicating improvements or regressions. When comparing evaluations, one run is designated as the *Baseline*. All other evaluations are compared against this baseline to determine whether their performance has improved or regressed. ![Evaluation comparison with metric highlighting](/_astro/evals_compare_light.BfzHwe1o_LOp20.webp) ![Evaluation comparison with metric highlighting](/_astro/evals_compare_dark.DfqBTUeH_Z2q4UrA.webp) #### Prerequisites [Section titled “Prerequisites”](#prerequisites) To use the evaluation comparison feature, the following conditions must be met: * Evaluations must originate from a dataset source. Evaluations from file sources are not comparable. * All evaluations being compared must be from the same dataset. * For metric highlighting, all evaluations must use at least one common metric that produces a `number` or `boolean` score. #### Comparing evaluations [Section titled “Comparing evaluations”](#comparing-evaluations) 1. Ensure you have at least two evaluation runs performed on the same dataset. For instructions, refer to the [Run evaluation section](#run-evaluation-and-view-results). 2. In the Developer UI, navigate to the **Datasets** page. 3. Select the relevant dataset and open its **Evaluations** tab. You should see all evaluation runs associated with that dataset. 4. Choose one evaluation to serve as the baseline for comparison. 5. On the evaluation results page, click the **+ Comparison** button. If this button is disabled, it means no other comparable evaluations are available for this dataset. 6. A new column will appear with a dropdown menu. Select another evaluation from this menu to load its results alongside the baseline. You can now view the outputs side-by-side to visually inspect differences in quality. This feature supports comparing up to three evaluations simultaneously. ##### Metric highlighting (Optional) [Section titled “Metric highlighting (Optional)”](#metric-highlighting-optional) If your evaluations include metrics, you can enable metric highlighting to color-code the results. This feature helps you quickly identify changes in performance: improvements are colored green, while regressions are red. Note that highlighting is only supported for numeric and boolean metrics, and the selected metric must be present in all evaluations being compared. To enable metric highlighting: 1. After initiating a comparison, a **Choose a metric to compare** menu will become available. 2. Select a metric from the dropdown. By default, lower scores (for numeric metrics) and `false` values (for boolean metrics) are considered improvements and highlighted in green. You can reverse this logic by ticking the checkbox in the menu. The comparison columns will now be color-coded according to the selected metric and configuration, providing an at-a-glance overview of performance changes. ### Evaluation using the CLI [Section titled “Evaluation using the CLI”](#evaluation-using-the-cli) Genkit CLI provides a rich API for performing evaluation. This is especially useful in environments where the Dev UI is not available (e.g. in a CI/CD workflow). Genkit CLI provides 3 main evaluation commands: `eval:flow`, `eval:extractData`, and `eval:run`. #### `eval:flow` command [Section titled “eval:flow command”](#evalflow-command) The `eval:flow` command runs inference-based evaluation on an input dataset. This dataset may be provided either as a JSON file or by referencing an existing dataset in your Genkit runtime. ```bash # Referencing an existing dataset genkit eval:flow qaFlow --input myFactsQaDataset # or, using a dataset from a file genkit eval:flow qaFlow --input testInputs.json ``` Note: Make sure that you start your genkit app before running these CLI commands. ```bash genkit start -- ``` Here, `testInputs.json` should be an array of objects containing an `input` field and an optional `reference` field, like below: ```json [ { "input": { "query": "What is the French word for Cheese?" } }, { "input": { "query": "What green vegetable looks like cauliflower?" }, "reference": "Broccoli" } ] ``` If your flow requires auth, you may specify it using the `--context` argument: ```bash genkit eval:flow qaFlow --input testInputs.json --context '{"auth": {"email_verified": true}}' ``` By default, the `eval:flow` and `eval:run` commands use all available metrics for evaluation. To run on a subset of the configured evaluators, use the `--evaluators` flag and provide a comma-separated list of evaluators by name: ```bash genkit eval:flow qaFlow --input testInputs.json --evaluators=genkitEval/maliciousness,genkitEval/answer_relevancy ``` You can view the results of your evaluation run in the Dev UI at `localhost:4000/evaluate`. #### `eval:extractData` and `eval:run` commands [Section titled “eval:extractData and eval:run commands”](#evalextractdata-and-evalrun-commands) To support *raw evaluation*, Genkit provides tools to extract data from traces and run evaluation metrics on extracted data. This is useful, for example, if you are using a different framework for evaluation or if you are collecting inferences from a different environment to test locally for output quality. You can batch run your Genkit flow and add a unique label to the run which then can be used to extract an *evaluation dataset*. A raw evaluation dataset is a collection of inputs for evaluation metrics, *without* running any prior inference. Run your flow over your test inputs: ```bash genkit flow:batchRun qaFlow testInputs.json --label firstRunSimple ``` Extract the evaluation data: ```bash genkit eval:extractData qaFlow --label firstRunSimple --output factsEvalDataset.json ``` The exported data has a format different from the dataset format presented earlier. This is because this data is intended to be used with evaluation metrics directly, without any inference step. Here is the syntax of the extracted data. ```json Array<{ "testCaseId": string, "input": any, "output": any, "context": any[], "traceIds": string[], }>; ``` The data extractor automatically locates retrievers and adds the produced docs to the context array. You can run evaluation metrics on this extracted dataset using the `eval:run` command. ```bash genkit eval:run factsEvalDataset.json ``` By default, `eval:run` runs against all configured evaluators, and as with `eval:flow`, results for `eval:run` appear in the evaluation page of Developer UI, located at `localhost:4000/evaluate`. ### Batching evaluations [Section titled “Batching evaluations”](#batching-evaluations) Note This feature is only available in the Node.js SDK. You can speed up evaluations by processing the inputs in batches using the CLI and Dev UI. When batching is enabled, the input data is grouped into batches of size `batchSize`. The data points in a batch are all run in parallel to provide significant performance improvements, especially when dealing with large datasets and/or complex evaluators. By default (when the flag is omitted), batching is disabled. The `batchSize` option has been integrated into the `eval:flow` and `eval:run` CLI commands. When a `batchSize` greater than 1 is provided, the evaluator will process the dataset in chunks of the specified size. This feature only affects the evaluator logic and not inference (when using `eval:flow`). Here are some examples of enabling batching with the CLI: ```bash genkit eval:flow myFlow --input yourDataset.json --evaluators=custom/myEval --batchSize 10 ``` Or, with `eval:run` ```bash genkit eval:run yourDataset.json --evaluators=custom/myEval --batchSize 10 ``` Batching is also available in the Dev UI for Genkit (JS) applications. You can set batch size when running a new evaluation, to enable parallelization. ### Custom extractors [Section titled “Custom extractors”](#custom-extractors) Genkit provides reasonable default logic for extracting the necessary fields (`input`, `output` and `context`) while doing an evaluation. However, you may find that you need more control over the extraction logic for these fields. Genkit supports customs extractors to achieve this. You can provide custom extractors to be used in `eval:extractData` and `eval:flow` commands. First, as a preparatory step, introduce an auxilary step in our `qaFlow` example: ```js export const qaFlow = ai.defineFlow( { name: 'qaFlow', inputSchema: z.object({ query: z.string() }), outputSchema: z.object({ answer: z.string() }), }, async ({ query }) => { const factDocs = await ai.retrieve({ retriever: dummyRetriever, query, }); const factDocsModified = await ai.run('factModified', async () => { // Let us use only facts that are considered silly. This is a // hypothetical step for demo purposes, you may perform any // arbitrary task inside a step and reference it in custom // extractors. // // Assume you have a method that checks if a fact is silly return factDocs.filter((d) => isSillyFact(d.text)); }); const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: `Answer this question with the given context ${query}`, docs: factDocsModified, }); return { answer: text }; }, ); ``` Next, configure a custom extractor to use the output of the `factModified` step when evaluating this flow. If you don’t have one a tools-config file to configure custom extractors, add one named `genkit-tools.conf.js` to your project root. ```bash cd /path/to/your/genkit/app touch genkit-tools.conf.js ``` In the tools config file, add the following code: ```js module.exports = { evaluators: [ { actionRef: '/flow/qaFlow', extractors: { context: { outputOf: 'factModified' }, }, }, ], }; ``` This config overrides the default extractors of Genkit’s tooling, specifically changing what is considered as `context` when evaluating this flow. Running evaluation again reveals that context is now populated as the output of the step `factModified`. ```bash genkit eval:flow qaFlow --input testInputs.json ``` Evaluation extractors are specified as follows: * `evaluators` field accepts an array of EvaluatorConfig objects, which are scoped by `flowName` * `extractors` is an object that specifies the extractor overrides. The current supported keys in `extractors` are `[input, output, context]`. The acceptable value types are: * `string` - this should be a step name, specified as a string. The output of this step is extracted for this key. * `{ inputOf: string }` or `{ outputOf: string }` - These objects represent specific channels (input or output) of a step. For example, `{ inputOf: 'foo-step' }` would extract the input of step `foo-step` for this key. * `(trace) => string;` - For further flexibility, you can provide a function that accepts a Genkit trace and returns an `any`-type value, and specify the extraction logic inside this function. Refer to `genkit/genkit-tools/common/src/types/trace.ts` for the exact TraceData schema. **Note:** The extracted data for all these extractors is the type corresponding to the extractor. For example, if you use context: `{ outputOf: 'foo-step' }`, and `foo-step` returns an array of objects, the extracted context is also an array of objects. ### Synthesizing test data using an LLM [Section titled “Synthesizing test data using an LLM”](#synthesizing-test-data-using-an-llm) Here is an example flow that uses a PDF file to generate potential user questions. ```ts import { genkit, z } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; import { chunk } from 'llm-chunk'; // npm install llm-chunk import path from 'path'; import { readFile } from 'fs/promises'; import pdf from 'pdf-parse'; // npm install pdf-parse const ai = genkit({ plugins: [googleAI()] }); const chunkingConfig = { minLength: 1000, // number of minimum characters into chunk maxLength: 2000, // number of maximum characters into chunk splitter: 'sentence', // paragraph | sentence overlap: 100, // number of overlap chracters delimiters: '', // regex for base split method } as any; async function extractText(filePath: string) { const pdfFile = path.resolve(filePath); const dataBuffer = await readFile(pdfFile); const data = await pdf(dataBuffer); return data.text; } export const synthesizeQuestions = ai.defineFlow( { name: 'synthesizeQuestions', inputSchema: z.object({ filePath: z.string().describe('PDF file path') }), outputSchema: z.object({ questions: z.array( z.object({ query: z.string(), }), ), }), }, async ({ filePath }) => { filePath = path.resolve(filePath); // `extractText` loads the PDF and extracts its contents as text. const pdfTxt = await ai.run('extract-text', () => extractText(filePath)); const chunks = await ai.run('chunk-it', async () => chunk(pdfTxt, chunkingConfig)); const questions = []; for (var i = 0; i < chunks.length; i++) { const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: { text: `Generate one question about the following text: ${chunks[i]}`, }, }); questions.push({ query: text }); } return { questions }; }, ); ``` You can then use this command to export the data into a file and use for evaluation. ```bash genkit flow:run synthesizeQuestions '{"filePath": "my_input.pdf"}' --output synthesizedQuestions.json ``` # Connect with us > Learn how to connect with the Genkit community and provide feedback on your experience, including Discord and GitHub resources. We’d love to hear about your excperience with Genkit. Here’s how you can get in touch with us: **Join the community:** Stay updated, ask questions, and share your work with other Genkit users on the [Genkit Discord server](https://discord.gg/qXt5zzQKpc). **Provide feedback:** Report issues with Genkit or the docs, or suggest new features using our [GitHub issue tracker](https://github.com/firebase/genkit/issues). **We’re interested in learning things like:** * Was it straightforward to set up and make your first `generate` call? If not, how could we make it better? * Were you able to build what you wanted? If not, what could we do to help? * Is there any specific feature, documentation, or resource that’s missing? * Is there anything that’s working particularly well for you? Anything that isn’t? * Anything else that you’d like to share with us about your experience! # Deploy flows using Cloud Functions for Firebase > Learn how to deploy Genkit flows as callable functions using Cloud Functions for Firebase, including setup, authorization, and client-side access. Cloud Functions for Firebase has an `onCallGenkit` method that lets you quickly create a [callable function](https://firebase.google.com/docs/functions/callable?gen=2nd) with a Genkit action (e.g. a Flow). These functions can be called using `genkit/beta/client`or the [Functions client SDK](https://firebase.google.com/docs/functions/callable?gen=2nd#call_the_function), which automatically adds auth info. ## Before you begin [Section titled “Before you begin”](#before-you-begin) * You should be familiar with Genkit’s concept of [flows](/docs/flows), and how to write them. The instructions on this page assume that you already have some flows defined, which you want to deploy. * It would be helpful, but not required, if you’ve already used Cloud Functions for Firebase before. ## 1. Set up a Firebase project [Section titled “1. Set up a Firebase project”](#1-set-up-a-firebase-project) If you don’t already have a Firebase project with TypeScript Cloud Functions set up, follow these steps: 1. Create a new Firebase project using the [Firebase console](https://console.firebase.google.com/) or choose an existing one. 2. Upgrade the project to the Blaze plan, which is required to deploy Cloud Functions. 3. Install the [Firebase CLI](https://firebase.google.com/docs/cli). 4. Log in with the Firebase CLI: ```bash firebase login firebase login --reauth # alternative, if necessary firebase login --no-localhost # if running in a remote shell ``` 5. Create a new project directory: ```bash export PROJECT_ROOT=~/tmp/genkit-firebase-project1 mkdir -p $PROJECT_ROOT ``` 6. Initialize a Firebase project in the directory: ```bash cd $PROJECT_ROOT firebase init genkit ``` The rest of this page assumes that you’ve decided to write your functions in TypeScript, but you can also deploy your Genkit flows if you’re using JavaScript. ## 2. Wrap the Flow in onCallGenkit [Section titled “2. Wrap the Flow in onCallGenkit”](#2-wrap-the-flow-in-oncallgenkit) After you’ve set up a Firebase project with Cloud Functions, you can copy or write flow definitions in the project’s `functions/src` directory, and export them in `index.ts`. For your flows to be deployable, you need to wrap them in `onCallGenkit`. This method has all the features of the normal `onCall`. It automatically supports both streaming and JSON responses. Suppose you have the following flow: ```ts const generatePoemFlow = ai.defineFlow( { name: 'generatePoem', inputSchema: z.object({ subject: z.string() }), outputSchema: z.object({ poem: z.string() }), }, async ({ subject }) => { const { text } = await ai.generate(`Compose a poem about ${subject}.`); return { poem: text }; }, ); ``` You can expose this flow as a callable function using `onCallGenkit`: ```ts import { onCallGenkit } from 'firebase-functions/https'; export generatePoem = onCallGenkit(generatePoemFlow); ``` ### Define an authorization policy [Section titled “Define an authorization policy”](#define-an-authorization-policy) All deployed flows, whether deployed to Firebase or not, should have an authorization policy; without one, anyone can invoke your potentially-expensive generative AI flows. To define an authorization policy, use the `authPolicy` parameter of `onCallGenkit`: ```ts export const generatePoem = onCallGenkit( { authPolicy: (auth) => auth?.token?.email_verified, }, generatePoemFlow, ); ``` This sample uses a manual function as its auth policy. In addition, the https library exports the `signedIn()` and `hasClaim()` helpers. Here is the same code using one of those helpers: ```ts import { hasClaim } from 'firebase-functions/https'; export const generatePoem = onCallGenkit( { authPolicy: hasClaim('email_verified'), }, generatePoemFlow, ); ``` ### Make API credentials available to deployed flows [Section titled “Make API credentials available to deployed flows”](#make-api-credentials-available-to-deployed-flows) Once deployed, your flows need some way to authenticate with any remote services they rely on. Most flows need, at a minimum, credentials for accessing the model API service they use. For this example, do one of the following, depending on the model provider you chose: * Gemini (Google AI) 1. Make sure Google AI is [available in your region](https://ai.google.dev/available_regions). 2. [Generate an API key](https://aistudio.google.com/app/apikey) for the Gemini API using Google AI Studio. 3. Store your API key in Cloud Secret Manager: ```bash firebase functions:secrets:set GEMINI_API_KEY ``` This step is important to prevent accidentally leaking your API key, which grants access to a potentially metered service. See [Store and access sensitive configuration information](https://firebase.google.com/docs/functions/config-env?gen=2nd#secret-manager) for more information on managing secrets. 4. Edit `src/index.ts` and add the following after the existing imports: ```ts import { defineSecret } from "firebase-functions/params"; const googleAIapiKey = defineSecret("GEMINI_API_KEY"); ``` Then, in the flow definition, declare that the cloud function needs access to this secret value: ```ts export const generatePoem = onCallGenkit( { secrets: [googleAIapiKey], }, generatePoemFlow ); ``` Now, when you deploy this function, your API key is stored in Cloud Secret Manager, and available from the Cloud Functions environment. * Gemini (Vertex AI) 1. In the Cloud console, [Enable the Vertex AI API](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com?project=_) for your Firebase project. 2. On the [IAM](https://console.cloud.google.com/iam-admin/iam?project=_) page, ensure that the **Default compute service account** is granted the **Vertex AI User** role. The only secret you need to set up for this tutorial is for the model provider, but in general, you must do something similar for each service your flow uses. ### Add App Check enforcement [Section titled “Add App Check enforcement”](#add-app-check-enforcement) [Firebase App Check](https://firebase.google.com/docs/app-check) uses a built-in attestation mechanism to verify that your API is only being called by your application. `onCallGenkit` supports App Check enforcement declaratively. ```ts export const generatePoem = onCallGenkit( { enforceAppCheck: true, // Optional. Makes App Check tokens only usable once. This adds extra security // at the expense of slowing down your app to generate a token for every API // call consumeAppCheckToken: true, }, generatePoemFlow, ); ``` ### Set a CORS policy [Section titled “Set a CORS policy”](#set-a-cors-policy) Callable functions default to allowing any domain to call your function. If you want to customize the domains that can do this, use the `cors` option. With proper authentication (especially App Check), CORS is often unnecessary. ```ts export const generatePoem = onCallGenkit( { cors: 'mydomain.com', }, generatePoemFlow, ); ``` ### Complete example [Section titled “Complete example”](#complete-example) After you’ve made all of the changes described earlier, your deployable flow looks something like the following example: ```ts import { genkit } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; import { onCallGenkit, hasClaim } from 'firebase-functions/https'; import { defineSecret } from 'firebase-functions/params'; const apiKey = defineSecret('GEMINI_API_KEY'); const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); const generatePoemFlow = ai.defineFlow( { name: 'generatePoem', inputSchema: z.object({ subject: z.string() }), outputSchema: z.object({ poem: z.string() }), }, async ({ subject }) => { const { text } = await ai.generate(`Compose a poem about ${subject}.`); return { poem: text }; }, ); export const generatePoem = onCallGenkit( { secrets: [apiKey], authPolicy: hasClaim('email_verified'), enforceAppCheck: true, }, generatePoemFlow, ); ``` ## 3. Deploy flows to Firebase [Section titled “3. Deploy flows to Firebase”](#3-deploy-flows-to-firebase) After you’ve defined flows using `onCallGenkit`, you can deploy them the same way you would deploy other Cloud Functions: ```bash cd $PROJECT_ROOT firebase deploy --only functions ``` You’ve now deployed the flow as a Cloud Function! But you can’t access your deployed endpoint with `curl` or similar, because of the flow’s authorization policy. The next section explains how to securely access the flow. ## Optional: Try the deployed flow [Section titled “Optional: Try the deployed flow”](#optional-try-the-deployed-flow) To try out your flow endpoint, you can deploy the following minimal example web app: 1. In the [Project settings](https://console.firebase.google.com/project/_/settings/general) section of the Firebase console, add a new web app, selecting the option to also set up Hosting. 2. In the [Authentication](https://console.firebase.google.com/project/_/authentication/providers) section of the Firebase console, enable the **Google** provider, used in this example. 3. In your project directory, set up Firebase Hosting, where you will deploy the sample app: ```bash cd $PROJECT_ROOT firebase init hosting ``` Accept the defaults for all of the prompts. 4. Replace `public/index.html` with the following: ```html Genkit demo ``` 5. Deploy the web app and Cloud Function: ```bash cd $PROJECT_ROOT firebase deploy ``` Open the web app by visiting the URL printed by the `deploy` command. The app requires you to sign in with a Google account, after which you can initiate endpoint requests. ## Optional: Run flows in the developer UI [Section titled “Optional: Run flows in the developer UI”](#optional-run-flows-in-the-developer-ui) You can run flows defined using `onCallGenkit` in the developer UI, exactly the same way as you run flows defined using `defineFlow`, so there’s no need to switch between the two between deployment and development. ```bash cd $PROJECT_ROOT/functions genkit start -- npx tsx --watch src/index.ts ``` or ```bash cd $PROJECT_ROOT/functions npm run genkit:start ``` You can now navigate to the URL printed by the `genkit start` command to access. ## Optional: Developing using Firebase Local Emulator Suite [Section titled “Optional: Developing using Firebase Local Emulator Suite”](#optional-developing-using-firebase-local-emulator-suite) Firebase offers a [suite of emulators for local development](https://firebase.google.com/docs/emulator-suite), which you can use with Genkit. To use the Genkit Dev UI with the Firebase Emulator Suite, start the Firebase emulators as follows: ```bash genkit start -- firebase emulators:start --inspect-functions ``` This command runs your code in the emulator, and runs the Genkit framework in development mode. This launches and exposes the Genkit reflection API (but not the Dev UI). # Defining AI workflows > Learn how to define and manage AI workflows in Genkit using flows, which provide type safety, integration with the developer UI, and simplified deployment. The core of your app’s AI features are generative model requests, but it’s rare that you can simply take user input, pass it to the model, and display the model output back to the user. Usually, there are pre- and post-processing steps that must accompany the model call. For example: * Retrieving contextual information to send with the model call * Retrieving the history of the user’s current session, for example in a chat app * Using one model to reformat the user input in a way that’s suitable to pass to another model * Evaluating the “safety” of a model’s output before presenting it to the user * Combining the output of several models Every step of this workflow must work together for any AI-related task to succeed. In Genkit, you represent this tightly-linked logic using a construction called a flow. Flows are written just like functions, using ordinary TypeScript code, but they add additional capabilities intended to ease the development of AI features: * **Type safety**: Input and output schemas defined using Zod, which provides both static and runtime type checking * **Integration with developer UI**: Debug flows independently of your application code using the developer UI. In the developer UI, you can run flows and view traces for each step of the flow. * **Simplified deployment**: Deploy flows directly as web API endpoints, using Cloud Functions for Firebase or any platform that can host a web app. Unlike similar features in other frameworks, Genkit’s flows are lightweight and unobtrusive, and don’t force your app to conform to any specific abstraction. All of the flow’s logic is written in standard TypeScript, and code inside a flow doesn’t need to be flow-aware. ## Defining and calling flows [Section titled “Defining and calling flows”](#defining-and-calling-flows) In its simplest form, a flow just wraps a function. The following example wraps a function that calls `generate()`: ```typescript export const menuSuggestionFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ menuItem: z.string() }), }, async ({ theme }) => { const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: `Invent a menu item for a ${theme} themed restaurant.`, }); return { menuItem: text }; }, ); ``` Just by wrapping your `generate()` calls like this, you add some functionality: doing so lets you run the flow from the Genkit CLI and from the developer UI, and is a requirement for several of Genkit’s features, including deployment and observability (later sections discuss these topics). ### Input and output schemas [Section titled “Input and output schemas”](#input-and-output-schemas) One of the most important advantages Genkit flows have over directly calling a model API is type safety of both inputs and outputs. When defining flows, you can define schemas for them using Zod, in much the same way as you define the output schema of a `generate()` call; however, unlike with `generate()`, you can also specify an input schema. While it’s not mandatory to wrap your input and output schemas in `z.object()`, it’s considered best practice for these reasons: * **Better developer experience**: Wrapping schemas in objects provides a better experience in the Developer UI by giving you labeled input fields. * **Future-proof API design**: Object-based schemas allow for easy extensibility in the future. You can add new fields to your input or output schemas without breaking existing clients, which is a core principle of robust API design. All examples in this documentation use object-based schemas to follow these best practices. Here’s a refinement of the last example, which defines a flow that takes a string as input and outputs an object: ```typescript import { z } from 'genkit'; const MenuItemSchema = z.object({ dishname: z.string(), description: z.string(), }); export const menuSuggestionFlowWithSchema = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: MenuItemSchema, }, async ({ theme }) => { const { output } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: `Invent a menu item for a ${theme} themed restaurant.`, output: { schema: MenuItemSchema }, }); if (output == null) { throw new Error("Response doesn't satisfy schema."); } return output; }, ); ``` Note that the schema of a flow does not necessarily have to line up with the schema of the `generate()` calls within the flow (in fact, a flow might not even contain `generate()` calls). Here’s a variation of the example that passes a schema to `generate()`, but uses the structured output to format a simple string, which the flow returns. ```typescript export const menuSuggestionFlowMarkdown = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ formattedMenuItem: z.string() }), }, async ({ theme }) => { const { output } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: `Invent a menu item for a ${theme} themed restaurant.`, output: { schema: MenuItemSchema }, }); if (output == null) { throw new Error("Response doesn't satisfy schema."); } return { formattedMenuItem: `**${output.dishname}**: ${output.description}` }; }, ); ``` ### Calling flows [Section titled “Calling flows”](#calling-flows) Once you’ve defined a flow, you can call it from your Node.js code: ```typescript const { text } = await menuSuggestionFlow({ theme: 'bistro' }); ``` The argument to the flow must conform to the input schema, if you defined one. If you defined an output schema, the flow response will conform to it. For example, if you set the output schema to `MenuItemSchema`, the flow output will contain its properties: ```typescript const { dishname, description } = await menuSuggestionFlowWithSchema({ theme: 'bistro' }); ``` ## Streaming flows [Section titled “Streaming flows”](#streaming-flows) Flows support streaming using an interface similar to `generate()`’s streaming interface. Streaming is useful when your flow generates a large amount of output, because you can present the output to the user as it’s being generated, which improves the perceived responsiveness of your app. As a familiar example, chat-based LLM interfaces often stream their responses to the user as they are generated. Here’s an example of a flow that supports streaming: ```typescript export const menuSuggestionStreamingFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), streamSchema: z.string(), outputSchema: z.object({ theme: z.string(), menuItem: z.string() }), }, async ({ theme }, { sendChunk }) => { const { stream, response } = ai.generateStream({ model: googleAI.model('gemini-2.5-flash'), prompt: `Invent a menu item for a ${theme} themed restaurant.`, }); for await (const chunk of stream) { // Here, you could process the chunk in some way before sending it to // the output stream via sendChunk(). In this example, we output // the text of the chunk, unmodified. sendChunk(chunk.text); } const { text: menuItem } = await response; return { theme, menuItem, }; }, ); ``` * The `streamSchema` option specifies the type of values your flow streams. This does not necessarily need to be the same type as the `outputSchema`, which is the type of the flow’s complete output. * The second parameter to your flow definition is called `sideChannel`. It provides features such as request context and the `sendChunk` callback. The `sendChunk` callback takes a single parameter, of the type specified by `streamSchema`. Whenever data becomes available within your flow, send the data to the output stream by calling this function. In the above example, the values streamed by the flow are directly coupled to the values streamed by the `generate()` call inside the flow. Although this is often the case, it doesn’t have to be: you can output values to the stream using the callback as often as is useful for your flow. ### Calling streaming flows [Section titled “Calling streaming flows”](#calling-streaming-flows) Streaming flows are also callable, but they immediately return a response object rather than a promise: ```typescript const response = menuSuggestionStreamingFlow.stream({ theme: 'Danube' }); ``` The response object has a stream property, which you can use to iterate over the streaming output of the flow as it’s generated: ```typescript for await (const chunk of response.stream) { console.log('chunk', chunk); } ``` You can also get the complete output of the flow, as you can with a non-streaming flow: ```typescript const output = await response.output; ``` Note that the streaming output of a flow might not be the same type as the complete output; the streaming output conforms to `streamSchema`, whereas the complete output conforms to `outputSchema`. ## Running flows from the command line [Section titled “Running flows from the command line”](#running-flows-from-the-command-line) You can run flows from the command line using the Genkit CLI tool: ```bash genkit flow:run menuSuggestionFlow '{"theme": "French"}' ``` For streaming flows, you can print the streaming output to the console by adding the `-s` flag: ```bash genkit flow:run menuSuggestionFlow '{"theme": "French"}' -s ``` Running a flow from the command line is useful for testing a flow, or for running flows that perform tasks needed on an ad hoc basis—for example, to run a flow that ingests a document into your vector database. ## Debugging flows [Section titled “Debugging flows”](#debugging-flows) One of the advantages of encapsulating AI logic within a flow is that you can test and debug the flow independently from your app using the Genkit developer UI. To start the developer UI, run the following commands from your project directory: ```bash genkit start -- tsx --watch src/your-code.ts ``` From the **Run** tab of developer UI, you can run any of the flows defined in your project: ![Genkit DevUI flows](/_astro/devui-flows.CU7lon_X_Z1bEbxA.webp) After you’ve run a flow, you can inspect a trace of the flow invocation by either clicking **View trace** or looking on the **Inspect** tab. In the trace viewer, you can see details about the execution of the entire flow, as well as details for each of the individual steps within the flow. For example, consider the following flow, which contains several generation requests: ```typescript const PrixFixeMenuSchema = z.object({ starter: z.string(), soup: z.string(), main: z.string(), dessert: z.string(), }); export const complexMenuSuggestionFlow = ai.defineFlow( { name: 'complexMenuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: PrixFixeMenuSchema, }, async ({ theme }): Promise> => { const chat = ai.chat({ model: googleAI.model('gemini-2.5-flash') }); await chat.send('What makes a good prix fixe menu?'); await chat.send( 'What are some ingredients, seasonings, and cooking techniques that ' + `would work for a ${theme} themed menu?`, ); const { output } = await chat.send({ prompt: `Based on our discussion, invent a prix fixe menu for a ${theme} ` + 'themed restaurant.', output: { schema: PrixFixeMenuSchema, }, }); if (!output) { throw new Error('No data generated.'); } return output; }, ); ``` When you run this flow, the trace viewer shows you details about each generation request including its output: ![Genkit DevUI flows](/_astro/devui-inspect.DMsKRir5_2mbUjn.webp) ### Flow steps [Section titled “Flow steps”](#flow-steps) In the last example, you saw that each `generate()` call showed up as a separate step in the trace viewer. Each of Genkit’s fundamental actions show up as separate steps of a flow: * `generate()` * `Chat.send()` * `embed()` * `index()` * `retrieve()` If you want to include code other than the above in your traces, you can do so by wrapping the code in a `run()` call. You might do this for calls to third-party libraries that are not Genkit-aware, or for any critical section of code. For example, here’s a flow with two steps: the first step retrieves a menu using some unspecified method, and the second step includes the menu as context for a `generate()` call. ```ts export const menuQuestionFlow = ai.defineFlow( { name: 'menuQuestionFlow', inputSchema: z.object({ question: z.string() }), outputSchema: z.object({ answer: z.string() }), }, async ({ question }): Promise<{ answer: string }> => { const menu = await ai.run('retrieve-daily-menu', async (): Promise => { // Retrieve today's menu. (This could be a database access or simply // fetching the menu from your website.) // ... return menu; }); const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), system: "Help the user answer questions about today's menu.", prompt: question, docs: [{ content: [{ text: menu }] }], }); return { answer: text }; }, ); ``` Because the retrieval step is wrapped in a `run()` call, it’s included as a step in the trace viewer: ![Genkit DevUI flows](/_astro/devui-runstep.BapAMTA1_Z134JbI.webp) ## Deploying flows [Section titled “Deploying flows”](#deploying-flows) You can deploy your flows directly as web API endpoints, ready for you to call from your app clients. Deployment is discussed in detail on several other pages, but this section gives brief overviews of your deployment options. ### Cloud Functions for Firebase [Section titled “Cloud Functions for Firebase”](#cloud-functions-for-firebase) To deploy flows with Cloud Functions for Firebase, use the `onCallGenkit` feature of `firebase-functions/https`. `onCallGenkit` wraps your flow in a callable function. You may set an auth policy and configure App Check. ```typescript import { hasClaim, onCallGenkit } from 'firebase-functions/https'; import { defineSecret } from 'firebase-functions/params'; const apiKey = defineSecret('GOOGLE_AI_API_KEY'); const menuSuggestionFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ menuItem: z.string() }), }, async ({ theme }) => { // ... return { menuItem: "Generated menu item would go here" }; }, ); export const menuSuggestion = onCallGenkit( { secrets: [apiKey], authPolicy: hasClaim('email_verified'), }, menuSuggestionFlow, ); ``` For more information, see the following pages: * [Deploy with Firebase](/docs/firebase) * [Authorization and integrity](/docs/auth#authorize-using-cloud-functions-for-firebase) * [Firebase plugin](/docs/plugins/firebase) ### Express.js [Section titled “Express.js”](#expressjs) To deploy flows using any Node.js hosting platform, such as Cloud Run, define your flows using `defineFlow()` and then call `startFlowServer()`: ```typescript import { startFlowServer } from '@genkit-ai/express'; export const menuSuggestionFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ result: z.string() }), }, async ({ theme }) => { // ... }, ); startFlowServer({ flows: [menuSuggestionFlow], }); ``` By default, `startFlowServer` will serve all the flows defined in your codebase as HTTP endpoints (for example, `http://localhost:3400/menuSuggestionFlow`). You can call a flow with a POST request as follows: ```bash curl -X POST "http://localhost:3400/menuSuggestionFlow" \ -H "Content-Type: application/json" -d '{"data": {"theme": "banana"}}' ``` If needed, you can customize the flows server to serve a specific list of flows, as shown below. You can also specify a custom port (it will use the PORT environment variable if set) or specify CORS settings. ```typescript export const flowA = ai.defineFlow( { name: 'flowA', inputSchema: z.object({ subject: z.string() }), outputSchema: z.object({ response: z.string() }), }, async ({ subject }) => { // ... return { response: "Generated response would go here" }; } ); export const flowB = ai.defineFlow( { name: 'flowB', inputSchema: z.object({ subject: z.string() }), outputSchema: z.object({ response: z.string() }), }, async ({ subject }) => { // ... return { response: "Generated response would go here" }; } ); startFlowServer({ flows: [flowB], port: 4567, cors: { origin: '*', }, }); ``` For information on deploying to specific platforms, see [Deploy with Cloud Run](/docs/cloud-run) and [Deploy flows to any Node.js platform](/docs/deploy-node). # Get started with Genkit JS > This guide shows you how to get started with Genkit in a Node.js application, including project setup, installing packages, configuring API keys, creating your first flow, and testing in the Developer UI. This guide shows you how to get started with Genkit in a Node.js app and test it in the Developer UI. ## Prerequisites [Section titled “Prerequisites”](#prerequisites) Before you begin, make sure your environment meets these requirements: * Node.js v20 or later * npm This guide assumes you’re already familiar with building Node.js applications. ## Set up your project [Section titled “Set up your project”](#set-up-your-project) Create a new Node.js project and configure TypeScript: ```sh mkdir my-genkit-app cd my-genkit-app npm init -y # Set up your source directory mkdir src touch src/index.ts # Install and configure TypeScript npm install -D typescript tsx npx tsc --init ``` This sets up your project structure and a TypeScript entry point at `src/index.ts`. ## Install Genkit packages [Section titled “Install Genkit packages”](#install-genkit-packages) First, install the Genkit CLI globally. This gives you access to local developer tools, including the Developer UI: ```bash npm install -g genkit-cli ``` Then, add the following packages to your project: ```bash npm install genkit @genkit-ai/googleai ``` * `genkit` provides Genkit core capabilities. * `@genkit-ai/googleai` provides access to the Google AI Gemini models. ## Configure your model API key [Section titled “Configure your model API key”](#configure-your-model-api-key) Genkit can work with multiple model providers. This guide uses the **Gemini API**, which offers a generous free tier and doesn’t require a credit card to get started. To use it, you’ll need an API key from Google AI Studio: [Get a Gemini API Key](https://aistudio.google.com/apikey) Once you have a key, set the `GEMINI_API_KEY` environment variable: ```sh export GEMINI_API_KEY= ``` Note Genkit also supports models from Vertex AI, Anthropic, OpenAI, Cohere, Ollama, and more. See [supported models](/docs/models#models-supported-by-genkit) for details. ## Create your first flow [Section titled “Create your first flow”](#create-your-first-flow) A flow is a special Genkit function with built-in observability. type safety, and tooling integration. Update `src/index.ts` with the following: ```ts import { googleAI } from '@genkit-ai/googleai'; import { genkit, z } from 'genkit'; // Initialize Genkit with the Google AI plugin const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash', { temperature: 0.8 }), }); // Define input schema const RecipeInputSchema = z.object({ ingredient: z.string().describe('Main ingredient or cuisine type'), dietaryRestrictions: z.string().optional().describe('Any dietary restrictions'), }); // Define output schema const RecipeSchema = z.object({ title: z.string(), description: z.string(), prepTime: z.string(), cookTime: z.string(), servings: z.number(), ingredients: z.array(z.string()), instructions: z.array(z.string()), tips: z.array(z.string()).optional(), }); // Define a recipe generator flow export const recipeGeneratorFlow = ai.defineFlow( { name: 'recipeGeneratorFlow', inputSchema: RecipeInputSchema, outputSchema: RecipeSchema, }, async (input) => { // Create a prompt based on the input const prompt = `Create a recipe with the following requirements: Main ingredient: ${input.ingredient} Dietary restrictions: ${input.dietaryRestrictions || 'none'}`; // Generate structured recipe data using the same schema const { output } = await ai.generate({ prompt, output: { schema: RecipeSchema }, }); if (!output) throw new Error('Failed to generate recipe'); return output; } ); // Run the flow async function main() { const recipe = await recipeGeneratorFlow({ ingredient: 'avocado', dietaryRestrictions: 'vegetarian' }); console.log(recipe); } main().catch(console.error); ``` This code sample: * Defines reusable input and output schemas with [Zod](https://zod.dev/) * Configures the `gemini-2.5-flash` model with temperature settings * Defines a Genkit flow to generate a structured recipe based on your input * Runs the flow with a sample input and prints the result ##### Why use flows? [Section titled “Why use flows?”](#why-use-flows) * Type-safe inputs and outputs * Integrates with the Developer UI * Easy deployment as APIs * Built-in tracing and observability ## Test in the Developer UI [Section titled “Test in the Developer UI”](#test-in-the-developer-ui) The **Developer UI** is a local tool for testing and inspecting Genkit components, like flows, with a visual interface. ### Start the Developer UI [Section titled “Start the Developer UI”](#start-the-developer-ui) Run the following command from your project root: ```bash genkit start -- npx tsx --watch src/index.ts ``` This starts your app and launches the Developer UI at `http://localhost:4000` by default. Note The command after `--` should run the file that defines or imports your Genkit components. You can use `tsx`, `node`, or other commands based on your setup. Learn more in [developer tools](/docs/devtools). ##### Optional: Add an npm script [Section titled “Optional: Add an npm script”](#optional-add-an-npm-script) To make starting the Developer UI easier, add the following to your `package.json` scripts: ```json "scripts": { "genkit:ui": "genkit start -- npx tsx --watch src/index.ts" } ``` Then run it with: ```sh npm run genkit:ui ``` ### Run and inspect the flow [Section titled “Run and inspect the flow”](#run-and-inspect-the-flow) In the Developer UI: 1. Select the `recipeGeneratorFlow` from the list of flows 2. Enter sample input: ```json { "ingredient": "avocado", "dietaryRestrictions": "vegetarian" } ``` 3. Click **Run** You’ll see the generated recipe as structured output, along with a visual trace of the AI generation process for debugging and optimization. Your browser does not support the video tag. ## Next steps [Section titled “Next steps”](#next-steps) Now that you’ve created and tested your first flow, explore more features to build powerful AI-driven applications: * [Developer tools](/docs/devtools): Set up your local workflow with the Genkit CLI and Dev UI. * [Generating content](/docs/models): Use Genkit’s unified generation API to work with multimodal and structured output across supported models. * [Defining flows](/docs/flows): Learn about streaming flows, schema customization, deployment options, and more. * [Prompt management](/docs/dotprompt): Define flexible prompt templates using `.prompt` files or code. * [App integration](https://developers.google.com/solutions/learn/agentic-barista): See a full-stack Genkit app example built with flows and the Gemini API. # Pause generation using interrupts > Learn how to use interrupts in Genkit to pause and resume LLM generation, enabling human-in-the-loop interactions, asynchronous processing, and controlled task completion. Beta This feature of Genkit is in **Beta,** which means it is not yet part of Genkit’s stable API. APIs of beta features may change in minor version releases. *Interrupts* are a special kind of [tool](/docs/tool-calling) that can pause the LLM generation-and-tool-calling loop to return control back to you. When you’re ready, you can then *resume* generation by sending *replies* that the LLM processes for further generation. The most common uses for interrupts fall into a few categories: * **Human-in-the-Loop:** Enabling the user of an interactive AI to clarify needed information or confirm the LLM’s action before it is completed, providing a measure of safety and confidence. * **Async Processing:** Starting an asynchronous task that can only be completed out-of-band, such as sending an approval notification to a human reviewer or kicking off a long-running background process. * **Exit from an Autonomous Task:** Providing the model a way to mark a task as complete, in a workflow that might iterate through a long series of tool calls. [Genkit by Example: Human-in-the-Loop ](https://examples.genkit.dev/chatbot-hitl?utm_source=genkit.dev\&utm_content=contextlink)See a live demo of how interrupts can allow the LLM to ask structured questions of the user. ## Before you begin [Section titled “Before you begin”](#before-you-begin) All of the examples documented here assume that you have already set up a project with Genkit dependencies installed. If you want to run the code examples on this page, first complete the steps in the [Get started](/docs/get-started) guide. Before diving too deeply, you should also be familiar with the following concepts: * [Generating content](/docs/models) with AI models. * Genkit’s system for [defining input and output schemas](/docs/flows). * General methods of [tool-calling](/docs/tool-calling). ## Overview of interrupts [Section titled “Overview of interrupts”](#overview-of-interrupts) At a high level, this is what an interrupt looks like when interacting with an LLM: 1. The calling application prompts the LLM with a request. The prompt includes a list of tools, including at least one for an interrupt that the LLM can use to generate a response. 2. The LLM generates either a complete response or a tool call request in a specific format. To the LLM, an interrupt call looks like any other tool call. 3. If the LLM calls an interrupt tool, the Genkit library automatically pauses generation rather than immediately passing responses back to the model for additional processing. 4. The developer checks whether an interrupt call is made, and performs whatever task is needed to collect the information needed for the interrupt response. 5. The developer resumes generation by passing an interrupt response to the model. This action triggers a return to Step 2. ## Define manual-response interrupts [Section titled “Define manual-response interrupts”](#define-manual-response-interrupts) The most common kind of interrupt allows the LLM to request clarification from the user, for example by asking a multiple-choice question. For this use case, use the Genkit instance’s `defineInterrupt()` method: ```ts import { genkit, z } from 'genkit'; import { googleAI } from '@genkitai/google-ai'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); const askQuestion = ai.defineInterrupt({ name: 'askQuestion', description: 'use this to ask the user a clarifying question', inputSchema: z.object({ choices: z.array(z.string()).describe('the choices to display to the user'), allowOther: z.boolean().optional().describe('when true, allow write-ins'), }), outputSchema: z.string(), }); ``` Note that the `outputSchema` of an interrupt corresponds to the response data you will provide as opposed to something that will be automatically populated by a tool function. ### Use interrupts [Section titled “Use interrupts”](#use-interrupts) Interrupts are passed into the `tools` array when generating content, just like other types of tools. You can pass both normal tools and interrupts to the same `generate` call: * Generate ```ts const response = await ai.generate({ prompt: "Ask me a movie trivia question.", tools: [askQuestion], }); ``` * definePrompt ```ts const triviaPrompt = ai.definePrompt({ name: "triviaPrompt", tools: [askQuestion], input: { schema: z.object({ subject: z.string() }), }, prompt: "Ask me a trivia question about {{subject}}.", }); const response = await triviaPrompt({ subject: "computer history" }); ``` * Prompt file ```dotprompt --- tools: [askQuestion] input: schema: partyType: string --- {{role "system"}} Use the askQuestion tool if you need to clarify something. {{role "user"}} Help me plan a {{partyType}} party next week. ``` Then you can execute the prompt in your code as follows: ```ts // assuming prompt file is named partyPlanner.prompt const partyPlanner = ai.prompt("partyPlanner"); const response = await partyPlanner({ partyType: "birthday" }); ``` * Chat ```ts const chat = ai.chat({ system: "Use the askQuestion tool if you need to clarify something.", tools: [askQuestion], }); const response = await chat.send("make a plan for my birthday party"); ``` Genkit immediately returns a response on receipt of an interrupt tool call. ### Respond to interrupts [Section titled “Respond to interrupts”](#respond-to-interrupts) If you’ve passed one or more interrupts to your generate call, you need to check the response for interrupts so that you can handle them: ```ts // you can check the 'finishReason' of the response response.finishReason === 'interrupted'; // or you can check to see if any interrupt requests are on the response response.interrupts.length > 0; ``` Responding to an interrupt is done using the `resume` option on a subsequent `generate` call, making sure to pass in the existing history. Each tool has a `.respond()` method on it to help construct the response. Once resumed, the model re-enters the generation loop, including tool execution, until either it completes or another interrupt is triggered: ```ts let response = await ai.generate({ tools: [askQuestion], system: 'ask clarifying questions until you have a complete solution', prompt: 'help me plan a backyard BBQ', }); while (response.interrupts.length) { const answers = []; // multiple interrupts can be called at once, so we handle them all for (const question of response.interrupts) { answers.push( // use the `respond` method on our tool to populate answers askQuestion.respond( question, // send the tool request input to the user to respond await askUser(question.toolRequest.input), ), ); } response = await ai.generate({ tools: [askQuestion], messages: response.messages, resume: { respond: answers, }, }); } // no more interrupts, we can see the final response console.log(response.text); ``` ## Tools with restartable interrupts [Section titled “Tools with restartable interrupts”](#tools-with-restartable-interrupts) Another common pattern for interrupts is the need to *confirm* an action that the LLM suggests before actually performing it. For example, a payments app might want the user to confirm certain kinds of transfers. For this use case, you can use the standard `defineTool` method to add custom logic around when to trigger an interrupt, and what to do when an interrupt is *restarted* with additional metadata. ### Define a restartable tool [Section titled “Define a restartable tool”](#define-a-restartable-tool) Every tool has access to two special helpers in the second argument of its implementation definition: * `interrupt`: when called, this method throws a special kind of exception that is caught to pause the generation loop. You can provide additional metadata as an object. * `resumed`: when a request from an interrupted generation is restarted using the `{resume: {restart: ...}}` option (see below), this helper contains the metadata provided when restarting. If you were building a payments app, for example, you might want to confirm with the user before making a transfer exceeding a certain amount: ```ts const transferMoney = ai.defineTool({ name: 'transferMoney', description: 'Transfers money between accounts.', inputSchema: z.object({ toAccountId: z.string().describe('the account id of the transfer destination'), amount: z.number().describe('the amount in integer cents (100 = $1.00)'), }), outputSchema: z.object({ status: z.string().describe('the outcome of the transfer'), message: z.string().optional(), }) }, async (input, {context, interrupt, resumed})) { // if the user rejected the transaction if (resumed?.status === "REJECTED") { return {status: 'REJECTED', message: 'The user rejected the transaction.'}; } // trigger an interrupt to confirm if amount > $100 if (resumed?.status !== "APPROVED" && input.amount > 10000) { interrupt({ message: "Please confirm sending an amount > $100.", }); } // complete the transaction if not interrupted return doTransfer(input); } ``` In this example, on first execution (when `resumed` is undefined), the tool checks to see if the amount exceeds $100, and triggers an interrupt if so. On second execution, it looks for a status in the new metadata provided and performs the transfer or returns a rejection response, depending on whether it is approved or rejected. ### Restart tools after interruption [Section titled “Restart tools after interruption”](#restart-tools-after-interruption) Interrupt tools give you full control over: 1. When an initial tool request should trigger an interrupt. 2. When and whether to resume the generation loop. 3. What additional information to provide to the tool when resuming. In the example shown in the previous section, the application might ask the user to confirm the interrupted request to make sure the transfer amount is okay: ```ts let response = await ai.generate({ tools: [transferMoney], prompt: "Transfer $1000 to account ABC123", }); while (response.interrupts.length) { const confirmations = []; // multiple interrupts can be called at once, so we handle them all for (const interrupt of response.interrupts) { confirmations.push( // use the 'restart' method on our tool to provide `resumed` metadata transferMoney.restart( interrupt, // send the tool request input to the user to respond. assume that this // returns `{status: "APPROVED"}` or `{status: "REJECTED"}` await requestConfirmation(interrupt.toolRequest.input); ) ); } response = await ai.generate({ tools: [transferMoney], messages: response.messages, resume: { restart: confirmations, } }) } // no more interrupts, we can see the final response console.log(response.text); ``` # Observe local metrics > Learn about Genkit's local observability features, including tracing, metrics collection, and logging, powered by OpenTelemetry and integrated with the Genkit Developer UI. Genkit provides a robust set of built-in observability features, including tracing and metrics collection powered by [OpenTelemetry](https://opentelemetry.io/). For local observability, such as during the development phase, the Genkit Developer UI provides detailed trace viewing and debugging capabilities. For production observability, we provide Genkit Monitoring in the Firebase console via the Firebase plugin. Alternatively, you can export your OpenTelemetry data to the observability tooling of your choice. ## Tracing & Metrics [Section titled “Tracing & Metrics”](#tracing--metrics) Genkit automatically collects traces and metrics without requiring explicit configuration, allowing you to observe and debug your Genkit code’s behavior in the Developer UI. Genkit stores these traces, enabling you to analyze your Genkit flows step-by-step with detailed input/output logging and statistics. In production, Genkit can export traces and metrics to Firebase Genkit Monitoring for further analysis. ## Log and export events [Section titled “Log and export events”](#log-and-export-events) Genkit provides a centralized logging system that you can configure using the logging module. One advantage of using the Genkit-provided logger is that it automatically exports logs to Genkit Monitoring when the Firebase Telemetry plugin is enabled. ```typescript import { logger } from 'genkit/logging'; // Set the desired log level logger.setLogLevel('debug'); ``` ## Production Observability [Section titled “Production Observability”](#production-observability) The [Genkit Monitoring](https://console.firebase.google.com/project/_/genai_monitoring) dashboard helps you understand the overall health of your Genkit features. It is also useful for debugging stability and content issues that may indicate problems with your LLM prompts and/or Genkit Flows. See the [Getting Started](/docs/observability/getting-started) guide for more details. # Migrate from 0.5 to 0.9 > This guide outlines the significant changes and steps to migrate your Genkit applications from version 0.5 to 0.9, covering CLI updates, package changes, and API modifications. Genkit 0.9 introduces a number of breaking changes alongside feature enhancements that improve overall functionality. If you have been developing applications with Genkit 0.5, you will need to update your application code when you upgrade to the latest version. This guide outlines the most significant changes and offers steps to migrate your existing applications smoothly. ## Quickstart guide [Section titled “Quickstart guide”](#quickstart-guide) The following steps will help you migrate from Genkit 0.5 to Genkit 0.9 quickly. Read more information about these changes in the detailed [Changelog](#changelog) below. ### 1. Install the new CLI [Section titled “1. Install the new CLI”](#1-install-the-new-cli) * Uninstall the old CLI ```bash npm uninstall -g genkit && npm uninstall genkit ``` * Install the new CLI globally ```bash npm install -g genkit-cli ``` ### 2. Update your dependencies [Section titled “2. Update your dependencies”](#2-update-your-dependencies) * Remove individual Genkit core packages ```bash npm uninstall @genkit-ai/ai @genkit-ai/core @genkit-ai/dotprompt @genkit-ai/flow ``` * Install the new consolidated `genkit` package ```bash npm install genkit ``` * Upgrade all plugin versions (example below) ```plaintext npm upgrade @genkit-ai/firebase ``` ### 3. Change your imports [Section titled “3. Change your imports”](#3-change-your-imports) * Remove imports for individual Genkit core packages ```js import { … } from '@genkit-ai/ai'; import { … } from '@genkit-ai/core'; import { … } from '@genkit-ai/flow'; ``` * Remove zod imports ```js import * as z from 'zod'; ``` * Import `genkit` and `zod` from `genkit` ```js import { z, genkit } from 'genkit'; ``` ### 4. Update your code [Section titled “4. Update your code”](#4-update-your-code) #### Remove the configureGenkit blocks [Section titled “Remove the configureGenkit blocks”](#remove-the-configuregenkit-blocks) Configuration for Genkit is now done per instance. Telemetry and logging is configured globally and separately from the Genkit instance. * Replace `configureGenkit` with `ai = genkit({...})` blocks. Keep only the plugin configuration. ```js import { genkit } from 'genkit'; const ai = genkit({ plugins: [...]}); ``` * Configure telemetry using enableFirebaseTelemetry or enableGoogleCloudTelemetry For Firebase: ```js import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; enableFirebaseTelemetry({...}); ``` For Google Cloud: ```js import { enableGoogleCloudTelemetry } from '@genkit-ai/google-cloud'; enableGoogleCloudTelemetry({...}); ``` * Set your logging level independently ```js import { logger } from 'genkit/logging'; logger.setLogLevel('debug'); ``` See the [Monitoring and Logging](/docs/observability/getting-started) documentation for more details on how to configure telemetry and logging. See the [Get Started](/docs/get-started) documentation for more details on how to configure a Genkit instance. #### Migrate Genkit actions to be called from the `genkit` instance [Section titled “Migrate Genkit actions to be called from the genkit instance”](#migrate-genkit-actions-to-be-called-from-the-genkit-instance) Actions (flows, tools, retrievers, indexers, etc.) are defined per instance. Read the [Changelog](#changelog) for all of the features you will need to change, but here is an example of some common ones. ```js import { genkit } from 'genkit'; import { onFlow } from '@genkit-ai/firebase/functions'; const ai = genkit({ plugins: [...]}); // Flows and tools are defined on the specific genkit instance // and are directly callable. const sampleFlow = ai.defineFlow(...); const sampleTool = ai.defineTool(...); async function callMyFlow() { // Previously, text output could accessed via .text() // Now it is either .output() or .text return await sampleFlow().output(); } // onFlow now takes the Genkit instance as first argument // This registers the flow as a callable firebase function onFlow(ai, ...); const flows = [ sampleFlow, ... ]; // Start the flow server to make the registered flows callable over HTTP ai.startFlowServer({flows}); ``` ### 5. Run it [Section titled “5. Run it”](#5-run-it) ```bash # run the DevUI and your js code genkit start -- # run a defined flow genkit flow:run ``` ## Changelog [Section titled “Changelog”](#changelog) ### 1. CLI Changes [Section titled “1. CLI Changes”](#1-cli-changes) The command-line interface (CLI) has undergone significant updates in Genkit 0.9. The command to start Genkit has changed, and the CLI has been separated into its own standalone package, which you now need to install separately. To install the CLI: ```bash npm install -g genkit-cli ``` Some changes have been made to the `genkit start` command: Starts your Genkit application code + Dev UI together: ```bash genkit start -- [start command] genkit start -- tsx src/index.ts genkit start -- go run main.go ``` Watch mode is supported as well: ```bash genkit start -- tsx --watch src/index.ts ``` Starts ONLY your application code in Genkit dev mode: ```bash genkit start --noui -- genkit start --noui -- tsx src/index.ts ``` Starts the Dev UI ONLY: ```bash genkit start ``` Previously, the `genkit start` command would start the Dev UI and your application code together. If you have any CI/CD pipelines relying on this command, you may need to update the pipeline. The Dev UI will interact directly with the flow server to figure out which flows are registered and allow you to invoke them directly with sample inputs. ### 2. Simplified packages and imports [Section titled “2. Simplified packages and imports”](#2-simplified-packages-and-imports) Previously, the Genkit libraries were separated into several modules, which you needed to install and import individually. These modules have now been consolidated into a single import. In addition, the Zod module is now re-exported by Genkit. **Old:** ```bash npm install @genkit-ai/core @genkit-ai/ai @genkit-ai/flow @genkit-ai/dotprompt ``` **New:** ```bash npm install genkit ``` **Old:** ```js import { … } from '@genkit-ai/ai'; import { … } from '@genkit-ai/core'; import { … } from '@genkit-ai/flow'; import * as z from 'zod'; ``` **New:** ```js import { genkit, z } from 'genkit'; ``` Genkit plugins still must be installed and imported individually. ### 3. Configuring Genkit [Section titled “3. Configuring Genkit”](#3-configuring-genkit) Previously, initializing Genkit was done once globally by calling the `configureGenkit` function. Genkit resources (flows, tools, prompts, etc.) would all automatically be wired with this global configuration. Genkit 0.9 introduces `Genkit` instances, each of which encapsulates a configuration. See the following examples: **Old:** ```js import { configureGenkit } from '@genkit-ai/core'; configureGenkit({ telemetry: { instrumentation: ..., logger: ... } }); ``` **New:** ```js import { genkit } from 'genkit'; import { logger } from 'genkit/logging'; import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; logger.setLogLevel('debug'); enableFirebaseTelemetry({...}); const ai = genkit({ ... }); ``` Let’s break it down: * `configureGenkit()` has been replaced with `genkit()`, and it returns a configured `Genkit` instance rather than setting up configurations globally. * The Genkit initialization function is now in the `genkit` package. * Logging and telemetry are still configured globally using their own explicit methods. These configurations apply uniformly across all `Genkit` instances. ### 4. Defining flows and starting the flow server explicitly [Section titled “4. Defining flows and starting the flow server explicitly”](#4-defining-flows-and-starting-the-flow-server-explicitly) Now that you have a configured `Genkit` instance, you will need to define your flows. All core developer-facing API methods like `defineFlow`, `defineTool`, and `onFlow` are now invoked through this instance. This is distinct from the previous way, where flows and tools were registered globally. **Old:** ```js import { defineFlow, defineTool, onFlow } from '@genkit-ai/core'; defineFlow(...); defineTool(...); onFlow(...); ``` **New:** ```js // Define tools and flows const sampleFlow = ai.defineFlow(...); const sampleTool = ai.defineTool(...); // onFlow now takes the Genkit instance as first argument // This registers the flow as a callable firebase function onFlow(ai, ...); const flows = [ sampleFlow, ... ]; // Start the flow server to make the registered flows callable over HTTP ai.startFlowServer({flows}); ``` As of now, all flows that you want to make available need to be explicitly registered in the `flows` array above. ### 5. Tools and Prompts must be statically defined [Section titled “5. Tools and Prompts must be statically defined”](#5-tools-and-prompts-must-be-statically-defined) In earlier versions of Genkit, you could dynamically define tools and prompts at runtime, directly from within a flow. In Genkit 0.9, this behavior is no longer allowed. Instead, you need to define all actions and flows outside of the flow’s execution (i.e. statically). This change enforces a stricter separation of action definitions from execution. If any of your code is defined dynamically, they need to be refactored. Otherwise, an error will be thrown at runtime when the flow is executed. **❌ DON’T:** ```js const flow = defineFlow({...}, async (input) => { const tool = defineTool({...}); await tool(...); }); ``` **✅ DO:** ```js const tool = ai.defineTool({...}); const flow = ai.defineFlow({...}, async (input) => { await tool(...); }); ``` ### 6. New API for Streaming Flows [Section titled “6. New API for Streaming Flows”](#6-new-api-for-streaming-flows) In Genkit 0.9, we have simplified the syntax for defining a streaming flow and invoking it. First, `defineFlow` and `defineStreamingFlow` have been separated. If you have a flow that is meant to be streamed, you will have to update your code to define it via `defineStreamingFlow`. Second, instead of calling separate `stream()` and `response()` functions, both stream and response are now values returned directly from the flow. This change simplifies flow streaming. **Old:** ```js import { defineFlow, streamFlow } from '@genkit-ai/flow'; const myStreamingFlow = defineFlow(...); const { stream, output } = await streamFlow(myStreamingFlow, ...); for await (const chunk of stream()) { console.log(chunk); } console.log(await output()); ``` **New:** ```js const myStreamingFlow = ai.defineStreamingFlow(...); const { stream, response } = await myStreamingFlow(...); for await (const chunk of stream) { console.log(chunk); } console.log(await response); ``` ### 7. GenerateResponse class methods replaced with getter properties [Section titled “7. GenerateResponse class methods replaced with getter properties”](#7-generateresponse-class-methods-replaced-with-getter-properties) Previously, you used to access the structured output or text of the response using class methods, like `output()` or `text()`. In Genkit 0.9, those methods have been replaced by getter properties. This simplifies working with responses. **Old:** ```js const response = await generate({ prompt: 'hi' }); console.log(response.text()); ``` **New:** ```js const response = await ai.generate('hi'); console.log(response.text); ``` The same applies to `output`: **Old:** ```js console.log(response.output()); ``` **New:** ```js console.log(response.output); ``` ### 8. Candidate Generation Eliminated [Section titled “8. Candidate Generation Eliminated”](#8-candidate-generation-eliminated) Genkit 0.9 simplifies response handling by removing the `candidates` attribute. Previously, responses could contain multiple candidates, which you needed to handle explicitly. Now, only the first candidate is returned directly in a flat response. Any code that accesses the candidates directly will not work anymore. **Old:** ```js const response = await generate({ messages: [ { role: 'user', content: ...} ] }); console.log(response.candidates); // previously you could access candidates directly ``` **New:** ```js const response = await ai.generate({ messages: [ { role: 'user', content: ...} ] }); console.log(response.message); // single candidate is returned directly in a flat response ``` ### 9. Generate API - Multi-Turn enhancements [Section titled “9. Generate API - Multi-Turn enhancements”](#9-generate-api---multi-turn-enhancements) For multi-turn conversations, the old `toHistory()` method has been replaced by `messages`, further simplifying how conversation history is handled. **Old:** ```js const history = response.toHistory(); ``` **New:** ```js const response = await ai.generate({ messages: [ { role: 'user', content: ...} ] }); const history = response.messages; ``` ### 10. Streamlined Chat API [Section titled “10. Streamlined Chat API”](#10-streamlined-chat-api) In Genkit 0.9, the Chat API has been redesigned for easier session management and interaction. Here’s how you can leverage it for both synchronous and streaming chat experiences: ```js import { genkit } from 'genkit'; import { gemini15Flash, googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()], model: gemini15Flash, }); const session = ai.createSession({ store: firestoreSessionStore() }); const chat = await session.chat({ system: 'talk like a pirate' }); let response = await chat.send('hi, my name is Pavel'); console.log(response.text()); // "hi Pavel, I'm llm" // continue the conversation response = await chat.send("what's my name"); console.log(response.text()); // "Pavel" // can stream const { response, stream } = await chat.sendStream('bye'); for await (const chunk of stream) { console.log(chunk.text()); } console.log((await response).text()); // can load session from the store const prevSession = await ai.loadSession(session.id, { store }); const prevChat = await prevSession.chat(); await prevChat.send('bye'); ``` # Migrate from 0.9 to 1.0 > This guide outlines the significant changes and steps to migrate your Genkit applications from version 0.9 to 1.0, covering API channel changes, package updates, and flow modifications. Genkit 1.0 introduces many feature enhancements that improve overall functionality; it also has some breaking changes. If you have been developing applications with Genkit 0.9, you need to update your application code when you upgrade to the latest version of Genkit. This guide outlines the most significant changes, and explains how to migrate your existing applications smoothly. ## Beta APIs [Section titled “Beta APIs”](#beta-apis) We’re introducing an unstable, Beta API channel, and leaving session, chat and Genkit client APIs in beta as we continue to refine them. More specifically, the following functions are currently in the `beta` namespace: * `ai.chat` * `ai.createSession` * `ai.loadSession` * `ai.currentSession` * `ai.defineFormat` * `ai.defineInterrupt` Note: When using the APIs as part of the Beta API, you may experience breaking changes outside of SemVer. Breaking changes may occur on minor releases. **Old:** ```ts import { genkit } from 'genkit'; const ai = genkit({...}) const session = ai.createSession({ ... }) ``` **New:** ```ts import { genkit } from 'genkit/beta'; const ai = genkit({...}) const session = ai.createSession({ ... }) ``` **Old:** ```ts import { runFlow, streamFlow } from 'genkit/client'; ``` **New:** ```ts import { runFlow, streamFlow } from 'genkit/beta/client'; ``` ## Introducing new `@genkit-ai/express` package [Section titled “Introducing new @genkit-ai/express package”](#introducing-new-genkit-aiexpress-package) This new package contains utilities to make it easier to build an Express.js server with Genkit. You can find more details about this on [this page](https://js.api.genkit.dev/modules/_genkit-ai_express.html). `startFlowServer` has moved from part of the genkit object to this new `@genkit-ai/express` package; to use startFlowServer, you must update your imports. **Old:** ```ts const ai = genkit({ ... }); ai.startFlowServer({ flows: [myFlow1, myFlow2], }); ``` **New:** ```ts import { startFlowServer } from '@genkit-ai/express'; startFlowServer({ flows: [myFlow1, myFlow2], }); ``` ## Changes to Flows [Section titled “Changes to Flows”](#changes-to-flows) There are several changes to flows in 1.0: * `ai.defineStreamingFlow` has been consolidated into `ai.defineFlow`, * `onFlow` has been replaced by `onCallGenkit`, * `run` has moved to `ai.run`, * There are changes to working with auth. The `run` function for custom trace blocks has moved to part of the `genkit` object; use `ai.run` to invoke it instead. **Old:** ```ts ai.defineFlow({ name: 'banana' }, async (input) => { const step = await run('myCode', async () => { return 'something'; }); }); ``` **New:** ```ts ai.defineFlow({ name: 'banana' }, async (input) => { const step = await ai.run('myCode', async () => { return 'something'; }); }); ``` `ai.defineStreamingFlow` has been removed; use `ai.defineFlow` instead. Also, `streamingCallback` has moved to a field inside the second argument of the flow function and is now called `sendChunk`. **Old:** ```ts const flow = ai.defineStreamingFlow({ name: 'banana' }, async (input, streamingCallback) => { streamingCallback({ chunk: 1 }); }); const { stream } = await flow(); for await (const chunk of stream) { // ... } ``` **New:** ```ts const flow = ai.defineFlow({ name: 'banana' }, async (input, { context, sendChunk }) => { sendChunk({ chunk: 1 }); }); const { stream, output } = flow.stream(input); for await (const chunk of stream) { // ... } ``` FlowAuth auth is now called context. You can access auth as a field inside context: **Old:** ```ts ai.defineFlow({ name: 'banana' }, async (input) => { const auth = getFlowAuth(); // ... }); ``` **New:** ```ts ai.defineFlow({ name: 'banana' }, async (input, { context }) => { const auth = context.auth; }); ``` `onFlow` moved to `firebase-functions/https` package and has been renamed to `onCallGenkit`. The following snippet shows an example of how to use it. **Old** ```ts import { onFlow } from '@genkit-ai/firebase/functions'; export const generatePoem = onFlow( ai, { name: 'jokeTeller', inputSchema: z.object({ type: z.string().nullable() }), outputSchema: z.object({ joke: z.string() }), streamSchema: z.string(), }, async ({ type }, streamingCallback) => { const { stream, response } = await ai.generateStream(`Tell me a longish ${type ?? 'dad'} joke.`); for await (const chunk of stream) { streamingCallback(chunk.text); } return { joke: (await response).text }; }, ); ``` **New:** ```ts import { onCallGenkit } from 'firebase-functions/https'; import { defineSecret } from 'firebase-functions/params'; import { genkit, z } from 'genkit'; const apiKey = defineSecret('GEMINI_API_KEY'); const ai = genkit({ plugins: [googleAI()], model: gemini15Flash, }); export const jokeTeller = ai.defineFlow( { name: 'jokeTeller', inputSchema: z.object({ type: z.string().nullable() }), outputSchema: z.object({ joke: z.string() }), streamSchema: z.string(), }, async ({ type }, { sendChunk }) => { const { stream, response } = ai.generateStream(`Tell me a longish ${type ?? 'dad'} joke.`); for await (const chunk of stream) { sendChunk(chunk.text); } return { joke: (await response).text }; }, ); export const tellJoke = onCallGenkit({ secrets: [apiKey] }, jokeTeller); ``` Auth policies have been removed from `defineFlow`. Handling of auth policies is now server-dependent. **Old:** ```ts export const simpleFlow = ai.defineFlow( { name: 'simpleFlow', authPolicy: (auth, input) => { // auth policy }, }, async (input) => { // Flow logic here... }, ); ``` The following snippet shows an example of handling auth in Express. **New:** ```ts import { UserFacingError } from 'genkit'; import { ContextProvider, RequestData } from 'genkit/context'; import { expressHandler, startFlowServer, withContextProvider } from '@genkit-ai/express'; const context: ContextProvider = (req: RequestData) => { return { auth: parseAuthToken(req.headers['authorization']), }; }; export const simpleFlow = ai.defineFlow( { name: 'simpleFlow', }, async (input, { context }) => { if (!context.auth) { throw new UserFacingError("UNAUTHORIZED", "Authorization required."); } if (input.uid !== context.auth.uid) { throw new UserFacingError("UNAUTHORIZED", "You may only summarize your own profile data."); } // Flow logic here... } ); const app = express(); app.use(express.json()); app.post( '/simpleFlow', expressHandler(simpleFlow, { context }) ); app.listen(8080); // or startFlowServer( flows: [withContextProvider(simpleFlow, context)], port: 8080 ); ``` For more details, refer to the [auth documentation](/docs/auth). The following snippet shows an example of handling auth in Cloud Functions for Firebase: ```ts import { genkit } from 'genkit'; import { onCallGenkit } from 'firebase-functions/https'; const ai = genkit({ ... });; const simpleFlow = ai.defineFlow({ name: 'simpleFlow', }, async (input) => { // Flow logic here... }); export const selfSummary = onCallGenkit({ authPolicy: (auth, data) => auth?.token?.['email_verified'] && auth?.token?.['admin'], }, simpleFlow); ``` ## Prompts [Section titled “Prompts”](#prompts) We’ve made several changes and improvements to prompts. You can define separate templates for prompt and system messages: ```ts const hello = ai.definePrompt({ name: 'hello', system: 'talk like a pirate.', prompt: 'hello {% verbatim %}{{ name }}{% endverbatim %}', input: { schema: z.object({ name: z.string(), }), }, }); const { text } = await hello({ name: 'Genkit' }); ``` Alternatively, you can define multi-message prompts in the messages field: ```ts const hello = ai.definePrompt({ name: 'hello', messages: '{% verbatim %}{{ role "system" }}{% endverbatim %} talk like a pirate. {% verbatim %}{{ role "user" }}{% endverbatim %} hello {% verbatim %}{{ name }}{% endverbatim %}', input: { schema: z.object({ name: z.string(), }), }, }); ``` Instead of prompt templates you can use a function: ```ts ai.definePrompt({ name: 'hello', prompt: async (input, { context }) => { return `hello ${input.name}`; }, input: { schema: z.object({ name: z.string(), }), }, }); ``` You can access the context (including auth information) from within the prompt: ```ts const hello = ai.definePrompt({ name: 'hello', messages: 'hello {% verbatim %}{{ @auth.email }}{% endverbatim %}', }); ``` ## Streaming functions do not require an `await` [Section titled “Streaming functions do not require an await”](#streaming-functions-do-not-require-an-await) **Old:** ```ts const { stream, response } = await ai.generateStream(`hi`); const { stream, output } = await myflow.stream(`hi`); ``` **New:** ```ts const { stream, response } = ai.generateStream(`hi`); const { stream, output } = myflow.stream(`hi`); ``` ## Embed has a new return type [Section titled “Embed has a new return type”](#embed-has-a-new-return-type) We’ve added support for multimodal embeddings. Instead of returning just a single embedding vector, Embed returns an array of embedding objects, each containing an embedding vector and metadata. **Old:** ```ts const response = await ai.embed({ embedder, content, options }); // returns number[] ``` **New:** ```ts const response = await ai.embed({ embedder, content, options }); // returns Embedding[] const firstEmbeddingVector = response[0].embedding; // is number[] ``` # Generating content with AI models > Learn how to generate content with AI models using Genkit's unified interface, covering basic usage, configuration, structured output, streaming, and multimodal input/output. **TL;DR:** display the LLM-friendly summary of this page. Genkit provides a unified interface to interact with various generative AI models (LLMs, image generation). **Core Function:** `ai.generate()` **Basic Usage:** ```typescript import { googleAI } from '@genkit-ai/googleai'; import { genkit } from 'genkit'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), // Default model }); // Generate with default model const response1 = await ai.generate('prompt text'); console.log(response1.text); // Generate with specific model reference import { googleAI } from '@genkit-ai/googleai'; const response2 = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: 'prompt text', }); console.log(response2.text); // Generate with model string ID const response3 = await ai.generate({ model: 'googleai/gemini-2.5-flash', prompt: 'prompt text', }); console.log(response3.text); ``` **Configuration:** * **System Prompt:** `system: "Instruction for the model"` * **Model Parameters:** `config: { maxOutputTokens: 512, temperature: 1.0, topP: 0.95, topK: 40, stopSequences: ["\n"] }` **Structured Output (using Zod):** ```typescript import { z } from 'genkit'; const MenuItemSchema = z.object({ name: z.string().describe('The name of the menu item.'), description: z.string().describe('A description of the menu item.'), calories: z.number().describe('The estimated number of calories.'), allergens: z.array(z.string()).describe('Any known allergens in the menu item.'), }); const response = await ai.generate({ prompt: 'Suggest a menu item.', output: { schema: MenuItemSchema }, }); const menuItem = response.output; // Typed output, might be null if validation fails if (menuItem) { console.log(menuItem.name); } ``` **Streaming:** ```typescript const { stream, response } = ai.generateStream({ prompt: 'Tell a story.', // Can also include output schema for streaming structured data // output: { schema: z.array(MenuItemSchema) }, }); // Stream text chunks for await (const chunk of stream) { console.log(chunk.text); // For structured: chunk.output (accumulated) } // Get final complete response const finalResponse = await response; console.log(finalResponse.text); // For structured: finalResponse.output ``` **Multimodal Input:** ```typescript import { readFile } from 'node:fs/promises'; // From URL const response1 = await ai.generate({ prompt: [{ media: { url: 'https://.../image.jpg' } }, { text: 'Describe this image.' }], }); // From local file (data URL) const data = await readFile('image.jpg'); const response2 = await ai.generate({ prompt: [{ media: { url: `data:image/jpeg;base64,${data.toString('base64')}` } }, { text: 'Describe this image.' }], }); ``` **Media Generation (e.g., Images):** ```typescript import { vertexAI } from '@genkit-ai/vertexai'; // Example image model import { parseDataUrl } from 'data-urls'; import { writeFile } from 'node:fs/promises'; const response = await ai.generate({ model: vertexAI.model('imagen-3.0-fast-generate-001'), prompt: 'Image description', output: { format: 'media' }, // Request media output }); const imagePart = response.output; if (imagePart?.media?.url) { // URL is typically a data: URL const parsed = parseDataUrl(imagePart.media.url); if (parsed) { await writeFile('output.png', parsed.body); } } ``` **Supported Model Plugins (Examples):** * Vertex AI (`@genkit-ai/vertexai`): Gemini, Imagen, Claude on Vertex * Google AI (`@genkit-ai/googleai`): Gemini * OpenAI (`@genkit-ai/compat-oai/openai`): GPT, Dall-E, Whisper on OpenAI * xAI (`@genkit-ai/compat-oai/xai`): Grok on xAI * DeepSeek (`@genkit-ai/compat-oai/deepseek`): DeepSeek Chat, Reasoner on DeepSeek * Ollama (`@genkit-ai/ollama`): Llama 3, Gemma 2, etc. (self-hosted) * Community: Anthropic, Azure OpenAI, Cohere, Mistral, Groq **Key Concepts:** * **Flexibility:** Easily swap models (`model` parameter). * **Zod:** For defining and validating structured output schemas. * **Streaming:** For real-time output using `generateStream`. * **Multimodality:** Handle text, image, video, audio inputs (model-dependent). * **Media Generation:** Create images, etc. (model-dependent). At the heart of generative AI are AI *models*. Currently, the two most prominent examples of generative models are large language models (LLMs) and image generation models. These models take input, called a *prompt* (most commonly text, an image, or a combination of both), and from it produce as output text, an image, or even audio or video. The output of these models can be surprisingly convincing: LLMs generate text that appears as though it could have been written by a human being, and image generation models can produce images that are very close to real photographs or artwork created by humans. In addition, LLMs have proven capable of tasks beyond simple text generation: * Writing computer programs * Planning subtasks that are required to complete a larger task * Organizing unorganized data * Understanding and extracting information data from a corpus of text * Following and performing automated activities based on a text description of the activity There are many models available to you, from several different providers. Each model has its own strengths and weaknesses and one model might excel at one task but perform less well at others. Apps making use of generative AI can often benefit from using multiple different models depending on the task at hand. As an app developer, you typically don’t interact with generative AI models directly, but rather through services available as web APIs. Although these services often have similar functionality, they all provide them through different and incompatible APIs. If you want to make use of multiple model services, you have to use each of their proprietary SDKs, potentially incompatible with each other. And if you want to upgrade from one model to the newest and most capable one, you might have to build that integration all over again. Genkit addresses this challenge by providing a single interface that abstracts away the details of accessing potentially any generative AI model service, with several pre-built implementations already available. Building your AI-powered app around Genkit simplifies the process of making your first generative AI call and makes it equally easy to combine multiple models or swap one model for another as new models emerge. ### Before you begin [Section titled “Before you begin”](#before-you-begin) If you want to run the code examples on this page, first complete the steps in the [Getting started](/docs/get-started) guide. All of the examples assume that you have already installed Genkit as a dependency in your project. ### Models supported by Genkit [Section titled “Models supported by Genkit”](#models-supported-by-genkit) Genkit is designed to be flexible enough to use potentially any generative AI model service. Its core libraries define the common interface for working with models, and model plugins define the implementation details for working with a specific model and its API. The Genkit team maintains plugins for working with models provided by Vertex AI, Google Generative AI, and Ollama: * Gemini family of LLMs, through the [Google Cloud Vertex AI plugin](/docs/plugins/vertex-ai) * Gemini family of LLMs, through the [Google AI plugin](/docs/plugins/google-genai) * Imagen2 and Imagen3 image generation models, through Google Cloud Vertex AI * Anthropic’s Claude 3 family of LLMs, through Google Cloud Vertex AI’s model garden * Gemma 2, Llama 3, and many more open models, through the [Ollama plugin](/docs/plugins/ollama) (you must host the Ollama server yourself) * GPT, Dall-E and Whisper family of models, through the [OpenAI plugin](/docs/plugins/openai) * Grok family of models, through the [xAI plugin](/docs/plugins/xai) * DeepSeek Chat and DeepSeek Reasoner models, through the [DeepSeek plugin](/docs/plugins/deepseek) In addition, there are also several community-supported plugins that provide interfaces to these models: * Claude 3 family of LLMs, through the [Anthropic plugin](https://thefireco.github.io/genkit-plugins/docs/plugins/genkitx-anthropic) * GPT family of LLMs through the [Azure OpenAI plugin](https://thefireco.github.io/genkit-plugins/docs/plugins/genkitx-azure-openai) * Command R family of LLMs through the [Cohere plugin](https://thefireco.github.io/genkit-plugins/docs/plugins/genkitx-cohere) * Mistral family of LLMs through the [Mistral plugin](https://thefireco.github.io/genkit-plugins/docs/plugins/genkitx-mistral) * Gemma 2, Llama 3, and many more open models hosted on Groq, through the [Groq plugin](https://thefireco.github.io/genkit-plugins/docs/plugins/genkitx-groq) You can discover more by searching for [packages tagged with `genkit-model` on npmjs.org](https://www.npmjs.com/search?q=keywords%3Agenkit-model). ### Loading and configuring model plugins [Section titled “Loading and configuring model plugins”](#loading-and-configuring-model-plugins) Before you can use Genkit to start generating content, you need to load and configure a model plugin. If you’re coming from the Getting Started guide, you’ve already done this. Otherwise, see the [Getting Started](/docs/get-started) guide or the individual plugin’s documentation and follow the steps there before continuing. ### The generate() method [Section titled “The generate() method”](#the-generate-method) In Genkit, the primary interface through which you interact with generative AI models is the `generate()` method. The simplest `generate()` call specifies the model you want to use and a text prompt: ```ts import { googleAI } from '@genkit-ai/googleai'; import { genkit } from 'genkit'; const ai = genkit({ plugins: [googleAI()], // Optional. Specify a default model. model: googleAI.model('gemini-2.5-flash'), }); async function run() { const response = await ai.generate('Invent a menu item for a restaurant with a pirate theme.'); console.log(response.text); } run(); ``` When you run this brief example, it will print out some debugging information followed by the output of the `generate()` call, which will usually be Markdown text as in the following example: ```md ## The Blackheart's Bounty **A hearty stew of slow-cooked beef, spiced with rum and molasses, served in a hollowed-out cannonball with a side of crusty bread and a dollop of tangy pineapple salsa.** **Description:** This dish is a tribute to the hearty meals enjoyed by pirates on the high seas. The beef is tender and flavorful, infused with the warm spices of rum and molasses. The pineapple salsa adds a touch of sweetness and acidity, balancing the richness of the stew. The cannonball serving vessel adds a fun and thematic touch, making this dish a perfect choice for any pirate-themed adventure. ``` Run the script again and you’ll get a different output. The preceding code sample sent the generation request to the default model, which you specified when you configured the Genkit instance. You can also specify a model for a single `generate()` call: ```ts import { googleAI } from '@genkit-ai/googleai'; const response = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: 'Invent a menu item for a restaurant with a pirate theme.', }); ``` This example uses a model reference function provided by the model plugin. Model references carry static type information about the model and its options which can be useful for code completion in the IDE and at compile time. Many plugins use this pattern, but not all, so in cases where they don’t, refer to the plugin documentation for their preferred way to create function references. Sometimes you may see code samples where model references are imported as constants: ```ts import { googleAI, gemini20Flash } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()], model: gemini20Flash, }); ``` Some plugins may still use this pattern. For plugins that switched to the new syntax those constants are still there and continue to work, but new constants for new future models may not to be added in the future. Another option is to specify the model using a string identifier. This way will work for all plugins regardless of how they chose to handle typed model references, however you won’t have the help of static type checking: ```ts const response = await ai.generate({ model: 'googleai/gemini-2.5-flash-001', prompt: 'Invent a menu item for a restaurant with a pirate theme.', }); ``` A model string identifier looks like `providerid/modelid`, where the provider ID (in this case, `googleai`) identifies the plugin, and the model ID is a plugin-specific string identifier for a specific version of a model. Some model plugins, such as the Ollama plugin, provide access to potentially dozens of different models and therefore do not export individual model references. In these cases, you can only specify a model to `generate()` using its string identifier. These examples also illustrate an important point: when you use `generate()` to make generative AI model calls, changing the model you want to use is simply a matter of passing a different value to the model parameter. By using `generate()` instead of the native model SDKs, you give yourself the flexibility to more easily use several different models in your app and change models in the future. So far you have only seen examples of the simplest `generate()` calls. However, `generate()` also provides an interface for more advanced interactions with generative models, which you will see in the sections that follow. ### System prompts [Section titled “System prompts”](#system-prompts) Some models support providing a *system prompt*, which gives the model instructions as to how you want it to respond to messages from the user. You can use the system prompt to specify a persona you want the model to adopt, the tone of its responses, the format of its responses, and so on. If the model you’re using supports system prompts, you can provide one with the `system` parameter: ```ts const response = await ai.generate({ prompt: 'What is your quest?', system: "You are a knight from Monty Python's Flying Circus.", }); ``` ### Multi-turn conversations with messages [Section titled “Multi-turn conversations with messages”](#multi-turn-conversations-with-messages) For multi-turn conversations, you can use the `messages` parameter instead of `prompt` to provide a conversation history. This is particularly useful when you need to maintain context across multiple interactions with the model. The `messages` parameter accepts an array of message objects, where each message has a `role` (one of `'system'`, `'user'`, `'model'`, or `'tool'`) and `content`: ```ts const response = await ai.generate({ messages: [ { role: 'user', content: 'Hello, can you help me plan a trip?' }, { role: 'model', content: 'Of course! I\'d be happy to help you plan a trip. Where are you thinking of going?' }, { role: 'user', content: 'I want to visit Japan for two weeks in spring.' } ], }); ``` You can also combine `messages` with other parameters like `system` prompts: ```ts const response = await ai.generate({ system: 'You are a helpful travel assistant.', messages: [ { role: 'user', content: 'What should I pack for Japan in spring?' } ], }); ``` **When to use `messages` vs. Chat API:** * Use the `messages` parameter for simple multi-turn conversations where you manually manage the conversation history * For persistent chat sessions with automatic history management, use the [Chat API](/docs/chat) instead ### Model parameters [Section titled “Model parameters”](#model-parameters) The `generate()` function takes a `config` parameter, through which you can specify optional settings that control how the model generates content: ```ts const response = await ai.generate({ prompt: 'Invent a menu item for a restaurant with a pirate theme.', config: { maxOutputTokens: 512, stopSequences: ['\n'], temperature: 1.0, topP: 0.95, topK: 40, }, }); ``` The exact parameters that are supported depend on the individual model and model API. However, the parameters in the previous example are common to almost every model. The following is an explanation of these parameters: #### Parameters that control output length [Section titled “Parameters that control output length”](#parameters-that-control-output-length) **maxOutputTokens** LLMs operate on units called *tokens*. A token usually, but does not necessarily, map to a specific sequence of characters. When you pass a prompt to a model, one of the first steps it takes is to *tokenize* your prompt string into a sequence of tokens. Then, the LLM generates a sequence of tokens from the tokenized input. Finally, the sequence of tokens gets converted back into text, which is your output. The maximum output tokens parameter simply sets a limit on how many tokens to generate using the LLM. Every model potentially uses a different tokenizer, but a good rule of thumb is to consider a single English word to be made of 2 to 4 tokens. As stated earlier, some tokens might not map to character sequences. One such example is that there is often a token that indicates the end of the sequence: when an LLM generates this token, it stops generating more. Therefore, it’s possible and often the case that an LLM generates fewer tokens than the maximum because it generated the “stop” token. **stopSequences** You can use this parameter to set the tokens or token sequences that, when generated, indicate the end of LLM output. The correct values to use here generally depend on how the model was trained, and are usually set by the model plugin. However, if you have prompted the model to generate another stop sequence, you might specify it here. Note that you are specifying character sequences, and not tokens per se. In most cases, you will specify a character sequence that the model’s tokenizer maps to a single token. #### Parameters that control “creativity” [Section titled “Parameters that control “creativity””](#parameters-that-control-creativity) The *temperature*, *top-p*, and *top-k* parameters together control how “creative” you want the model to be. Below are very brief explanations of what these parameters mean, but the more important point to take away is this: these parameters are used to adjust the character of an LLM’s output. The optimal values for them depend on your goals and preferences, and are likely to be found only through experimentation. **temperature** LLMs are fundamentally token-predicting machines. For a given sequence of tokens (such as the prompt) an LLM predicts, for each token in its vocabulary, the likelihood that the token comes next in the sequence. The temperature is a scaling factor by which these predictions are divided before being normalized to a probability between 0 and 1. Low temperature values—between 0.0 and 1.0—amplify the difference in likelihoods between tokens, with the result that the model will be even less likely to produce a token it already evaluated to be unlikely. This is often perceived as output that is less creative. Although 0.0 is technically not a valid value, many models treat it as indicating that the model should behave deterministically, and to only consider the single most likely token. High temperature values—those greater than 1.0—compress the differences in likelihoods between tokens, with the result that the model becomes more likely to produce tokens it had previously evaluated to be unlikely. This is often perceived as output that is more creative. Some model APIs impose a maximum temperature, often 2.0. **topP** *Top-p* is a value between 0.0 and 1.0 that controls the number of possible tokens you want the model to consider, by specifying the cumulative probability of the tokens. For example, a value of 1.0 means to consider every possible token (but still take into account the probability of each token). A value of 0.4 means to only consider the most likely tokens, whose probabilities add up to 0.4, and to exclude the remaining tokens from consideration. **topK** *Top-k* is an integer value that also controls the number of possible tokens you want the model to consider, but this time by explicitly specifying the maximum number of tokens. Specifying a value of 1 means that the model should behave deterministically. #### Experiment with model parameters [Section titled “Experiment with model parameters”](#experiment-with-model-parameters) You can experiment with the effect of these parameters on the output generated by different model and prompt combinations by using the Developer UI. Start the developer UI with the `genkit start` command and it will automatically load all of the models defined by the plugins configured in your project. You can quickly try different prompts and configuration values without having to repeatedly make these changes in code. ### Structured output [Section titled “Structured output”](#structured-output) [Genkit by Example: Structured Output ](https://examples.genkit.dev/structured-output?utm_source=genkit.dev\&utm_content=contextlink)View a live example of using structured output to generate a D\&D character sheet. When using generative AI as a component in your application, you often want output in a format other than plain text. Even if you’re just generating content to display to the user, you can benefit from structured output simply for the purpose of presenting it more attractively to the user. But for more advanced applications of generative AI, such as programmatic use of the model’s output, or feeding the output of one model into another, structured output is a must. In Genkit, you can request structured output from a model by specifying a schema when you call `generate()`: ```ts import { z } from 'genkit'; ``` ```ts const MenuItemSchema = z.object({ name: z.string().describe('The name of the menu item.'), description: z.string().describe('A description of the menu item.'), calories: z.number().describe('The estimated number of calories.'), allergens: z.array(z.string()).describe('Any known allergens in the menu item.'), }); const response = await ai.generate({ prompt: 'Suggest a menu item for a pirate-themed restaurant.', output: { schema: MenuItemSchema }, }); ``` Model output schemas are specified using the [Zod](https://zod.dev/) library. In addition to a schema definition language, Zod also provides runtime type checking, which bridges the gap between static TypeScript types and the unpredictable output of generative AI models. Zod lets you write code that can rely on the fact that a successful generate call will always return output that conforms to your TypeScript types. When you specify a schema in `generate()`, Genkit does several things behind the scenes: * Augments the prompt with additional guidance about the desired output format. This also has the side effect of specifying to the model what content exactly you want to generate (for example, not only suggest a menu item but also generate a description, a list of allergens, and so on). * Parses the model output into a JavaScript object. * Verifies that the output conforms with the schema. To get structured output from a successful generate call, use the response object’s `output` property: ```ts const menuItem = response.output; // Typed as z.infer console.log(menuItem?.name); ``` #### Handling errors [Section titled “Handling errors”](#handling-errors) Note in the prior example that the `output` property can be `null`. This can happen when the model fails to generate output that conforms to the schema. The best strategy for dealing with such errors will depend on your exact use case, but here are some general hints: * **Try a different model**. For structured output to succeed, the model must be capable of generating output in JSON. The most powerful LLMs, like Gemini and Claude, are versatile enough to do this; however, smaller models, such as some of the local models you would use with Ollama, might not be able to generate structured output reliably unless they have been specifically trained to do so. * **Make use of Zod’s coercion abilities**: You can specify in your schemas that Zod should try to coerce non-conforming types into the type specified by the schema. If your schema includes primitive types other than strings, using Zod coercion can reduce the number of `generate()` failures you experience. The following version of `MenuItemSchema` uses type coercion to automatically correct situations where the model generates calorie information as a string instead of a number: ```ts const MenuItemSchema = z.object({ name: z.string().describe('The name of the menu item.'), description: z.string().describe('A description of the menu item.'), calories: z.coerce.number().describe('The estimated number of calories.'), allergens: z.array(z.string()).describe('Any known allergens in the menu item.'), }); ``` * **Retry the generate() call**. If the model you’ve chosen only rarely fails to generate conformant output, you can treat the error as you would treat a network error, and simply retry the request using some kind of incremental back-off strategy. ### Streaming [Section titled “Streaming”](#streaming) When generating large amounts of text, you can improve the experience for your users by presenting the output as it’s generated—streaming the output. A familiar example of streaming in action can be seen in most LLM chat apps: users can read the model’s response to their message as it’s being generated, which improves the perceived responsiveness of the application and enhances the illusion of chatting with an intelligent counterpart. In Genkit, you can stream output using the `generateStream()` method. Its syntax is similar to the `generate()` method: ```ts const { stream, response } = ai.generateStream({ prompt: 'Tell me a story about a boy and his dog.', }); ``` The response object has a `stream` property, which you can use to iterate over the streaming output of the request as it’s generated: ```ts for await (const chunk of stream) { console.log(chunk.text); } ``` You can also get the complete output of the request, as you can with a non-streaming request: ```ts const finalResponse = await response; console.log(finalResponse.text); ``` Streaming also works with structured output: ```ts const { stream, response } = ai.generateStream({ prompt: 'Suggest three pirate-themed menu items.', output: { schema: z.array(MenuItemSchema) }, }); for await (const chunk of stream) { console.log(chunk.output); } const finalResponse = await response; console.log(finalResponse.output); ``` Streaming structured output works a little differently from streaming text: the `output` property of a response chunk is an object constructed from the accumulation of the chunks that have been produced so far, rather than an object representing a single chunk (which might not be valid on its own). **Every chunk of structured output in a sense supersedes the chunk that came before it**. For example, here’s what the first five outputs from the prior example might look like: ```js null; { starters: [{}]; } { starters: [{ name: "Captain's Treasure Chest", description: 'A' }]; } { starters: [ { name: "Captain's Treasure Chest", description: 'A mix of spiced nuts, olives, and marinated cheese served in a treasure chest.', calories: 350, }, ]; } { starters: [ { name: "Captain's Treasure Chest", description: 'A mix of spiced nuts, olives, and marinated cheese served in a treasure chest.', calories: 350, allergens: [Array], }, { name: 'Shipwreck Salad', description: 'Fresh' }, ]; } ``` ### Multimodal input [Section titled “Multimodal input”](#multimodal-input) [Genkit by Example: Image Analysis ](https://examples.genkit.dev/image-analysis?utm_source=genkit.dev\&utm_content=contextlink)See a live demo of how Genkit can enable image analysis using multimodal input. The examples you’ve seen so far have used text strings as model prompts. While this remains the most common way to prompt generative AI models, many models can also accept other media as prompts. Media prompts are most often used in conjunction with text prompts that instruct the model to perform some operation on the media, such as to caption an image or transcribe an audio recording. The ability to accept media input and the types of media you can use are completely dependent on the model and its API. For example, the Gemini 1.5 series of models can accept images, video, and audio as prompts. To provide a media prompt to a model that supports it, instead of passing a simple text prompt to `generate`, pass an array consisting of a media part and a text part: ```ts const response = await ai.generate({ prompt: [{ media: { url: 'https://.../image.jpg' } }, { text: 'What is in this image?' }], }); ``` In the above example, you specified an image using a publicly-accessible HTTPS URL. You can also pass media data directly by encoding it as a data URL. For example: ```ts import { readFile } from 'node:fs/promises'; ``` ```ts const data = await readFile('image.jpg'); const response = await ai.generate({ prompt: [{ media: { url: `data:image/jpeg;base64,${data.toString('base64')}` } }, { text: 'What is in this image?' }], }); ``` All models that support media input support both data URLs and HTTPS URLs. Some model plugins add support for other media sources. For example, the Vertex AI plugin also lets you use Cloud Storage (`gs://`) URLs. ### Generating Media [Section titled “Generating Media”](#generating-media) While most examples in this guide focus on generating text with LLMs, Genkit also supports generating other types of media, including **images** and **audio**. Thanks to its unified `generate()` interface, working with media models is just as straightforward as generating text. Note Genkit returns generated media as a **data URL**, a widely supported format for handling binary media in both browsers and Node.js environments. #### Image Generation [Section titled “Image Generation”](#image-generation) To generate an image using a model like Imagen from Vertex AI, follow these steps: 1. **Install a data URL parser.** Genkit outputs media as data URLs, so you’ll need to decode them before saving to disk. This example uses [`data-urls`](https://www.npmjs.com/package/data-urls): ```bash npm install data-urls npm install --save-dev @types/data-urls ``` 2. **Generate the image and save it to a file:** ```ts import { vertexAI } from '@genkit-ai/vertexai'; import { parseDataUrl } from 'data-urls'; import { writeFile } from 'node:fs/promises'; const response = await ai.generate({ model: vertexAI.model('imagen-3.0-fast-generate-001'), prompt: 'An illustration of a dog wearing a space suit, photorealistic', output: { format: 'media' }, }); const imagePart = response.output; if (imagePart?.media?.url) { const parsed = parseDataUrl(imagePart.media.url); if (parsed) { await writeFile('dog.png', parsed.body); } } ``` This will generate an image and save it as a PNG file named `dog.png`. #### Audio Generation [Section titled “Audio Generation”](#audio-generation) You can also use Genkit to generate audio with a text-to-speech (TTS) models. This is especially useful for voice features, narration, or accessibility support. Here’s how to convert text into speech and save it as an audio file: ```ts import { googleAI } from '@genkit-ai/googleai'; import { writeFile } from 'node:fs/promises'; import { Buffer } from 'node:buffer'; const response = await ai.generate({ model: googleAI.model('gemini-2.5-flash-preview-tts'), // Gemini-specific configuration for audio generation // Available configuration options will depend on model and provider config: { responseModalities: ['AUDIO'], speechConfig: { voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib' }, }, }, }, prompt: 'Say that Genkit is an amazing AI framework', }); // Handle the audio data (returned as a data URL) if (response.media?.url) { // Extract base64 data from the data URL const audioBuffer = Buffer.from( response.media.url.substring(response.media.url.indexOf(',') + 1), 'base64' ); // Save to a file await writeFile('output.wav', audioBuffer); } ``` This code generates speech using the Gemini TTS model and saves the result to a file named `output.wav`. ### Next steps [Section titled “Next steps”](#next-steps) #### Learn more about Genkit [Section titled “Learn more about Genkit”](#learn-more-about-genkit) * As an app developer, the primary way you influence the output of generative AI models is through prompting. Read [Prompt management](/docs/dotprompt) to learn how Genkit helps you develop effective prompts and manage them in your codebase. * Although `generate()` is the nucleus of every generative AI powered application, real-world applications usually require additional work before and after invoking a generative AI model. To reflect this, Genkit introduces the concept of *flows*, which are defined like functions but add additional features such as observability and simplified deployment. To learn more, see [Defining workflows](/docs/flows). #### Advanced LLM use [Section titled “Advanced LLM use”](#advanced-llm-use) * Many of your users will have interacted with large language models for the first time through chatbots. Although LLMs are capable of much more than simulating conversations, it remains a familiar and useful style of interaction. Even when your users will not be interacting directly with the model in this way, the conversational style of prompting is a powerful way to influence the output generated by an AI model. Read [Multi-turn chats](/docs/chat) to learn how to use Genkit as part of an LLM chat implementation. * One way to enhance the capabilities of LLMs is to prompt them with a list of ways they can request more information from you, or request you to perform some action. This is known as *tool calling* or *function calling*. Models that are trained to support this capability can respond to a prompt with a specially-formatted response, which indicates to the calling application that it should perform some action and send the result back to the LLM along with the original prompt. Genkit has library functions that automate both the prompt generation and the call-response loop elements of a tool calling implementation. See [Tool calling](/docs/tool-calling) to learn more. * Retrieval-augmented generation (RAG) is a technique used to introduce domain-specific information into a model’s output. This is accomplished by inserting relevant information into a prompt before passing it on to the language model. A complete RAG implementation requires you to bring several technologies together: text embedding generation models, vector databases, and large language models. See [Retrieval-augmented generation (RAG)](/docs/rag) to learn how Genkit simplifies the process of coordinating these various elements. #### Testing model output [Section titled “Testing model output”](#testing-model-output) As a software engineer, you’re used to deterministic systems where the same input always produces the same output. However, with AI models being probabilistic, the output can vary based on subtle nuances in the input, the model’s training data, and even randomness deliberately introduced by parameters like temperature. Genkit’s evaluators are structured ways to assess the quality of your LLM’s responses, using a variety of strategies. Read more on the [Evaluation](/docs/evaluation) page. # Building multi-agent systems > Learn how to build multi-agent systems in Genkit by delegating tasks to specialized agents, addressing challenges of complex agentic workflows. Beta This feature of Genkit is in **Beta,** which means it is not yet part of Genkit’s stable API. APIs of beta features may change in minor version releases. A powerful application of large language models are LLM-powered agents. An agent is a system that can carry out complex tasks by planning how to break tasks into smaller ones, and (with the help of [tool calling](/docs/tool-calling)) execute tasks that interact with external resources such as databases or even physical devices. Here are some excerpts from a very simple customer service agent built using a single prompt and several tools: ```typescript const menuLookupTool = ai.defineTool( { name: 'menuLookupTool', description: 'use this tool to look up the menu for a given date', inputSchema: z.object({ date: z.string().describe('the date to look up the menu for'), }), outputSchema: z.string().describe('the menu for a given date'), }, async (input) => { // Retrieve the menu from a database, website, etc. // ... }, ); const reservationTool = ai.defineTool( { name: 'reservationTool', description: 'use this tool to try to book a reservation', inputSchema: z.object({ partySize: z.coerce.number().describe('the number of guests'), date: z.string().describe('the date to book for'), }), outputSchema: z .string() .describe( "true if the reservation was successfully booked and false if there's" + ' no table available for the requested time', ), }, async (input) => { // Access your database to try to make the reservation. // ... }, ); ``` ```typescript const chat = ai.chat({ model: googleAI.model('gemini-2.5-flash'), system: "You are an AI customer service agent for Pavel's Cafe. Use the tools " + 'available to you to help the customer. If you cannot help the ' + 'customer with the available tools, politely explain so.', tools: [menuLookupTool, reservationTool], }); ``` A simple architecture like the one shown above can be sufficient when your agent only has a few capabilities. However, even for the limited example above, you can see that there are some capabilities that customers would likely expect: for example, listing the customer’s current reservations, canceling a reservation, and so on. As you build more and more tools to implement these additional capabilities, you start to run into some problems: * The more tools you add, the more you stretch the model’s ability to consistently and correctly employ the right tool for the job. * Some tasks might best be served through a more focused back and forth between the user and the agent, rather than by a single tool call. * Some tasks might benefit from a specialized prompt. For example, if your agent is responding to an unhappy customer, you might want its tone to be more business-like, whereas the agent that greets the customer initially can have a more friendly and lighthearted tone. One approach you can use to deal with these issues that arise when building complex agents is to create many specialized agents and use a general purpose agent to delegate tasks to them. Genkit supports this architecture by allowing you to specify prompts as tools. Each prompt represents a single specialized agent, with its own set of tools available to it, and those agents are in turn available as tools to your single orchestration agent, which is the primary interface with the user. Here’s what an expanded version of the previous example might look like as a multi-agent system: ```typescript // Define a prompt that represents a specialist agent const reservationAgent = ai.definePrompt({ name: 'reservationAgent', description: 'Reservation Agent can help manage guest reservations', tools: [reservationTool, reservationCancelationTool, reservationListTool], system: 'Help guests make and manage reservations', }); // Or load agents from .prompt files const menuInfoAgent = ai.prompt('menuInfoAgent'); const complaintAgent = ai.prompt('complaintAgent'); // The triage agent is the agent that users interact with initially const triageAgent = ai.definePrompt({ name: 'triageAgent', description: 'Triage Agent', tools: [reservationAgent, menuInfoAgent, complaintAgent], system: `You are an AI customer service agent for Pavel's Cafe. Greet the user and ask them how you can help. If appropriate, transfer to an agent that can better handle the request. If you cannot help the customer with the available tools, politely explain so.`, }); ``` ```typescript // Start a chat session, initially with the triage agent const chat = ai.chat(triageAgent); ``` # Use Genkit in a Next.js app > Learn how to integrate Genkit flows into your Next.js applications using the official Genkit Next.js plugin, covering project setup, flow definition, API routes, and client-side calls. This page shows how you can use Genkit flows in your Next.js applications using the official Genkit Next.js plugin. For complete API reference documentation, see the [Genkit Next.js Plugin API Reference](https://js.api.genkit.dev/modules/_genkit-ai_next.html). ## Before you begin [Section titled “Before you begin”](#before-you-begin) You should be familiar with Genkit’s concept of [flows](/docs/flows), and how to write them. ## Create a Next.js project [Section titled “Create a Next.js project”](#create-a-nextjs-project) If you don’t already have a Next.js project that you want to add generative AI features to, you can create one for the purpose of following along with this page: ```bash npx create-next-app@latest --src-dir ``` The `--src-dir` flag creates a `src/` directory to keep your project organized by separating source code from configuration files. ## Install Genkit dependencies [Section titled “Install Genkit dependencies”](#install-genkit-dependencies) Install the Genkit dependencies into your Next.js app: 1. Install the core Genkit library and the Next.js plugin: ```bash npm install genkit @genkit-ai/next ``` 2. Install at least one model plugin. * Gemini (Google AI) ```bash npm install @genkit-ai/googleai ``` * Gemini (Vertex AI) ```bash npm install @genkit-ai/vertexai ``` 3. Install the Genkit CLI globally. The tsx tool is also recommended as a development dependency, as it makes testing your code more convenient. Both of these dependencies are optional, however. ```bash npm install -g genkit-cli npm install --save-dev tsx ``` ## Define Genkit flows [Section titled “Define Genkit flows”](#define-genkit-flows) Create a new directory in your Next.js project to contain your Genkit flows. Create `src/genkit/` and add your flow definitions there: For example, create `src/genkit/menuSuggestionFlow.ts`: * Gemini (Google AI) ```ts import { googleAI } from '@genkit-ai/googleai'; import { genkit, z } from 'genkit'; const ai = genkit({ plugins: [googleAI()], }); export const menuSuggestionFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ menuItem: z.string() }), streamSchema: z.string(), }, async ({ theme }, { sendChunk }) => { const { stream, response } = ai.generateStream({ model: googleAI.model('gemini-2.5-flash'), prompt: `Invent a menu item for a ${theme} themed restaurant.`, }); for await (const chunk of stream) { sendChunk(chunk.text); } const { text } = await response; return { menuItem: text }; } ); ``` * Gemini (Vertex AI) ```ts import { vertexAI } from '@genkit-ai/vertexai'; import { genkit, z } from 'genkit'; const ai = genkit({ plugins: [vertexAI()], }); export const menuSuggestionFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ menuItem: z.string() }), streamSchema: z.string(), }, async ({ theme }, { sendChunk }) => { const { stream, response } = ai.generateStream({ model: vertexAI.model('gemini-2.5-flash'), prompt: `Invent a menu item for a ${theme} themed restaurant.`, }); for await (const chunk of stream) { sendChunk(chunk.text); } const { text } = await response; return { menuItem: text }; } ); ``` ## Create API routes [Section titled “Create API routes”](#create-api-routes) Now, create API routes that expose your flows using the Genkit Next.js plugin. For each flow, create a corresponding route file: Create `src/app/api/menuSuggestion/route.ts`: ```ts import { menuSuggestionFlow } from '@/genkit/menuSuggestionFlow'; import { appRoute } from '@genkit-ai/next'; export const POST = appRoute(menuSuggestionFlow); ``` ## Call your flows from the frontend [Section titled “Call your flows from the frontend”](#call-your-flows-from-the-frontend) In your frontend code, you can now call your flows using the Genkit Next.js client: ```tsx 'use client'; import { useState } from 'react'; import { runFlow, streamFlow } from '@genkit-ai/next/client'; import { menuSuggestionFlow } from '@/genkit/menuSuggestionFlow'; export default function Home() { const [menuItem, setMenuItem] = useState(''); const [isLoading, setIsLoading] = useState(false); const [streamedText, setStreamedText] = useState(''); async function getMenuItem(formData: FormData) { const theme = formData.get('theme')?.toString() ?? ''; setIsLoading(true); try { // Regular (non-streaming) approach const result = await runFlow({ url: '/api/menuSuggestion', input: { theme }, }); setMenuItem(result.menuItem); } catch (error) { console.error('Error generating menu item:', error); } finally { setIsLoading(false); } } async function streamMenuItem(formData: FormData) { const theme = formData.get('theme')?.toString() ?? ''; setIsLoading(true); setStreamedText(''); try { // Streaming approach const result = streamFlow({ url: '/api/menuSuggestion', input: { theme }, }); // Process the stream chunks as they arrive for await (const chunk of result.stream) { setStreamedText((prev) => prev + chunk); } // Get the final complete response const finalOutput = await result.output; setMenuItem(finalOutput.menuItem); } catch (error) { console.error('Error streaming menu item:', error); } finally { setIsLoading(false); } } return (



{streamedText && (

Streaming Output:

{streamedText}
)} {menuItem && (

Final Output:

{menuItem}
)}
); } ``` ## Authentication (Optional) [Section titled “Authentication (Optional)”](#authentication-optional) If you need to add authentication to your API routes, you can pass headers with your requests: ```tsx const result = await runFlow({ url: '/api/menuSuggestion', headers: { Authorization: 'Bearer your-token-here', }, input: { theme }, }); ``` ## Test your app locally [Section titled “Test your app locally”](#test-your-app-locally) If you want to run your app locally, you need to make credentials for the model API service you chose available. * Gemini (Google AI) 1. [Generate an API key](https://aistudio.google.com/app/apikey) for the Gemini API using Google AI Studio. 2. Set the `GEMINI_API_KEY` environment variable to your key: ```bash export GEMINI_API_KEY= ``` * Gemini (Vertex AI) 1. In the Cloud console, [Enable the Vertex AI API](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com?project=_) for your project. 2. Configure the [`gcloud`](https://cloud.google.com/sdk/gcloud) tool to set up application default credentials: ```bash gcloud config set project gcloud services enable aiplatform.googleapis.com ``` Then, run your app locally as normal: ```bash npm run dev ``` For Genkit development tools, you can still use: ```bash genkit start -- npx tsx --watch src/genkit/menuSuggestionFlow.ts ``` ## Deploy your app [Section titled “Deploy your app”](#deploy-your-app) When you deploy your app, you will need to make sure the credentials for any external services you use (such as your chosen model API service) are available to the deployed app. See the following pages for information specific to your chosen deployment platform: * [Cloud Functions for Firebase](/docs/firebase) * [Cloud Run](/docs/cloud-run) * [Other Node.js platforms](/docs/deploy-node) # Advanced Configuration > This guide covers advanced configuration options for Genkit's Firebase telemetry plugin, including fine-tuning telemetry collection, export settings, and disabling specific data types. This guide focuses on advanced configuration options for deployed features using the Firebase telemetry plugin. Detailed descriptions of each configuration option can be found in our [JS API reference documentation](https://js.api.genkit.dev/interfaces/_genkit-ai_google-cloud.GcpTelemetryConfigOptions.html). This documentation will describe how to fine-tune which telemetry is collected, how often, and from what environments. ## Default Configuration [Section titled “Default Configuration”](#default-configuration) The Firebase telemetry plugin provides default options, out of the box, to get you up and running quickly. These are the provided defaults: ```typescript { autoInstrumentation: true, autoInstrumentationConfig: { '@opentelemetry/instrumentation-dns': { enabled: false }, } disableMetrics: false, disableTraces: false, disableLoggingInputAndOutput: false, forceDevExport: false, // 5 minutes metricExportIntervalMillis: 300_000, // 5 minutes metricExportTimeoutMillis: 300_000, // See https://js.api.genkit.dev/interfaces/_genkit-ai_google-cloud.GcpTelemetryConfigOptions.html#sampler sampler: AlwaysOnSampler() } ``` ## Export local telemetry [Section titled “Export local telemetry”](#export-local-telemetry) To export telemetry when running locally set the `forceDevExport` option to `true`. ```typescript import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; enableFirebaseTelemetry({ forceDevExport: true }); ``` During development and testing, you can decrease latency by adjusting the export interval and timeout. Note: Shipping to production with a frequent export interval may increase the cost for exported telemetry. ```typescript import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; enableFirebaseTelemetry({ forceDevExport: true, metricExportIntervalMillis: 10_000, // 10 seconds metricExportTimeoutMillis: 10_000, // 10 seconds }); ``` ## Adjust auto instrumentation [Section titled “Adjust auto instrumentation”](#adjust-auto-instrumentation) The Firebase telemetry plugin will automatically collect traces and metrics for popular frameworks using OpenTelemetry [zero-code instrumentation](https://opentelemetry.io/docs/zero-code/js/). A full list of available instrumentations can be found in the [auto-instrumentations-node](https://github.com/open-telemetry/opentelemetry-js-contrib/blob/main/metapackages/auto-instrumentations-node/README.md#supported-instrumentations) documentation. To selectively disable or enable instrumentations that are eligible for auto instrumentation, update the `autoInstrumentationConfig` field: ```typescript import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; enableFirebaseTelemetry({ autoInstrumentationConfig: { '@opentelemetry/instrumentation-fs': { enabled: false }, '@opentelemetry/instrumentation-dns': { enabled: false }, '@opentelemetry/instrumentation-net': { enabled: false }, }, }); ``` ## Disable telemetry [Section titled “Disable telemetry”](#disable-telemetry) Genkit Monitoring leverages a combination of logging, tracing, and metrics to capture a holistic view of your Genkit interactions, however, you can also disable each of these elements independently if needed. ### Disable input and output logging [Section titled “Disable input and output logging”](#disable-input-and-output-logging) By default, the Firebase telemetry plugin will capture inputs and outputs for each Genkit feature or step. To help you control how customer data is stored, you can disable the logging of input and output by adding the following to your configuration: ```typescript import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; enableFirebaseTelemetry({ disableLoggingInputAndOutput: true, }); ``` With this option set, input and output attributes will be redacted in the Genkit Monitoring trace viewer and will be missing from Google Cloud logging. ### Disable metrics [Section titled “Disable metrics”](#disable-metrics) To disable metrics collection, add the following to your configuration: ```typescript import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; enableFirebaseTelemetry({ disableMetrics: true, }); ``` With this option set, you will no longer see stability metrics in the Genkit Monitoring dashboard and will be missing from Google Cloud Metrics. ### Disable traces [Section titled “Disable traces”](#disable-traces) To disable trace collection, add the following to your configuration: ```typescript import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; enableFirebaseTelemetry({ disableTraces: true, }); ``` With this option set, you will no longer see traces in the Genkit Monitoring feature page, have access to the trace viewer, or see traces present in Google Cloud Tracing. # Authentication and authorization > Learn how to authenticate and authorize the Firebase telemetry plugin for Genkit, covering API enablement, user authentication, and deployment to Google Cloud or other environments. The Firebase telemetry plugin requires a Google Cloud or Firebase project ID and application credentials. If you don’t have a Google Cloud project and account, you can set one up in the [Firebase Console](https://console.firebase.google.com/) or in the [Google Cloud Console](https://cloud.google.com). All Firebase project IDs are Google Cloud project IDs. ## Enable APIs [Section titled “Enable APIs”](#enable-apis) Prior to adding the plugin, make sure the following APIs are enabled for your project: * [Cloud Logging API](https://console.cloud.google.com/apis/library/logging.googleapis.com) * [Cloud Trace API](https://console.cloud.google.com/apis/library/cloudtrace.googleapis.com) * [Cloud Monitoring API](https://console.cloud.google.com/apis/library/monitoring.googleapis.com) These APIs should be listed in the [API dashboard](https://console.cloud.google.com/apis/dashboard) for your project. Click to learn more about how to [enable and disable APIs](https://support.google.com/googleapi/answer/6158841). ## User Authentication [Section titled “User Authentication”](#user-authentication) To export telemetry from your local development environment to Genkit Monitoring, you will need to authenticate yourself with Google Cloud. The easiest way to authenticate as yourself is using the gcloud CLI, which will automatically make your credentials available to the framework through [Application Default Credentials (ADC)](https://cloud.google.com/docs/authentication/application-default-credentials). If you don’t have the gcloud CLI installed, first follow the [installation instructions](https://cloud.google.com/sdk/docs/install#installation_instructions). 1. Authenticate using the `gcloud` CLI: ```bash gcloud auth application-default login ``` 2. Set your project ID ```bash gcloud config set project PROJECT_ID ``` ## Deploy to Google Cloud [Section titled “Deploy to Google Cloud”](#deploy-to-google-cloud) If deploying your code to a Google Cloud or Firebase environment (Cloud Functions, Cloud Run, App Hosting, etc), the project ID and credentials will be discovered automatically with [Application Default Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc). You will need to apply the following roles to the service account that is running your code (i.e. ‘attached service account’) using the [IAM Console](https://console.cloud.google.com/iam-admin/iam): * `roles/monitoring.metricWriter` * `roles/cloudtrace.agent` * `roles/logging.logWriter` Not sure which service account is the right one? See the [Find or create your service account](#find-or-create-your-service-account) section. ## Deploy outside of Google Cloud (with ADC) [Section titled “Deploy outside of Google Cloud (with ADC)”](#deploy-outside-of-google-cloud-with-adc) If possible, use [Application Default Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc) to make credentials available to the plugin. Typically this involves generating a service account key and deploying those credentials to your production environment. 1. Follow the instructions to set up a [service account key](https://cloud.google.com/iam/docs/keys-create-delete#creating). 2. Ensure the service account has the following roles: * `roles/monitoring.metricWriter` * `roles/cloudtrace.agent` * `roles/logging.logWriter` 3. Deploy the credential file to production (**do not** check into source code) 4. Set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable as the path to the credential file. ```bash GOOGLE_APPLICATION_CREDENTIALS = "path/to/your/key/file" ``` Not sure which service account is the right one? See the [Find or create your service account](#find-or-create-your-service-account) section. ## Deploy outside of Google Cloud (without ADC) [Section titled “Deploy outside of Google Cloud (without ADC)”](#deploy-outside-of-google-cloud-without-adc) In some serverless environments, you may not be able to deploy a credential file. 1. Follow the instructions to set up a [service account key](https://cloud.google.com/iam/docs/keys-create-delete#creating). 2. Ensure the service account has the following roles: * `roles/monitoring.metricWriter` * `roles/cloudtrace.agent` * `roles/logging.logWriter` 3. Download the credential file. 4. Assign the contents of the credential file to the `GCLOUD_SERVICE_ACCOUNT_CREDS` environment variable as follows: ```bash GCLOUD_SERVICE_ACCOUNT_CREDS='{ "type": "service_account", "project_id": "your-project-id", "private_key_id": "your-private-key-id", "private_key": "your-private-key", "client_email": "your-client-email", "client_id": "your-client-id", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://accounts.google.com/o/oauth2/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "your-cert-url" }' ``` Not sure which service account is the right one? See the [Find or create your service account](#find-or-create-your-service-account) section. ## Find or create your service account [Section titled “Find or create your service account”](#find-or-create-your-service-account) To find the appropriate service account: 1. Navigate to the [service accounts page](https://console.cloud.google.com/iam-admin/serviceaccounts) in the Google Cloud Console 2. Select your project 3. Find the appropriate service account. Common default service accounts are as follows: * Firebase functions & Cloud Run `PROJECT ID-compute@developer.gserviceaccount.com` * App Engine `PROJECT ID@appspot.gserviceaccount.com` * App Hosting `firebase-app-hosting-compute@PROJECT ID.iam.gserviceaccount.com` If you are deploying outside of the Google ecosystem or don’t want to use a default service account, you can [create a service account](https://cloud.google.com/iam/docs/service-accounts-create#creating) in the Google Cloud console. # Get started with Genkit Monitoring > This quickstart guide explains how to set up Genkit Monitoring for your deployed Genkit features to collect and view real-time telemetry data, including metrics, traces, and production trace exports for evaluations. This quickstart guide describes how to set up Genkit Monitoring for your deployed Genkit features, so that you can collect and view real-time telemetry data. With Genkit Monitoring, you get visibility into how your Genkit features are performing in production. Key capabilities of Genkit Monitoring include: * Viewing quantitative metrics like Genkit feature latency, errors, and token usage. * Inspecting traces to see your Genkit’s feature steps, inputs, and outputs, to help with debugging and quality improvement. * Exporting production traces to run evals within Genkit. Setting up Genkit Monitoring requires completing tasks in both your codebase and on the Google Cloud Console. ## Before you begin [Section titled “Before you begin”](#before-you-begin) 1. If you haven’t already, create a Firebase project. In the [Firebase console](https://console.firebase.google.com), click **Add a project**, then follow the on-screen instructions. You can create a new project or add Firebase services to an already-existing Google Cloud project. 2. Ensure your project is on the [Blaze pricing plan](https://firebase.google.com/pricing). Genkit Monitoring relies on telemetry data written to Google Cloud Logging, Metrics, and Trace, which are paid services. View the [Google Cloud Observability pricing](https://cloud.google.com/stackdriver/pricing) page for pricing details and to learn about free-of-charge tier limits. 3. Write a Genkit feature by following the [Get Started Guide](/docs/get-started), and prepare your code for deployment by using one of the following guides: a. [Deploy flows using Cloud Functions for Firebase](/docs/firebase) b. [Deploy flows using Cloud Run](/docs/cloud-run) c. [Deploy flows to any Node.js platform](/docs/deploy-node) ## Step 1. Add the Firebase plugin [Section titled “Step 1. Add the Firebase plugin”](#step-1-add-the-firebase-plugin) Install the `@genkit-ai/firebase` plugin in your project: ```bash npm install @genkit-ai/firebase ``` ### Environment-based configuration [Section titled “Environment-based configuration”](#environment-based-configuration) If you intend to use the default configuration for Firebase Genkit Monitoring, you can enable telemetry by setting the `ENABLE_FIREBASE_MONITORING` environment variable in your deployment environment. ```bash export ENABLE_FIREBASE_MONITORING=true ``` Note This will use default configuration values. To override configuration options, use “Programmatic configuration”. ### Programmatic configuration [Section titled “Programmatic configuration”](#programmatic-configuration) You can also enable Firebase Genkit Monitoring in code. This is useful if you want to tweak any configuration settings like the metric export interval or to set up your local environment to export telemetry data. Import `enableFirebaseTelemetry` into your Genkit configuration file (the file where `genkit(...)` is initalized), and call it: ```typescript import { enableFirebaseTelemetry } from '@genkit-ai/firebase'; enableFirebaseTelemetry(); ``` ## Step 2. Enable the required APIs [Section titled “Step 2. Enable the required APIs”](#step-2-enable-the-required-apis) Make sure that the following APIs are enabled for your Google Cloud project: * [Cloud Logging API](https://console.cloud.google.com/apis/library/logging.googleapis.com) * [Cloud Trace API](https://console.cloud.google.com/apis/library/cloudtrace.googleapis.com) * [Cloud Monitoring API](https://console.cloud.google.com/apis/library/monitoring.googleapis.com) These APIs should be listed in the [API dashboard](https://console.cloud.google.com/apis/dashboard) for your project. ## Step 3. Set up permissions [Section titled “Step 3. Set up permissions”](#step-3-set-up-permissions) The Firebase plugin needs to use a *service account* to authenticate with Google Cloud Logging, Metrics, and Trace services. Grant the following roles to whichever service account is configured to run your code within the [Google Cloud IAM Console](https://console.cloud.google.com/iam-admin/iam). For Cloud Functions for Firebase and Cloud Run, that’s typically the default compute service account. * **Monitoring Metric Writer** (`roles/monitoring.metricWriter`) * **Cloud Trace Agent** (`roles/cloudtrace.agent`) * **Logs Writer** (`roles/logging.logWriter`) ## Step 4. (Optional) Test your configuration locally [Section titled “Step 4. (Optional) Test your configuration locally”](#step-4-optional-test-your-configuration-locally) Before deploying, you can run your Genkit code locally to confirm that telemetry data is being collected, and is viewable in the Genkit Monitoring dashboard. 1. In your Genkit code, set `forceDevExport` to `true` to send telemetry from your local environment. 2. Use your service account to authenticate and test your configuration. Tip In order to impersonate the service account, you will need to have the `roles/iam.serviceAccountTokenCreator` [IAM role](https://console.cloud.google.com/iam-admin/iam) applied to your user account. With the [Google Cloud CLI tool](https://cloud.google.com/sdk/docs/install?authuser=0), authenticate using the service account: ```bash gcloud auth application-default login --impersonate-service-account SERVICE_ACCT_EMAIL ``` 3. Run and invoke your Genkit feature, and then view metrics on the [Genkit Monitoring dashboard](https://console.firebase.google.com/project/_/genai_monitoring). Allow for up to 5 minutes to collect the first metric. You can reduce this delay by setting `metricExportIntervalMillis` in the telemetry configuration. 4. If metrics are not appearing in the Genkit Monitoring dashboard, view the [Troubleshooting](/docs/observability/troubleshooting) guide for steps to debug. ## Step 5. Re-build and deploy code [Section titled “Step 5. Re-build and deploy code”](#step-5-re-build-and-deploy-code) Re-build, deploy, and invoke your Genkit feature to start collecting data. After Genkit Monitoring receives your metrics, you can view them by visiting the [Genkit Monitoring dashboard](https://console.firebase.google.com/project/_/genai_monitoring) Note It may take up to 5 minutes to collect the first metric (based on the default `metricExportIntervalMillis` setting in the telemetry configuration). # Telemetry Collection > This document details the metrics, trace attributes, and logs collected by the Firebase telemetry plugin for Genkit, along with information on latency, quotas, and cost. The Firebase telemetry plugin exports a combination of metrics, traces, and logs to Google Cloud Observability. This document details which metrics, trace attributes, and logs will be collected and what you can expect in terms of latency, quotas, and cost. ## Telemetry delay [Section titled “Telemetry delay”](#telemetry-delay) There may be a slight delay before telemetry from a given invocation is available in Firebase. This is dependent on your export interval (5 minutes by default). ## Quotas and limits [Section titled “Quotas and limits”](#quotas-and-limits) There are several quotas that are important to keep in mind: * [Cloud Trace Quotas](http://cloud.google.com/trace/docs/quotas) * [Cloud Logging Quotas](http://cloud.google.com/logging/quotas) * [Cloud Monitoring Quotas](http://cloud.google.com/monitoring/quotas) ## Cost [Section titled “Cost”](#cost) Cloud Logging, Cloud Trace, and Cloud Monitoring have generous free-of-charge tiers. Specific pricing can be found at the following links: * [Cloud Logging Pricing](http://cloud.google.com/stackdriver/pricing#google-cloud-observability-pricing) * [Cloud Trace Pricing](https://cloud.google.com/trace#pricing) * [Cloud Monitoring Pricing](https://cloud.google.com/stackdriver/pricing#monitoring-pricing-summary) ## Metrics [Section titled “Metrics”](#metrics) The Firebase telemetry plugin collects a number of different metrics to support the various Genkit action types detailed in the following sections. ### Feature metrics [Section titled “Feature metrics”](#feature-metrics) Features are the top-level entry-point to your Genkit code. In most cases, this will be a flow. Otherwise, this will be the top-most span in a trace. | Name | Type | Description | | ----------------------- | --------- | ----------------------- | | genkit/feature/requests | Counter | Number of requests | | genkit/feature/latency | Histogram | Execution latency in ms | Each feature metric contains the following dimensions: | Name | Description | | ------------- | -------------------------------------------------------------------------------- | | name | The name of the feature. In most cases, this is the top-level Genkit flow | | status | ’success’ or ‘failure’ depending on whether or not the feature request succeeded | | error | Only set when `status=failure`. Contains the error type that caused the failure | | source | The Genkit source language. Eg. ‘ts’ | | sourceVersion | The Genkit framework version | ### Action metrics [Section titled “Action metrics”](#action-metrics) Actions represent a generic step of execution within Genkit. Each of these steps will have the following metrics tracked: | Name | Type | Description | | ---------------------- | --------- | --------------------------------------------- | | genkit/action/requests | Counter | Number of times this action has been executed | | genkit/action/latency | Histogram | Execution latency in ms | Each action metric contains the following dimensions: | Name | Description | | ------------- | ---------------------------------------------------------------------------------------------------- | | name | The name of the action | | featureName | The name of the parent feature being executed | | path | The path of execution from the feature root to this action. eg. ‘/myFeature/parentAction/thisAction’ | | status | ’success’ or ‘failure’ depending on whether or not the action succeeded | | error | Only set when `status=failure`. Contains the error type that caused the failure | | source | The Genkit source language. Eg. ‘ts’ | | sourceVersion | The Genkit framework version | ### Generate metrics [Section titled “Generate metrics”](#generate-metrics) These are special action metrics relating to actions that interact with a model. In addition to requests and latency, input and output are also tracked, with model specific dimensions that make debugging and configuration tuning easier. | Name | Type | Description | | ------------------------------------ | --------- | ------------------------------------------ | | genkit/ai/generate/requests | Counter | Number of times this model has been called | | genkit/ai/generate/latency | Histogram | Execution latency in ms | | genkit/ai/generate/input/tokens | Counter | Input tokens | | genkit/ai/generate/output/tokens | Counter | Output tokens | | genkit/ai/generate/input/characters | Counter | Input characters | | genkit/ai/generate/output/characters | Counter | Output characters | | genkit/ai/generate/input/images | Counter | Input images | | genkit/ai/generate/output/images | Counter | Output images | | genkit/ai/generate/input/audio | Counter | Input audio files | | genkit/ai/generate/output/audio | Counter | Output audio files | Each generate metric contains the following dimensions: | Name | Description | | ------------- | ---------------------------------------------------------------------------------------------------- | | modelName | The name of the model | | featureName | The name of the parent feature being executed | | path | The path of execution from the feature root to this action. eg. ‘/myFeature/parentAction/thisAction’ | | latencyMs | The response time taken by the model | | status | ’success’ or ‘failure’ depending on whether or not the feature request succeeded | | error | Only set when `status=failure`. Contains the error type that caused the failure | | source | The Genkit source language. Eg. ‘ts’ | | sourceVersion | The Genkit framework version | ## Traces [Section titled “Traces”](#traces) All Genkit actions are automatically instrumented to provide detailed traces for your AI features. Locally, traces are visible in the Developer UI. For deployed apps enable Genkit Monitoring to get the same level of visibility. The following sections describe what trace attributes you can expect based on the Genkit action type for a particular span in the trace. ### Root Spans [Section titled “Root Spans”](#root-spans) Root spans have special attributes to help disambiguate the state attributes for the whole trace versus an individual span. | Attribute name | Description | | ---------------- | ----------------------------------------------------------------------------------------------------------------------- | | genkit/feature | The name of the parent feature being executed | | genkit/isRoot | Marked true if this span is the root span | | genkit/rootState | The state of the overall execution as `success` or `error`. This does not indicate that this step failed in particular. | ### Flow [Section titled “Flow”](#flow) | Attribute name | Description | | ----------------------- | ---------------------------------------------------------------------------------------------------------- | | genkit/input | The input to the flow. This will always be `` because of trace attribute size limits. | | genkit/metadata/subtype | The type of Genkit action. For flows it will be `flow`. | | genkit/name | The name of this Genkit action. In this case the name of the flow | | genkit/output | The output generated in the flow. This will always be `` because of trace attribute size limits. | | genkit/path | The fully qualified execution path that lead to this step in the trace, including type information. | | genkit/state | The state of this span’s execution as `success` or `error`. | | genkit/type | The type of Genkit primitive that corresponds to this span. For flows, this will be `action`. | ### Util [Section titled “Util”](#util) | Attribute name | Description | | -------------- | ---------------------------------------------------------------------------------------------------------- | | genkit/input | The input to the util. This will always be `` because of trace attribute size limits. | | genkit/name | The name of this Genkit action. In this case the name of the flow | | genkit/output | The output generated in the util. This will always be `` because of trace attribute size limits. | | genkit/path | The fully qualified execution path that lead to this step in the trace, including type information. | | genkit/state | The state of this span’s execution as `success` or `error`. | | genkit/type | The type of Genkit primitive that corresponds to this span. For flows, this will be `util`. | ### Model [Section titled “Model”](#model) | Attribute name | Description | | ----------------------- | ----------------------------------------------------------------------------------------------------------- | | genkit/input | The input to the model. This will always be `` because of trace attribute size limits. | | genkit/metadata/subtype | The type of Genkit action. For models it will be `model`. | | genkit/model | The name of the model. | | genkit/name | The name of this Genkit action. In this case the name of the model. | | genkit/output | The output generated by the model. This will always be `` because of trace attribute size limits. | | genkit/path | The fully qualified execution path that lead to this step in the trace, including type information. | | genkit/state | The state of this span’s execution as `success` or `error`. | | genkit/type | The type of Genkit primitive that corresponds to this span. For flows, this will be `action`. | ### Tool [Section titled “Tool”](#tool) | Attribute name | Description | | ----------------------- | ----------------------------------------------------------------------------------------------------------- | | genkit/input | The input to the model. This will always be `` because of trace attribute size limits. | | genkit/metadata/subtype | The type of Genkit action. For tools it will be `tool`. | | genkit/name | The name of this Genkit action. In this case the name of the model. | | genkit/output | The output generated by the model. This will always be `` because of trace attribute size limits. | | genkit/path | The fully qualified execution path that lead to this step in the trace, including type information. | | genkit/state | The state of this span’s execution as `success` or `error`. | | genkit/type | The type of Genkit primitive that corresponds to this span. For flows, this will be `action`. | ## Logs [Section titled “Logs”](#logs) For deployed apps with Genkit Monitoring, logs are used to capture input, output, and configuration metadata that provides rich detail about each step in your AI feature. All logs will include the following shared metadata fields: | Field name | Description | | ----------------- | --------------------------------------------------------------------------------------------------------------------------------- | | insertId | Unique id for the log entry | | jsonPayload | Container for variable information that is unique to each log type | | labels | `{module: genkit}` | | logName | `projects/weather-gen-test-next/logs/genkit_log` | | receivedTimestamp | Time the log was received by Cloud | | resource | Information about the source of the log including deployment information region, and projectId | | severity | The log level written. See Cloud’s [LogSeverity](https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry#logseverity) | | spanId | Identifier for the span that created this log | | timestamp | Time that the client logged a message | | trace | Identifier for the trace of the format `projects//traces/` | | traceSampled | Boolean representing whether the trace was sampled. Logs are not sampled. | Each log type will have a different json payload described in each section. ### Input [Section titled “Input”](#input) JSON payload: | Field name | Description | | ---------- | -------------------------------------------------------------------------------------------- | | message | `[genkit] Input[, ]` including `(message X of N)` for multi-part messages | | metadata | Additional context including the input message sent to the action | Metadata: | Field name | Description | | ---------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | content | The input message content sent to this Genkit action | | featureName | The name of the Genkit flow, action, tool, util, or helper. | | messageIndex \* | Index indicating the order of messages for inputs that contain multiple messages. For single messages, this will always be 0. | | model \* | Model name. | | path | The execution path that generated this log of the format `step1 > step2 > step3` | | partIndex \* | Index indicating the order of parts within a message for multi-part messages. This is typical when combining text and images in a single input. | | qualifiedPath | The execution path that generated this log, including type information of the format: `/{flow1,t:flow}/{generate,t:util}/{modelProvider/model,t:action,s:model` | | totalMessages \* | The total number of messages for this input. For single messages, this will always be 1. | | totalParts \* | Total number of parts for this message. For single-part messages, this will always be 1. | (\*) Starred items are only present on Input logs for model interactions. ### Output [Section titled “Output”](#output) JSON payload: | Field name | Description | | ---------- | --------------------------------------------------------------------------------------------- | | message | `[genkit] Output[, ]` including `(message X of N)` for multi-part messages | | metadata | Additional context including the input message sent to the action | Metadata: | Field name | Description | | ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | candidateIndex \* (deprecated) | Index indicating the order of candidates for outputs that contain multiple candidates. For logs with single candidates, this will always be 0. | | content | The output message generated by the Genkit action | | featureName | The name of the Genkit flow, action, tool, util, or helper. | | messageIndex \* | Index indicating the order of messages for inputs that contain multiple messages. For single messages, this will always be 0. | | model \* | Model name. | | path | The execution path that generated this log of the format \`step1 > step2 > step3 | | partIndex \* | Index indicating the order of parts within a message for multi-part messages. This is typical when combining text and images in a single output. | | qualifiedPath | The execution path that generated this log, including type information of the format: `/{flow1,t:flow}/{generate,t:util}/{modelProvider/model,t:action,s:model` | | totalCandidates \* (deprecated) | Total number of candidates generated as output. For single-candidate messages, this will always be 1. | | totalParts \* | Total number of parts for this message. For single-part messages, this will always be 1. | (\*) Starred items are only present on Output logs for model interactions. ### Config [Section titled “Config”](#config) JSON payload: | Field name | Description | | ---------- | ----------------------------------------------------------------- | | message | `[genkit] Config[, ]` | | metadata | Additional context including the input message sent to the action | Metadata: | Field name | Description | | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | featureName | The name of the Genkit flow, action, tool, util, or helper. | | model | Model name. | | path | The execution path that generated this log of the format \`step1 > step2 > step3 | | qualifiedPath | The execution path that generated this log, including type information of the format: `/{flow1,t:flow}/{generate,t:util}/{modelProvider/model,t:action,s:model` | | source | The Genkit library language used. This will always be set to ‘ts’ as it is the only supported language. | | sourceVersion | The Genkit library version. | | temperature | Model temperature used. | ### Paths [Section titled “Paths”](#paths) JSON payload: | Field name | Description | | ---------- | ----------------------------------------------------------------- | | message | `[genkit] Paths[, ]` | | metadata | Additional context including the input message sent to the action | Metadata: | Field name | Description | | ---------- | ---------------------------------------------------------------- | | flowName | The name of the Genkit flow, action, tool, util, or helper. | | paths | An array containing all execution paths for the collected spans. | # Genkit Monitoring - Troubleshooting > This guide provides solutions to common issues encountered when using Genkit Monitoring, including problems with traces, metrics, and telemetry export. The following sections detail solutions to common issues that developers run into when using Genkit Monitoring. ## I can’t see traces or metrics in Genkit Monitoring [Section titled “I can’t see traces or metrics in Genkit Monitoring”](#i-cant-see-traces-or-metrics-in-genkit-monitoring) 1. Ensure that the following APIs are enabled for your underlying Google Cloud project: * [Cloud Logging API](https://console.cloud.google.com/apis/library/logging.googleapis.com) * [Cloud Trace API](https://console.cloud.google.com/apis/library/cloudtrace.googleapis.com) * [Cloud Monitoring API](https://console.cloud.google.com/apis/library/monitoring.googleapis.com) 2. Ensure that the following roles are applied to the service account that is running your code (or service account that has been configured as part of the plugin options) in [Cloud IAM](https://console.cloud.google.com/iam-admin/iam). * **Monitoring Metric Writer** (`roles/monitoring.metricWriter`) * **Cloud Trace Agent** (`roles/cloudtrace.agent`) * **Logs Writer** (`roles/logging.logWriter`) 3. Inspect the application logs for errors writing to Cloud Logging, Cloud Trace, and Cloud Monitoring. On Google Cloud infrastructure such as Firebase Functions and Cloud Run, even when telemetry is misconfigured, logs to `stdout/stderr` are automatically ingested by the Cloud Logging Agent, allowing you to diagnose issues in the in the [Cloud Logging Console](https://console.cloud.google.com/logs). 4. Debug locally: Enable dev export: ```typescript enableFirebaseTelemetry({ forceDevExport: true, }); ``` To test with your personal user credentials, use the [gcloud CLI](https://cloud.google.com/sdk/docs/install) to authenticate with Google Cloud. Doing so can help diagnose enabled or disabled APIs, but does not test the gcloud auth application-default login. Alternatively, impersonating the service account lets you test production-like access. You must have the `roles/iam. serviceAccountTokenCreator` IAM role applied to your user account in order to impersonate service accounts: ```bash gcloud auth application-default login --impersonate-service-account ``` See the [ADC](https://cloud.google.com/docs/authentication/set-up-adc-local-dev-environment) documentation for more information. ## Request count does not match traces count [Section titled “Request count does not match traces count”](#request-count-does-not-match-traces-count) At low volumes (<1 query per second), you may notice that your metric counts, like requests or failed paths, do not match the number of traces shown in the traces table. Below are three common reasons for this happening. ### Metric and trace export intervals can be different [Section titled “Metric and trace export intervals can be different”](#metric-and-trace-export-intervals-can-be-different) In some cases, the dashboard shows traces that have exported but metrics that have not, or vice versa. You can reduce the likelihood of this happening by adjusting the metric export interval to be more frequent. By default, metrics are exported every 5 minutes. The minimum allowable export interval is 5 seconds. Note Exporting metrics more frequently can result in increased costs. ```typescript enableFirebaseTelemetry({ // Override the export interval to 3 minutes metricExportIntervalMillis: 180_000, // Override the export timeout to 3 minutes metricExportTimeoutMillis: 180_000, }); ``` ### Intermittent network issues [Section titled “Intermittent network issues”](#intermittent-network-issues) Occasionally you may have transient network issues that result in a failure to upload telemetry data. These failures are logged to Google Cloud Logging. To see the specific failure reason, look for a log that starts with — > Unable to send telemetry to Google Cloud: Error: Send TimeSeries failed: ### Telemetry upload reliability in Firebase Functions or Cloud Run [Section titled “Telemetry upload reliability in Firebase Functions or Cloud Run”](#telemetry-upload-reliability-in-firebase-functions-or-cloud-run) When your Genkit codei is hosted in Google Cloud Run or Cloud Functions for Firebase, telemetry-data upload may be less reliable as the container switches to the “idle” [lifecycle state](https://cloud.google.com/blog/topics/developers-practitioners/lifecycle-container-cloud-run). If higher reliability is important to you, consider changing [CPU allocation](https://cloud.google.com/run/docs/configuring/cpu-allocation) to **Instance-based billing** (previously called **CPU always allocated**) in the Google Cloud Console. Note The **Instance-based billing** setting impacts pricing. Check [Cloud Run pricing](https://cloud.google.com/run/pricing) before enabling this setting. To switch to instance-based billing, run ```bash gcloud run services update YOUR-SERVICE --no-cpu-throttling ``` # Writing Genkit plugins > Learn how to extend Genkit's capabilities by writing custom plugins, covering plugin creation, options, building models, and publishing to NPM. Genkit’s capabilities are designed to be extended by plugins. Genkit plugins are configurable modules that can provide models, retrievers, indexers, trace stores, and more. You’ve already seen plugins in action just by using Genkit: ```ts import { genkit } from 'genkit'; import { vertexAI } from '@genkit-ai/vertexai'; const ai = genkit({ plugins: [vertexAI({ projectId: 'my-project' })], }); ``` The Vertex AI plugin takes configuration (such as the user’s Google Cloud project ID) and registers a variety of new models, embedders, and more with the Genkit registry. The registry powers Genkit’s local UI for running and inspecting models, prompts, and more as well as serves as a lookup service for named actions at runtime. ## Creating a Plugin [Section titled “Creating a Plugin”](#creating-a-plugin) To create a plugin you’ll generally want to create a new NPM package: ```bash mkdir genkitx-my-plugin cd genkitx-my-plugin npm init -y npm install genkit npm install --save-dev typescript npx tsc --init ``` Then, define and export your plugin from your main entry point using the `genkitPlugin` helper: ```ts import { Genkit, z, modelActionMetadata } from 'genkit'; import { GenkitPlugin, genkitPlugin } from 'genkit/plugin'; import { ActionMetadata, ActionType } from 'genkit/registry'; interface MyPluginOptions { // add any plugin configuration here } export function myPlugin(options?: MyPluginOptions): GenkitPlugin { return genkitPlugin( 'myPlugin', // Initializer function (required): Registers actions defined upfront. async (ai: Genkit) => { // Example: Define a model that's always available ai.defineModel({ name: 'myPlugin/always-available-model', ... }); ai.defineEmbedder(/* ... */); // ... other upfront definitions }, // Dynamic Action Resolver (optional): Defines actions on-demand. async (ai: Genkit, actionType: ActionType, actionName: string) => { // Called when an action (e.g., 'myPlugin/some-dynamic-model') is // requested but not found in the registry. if (actionType === 'model' && actionName === 'some-dynamic-model') { ai.defineModel({ name: `myPlugin/${actionName}`, ... }); } // ... handle other dynamic actions }, // List Actions function (optional): Lists all potential actions. async (): Promise => { // Returns metadata for all actions the plugin *could* provide, // even if not yet defined dynamically. Used by Dev UI, etc. // Example: Fetch available models from an API const availableModels = await fetchMyModelsFromApi(); return availableModels.map(model => modelActionMetadata({ type: 'model', name: `myPlugin/${model.id}`, // ... other metadata })); } ); } ``` The `genkitPlugin` function accepts up to three arguments: 1. **Plugin Name (string, required):** A unique identifier for your plugin (e.g., `'myPlugin'`). 2. **Initializer Function (`async (ai: Genkit) => void`, required):** This function runs when Genkit starts. Use it to register actions (models, embedders, etc.) that should always be available using `ai.defineModel()`, `ai.defineEmbedder()`, etc. 3. **Dynamic Action Resolver (`async (ai: Genkit, actionType: ActionType, actionName: string) => void`, optional):** This function is called when Genkit tries to access an action (by type and name) that hasn’t been registered yet. It lets you define actions dynamically, just-in-time. For example, if a user requests `model: 'myPlugin/some-model'`, and it wasn’t defined in the initializer, this function runs, giving you a chance to define it using `ai.defineModel()`. This is useful when a plugin supports many possible actions (like numerous models) and you don’t want to register them all at startup. 4. **List Actions Function (`async () => Promise`, optional):** This function should return metadata for *all* actions your plugin can potentially provide, including those that would be dynamically defined. This is primarily used by development tools like the Genkit Developer UI to populate lists of available models, embedders, etc., allowing users to discover and select them even if they haven’t been explicitly defined yet. This function is generally *not* called during normal flow execution. ### Plugin options guidance [Section titled “Plugin options guidance”](#plugin-options-guidance) In general, your plugin should take a single `options` argument that includes any plugin-wide configuration necessary to function. For any plugin option that requires a secret value, such as API keys, you should offer both an option and a default environment variable to configure it: ```ts import { GenkitError, Genkit, z } from 'genkit'; import { GenkitPlugin, genkitPlugin } from 'genkit/plugin'; interface MyPluginOptions { apiKey?: string; } export function myPlugin(options?: MyPluginOptions) { return genkitPlugin('myPlugin', async (ai: Genkit) => { if (!apiKey) throw new GenkitError({ source: 'my-plugin', status: 'INVALID_ARGUMENT', message: 'Must supply either `options.apiKey` or set `MY_PLUGIN_API_KEY` environment variable.', }); ai.defineModel(...); ai.defineEmbedder(...) // .... }); }; ``` ## Building your plugin [Section titled “Building your plugin”](#building-your-plugin) A single plugin can activate many new things within Genkit. For example, the Vertex AI plugin activates several new models as well as an embedder. ### Model plugins [Section titled “Model plugins”](#model-plugins) Genkit model plugins add one or more generative AI models to the Genkit registry. A model represents any generative model that is capable of receiving a prompt as input and generating text, media, or data as output. Generally, a model plugin will make one or more `defineModel` calls in its initialization function. A custom model generally consists of three components: 1. Metadata defining the model’s capabilities. 2. A configuration schema with any specific parameters supported by the model. 3. A function that implements the model accepting `GenerateRequest` and returning `GenerateResponse`. To build a model plugin, you’ll need to use the `genkit/model` package: At a high level, a model plugin might look something like this: ```ts import { genkitPlugin, GenkitPlugin } from 'genkit/plugin'; import { GenerationCommonConfigSchema } from 'genkit/model'; import { simulateSystemPrompt } from 'genkit/model/middleware'; import { Genkit, GenkitError, z } from 'genkit'; export interface MyPluginOptions { // ... } export function myPlugin(options?: MyPluginOptions): GenkitPlugin { return genkitPlugin('my-plugin', async (ai: Genkit) => { ai.defineModel({ // be sure to include your plugin as a provider prefix name: 'my-plugin/my-model', // label for your model as shown in Genkit Developer UI label: 'My Awesome Model', // optional list of supported versions of your model versions: ['my-model-001', 'my-model-001'], // model support attributes supports: { multiturn: true, // true if your model supports conversations media: true, // true if your model supports multimodal input tools: true, // true if your model supports tool/function calling systemRole: true, // true if your model supports the system role output: ['text', 'media', 'json'], // types of output your model supports }, // Zod schema for your model's custom configuration configSchema: GenerationCommonConfigSchema.extend({ safetySettings: z.object({...}), }), // list of middleware for your model to use use: [simulateSystemPrompt()] }, async request => { const myModelRequest = toMyModelRequest(request); const myModelResponse = await myModelApi(myModelRequest); return toGenerateResponse(myModelResponse); }); }); }; ``` #### Transforming Requests and Responses [Section titled “Transforming Requests and Responses”](#transforming-requests-and-responses) The primary work of a Genkit model plugin is transforming the `GenerateRequest` from Genkit’s common format into a format that is recognized and supported by your model’s API, and then transforming the response from your model into the `GenerateResponseData` format used by Genkit. Sometimes, this may require massaging or manipulating data to work around model limitations. For example, if your model does not natively support a `system` message, you may need to transform a prompt’s system message into a user/model message pair. #### Action References (Models, Embedders, etc.) [Section titled “Action References (Models, Embedders, etc.)”](#action-references-models-embedders-etc) While actions like models and embedders can always be referenced by their string name (e.g., `'myPlugin/my-model'`) after being defined (either upfront or dynamically), providing strongly-typed references offers better developer experience through improved type checking and IDE autocompletion. The recommended pattern is to attach helper methods directly to your exported plugin function. These methods use reference builders like `modelRef` and `embedderRef` from Genkit core. First, define the type for your plugin function including the helper methods: ```ts import { GenkitPlugin } from 'genkit/plugin'; import { ModelReference, EmbedderReference, modelRef, embedderRef, z } from 'genkit'; // Define your model's specific config schema if it has one const MyModelConfigSchema = z.object({ customParam: z.string().optional(), }); // Define the type for your plugin function export type MyPlugin = { // The main plugin function signature (options?: MyPluginOptions): GenkitPlugin; // Helper method for creating model references model( name: string, // e.g., 'some-model-name' config?: z.infer, ): ModelReference; // Helper method for creating embedder references embedder( name: string, // e.g., 'my-embedder' config?: Record, // Or a specific config schema ): EmbedderReference; // ... add helpers for other action types if needed }; ``` Then, implement the plugin function and attach the helper methods before exporting: ```ts // (Previous imports and MyPluginOptions interface definition) import { modelRef, embedderRef } from 'genkit/model'; // Ensure modelRef/embedderRef are imported function myPluginFn(options?: MyPluginOptions): GenkitPlugin { return genkitPlugin( 'myPlugin', async (ai: Genkit) => { // Initializer... }, async (ai, actionType, actionName) => { // Dynamic resolver... // Example: Define model if requested dynamically if (actionType === 'model') { ai.defineModel( { name: `myPlugin/${actionName}`, // ... other model definition properties configSchema: MyModelConfigSchema, // Use the defined schema }, async (request) => { /* ... model implementation ... */ }, ); } // Handle other dynamic actions... }, async () => { // List actions... }, ); } // Create the final export conforming to the MyPlugin type export const myPlugin = myPluginFn as MyPlugin; // Implement the helper methods myPlugin.model = ( name: string, config?: z.infer, ): ModelReference => { return modelRef({ name: `myPlugin/${name}`, // Automatically prefixes the name configSchema: MyModelConfigSchema, config, }); }; myPlugin.embedder = (name: string, config?: Record): EmbedderReference => { return embedderRef({ name: `myPlugin/${name}`, config, }); }; ``` Now, users can import your plugin and use the helper methods for type-safe action references: ```ts import { genkit } from 'genkit'; import { myPlugin } from 'genkitx-my-plugin'; // Assuming your package name const ai = genkit({ plugins: [ myPlugin({ /* options */ }), ], }); async function run() { const { text } = await ai.generate({ // Use the helper for a type-safe model reference model: myPlugin.model('some-model-name', { customParam: 'value' }), prompt: 'Tell me a story.', }); console.log(text); const embeddings = await ai.embed({ // Use the helper for a type-safe embedder reference embedder: myPlugin.embedder('my-embedder'), content: 'Embed this text.', }); console.log(embeddings); } run(); ``` This approach keeps the plugin definition clean while providing a convenient and type-safe way for users to reference the actions provided by your plugin. It works seamlessly with both statically and dynamically defined actions, as the references only contain metadata, not the implementation itself. ## Publishing a plugin [Section titled “Publishing a plugin”](#publishing-a-plugin) Genkit plugins can be published as normal NPM packages. To increase discoverability and maximize consistency, your package should be named `genkitx-{name}` to indicate it is a Genkit plugin and you should include as many of the following `keywords` in your `package.json` as are relevant to your plugin: * `genkit-plugin`: always include this keyword in your package to indicate it is a Genkit plugin. * `genkit-model`: include this keyword if your package defines any models. * `genkit-retriever`: include this keyword if your package defines any retrievers. * `genkit-indexer`: include this keyword if your package defines any indexers. * `genkit-embedder`: include this keyword if your package defines any indexers. * `genkit-telemetry`: include this keyword if your package defines a telemetry provider. * `genkit-deploy`: include this keyword if your package includes helpers to deploy Genkit apps to cloud providers. * `genkit-flow`: include this keyword if your package enhances Genkit flows. A plugin that provided a retriever, embedder, and model might have a `package.json` that looks like: ```js { "name": "genkitx-my-plugin", "keywords": ["genkit-plugin", "genkit-retriever", "genkit-embedder", "genkit-model"], // ... dependencies etc. } ``` # Writing a Genkit Evaluator > Learn how to write custom Genkit evaluators for heuristic and LLM-based assessments, including defining prompts, scoring functions, and evaluator actions. You can extend Genkit to support custom evaluation, using either an LLM as a judge, or by programmatic (heuristic) evaluation. ## Evaluator definition [Section titled “Evaluator definition”](#evaluator-definition) Evaluators are functions that assess an LLM’s response. There are two main approaches to automated evaluation: heuristic evaluation and LLM-based evaluation. In the heuristic approach, you define a deterministic function. By contrast, in an LLM-based assessment, the content is fed back to an LLM, and the LLM is asked to score the output according to criteria set in a prompt. The `ai.defineEvaluator` method, which you use to define an evaluator action in Genkit, supports either approach. This document explores a couple of examples of how to use this method for heuristic and LLM-based evaluations. ### LLM-based Evaluators [Section titled “LLM-based Evaluators”](#llm-based-evaluators) An LLM-based evaluator leverages an LLM to evaluate the `input`, `context`, and `output` of your generative AI feature. LLM-based evaluators in Genkit are made up of 3 components: * A prompt * A scoring function * An evaluator action #### Define the prompt [Section titled “Define the prompt”](#define-the-prompt) For this example, the evaluator leverages an LLM to determine whether a food (the `output`) is delicious or not. First, provide context to the LLM, then describe what you want it to do, and finally, give it a few examples to base its response on. Genkit’s `definePrompt` utility provides an easy way to define prompts with input and output validation. The following code is an example of setting up an evaluation prompt with `definePrompt`. ```ts import { z } from "genkit"; const DELICIOUSNESS_VALUES = ['yes', 'no', 'maybe'] as const; const DeliciousnessDetectionResponseSchema = z.object({ reason: z.string(), verdict: z.enum(DELICIOUSNESS_VALUES), }); function getDeliciousnessPrompt(ai: Genkit) { return ai.definePrompt({ name: 'deliciousnessPrompt', input: { schema: z.object({ responseToTest: z.string(), }), }, output: { schema: DeliciousnessDetectionResponseSchema, } prompt: `You are a food critic. Assess whether the provided output sounds delicious, giving only "yes" (delicious), "no" (not delicious), or "maybe" (undecided) as the verdict. Examples: Output: Chicken parm sandwich Response: { "reason": "A classic and beloved dish.", "verdict": "yes" } Output: Boston Logan Airport tarmac Response: { "reason": "Not edible.", "verdict": "no" } Output: A juicy piece of gossip Response: { "reason": "Metaphorically 'tasty' but not food.", "verdict": "maybe" } New Output: {{ responseToTest }} Response: ` }); } ``` #### Define the scoring function [Section titled “Define the scoring function”](#define-the-scoring-function) Define a function that takes an example that includes `output` as required by the prompt, and scores the result. Genkit testcases include `input` as a required field, with `output` and `context` as optional fields. It is the responsibility of the evaluator to validate that all fields required for evaluation are present. ```ts import { ModelArgument } from 'genkit'; import { BaseEvalDataPoint, Score } from 'genkit/evaluator'; /** * Score an individual test case for delciousness. */ export async function deliciousnessScore( ai: Genkit, judgeLlm: ModelArgument, dataPoint: BaseEvalDataPoint, judgeConfig?: CustomModelOptions, ): Promise { const d = dataPoint; // Validate the input has required fields if (!d.output) { throw new Error('Output is required for Deliciousness detection'); } // Hydrate the prompt and generate an evaluation result const deliciousnessPrompt = getDeliciousnessPrompt(ai); const response = await deliciousnessPrompt( { responseToTest: d.output as string, }, { model: judgeLlm, config: judgeConfig, }, ); // Parse the output const parsedResponse = response.output; if (!parsedResponse) { throw new Error(`Unable to parse evaluator response: ${response.text}`); } // Return a scored response return { score: parsedResponse.verdict, details: { reasoning: parsedResponse.reason }, }; } ``` #### Define the evaluator action [Section titled “Define the evaluator action”](#define-the-evaluator-action) The final step is to write a function that defines the `EvaluatorAction`. ```ts import { EvaluatorAction } from 'genkit/evaluator'; /** * Create the Deliciousness evaluator action. */ export function createDeliciousnessEvaluator( ai: Genkit, judge: ModelArgument, judgeConfig?: z.infer, ): EvaluatorAction { return ai.defineEvaluator( { name: `myCustomEvals/deliciousnessEvaluator`, displayName: 'Deliciousness', definition: 'Determines if output is considered delicous.', isBilled: true, }, async (datapoint: BaseEvalDataPoint) => { const score = await deliciousnessScore(ai, judge, datapoint, judgeConfig); return { testCaseId: datapoint.testCaseId, evaluation: score, }; }, ); } ``` The `defineEvaluator` method is similar to other Genkit constructors like `defineFlow` and `defineRetriever`. This method requires an `EvaluatorFn` to be provided as a callback. The `EvaluatorFn` method accepts a `BaseEvalDataPoint` object, which corresponds to a single entry in a dataset under evaluation, along with an optional custom-options parameter if specified. The function processes the datapoint and returns an `EvalResponse` object. The Zod Schemas for `BaseEvalDataPoint` and `EvalResponse` are as follows. ##### `BaseEvalDataPoint` [Section titled “BaseEvalDataPoint”](#baseevaldatapoint) ```ts export const BaseEvalDataPoint = z.object({ testCaseId: z.string(), input: z.unknown(), output: z.unknown().optional(), context: z.array(z.unknown()).optional(), reference: z.unknown().optional(), testCaseId: z.string().optional(), traceIds: z.array(z.string()).optional(), }); export const EvalResponse = z.object({ sampleIndex: z.number().optional(), testCaseId: z.string(), traceId: z.string().optional(), spanId: z.string().optional(), evaluation: z.union([ScoreSchema, z.array(ScoreSchema)]), }); ``` ##### `ScoreSchema` [Section titled “ScoreSchema”](#scoreschema) ```ts const ScoreSchema = z.object({ id: z.string().describe('Optional ID to differentiate multiple scores').optional(), score: z.union([z.number(), z.string(), z.boolean()]).optional(), error: z.string().optional(), details: z .object({ reasoning: z.string().optional(), }) .passthrough() .optional(), }); ``` The `defineEvaluator` object lets the user provide a name, a user-readable display name, and a definition for the evaluator. The display name and definiton are displayed along with evaluation results in the Dev UI. It also has an optional `isBilled` field that marks whether this evaluator can result in billing (e.g., it uses a billed LLM or API). If an evaluator is billed, the UI prompts the user for a confirmation in the CLI before allowing them to run an evaluation. This step helps guard against unintended expenses. ### Heuristic Evaluators [Section titled “Heuristic Evaluators”](#heuristic-evaluators) A heuristic evaluator can be any function used to evaluate the `input`, `context`, or `output` of your generative AI feature. Heuristic evaluators in Genkit are made up of 2 components: * A scoring function * An evaluator action #### Define the scoring function [Section titled “Define the scoring function”](#define-the-scoring-function-1) As with the LLM-based evaluator, define the scoring function. In this case, the scoring function does not need a judge LLM. ```ts import { BaseEvalDataPoint, Score } from 'genkit/evaluator'; const US_PHONE_REGEX = /[\+]?[(]?[0-9]{3}[)]?[-\s\.]?[0-9]{3}[-\s\.]?[0-9]{4}/i; /** * Scores whether a datapoint output contains a US Phone number. */ export async function usPhoneRegexScore(dataPoint: BaseEvalDataPoint): Promise { const d = dataPoint; if (!d.output || typeof d.output !== 'string') { throw new Error('String output is required for regex matching'); } const matches = US_PHONE_REGEX.test(d.output as string); const reasoning = matches ? `Output matched US_PHONE_REGEX` : `Output did not match US_PHONE_REGEX`; return { score: matches, details: { reasoning }, }; } ``` #### Define the evaluator action [Section titled “Define the evaluator action”](#define-the-evaluator-action-1) ```ts import { Genkit } from 'genkit'; import { BaseEvalDataPoint, EvaluatorAction } from 'genkit/evaluator'; /** * Configures a regex evaluator to match a US phone number. */ export function createUSPhoneRegexEvaluator(ai: Genkit): EvaluatorAction { return ai.defineEvaluator( { name: `myCustomEvals/usPhoneRegexEvaluator`, displayName: 'Regex Match for US PHONE NUMBER', definition: 'Uses Regex to check if output matches a US phone number', isBilled: false, }, async (datapoint: BaseEvalDataPoint) => { const score = await usPhoneRegexScore(datapoint); return { testCaseId: datapoint.testCaseId, evaluation: score, }; }, ); } ``` ## Putting it together [Section titled “Putting it together”](#putting-it-together) ### Plugin definition [Section titled “Plugin definition”](#plugin-definition) Plugins are registered with the framework by installing them at the time of initializing Genkit. To define a new plugin, use the `genkitPlugin` helper method to instantiate all Genkit actions within the plugin context. This code sample shows two evaluators: the LLM-based deliciousness evaluator, and the regex-based US phone number evaluator. Instantiating these evaluators within the plugin context registers them with the plugin. ```ts import { GenkitPlugin, genkitPlugin } from 'genkit/plugin'; export function myCustomEvals(options: { judge: ModelArgument; judgeConfig?: ModelCustomOptions; }): GenkitPlugin { // Define the new plugin return genkitPlugin('myCustomEvals', async (ai: Genkit) => { const { judge, judgeConfig } = options; // The plugin instatiates our custom evaluators within the context // of the `ai` object, making them available // throughout our Genkit application. createDeliciousnessEvaluator(ai, judge, judgeConfig); createUSPhoneRegexEvaluator(ai); }); } export default myCustomEvals; ``` ### Configure Genkit [Section titled “Configure Genkit”](#configure-genkit) Add the `myCustomEvals` plugin to your Genkit configuration. For evaluation with Gemini, disable safety settings so that the evaluator can accept, detect, and score potentially harmful content. ```ts import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [ vertexAI(), ... myCustomEvals({ judge: googleAI.model("gemini-2.5-flash"), }), ], ... }); ``` ## Using your custom evaluators [Section titled “Using your custom evaluators”](#using-your-custom-evaluators) Once you instantiate your custom evaluators within the Genkit app context (either through a plugin or directly), they are ready to be used. The following example illustrates how to try out the deliciousness evaluator with a few sample inputs and outputs. 1. Create a json file `deliciousness_dataset.json` with the following content: ```json [ { "testCaseId": "delicous_mango", "input": "What is a super delicious fruit", "output": "A perfectly ripe mango – sweet, juicy, and with a hint of tropical sunshine." }, { "testCaseId": "disgusting_soggy_cereal", "input": "What is something that is tasty when fresh but less tasty after some time?", "output": "Stale, flavorless cereal that's been sitting in the box too long." } ] ``` 2. Use the Genkit CLI to run the evaluator against these test cases. ```bash # Start your genkit runtime genkit start -- genkit eval:run deliciousness_dataset.json --evaluators=myCustomEvals/deliciousnessEvaluator ``` 3. Navigate to `localhost:4000/evaluate` to view your results in the Genkit UI. It is important to note that confidence in custom evaluators increases as you benchmark them with standard datasets or approaches. Iterate on the results of such benchmarks to improve your evaluators’ performance until it reaches the targeted level of quality. # Astra DB plugin > This document describes the Astra DB plugin for Genkit, providing a retriever and indexer for Astra DB, including installation, prerequisites, configuration, and usage. This plugin provides a [Astra DB](https://docs.datastax.com/en/astra-db-serverless/index.html) retriever and indexer for Genkit. ## Installation [Section titled “Installation”](#installation) ```bash npm install genkitx-astra-db ``` ## Prerequisites [Section titled “Prerequisites”](#prerequisites) You will need a DataStax account in which to run an Astra DB database. You can [sign up for a free DataStax account here](https://astra.datastax.com/signup). Once you have an account, create a Serverless Vector database. After the database has been provisioned, create a collection. Ensure that you choose the same number of dimensions as the embedding provider you are going to use. You will then need the database’s API Endpoint, an Application Token and the name of the collection in order to configure the plugin. ## Configuration [Section titled “Configuration”](#configuration) To use the Astra DB plugin, specify it when you call `configureGenkit()`: ```typescript import { genkit } from "genkit"; import { googleAI } from "@genkit-ai/googleai"; import { astraDB } from "genkitx-astra-db"; const ai = genkit({ plugins: [ astraDB([ { clientParams: { applicationToken: "your_application_token", apiEndpoint: "your_astra_db_endpoint", keyspace: "default_keyspace", }, collectionName: "your_collection_name", embedder: googleAI.embedder('gemini-embedding-001'), }, ]), ], }); ``` ### Client Parameters [Section titled “Client Parameters”](#client-parameters) You will need an Application Token and API Endpoint from Astra DB. You can either provide them through the `clientParams` object or by setting the environment variables `ASTRA_DB_APPLICATION_TOKEN` and `ASTRA_DB_API_ENDPOINT`. If you are using the default namespace, you do not need to pass it as config. ### Configuration Options [Section titled “Configuration Options”](#configuration-options) The Astra DB plugin accepts the following configuration options: * `collectionName`: (required) The name of the collection in your Astra DB database * `embedder`: (required) The embedding model to use, like Google’s `googleAI.embedder('gemini-embedding-001')`. Ensure that you have set up your collection with the correct number of dimensions for the embedder that you are using * `clientParams`: (optional) Astra DB connection configuration with the following properties: * `applicationToken`: Your Astra DB application token * `apiEndpoint`: Your Astra DB API endpoint * `keyspace`: (optional) Your Astra DB keyspace, defaults to “default\_keyspace” ### Astra DB Vectorize [Section titled “Astra DB Vectorize”](#astra-db-vectorize) You do not need to provide an `embedder` as you can use [Astra DB Vectorize](https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html) to generate your vectors. Ensure that you have [set up your collection with an embedding provider](https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html#external-embedding-provider-integrations). You can then skip the `embedder` option: ```typescript import { genkit } from "genkit"; import { astraDB } from "genkitx-astra-db"; const ai = genkit({ plugins: [ astraDB([ { clientParams: { applicationToken: "your_application_token", apiEndpoint: "your_astra_db_endpoint", keyspace: "default_keyspace", }, collectionName: "your_collection_name", }, ]), ], }); ``` ## Usage [Section titled “Usage”](#usage) Import the indexer and retriever references like so: ```typescript import { astraDBIndexerRef, astraDBRetrieverRef } from "genkitx-astra-db"; ``` Then get a reference using the `collectionName` and an optional `displayName` and pass the relevant references to the Genkit functions `index()` or `retrieve()`. ### Indexing [Section titled “Indexing”](#indexing) Use the indexer reference with `ai.index()`: ```typescript export const astraDBIndexer = astraDBIndexerRef({ collectionName: "your_collection_name", }); await ai.index({ indexer: astraDBIndexer, documents, }); ``` ### Retrieval [Section titled “Retrieval”](#retrieval) Use the retriever reference with `ai.retrieve()`: ```typescript export const astraDBRetriever = astraDBRetrieverRef({ collectionName: "your_collection_name", }); await ai.retrieve({ retriever: astraDBRetriever, query, }); ``` #### Retrieval Options [Section titled “Retrieval Options”](#retrieval-options) You can pass options to `retrieve()` that will affect the retriever. The available options are: * `k`: The number of documents to return from the retriever. The default is 5. * `filter`: A `Filter` as defined by the [Astra DB library](https://docs.datastax.com/en/astra-api-docs/_attachments/typescript-client/types/Filter.html). See below for how to use a filter #### Advanced Retrieval [Section titled “Advanced Retrieval”](#advanced-retrieval) If you want to perform a vector search with additional filtering (hybrid search) you can pass a schema type to `astraDBRetrieverRef`. For example: ```typescript type Schema = { _id: string; text: string; score: number; }; export const astraDBRetriever = astraDBRetrieverRef({ collectionName: "your_collection_name", }); await ai.retrieve({ retriever: astraDBRetriever, query, options: { filter: { score: { $gt: 75 }, }, }, }); ``` You can find the [operators that you can use in filters in the Astra DB documentation](https://docs.datastax.com/en/astra-db-serverless/api-reference/overview.html#operators). If you don’t provide a schema type, you can still filter but you won’t get type-checking on the filtering options. ## Further Information [Section titled “Further Information”](#further-information) For more on using indexers and retrievers with Genkit check out the documentation on [Retrieval-Augmented Generation with Genkit](/docs/rag). ## Learn More [Section titled “Learn More”](#learn-more) For more information, feedback, or to report issues, visit the [Astra DB plugin GitHub repository](https://github.com/datastax/genkitx-astra-db/tree/main). # Auth0 AI plugin > This document describes the Auth0 AI plugin for Genkit, which provides features for building secure AI-powered applications using Auth0, Okta FGA, and Genkit. The Auth0 AI plugin (`@auth0/ai-genkit`) is an SDK for building secure AI-powered applications using [Auth0](https://www.auth0.ai/), [Okta FGA](https://docs.fga.dev/) and Genkit. ## Features [Section titled “Features”](#features) * **Authorization for RAG**: Securely filter documents using Okta FGA as a [retriever](https://js.langchain.com/docs/concepts/retrievers/) for RAG applications. This smart retriever performs efficient batch access control checks, ensuring users only see documents they have permission to access. * **Tool Authorization with FGA**: Protect AI tool execution with fine-grained authorization policies through Okta FGA integration, controlling which users can invoke specific tools based on custom authorization rules. * **Client Initiated Backchannel Authentication (CIBA)**: Implement secure, out-of-band user authorization for sensitive AI operations using the [CIBA standard](https://openid.net/specs/openid-client-initiated-backchannel-authentication-core-1_0.html), enabling user confirmation without disrupting the main interaction flow. * **Federated API Access**: Seamlessly connect to third-party services by leveraging Auth0’s Tokens For APIs feature, allowing AI tools to access users’ connected services (like Google, Microsoft, etc.) with proper authorization. * **Device Authorization Flow**: Support headless and input-constrained environments with the [Device Authorization Flow](https://auth0.com/docs/get-started/authentication-and-authorization-flow/device-authorization-flow), enabling secure user authentication without direct input capabilities. ## Installation [Section titled “Installation”](#installation) Caution `@auth0/ai-genkit` is currently **under heavy development**. We strictly follow [Semantic Versioning (SemVer)](https://semver.org/), meaning all **breaking changes will only occur in major versions**. However, please note that during this early phase, **major versions may be released frequently** as the API evolves. We recommend locking versions when using this in production. ```bash npm install @auth0/ai @auth0/ai-genkit ``` ## Initialization [Section titled “Initialization”](#initialization) Initialize the SDK with your Auth0 credentials: ```javascript import { Auth0AI, setAIContext } from "@auth0/ai-genkit"; import { genkit } from "genkit/beta"; import { googleAI } from "@genkit-ai/googleai"; // Initialize Genkit const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); // Initialize Auth0AI const auth0AI = new Auth0AI({ // Alternatively, you can use the `AUTH0_DOMAIN`, `AUTH0_CLIENT_ID`, and `AUTH0_CLIENT_SECRET` // environment variables. auth0: { domain: "YOUR_AUTH0_DOMAIN", clientId: "YOUR_AUTH0_CLIENT_ID", clientSecret: "YOUR_AUTH0_CLIENT_SECRET", }, // store: new MemoryStore(), // Optional: Use a custom store genkit: ai }); ``` ## Calling APIs [Section titled “Calling APIs”](#calling-apis) The “Tokens for API” feature of Auth0 allows you to exchange refresh tokens for access tokens for third-party APIs. This is useful when you want to use a federated connection (like Google, Facebook, etc.) to authenticate users and then use the access token to call the API on behalf of the user. First initialize the Federated Connection Authorizer as follows: ```javascript const withGoogleAccess = auth0AI.withTokenForConnection({ // An optional function to specify where to retrieve the token // This is the default: refreshToken: async (params) => { return context.refreshToken; }, // The connection name: connection: "google-oauth2", // The scopes to request: scopes: ["https://www.googleapis.com/auth/calendar.freebusy"], }); ``` Then use the `withGoogleAccess` to wrap the tool and use `getAccessTokenForConnection` from the SDK to get the access token: ```javascript import { getAccessTokenForConnection } from "@auth0/ai-genkit"; import { FederatedConnectionError } from "@auth0/ai/interrupts"; import { addHours } from "date-fns"; import { z } from "zod"; export const checkCalendarTool = ai.defineTool( ...withGoogleAccess({ name: "check_user_calendar", description: "Check user availability on a given date time on their calendar", inputSchema: z.object({ date: z.coerce.date(), }), outputSchema: z.object({ available: z.boolean(), }), }, async ({ date }) => { const accessToken = getAccessTokenForConnection(); const body = JSON.stringify({ timeMin: date, timeMax: addHours(date, 1), timeZone: "UTC", items: [{ id: "primary" }], }); const response = await fetch(url, { method: "POST", headers: { Authorization: `Bearer ${accessToken}`, "Content-Type": "application/json", }, body, }); if (!response.ok) { if (response.status === 401) { throw new FederatedConnectionError( `Authorization required to access the Federated Connection` ); } throw new Error( `Invalid response from Google Calendar API: ${ response.status } - ${await response.text()}` ); } const busyResp = await response.json(); return { available: busyResp.calendars.primary.busy.length === 0 }; } )); ``` ## CIBA: Client-Initiated Backchannel Authentication [Section titled “CIBA: Client-Initiated Backchannel Authentication”](#ciba-client-initiated-backchannel-authentication) CIBA (Client-Initiated Backchannel Authentication) enables secure, user-in-the-loop authentication for sensitive operations. This flow allows you to request user authorization asynchronously and resume execution once authorization is granted. ```javascript const buyStockAuthorizer = auth0AI.withAsyncUserConfirmation({ // A callback to retrieve the userID from tool context. userID: (_params, config) => { return config.configurable?.user_id; }, // The message the user will see on the notification bindingMessage: async ({ qty , ticker }) => { return `Confirm the purchase of ${qty} ${ticker}`; }, // The scopes and audience to request audience: process.env["AUDIENCE"], scopes: ["stock:trade"] }); ``` Then wrap the tool as follows: ```javascript import { z } from "zod"; import { getCIBACredentials } from "@auth0/ai-genkit"; export const buyTool = ai.defineTool( ...buyStockAuthorizer({ name: "buy_stock", description: "Execute a stock purchase given stock ticker and quantity", inputSchema: z.object({ tradeID: z .string() .uuid() .describe("The unique identifier for the trade provided by the user"), userID: z .string() .describe("The user ID of the user who created the conditional trade"), ticker: z.string().describe("The stock ticker to trade"), qty: z .number() .int() .positive() .describe("The quantity of shares to trade"), }), outputSchema: z.string(), }, async ({ ticker, qty }) => { const { accessToken } = getCIBACredentials(); fetch("http://yourapi.com/buy", { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${accessToken}`, }, body: JSON.stringify({ ticker, qty }), }); return `Purchased ${qty} shares of ${ticker}`; }) ); ``` ### CIBA with RAR (Rich Authorization Requests) [Section titled “CIBA with RAR (Rich Authorization Requests)”](#ciba-with-rar-rich-authorization-requests) Auth0 supports RAR (Rich Authorization Requests) for CIBA. This allows you to provide additional authorization parameters to be displayed during the user confirmation request. When defining the tool authorizer, you can specify the `authorizationDetails` parameter to include detailed information about the authorization being requested: ```javascript const buyStockAuthorizer = auth0AI.withAsyncUserConfirmation({ // A callback to retrieve the userID from tool context. userID: (_params, config) => { return config.configurable?.user_id; }, // The message the user will see on the notification bindingMessage: async ({ qty , ticker }) => { return `Confirm the purchase of ${qty} ${ticker}`; }, authorizationDetails: async ({ qty, ticker }) => { return [{ type: "trade_authorization", qty, ticker, action: "buy" }]; }, // The scopes and audience to request audience: process.env["AUDIENCE"], scopes: ["stock:trade"] }); ``` To use RAR with CIBA, you need to [set up authorization details](https://auth0.com/docs/get-started/apis/configure-rich-authorization-requests) in your Auth0 tenant. This includes defining the authorization request parameters and their types. Additionally, the [Guardian SDK](https://auth0.com/docs/secure/multi-factor-authentication/auth0-guardian) is required to handle these authorization details in your authorizer app. For more information on setting up RAR with CIBA, refer to: * [Configure Rich Authorization Requests (RAR)](https://auth0.com/docs/get-started/apis/configure-rich-authorization-requests) * [User Authorization with CIBA](https://auth0.com/docs/get-started/authentication-and-authorization-flow/client-initiated-backchannel-authentication-flow/user-authorization-with-ciba) ## Device Flow Authorizer [Section titled “Device Flow Authorizer”](#device-flow-authorizer) The Device Flow Authorizer enables secure, user-in-the-loop authentication for devices or tools that cannot directly authenticate users. It uses the OAuth 2.0 Device Authorization Grant to request user authorization and resume execution once authorization is granted. ```javascript import { auth0 } from "./auth0"; export const deviceFlowAuthorizer = auth0AI.withDeviceAuthorizationFlow({ // The scopes and audience to request scopes: ["read:data", "write:data"], audience: "https://api.example.com", }); ``` Then wrap the tool as follows: ```javascript import { z } from "zod"; import { getDeviceAuthorizerCredentials } from "@auth0/ai-genkit"; export const fetchData = ai.defineTool( ...deviceFlowAuthorizer({ name: "fetch_data", description: "Fetch data from a secure API", inputSchema: z.object({ resourceID: z.string().describe("The ID of the resource to fetch"), }), outputSchema: z.any(), }, async ({ resourceID }) => { const credentials = getDeviceAuthorizerCredentials(); const response = await fetch(`https://api.example.com/resource/${resourceID}`, { headers: { Authorization: `Bearer ${credentials.accessToken}`, }, }); if (!response.ok) { throw new Error(`Failed to fetch resource: ${response.statusText}`); } return await response.json(); }) ); ``` ## FGA [Section titled “FGA”](#fga) ```javascript import { Auth0AI } from "@auth0/ai-llamaindex"; const auth0AI = new Auth0AI.FGA({ apiScheme, apiHost, storeId, credentials: { method: CredentialsMethod.ClientCredentials, config: { apiTokenIssuer, clientId, clientSecret, }, }, }); // Alternatively you can use env variables: `FGA_API_SCHEME`, `FGA_API_HOST`, `FGA_STORE_ID`, `FGA_API_TOKEN_ISSUER`, `FGA_CLIENT_ID` and `FGA_CLIENT_SECRET` ``` Then initialize the tool wrapper: ```javascript const authorizedTool = fgaAI.withFGA( { buildQuery: async ({ userID, doc }) => ({ user: userID, object: doc, relation: "read", }), }, myAITool ); // Or create a wrapper to apply to tools later const authorizer = fgaAI.withFGA({ buildQuery: async ({ userID, doc }) => ({ user: userID, object: doc, relation: "read", }), }); const authorizedTool = authorizer(myAITool); ``` Note The parameters given to the `buildQuery` function are the same provided to the tool’s `execute` function. ## RAG with FGA [Section titled “RAG with FGA”](#rag-with-fga) Auth0 AI can leverage OpenFGA to authorize RAG applications. The `FGARetriever` can be used to filter documents based on access control checks defined in Okta FGA. This retriever performs batch checks on retrieved documents, returning only the ones that pass the specified access criteria. Create a Retriever instance using the `FGARetriever.create` method: ```javascript // From examples/langchain/retrievers-with-fga import { FGARetriever } from "@auth0/ai-genkit/RAG"; import { MemoryStore, RetrievalChain } from "./helpers/memory-store"; import { readDocuments } from "./helpers/read-documents"; async function main() { // UserID const user = "user1"; const documents = await readDocuments(); // 1. Call helper function to load LangChain MemoryStore const vectorStore = await MemoryStore.fromDocuments(documents); // 2. Call helper function to create a LangChain retrieval chain. const retrievalChain = await RetrievalChain.create({ // 3. Decorate the retriever with the FGARetriever to check permissions. retriever: FGARetriever.create({ retriever: vectorStore.asRetriever(), buildQuery: (doc) => ({ user: `user:${user}`, object: `doc:${doc.metadata.id}`, relation: "viewer", }), }), }); // 4. Execute the query const { answer } = await retrievalChain.query({ query: "Show me forecast for ZEKO?", }); console.log(answer); } main().catch(console.error); ``` ## Handling Interrupts [Section titled “Handling Interrupts”](#handling-interrupts) Auth0 AI uses interrupts thoroughly and it will never block a Graph. Whenever an authorizer requires some user interaction the graph will throw a `ToolInterruptError` with data that allows the client the resumption of the flow. Handle the interrupts as follows: ```javascript import { AuthorizationPendingInterrupt } from '@auth0/ai/interrupts" const tools = [ myProtectedTool ]; const response = await ai.generate({ tools, prompt: "Transfer $1000 to account ABC123", }); const interrupt = response.interrupts[0]; if (interrupt && AuthorizationPendingInterrupt.is(interrupt.metadata)) { //do something const tool = tools.find(t => t.name === interrupt.metadata.toolCall.toolName); tool.restart( interrupt, //resume data if needed ); } ``` Note Since Auth0 AI has persistence on the backend you typically don’t need to reattach interrupt’s information when resuming. ## Learn More [Section titled “Learn More”](#learn-more) For more information, feedback, or to report issues, visit the [Auth0 AI for Genkit GitHub repository](https://github.com/auth0-lab/auth0-ai-js/tree/main/packages/ai-genkit). # Chroma plugin > This document describes the Chroma plugin for Genkit, which provides indexer and retriever implementations for the Chroma vector database in client/server mode. The Chroma plugin provides indexer and retriever implementations that use the [Chroma](https://docs.trychroma.com/) vector database in client/server mode. ## Installation [Section titled “Installation”](#installation) ```bash npm install genkitx-chromadb ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { chroma } from 'genkitx-chromadb'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [ chroma([ { collectionName: 'bob_collection', embedder: googleAI.embedder('gemini-embedding-001'), }, ]), ], }); ``` You must specify a Chroma collection and the embedding model you want to use. In addition, there are two optional parameters: * `clientParams`: If you’re not running your Chroma server on the same machine as your Genkit flow, you need to specify auth options, or you’re otherwise not running a default Chroma server configuration, you can specify a Chroma [`ChromaClientParams` object](https://docs.trychroma.com/js_reference/Client) to pass to the Chroma client: ```ts clientParams: { path: "http://192.168.10.42:8000", } ``` * `embedderOptions`: Use this parameter to pass options to the embedder: ```ts embedderOptions: { taskType: 'RETRIEVAL_DOCUMENT' }, ``` ## Usage [Section titled “Usage”](#usage) Import retriever and indexer references like so: ```ts import { chromaRetrieverRef } from 'genkitx-chromadb'; import { chromaIndexerRef } from 'genkitx-chromadb'; ``` Then, use the references with `ai.retrieve()` and `ai.index()`: ```ts // To use the index you configured when you loaded the plugin: let docs = await ai.retrieve({ retriever: chromaRetrieverRef, query }); // To specify an index: export const bobFactsRetriever = chromaRetrieverRef({ collectionName: 'bob-facts', }); docs = await ai.retrieve({ retriever: bobFactsRetriever, query }); ``` ```ts // To use the index you configured when you loaded the plugin: await ai.index({ indexer: chromaIndexerRef, documents }); // To specify an index: export const bobFactsIndexer = chromaIndexerRef({ collectionName: 'bob-facts', }); await ai.index({ indexer: bobFactsIndexer, documents }); ``` See the [Retrieval-augmented generation](/docs/rag) page for a general discussion on indexers and retrievers. # Cloud SQL for PostgreSQL plugin > This document describes the Cloud SQL for PostgreSQL plugin for Genkit, providing indexer and retriever implementations that use PostgreSQL with the pgvector extension for vector similarity search. The Cloud SQL for PostgreSQL plugin provides indexer and retriever implementations that use PostgreSQL with the pgvector extension for vector similarity search. ## Installation [Section titled “Installation”](#installation) ```posix-terminal npm i --save genkitx-cloud-sql-pg ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, first create a `PostgresEngine` instance: ```ts import { PostgresEngine } from 'genkitx-cloud-sql-pg'; // Create PostgresEngine instance const engine = await PostgresEngine.fromInstance('my-project', 'us-central1', 'my-instance', 'my-database'); // Create the vector store table await engine.initVectorstoreTable('my-documents', 768); // Or create a custom vector store table await engine.initVectorstoreTable('my-documents', 768, { schemaName: 'public', contentColumn: 'content', embeddingColumn: 'embedding', idColumn: 'custom_id', // Custom ID column name metadataColumns: [ { name: 'source', dataType: 'TEXT' }, { name: 'category', dataType: 'TEXT' } ], metadataJsonColumn: 'metadata', storeMetadata: true, overwriteExisting: true }); ``` Then, specify the plugin when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { postgres } from 'genkitx-cloud-sql-pg'; import { vertexAI } from '@genkit-ai/vertexai'; const ai = genkit({ plugins: [ postgres([ { tableName: 'my-documents', engine: engine, embedder: vertexAI.embedder('gemini-embedding-001'), // Use additional fields to connect to a custom vector store table // schemaName: 'public', // contentColumn: 'custom_content', // embeddingColumn: 'custom_embedding', // idColumn: 'custom_id', // Match the ID column from table creation // metadataColumns: ['source', 'category'], // metadataJsonColumn: 'my_json_metadata', }, ]), ], }); // To use the table you configured when you loaded the plugin: await ai.index({ indexer: postgresIndexerRef, documents: [ { content: [{ text: "The product features include..." }], metadata: { source: "website", category: "product-docs", custom_id: "doc-123" // This will be used as the document ID } } ] }); // To retrieve from the configured table: const query = "What are the key features of the product?"; let docs = await ai.retrieve({ retriever: postgresRetrieverRef, query, options: { k: 5, filter: { category: 'product-docs', source: 'website' } } }); ``` ## Usage [Section titled “Usage”](#usage) Import retriever and indexer references like so: ```ts import { postgresRetrieverRef, postgresIndexerRef } from 'genkitx-cloud-sql-pg'; ``` ### Index Documents [Section titled “Index Documents”](#index-documents) You can create reusable references for your indexers: ```ts export const myDocumentsIndexer = postgresIndexerRef({ tableName: 'my-custom-documents', idColumn: 'custom_id', metadataColumns: ['source', 'category'] }); ``` Then use them to index documents: ```ts // Index with custom ID from metadata const docWithCustomId = new Document({ content: [{ text: 'Document with custom ID' }], metadata: { source: 'test', category: 'docs', custom_id: 'custom-123' } }); await ai.index({ indexer: myDocumentsIndexer, documents: [docWithCustomId] }); // Index with custom batch size await ai.index({ indexer: myDocumentsIndexer, documents: [ { content: [{ text: "The product features include..." }], metadata: { source: "website", category: "product-docs", custom_id: "doc-456" } } ], options: { batchSize: 10 } }); ``` ### Indexing Options [Section titled “Indexing Options”](#indexing-options) The indexer supports: * batchSize: Number of documents to process at once * Custom ID and metadata handling through table configuration ### Retrieve Documents [Section titled “Retrieve Documents”](#retrieve-documents) You can create reusable references for your retrievers: ```ts export const myDocumentsRetriever = postgresRetrieverRef({ tableName: 'my-documents', idColumn: 'custom_id', metadataColumns: ['source', 'category'] }); ``` Then use them to retrieve documents: ```ts // Basic retrieval const query = "What are the key features of the product?"; let docs = await ai.retrieve({ retriever: myDocumentsRetriever, query, options: { k: 5, // Number of documents to return (default: 4, max: 1000) filter: "source = 'website'" // Optional SQL WHERE clause } }); // Access retrieved documents and their metadata console.log(docs.documents[0].content); // Document content console.log(docs.documents[0].metadata.source); // Metadata fields console.log(docs.documents[0].metadata.category); ``` #### Retriever Options [Section titled “Retriever Options”](#retriever-options) The retriever supports the following options: k: Number of documents to return (default: 4, max: 1000) filter: SQL WHERE clause to filter results (e.g., “category = ‘docs’ AND source = ‘website’“) #### Distance Strategies [Section titled “Distance Strategies”](#distance-strategies) The retriever supports different distance strategies for vector similarity search: ```ts import { DistanceStrategy } from 'genkitx-cloud-sql-pg'; // Configure retriever with specific distance strategy const myDocumentsRetriever = postgresRetrieverRef({ tableName: 'my-documents', distanceStrategy: DistanceStrategy.COSINE_DISTANCE // or EUCLIDEAN_DISTANCE }); ``` Available strategies: * COSINE\_DISTANCE: Cosine similarity (default) * EUCLIDEAN\_DISTANCE: Euclidean distance * DOT\_PRODUCT: Dot product similarity #### Metadata Handling [Section titled “Metadata Handling”](#metadata-handling) The retriever preserves all metadata fields when returning documents. You can access both individual metadata columns and the JSON metadata column: ```ts // Example 1: Search for product documentation const productQuery = "How do I configure the API rate limits?"; const productDocs = await ai.retrieve({ retriever: myDocumentsRetriever, query: productQuery, options: { k: 3, filter: "category = 'api-docs' AND source = 'product-manual'" } }); // Example 2: Search for customer support articles const supportQuery = "What are the troubleshooting steps for connection issues?"; const supportDocs = await ai.retrieve({ retriever: myDocumentsRetriever, query: supportQuery, options: { k: 5, filter: "category = 'troubleshooting' AND source = 'support-kb'" } }); // Access retrieved documents and their metadata console.log(productDocs.documents[0].content); // Document content console.log(productDocs.documents[0].metadata.source); // e.g., "product-manual" console.log(productDocs.documents[0].metadata.category); // e.g., "api-docs" console.log(productDocs.documents[0].metadata.lastUpdated); // e.g., "2024-03-15" ``` See the [Retrieval-augmented generation](http://../rag.md) page for a general discussion on indexers and retrievers. # OpenAI-Compatible Plugin > Learn how to configure and use Genkit OpenAI-comptiable plugin to access models through any OpenAI-compatible API. The `@genkit-ai/compat-oai` package provides plugins for services that are compatible with the OpenAI API specification. This includes official OpenAI services as well as other model providers and local servers that expose an OpenAI-compatible endpoint. This package contains four main exports: * `openAICompatible`: A general-purpose plugin for any OpenAI-compatible service. * [`openAI`](/docs/plugins/openai): A pre-configured plugin for OpenAI’s own services (GPT models, DALL-E, etc.). * [`xai`](/docs/plugins/xai): A pre-configured plugin for xAI (Grok) models. * [`deepSeek`](/docs/plugins/deepseek): A pre-configured plugin for DeepSeek models. ## Installation [Section titled “Installation”](#installation) ```bash npm install @genkit-ai/compat-oai ``` ## General-Purpose OpenAI-Compatible Plugin [Section titled “General-Purpose OpenAI-Compatible Plugin”](#general-purpose-openai-compatible-plugin) You can use the `openAICompatible` plugin factory to connect to any service that exposes an OpenAI-compatible API. This is useful for custom or self-hosted models, such as those served via [Ollama](https://ollama.com/). To use this plugin, import `openAICompatible` and specify it in your Genkit configuration. You must provide a unique `name` for each instance, and client options like `baseURL` and `apiKey`. ### Configuration [Section titled “Configuration”](#configuration) The `openAICompatible` plugin takes an options object with the following parameters: * `name`: (Required) A unique name for the plugin instance (e.g., `'ollama'`, `'my-custom-llm'`). * `apiKey`: The API key for the service. For local services, this can often be a placeholder string like `'ollama'`. * `baseURL`: The base URL of the OpenAI-compatible API endpoint (e.g., `'http://localhost:11434/v1'` for Ollama). * Other options from the OpenAI Node.js SDK’s `ClientOptions` can also be included, such as `timeout` or `defaultHeaders`. Here’s an example of how to configure the plugin for a local Ollama instance: ```ts import { genkit } from 'genkit'; import { openAICompatible } from '@genkit-ai/compat-oai'; export const ai = genkit({ plugins: [ openAICompatible({ name: 'localLlama', apiKey: 'ollama', // Required, but can be a placeholder for local servers baseURL: 'http://localhost:11434/v1', // Example for Ollama }), ], }); ``` ### Usage [Section titled “Usage”](#usage) Once configured, you need to define a `modelRef` to interact with your custom model. A `modelRef` is a reference that tells Genkit how to use a specific model, including its name and any supported features. The model name in the `modelRef` should be prefixed with the `name` you gave the plugin instance, followed by a `/` and the model ID from the service. ```ts import { genkit, modelRef, z } from 'genkit'; import { openAICompatible } from '@genkit-ai/compat-oai'; // In your Genkit config... const ai = genkit({ plugins: [ openAICompatible({ name: 'localLlama', apiKey: 'ollama', baseURL: 'http://localhost:11434/v1', }), ], }); // Define a reference to your model export const myLocalModel = modelRef({ name: 'localLlama/llama3', // You can specify model-specific configuration here if needed. // For many custom models, Genkit's default capabilities are sufficient. }); // Use the model in a flow export const localLlamaFlow = ai.defineFlow( { name: 'localLlamaFlow', inputSchema: z.object({ subject: z.string() }), outputSchema: z.object({ joke: z.string() }), }, async ({ subject }) => { const llmResponse = await ai.generate({ model: myLocalModel, prompt: `Tell me a joke about ${subject}.`, }); return { joke: llmResponse.text }; } ); ``` In this example, `'localLlama/llama3'` tells Genkit to use the `llama3` model provided by the `localLlama` plugin instance. ### Passing Model Configuration [Section titled “Passing Model Configuration”](#passing-model-configuration) You can pass configuration options to the model in the `generate` call. The available options depend on the specific model you are using. Common options include `temperature`, `maxOutputTokens`, etc. These are passed through to the underlying service. ```ts const llmResponse = await ai.generate({ model: myLocalModel, prompt: 'Tell me a joke about a llama.', config: { temperature: 0.9, }, }); ``` # DeepSeek Plugin > Learn how to configure and use Genkit DeepSeek plugin to access DeepSeek models. The `@genkit-ai/compat-oai` package includes a pre-configured plugin for [DeepSeek](https://www.deepseek.com/) models. Note The DeepSeek plugin is built on top of the `openAICompatible` plugin. It is pre-configured for DeepSeek’s API endpoints, so you don’t need to provide a `baseURL`. ## Installation [Section titled “Installation”](#installation) ```bash npm install @genkit-ai/compat-oai ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, import `deepSeek` and specify it when you initialize Genkit. ```ts import { genkit } from 'genkit'; import { deepSeek } from '@genkit-ai/compat-oai/deepseek'; export const ai = genkit({ plugins: [deepSeek()], }); ``` You must provide an API key from DeepSeek. You can get an API key from your [DeepSeek account settings](https://platform.deepseek.com/). Configure the plugin to use your API key by doing one of the following: * Set the `DEEPSEEK_API_KEY` environment variable to your API key. * Specify the API key when you initialize the plugin: ```ts deepSeek({ apiKey: yourKey }); ``` As always, avoid embedding API keys directly in your code. ## Usage [Section titled “Usage”](#usage) Use the `deepSeek.model()` helper to reference a DeepSeek model. ```ts import { genkit, z } from 'genkit'; import { deepSeek } from '@genkit-ai/compat-oai/deepseek'; const ai = genkit({ plugins: [deepSeek({ apiKey: process.env.DEEPSEEK_API_KEY })], }); export const deepseekFlow = ai.defineFlow( { name: 'deepseekFlow', inputSchema: z.object({ subject: z.string() }), outputSchema: z.object({ information: z.string() }), }, async ({ subject }) => { // Reference a model const deepseekChat = deepSeek.model('deepseek-chat'); // Use it in a generate call const llmResponse = await ai.generate({ model: deepseekChat, prompt: `Tell me something about ${subject}.`, }); return { information: llmResponse.text }; }, ); ``` You can also pass model-specific configuration: ```ts const llmResponse = await ai.generate({ model: deepSeek.model('deepseek-chat'), prompt: 'Tell me something about deep learning.', config: { temperature: 0.8, maxTokens: 1024, }, }); ``` ## Advanced usage [Section titled “Advanced usage”](#advanced-usage) ### Passthrough configuration [Section titled “Passthrough configuration”](#passthrough-configuration) You can pass configuration options that are not defined in the plugin’s custom config schema. This permits you to access new models and features without having to update your Genkit version. ```ts import { genkit } from 'genkit'; import { deepSeek } from '@genkit-ai/compat-oai/deepSeek'; const ai = genkit({ plugins: [deepSeek()], }); const llmResponse = await ai.generate({ prompt: `Tell me a cool story`, model: deepSeek.model('deepseek-new'), // hypothetical new model config: { new_feature_parameter: ... // hypothetical config needed for new model }, }); ``` Genkit passes this configuration as-is to the DeepSeek API giving you access to the new model features. Note that the field name and types are not validated by Genkit and should match the DeepSeek API specification to work. # Express plugin > The Genkit Express plugin provides utilities for conveniently exposing Genkit flows and actions via an Express HTTP server as REST APIs. The Genkit Express plugin provides utilities for conveniently exposing Genkit flows and actions via an Express HTTP server as REST APIs. This allows you to integrate your Genkit applications with existing Express-based backends or deploy them to any platform that can serve an Express.js app. ## Installation [Section titled “Installation”](#installation) To use this plugin, install it in your project: ```bash npm i @genkit-ai/express ``` ## Usage [Section titled “Usage”](#usage) You can expose your Genkit flows and actions as REST API endpoints using the `expressHandler` function. First, define your Genkit flow: ```typescript import { genkit, z } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; import { expressHandler } from '@genkit-ai/express'; import express from 'express'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); const simpleFlow = ai.defineFlow( { name: 'simpleFlow', inputSchema: z.object({ input: z.string() }), outputSchema: z.object({ output: z.string() }), }, async ({ input }, { sendChunk }) => { const { text } = await ai.generate({ prompt: input, onChunk: (c) => sendChunk(c.text), }); return { output: text }; }, ); const app = express(); app.use(express.json()); app.post('/simpleFlow', expressHandler(simpleFlow)); app.listen(8080, () => { console.log('Express server listening on port 8080'); }); ``` ### Accessing flows from the client [Section titled “Accessing flows from the client”](#accessing-flows-from-the-client) Flows and actions exposed using the `expressHandler` function can be accessed using the `genkit/beta/client` library: ```typescript import { runFlow, streamFlow } from 'genkit/beta/client'; // Example: Running a flow const result = await runFlow({ url: `http://localhost:8080/simpleFlow`, input: { input: 'say hello' }, }); console.log(result); // { output: "hello" } // Example: Streaming a flow const streamResult = streamFlow({ url: `http://localhost:8080/simpleFlow`, input: { input: 'say hello' }, }); for await (const chunk of streamResult.stream) { console.log(chunk); } console.log(await streamResult.output); ``` ## Authentication [Section titled “Authentication”](#authentication) You can handle authentication for your Express endpoints using context providers with `expressHandler`. This allows you to implement custom authorization logic based on incoming request data. ```typescript import { UserFacingError } from 'genkit'; import { ContextProvider, RequestData } from 'genkit/context'; import { expressHandler } from '@genkit-ai/express'; import express from 'express'; // Define a custom context provider for authentication const authContext: ContextProvider = (req: RequestData) => { if (req.headers['authorization'] !== 'open sesame') { throw new UserFacingError('PERMISSION_DENIED', 'not authorized'); } return { auth: { user: 'Ali Baba', }, }; }; // Example middleware for authentication (optional, can be integrated directly into context provider) const authMiddleware = (req: express.Request, res: express.Response, next: express.NextFunction) => { if (req.headers['authorization'] !== 'open sesame') { return res.status(403).send('Unauthorized'); } next(); }; const app = express(); app.use(express.json()); // Expose the flow with authentication middleware and context provider app.post( '/simpleFlow', authMiddleware, // Optional: Express middleware for early auth checks expressHandler(simpleFlow, { context: authContext }) ); app.listen(8080, () => { console.log('Express server with auth listening on port 8080'); }); ``` When using authentication policies, you can pass headers with the client library: ```typescript import { runFlow } from 'genkit/beta/client'; // set auth headers (when using auth policies) const result = await runFlow({ url: `http://localhost:8080/simpleFlow`, headers: { Authorization: 'open sesame', }, input: { input: 'say hello' }, }); console.log(result); // { output: "hello" } ``` ### Using `withContextProvider` [Section titled “Using withContextProvider”](#using-withcontextprovider) For more advanced authentication scenarios, you can use `withContextProvider` to wrap your flows with a `ContextProvider`. This allows you to inject custom context, such as authentication details, into your flows. Here’s an example of a custom context provider that checks for a specific header: ```typescript import { ContextProvider, RequestData, UserFacingError } from 'genkit/context'; import { startFlowServer, withContextProvider } from '@genkit-ai/express'; import { genkit, z } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); // Define a custom context type interface CustomAuthContext { auth?: { user: string; role: string; }; } // Define a custom context provider const customContextProvider: ContextProvider = async (req: RequestData) => { const customHeader = req.headers['x-custom-auth']; if (customHeader === 'my-secret-token') { return { auth: { user: 'authorized-user', role: 'admin', }, }; } throw new UserFacingError('UNAUTHENTICATED', 'Invalid or missing custom authentication header.'); }; export const protectedFlow = ai.defineFlow( { name: 'protectedFlow', inputSchema: z.object({ input: z.string() }), outputSchema: z.object({ output: z.string() }), }, async ({ input }, { context }) => { // Access context.auth populated by the CustomContextProvider if (!context.auth || context.auth.role !== 'admin') { throw new Error('Unauthorized access: Admin role required.'); } return { output: `Hello, ${context.auth.user}! Your role is ${context.auth.role}. You said: ${input}` }; } ); // Secure the flow with the custom context provider startFlowServer({ flows: [withContextProvider(protectedFlow, customContextProvider)], }); ``` To call this secured flow from the client, include the custom header: ```typescript import { runFlow } from 'genkit/beta/client'; const result = await runFlow({ url: `http://localhost:8080/protectedFlow`, headers: { 'X-Custom-Auth': 'my-secret-token', // Replace with your actual custom token }, input: { input: 'sensitive data' }, }); console.log(result); ``` #### `apiKey` Context Provider [Section titled “apiKey Context Provider”](#apikey-context-provider) The `apiKey` context provider is a built-in `ContextProvider` that allows you to perform API key-based access checks. It can be used with `withContextProvider` to secure your flows. To use it, you provide the expected API key. The `apiKey` provider will then check the `Authorization` header of incoming requests against the provided key. If it matches, it populates `context.auth` with `{ apiKey: 'api-key' }`. Caution **Warning:** This type of API key authentication should only be used with trusted clients (e.g., server-to-server communication or internal applications). Since the API key is sent directly in the `Authorization` header, it can be easily intercepted if used in untrusted environments like client-side web applications. For public-facing applications, consider more robust authentication mechanisms like OAuth 2.0 or Firebase Authentication. ```typescript import { apiKey } from 'genkit/context'; import { startFlowServer, withContextProvider } from '@genkit-ai/express'; import { genkit, z } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); export const securedFlow = ai.defineFlow( { name: 'securedFlow', inputSchema: z.object({ sensitiveData: z.string() }), outputSchema: z.object({ output: z.string() }), }, async ({ sensitiveData }, { context }) => { return { output: 'this is protected by API Key check' }; } ); // Secure the flow with an API key from environment variables startFlowServer({ flows: [withContextProvider(securedFlow, apiKey(process.env.MY_API_KEY))], }); ``` To call this secured flow from the client, include the API key in the `Authorization` header: ```typescript import { runFlow } from 'genkit/beta/client'; const result = await runFlow({ url: `http://localhost:8080/securedFlow`, headers: { Authorization: `${process.env.MY_API_KEY}`, // Replace with your actual API key }, input: { sensitiveData: 'sensitive data' }, }); console.log(result); ``` ## Using `startFlowServer` [Section titled “Using startFlowServer”](#using-startflowserver) You can also use `startFlowServer` to quickly expose multiple flows and actions: ```typescript import { startFlowServer } from '@genkit-ai/express'; import { genkit, z } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); export const menuSuggestionFlow = ai.defineFlow( { name: 'menuSuggestionFlow', inputSchema: z.object({ theme: z.string() }), outputSchema: z.object({ menu: z.string() }), }, async ({ theme }) => { // ... your flow logic here return { menu: `Suggested menu for ${theme}` }; } ); startFlowServer({ flows: [menuSuggestionFlow], }); ``` You can also configure the server with options like `port` and `cors`: ```typescript startFlowServer({ flows: [menuSuggestionFlow], port: 4567, cors: { origin: '*', }, }); ``` `startFlowServer` options: ```ts export interface FlowServerOptions { /** List of flows to expose via the flow server. */ flows: (Flow | FlowWithContextProvider)[]; /** Port to run the server on. Defaults to env.PORT or 3400. */ port?: number; /** CORS options for the server. */ cors?: CorsOptions; /** HTTP method path prefix for the exposed flows. */ pathPrefix?: string; /** JSON body parser options. */ jsonParserOptions?: bodyParser.OptionsJson; } ``` # Firebase plugin > This document describes the Firebase plugin for Genkit, providing integrations with Firebase services like Firestore for vector search and telemetry export to Google Cloud Operations. The Firebase plugin provides integrations with Firebase services, so you can build intelligent and scalable AI applications. Key features include: * **Firestore Vector Store**: Use Firestore for indexing and retrieval with vector embeddings. * **Telemetry**: Export telemetry to [Google’s Cloud operations suite](https://cloud.google.com/products/operations) that powers the Genkit Monitoring console. ## Installation [Section titled “Installation”](#installation) Install the Firebase plugin with npm: ```bash npm install @genkit-ai/firebase ``` ## Prerequisites [Section titled “Prerequisites”](#prerequisites) ### Firebase Project Setup [Section titled “Firebase Project Setup”](#firebase-project-setup) 1. All Firebase products require a Firebase project. You can create a new project or enable Firebase in an existing Google Cloud project using the [Firebase console](https://console.firebase.google.com/). 2. If deploying flows with Cloud functions, [upgrade your Firebase project](https://console.firebase.google.com/project/_/overview?purchaseBillingPlan=metered) to the Blaze plan. 3. If you want to run code locally that exports telemetry, you need the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install) tool installed. ### Firebase Admin SDK Initialization [Section titled “Firebase Admin SDK Initialization”](#firebase-admin-sdk-initialization) You must initialize the Firebase Admin SDK in your application. This is not handled automatically by the plugin. ```js import { initializeApp } from 'firebase-admin/app'; initializeApp({ projectId: 'your-project-id', }); ``` The plugin requires you to specify your Firebase project ID. You can specify your Firebase project ID in either of the following ways: * Set `projectId` in the `initializeApp()` configuration object as shown in the snippet above. * Set the `GCLOUD_PROJECT` environment variable. If you’re running your flow from a Google Cloud environment (Cloud Functions, Cloud Run, and so on), `GCLOUD_PROJECT` is automatically set to the project ID of the environment. If you set `GCLOUD_PROJECT`, you can omit the configuration parameter in `initializeApp()`. ### Credentials [Section titled “Credentials”](#credentials) To provide Firebase credentials, you also need to set up Google Cloud Application Default Credentials. To specify your credentials: * If you’re running your flow from a Google Cloud environment (Cloud Functions, Cloud Run, and so on), this is set automatically. * For other environments: 1. Generate service account credentials for your Firebase project and download the JSON key file. You can do so on the [Service account](https://console.firebase.google.com/project/_/settings/serviceaccounts/adminsdk) page of the Firebase console. 2. Set the environment variable `GOOGLE_APPLICATION_CREDENTIALS` to the file path of the JSON file that contains your service account key, or you can set the environment variable `GCLOUD_SERVICE_ACCOUNT_CREDS` to the content of the JSON file. ## Features and usage [Section titled “Features and usage”](#features-and-usage) ### Telemetry [Section titled “Telemetry”](#telemetry) The Firebase plugin provides a telemetry implementation for sending metrics, traces, and logs to Genkit Monitoring. To get started, visit the [Getting started guide](/docs/observability/getting-started) for installation and configuration instructions. See the [Authentication and authorization guide](/docs/observability/authentication) to authenticate with Google Cloud. See the [Advanced configuration guide](/docs/observability/advanced-configuration) for configuration options. See the [Telemetry collection](/docs/observability/telemetry-collection) for details on which Genkit metrics, traces, and logs collected. ### Cloud Firestore vector search [Section titled “Cloud Firestore vector search”](#cloud-firestore-vector-search) You can use Cloud Firestore as a vector store for RAG indexing and retrieval. This section contains information specific to the `firebase` plugin and Cloud Firestore’s vector search feature. See the [Retrieval-augmented generation](/docs/rag) page for a more detailed discussion on implementing RAG using Genkit. #### Using `GCLOUD_SERVICE_ACCOUNT_CREDS` and Firestore [Section titled “Using GCLOUD\_SERVICE\_ACCOUNT\_CREDS and Firestore”](#using-gcloud_service_account_creds-and-firestore) If you are using service account credentials by passing credentials directly via `GCLOUD_SERVICE_ACCOUNT_CREDS` and are also using Firestore as a vector store, you need to pass credentials directly to the Firestore instance during initialization or the singleton may be initialized with application default credentials depending on plugin initialization order. ```js import { initializeApp } from 'firebase-admin/app'; import { getFirestore } from 'firebase-admin/firestore'; const app = initializeApp(); let firestore = getFirestore(app); if (process.env.GCLOUD_SERVICE_ACCOUNT_CREDS) { const serviceAccountCreds = JSON.parse(process.env.GCLOUD_SERVICE_ACCOUNT_CREDS); const authOptions = { credentials: serviceAccountCreds }; firestore.settings(authOptions); } ``` #### Define a Firestore retriever [Section titled “Define a Firestore retriever”](#define-a-firestore-retriever) Use `defineFirestoreRetriever()` to create a retriever for Firestore vector-based queries. ```js import { defineFirestoreRetriever } from '@genkit-ai/firebase'; import { initializeApp } from 'firebase-admin/app'; import { getFirestore } from 'firebase-admin/firestore'; const app = initializeApp(); const firestore = getFirestore(app); const retriever = defineFirestoreRetriever(ai, { name: 'exampleRetriever', firestore, collection: 'documents', contentField: 'text', // Field containing document content vectorField: 'embedding', // Field containing vector embeddings embedder: yourEmbedderInstance, // Embedder to generate embeddings distanceMeasure: 'COSINE', // Default is 'COSINE'; other options: 'EUCLIDEAN', 'DOT_PRODUCT' }); ``` #### Retrieve documents [Section titled “Retrieve documents”](#retrieve-documents) To retrieve documents using the defined retriever, pass the retriever instance and query options to `ai.retrieve`. ```js const docs = await ai.retrieve({ retriever, query: 'search query', options: { limit: 5, // Options: Return up to 5 documents where: { category: 'example' }, // Optional: Filter by field-value pairs collection: 'alternativeCollection', // Optional: Override default collection }, }); ``` #### Available Retrieval Options [Section titled “Available Retrieval Options”](#available-retrieval-options) The following options can be passed to the `options` field in `ai.retrieve`: * **`limit`**: *(number)* Specify the maximum number of documents to retrieve. Default is `10`. * **`where`**: *(Record\)* Add additional filters based on Firestore fields. Example: ```js where: { category: 'news', status: 'published' } ``` * **`collection`**: *(string)* Override the default collection specified in the retriever configuration. This is useful for querying subcollections or dynamically switching between collections. #### Populate Firestore with Embeddings [Section titled “Populate Firestore with Embeddings”](#populate-firestore-with-embeddings) To populate your Firestore collection, use an embedding generator along with the Admin SDK. For example, the menu ingestion script from the [Retrieval-augmented generation](/docs/rag) page could be adapted for Firestore in the following way: ```js import { genkit } from 'genkit'; import { vertexAI } from "@genkit-ai/vertexai"; import { applicationDefault, initializeApp } from "firebase-admin/app"; import { FieldValue, getFirestore } from "firebase-admin/firestore"; import { chunk } from "llm-chunk"; import pdf from "pdf-parse"; import { readFile } from "fs/promises"; import path from "path"; // Change these values to match your Firestore config/schema const indexConfig = { collection: "menuInfo", contentField: "text", vectorField: "embedding", embedder: vertexAI.embedder('gemini-embedding-001'), }; const ai = genkit({ plugins: [vertexAI({ location: "us-central1" })], }); const app = initializeApp({ credential: applicationDefault() }); const firestore = getFirestore(app); export async function indexMenu(filePath: string) { filePath = path.resolve(filePath); // Read the PDF. const pdfTxt = await extractTextFromPdf(filePath); // Divide the PDF text into segments. const chunks = await chunk(pdfTxt); // Add chunks to the index. await indexToFirestore(chunks); } async function indexToFirestore(data: string[]) { for (const text of data) { const embedding = (await ai.embed({ embedder: indexConfig.embedder, content: text, }))[0].embedding; await firestore.collection(indexConfig.collection).add({ [indexConfig.vectorField]: FieldValue.vector(embedding), [indexConfig.contentField]: text, }); } } async function extractTextFromPdf(filePath: string) { const pdfFile = path.resolve(filePath); const dataBuffer = await readFile(pdfFile); const data = await pdf(dataBuffer); return data.text; } ``` Firestore depends on indexes to provide fast and efficient querying on collections. (Note that “index” here refers to database indexes, and not Genkit’s indexer and retriever abstractions.) The prior example requires the `embedding` field to be indexed to work. To create the index: * Run the `gcloud` command described in the [Create a single-field vector index](https://firebase.google.com/docs/firestore/vector-search?authuser=0#create_and_manage_vector_indexes) section of the Firestore docs. The command looks like the following: ```bash gcloud alpha firestore indexes composite create --project=your-project-id \ --collection-group=yourCollectionName --query-scope=COLLECTION \ --field-config=vector-config='{"dimension":"768","flat": "{}"}',field-path=yourEmbeddingField ``` However, the correct indexing configuration depends on the queries you make and the embedding model you’re using. * Alternatively, call `ai.retrieve()` and Firestore will throw an error with the correct command to create the index. #### Learn more [Section titled “Learn more”](#learn-more) * See the [Retrieval-augmented generation](/docs/rag) page for a general discussion on indexers and retrievers in Genkit. * See [Search with vector embeddings](https://firebase.google.com/docs/firestore/vector-search) in the Cloud Firestore docs for more on the vector search feature. ### Deploy flows as Cloud Functions [Section titled “Deploy flows as Cloud Functions”](#deploy-flows-as-cloud-functions) To deploy a flow with Cloud Functions, use the Firebase Functions library’s built-in support for genkit. The `onCallGenkit` method lets you to create a [callable function](https://firebase.google.com/docs/functions/callable?gen=2nd) from a flow. It automatically supports streaming and JSON requests. You can use the [Cloud Functions client SDKs](https://firebase.google.com/docs/functions/callable?gen=2nd#call_the_function) to call them. ```js import { onCallGenkit } from 'firebase-functions/https'; import { defineSecret } from 'firebase-functions/params'; export const exampleFlow = ai.defineFlow( { name: 'exampleFlow', }, async (prompt) => { // Flow logic goes here. return response; }, ); // WARNING: This has no authentication or app check protections. // See genkit.dev/docs/auth for more information. export const example = onCallGenkit({ secrets: [apiKey] }, exampleFlow); ``` Deploy your flow using the Firebase CLI: ```bash firebase deploy --only functions ``` # Google Generative AI plugin > This document describes the Google Generative AI plugin for Genkit, providing interfaces to Google's Gemini models, including text-to-speech, video generation, and context caching. The Google Generative AI plugin provides interfaces to Google’s Gemini models through the [Gemini API](https://ai.google.dev/docs/gemini_api_overview). ## Installation [Section titled “Installation”](#installation) ```bash npm install @genkit-ai/googleai ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()], }); ``` The plugin requires an API key for the Gemini API, which you can get from [Google AI Studio](https://aistudio.google.com/app/apikey). Configure the plugin to use your API key by doing one of the following: * Set the `GEMINI_API_KEY` environment variable to your API key. * Specify the API key when you initialize the plugin: ```ts googleAI({ apiKey: yourKey }); ``` However, don’t embed your API key directly in code! Use this feature only in conjunction with a service like Cloud Secret Manager or similar. ## Usage [Section titled “Usage”](#usage) The recommended way to reference models is through the helper function provided by the plugin: ```ts import { googleAI } from '@genkit-ai/googleai'; // Referencing models const model = googleAI.model('gemini-2.5-flash'); const modelPro = googleAI.model('gemini-2.5-flash-lite'); // Referencing embedders const embedder = googleAI.embedder('gemini-embedding-001'); ``` You can use these references to specify which model `generate()` uses: ```ts const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), // Set default model }); const llmResponse = await ai.generate('Tell me a joke.'); ``` or use embedders (ex. `gemini-embedding-001`) with `embed` or retrievers: ```ts const ai = genkit({ plugins: [googleAI()], }); const embeddings = await ai.embed({ embedder: googleAI.embedder('gemini-embedding-001'), content: input, }); ``` ## Gemini Files API [Section titled “Gemini Files API”](#gemini-files-api) You can use files uploaded to the Gemini Files API with Genkit: ```ts import { GoogleAIFileManager } from '@google/generative-ai/server'; import { genkit } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()], }); const fileManager = new GoogleAIFileManager(process.env.GEMINI_API_KEY); const uploadResult = await fileManager.uploadFile('path/to/file.jpg', { mimeType: 'image/jpeg', displayName: 'Your Image', }); const response = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: [ { text: 'Describe this image:' }, { media: { contentType: uploadResult.file.mimeType, url: uploadResult.file.uri, }, }, ], }); ``` ## Fine-tuned models [Section titled “Fine-tuned models”](#fine-tuned-models) You can use models fine-tuned with the Google Gemini API. Follow the instructions from the [Gemini API](https://ai.google.dev/gemini-api/docs/model-tuning/tutorial?lang=python) or fine-tune a model using [AI Studio](https://aistudio.corp.google.com/app/tune). The tuning process uses a base model—for example, Gemini 2.0 Flash—and your provided examples to create a new tuned model. Remember the base model you used, and copy the new model’s ID. When calling the tuned model in Genkit, use the base model as the `model` parameter, and pass the tuned model’s ID as part of the `config` block. For example, if you used Gemini 2.0 Flash as the base model, and got the model ID `tunedModels/my-example-model-apbm8oqbvuv2` you can call it with: ```ts const ai = genkit({ plugins: [googleAI()], }); const llmResponse = await ai.generate({ prompt: `Suggest an item for the menu of fish themed restruant`, model: googleAI.model('tunedModels/my-example-model-apbm8oqbvuv2'), }); ``` ## Text-to-Speech (TTS) Models [Section titled “Text-to-Speech (TTS) Models”](#text-to-speech-tts-models) The Google Generative AI plugin provides access to text-to-speech capabilities through the Gemini TTS models. These models can convert text into natural-sounding speech for various applications such as voice assistants, accessibility features, or interactive content. ### Basic Usage [Section titled “Basic Usage”](#basic-usage) To generate audio using the Gemini TTS model: ```ts import { googleAI } from '@genkit-ai/googleai'; import { writeFile } from 'node:fs/promises'; import wav from 'wav'; // npm install wav && npm install -D @types/wav const ai = genkit({ plugins: [googleAI()], }); const { media } = await ai.generate({ model: googleAI.model('gemini-2.5-flash-preview-tts'), config: { responseModalities: ['AUDIO'], speechConfig: { voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib' }, }, }, }, prompt: 'Say that Genkit is an amazing Gen AI library', }); if (!media) { throw new Error('no media returned'); } const audioBuffer = Buffer.from( media.url.substring(media.url.indexOf(',') + 1), 'base64' ); await writeFile('output.wav', await toWav(audioBuffer)); async function toWav( pcmData: Buffer, channels = 1, rate = 24000, sampleWidth = 2 ): Promise { return new Promise((resolve, reject) => { // This code depends on `wav` npm library. const writer = new wav.Writer({ channels, sampleRate: rate, bitDepth: sampleWidth * 8, }); let bufs = [] as any[]; writer.on('error', reject); writer.on('data', function (d) { bufs.push(d); }); writer.on('end', function () { resolve(Buffer.concat(bufs).toString('base64')); }); writer.write(pcmData); writer.end(); }); } ``` ### Multi-speaker Audio Generation [Section titled “Multi-speaker Audio Generation”](#multi-speaker-audio-generation) You can generate audio with multiple speakers, each with their own voice: ```ts const response = await ai.generate({ model: googleAI.model('gemini-2.5-flash-preview-tts'), config: { responseModalities: ['AUDIO'], speechConfig: { multiSpeakerVoiceConfig: { speakerVoiceConfigs: [ { speaker: 'Speaker1', voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib' }, }, }, { speaker: 'Speaker2', voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Achernar' }, }, }, ], }, }, }, prompt: `Here's the dialog: Speaker1: "Genkit is an amazing Gen AI library!" Speaker2: "I thought it was a framework."`, }); ``` When using multi-speaker configuration, the model automatically detects speaker labels in the text (like “Speaker1:” and “Speaker2:”) and applies the corresponding voice to each speaker’s lines. ### Configuration Options [Section titled “Configuration Options”](#configuration-options) The Gemini TTS models support various configuration options: #### Voice Selection [Section titled “Voice Selection”](#voice-selection) You can choose from different pre-built voices with unique characteristics: ```ts speechConfig: { voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib' // Other options: 'Achernar', 'Ankaa', etc. }, }, } ``` #### Speech Emphasis [Section titled “Speech Emphasis”](#speech-emphasis) You can use markdown-style formatting in your prompt to add emphasis: * Bold text (`**like this**`) for stronger emphasis * Italic text (`*like this*`) for moderate emphasis Example: ```ts prompt: 'Genkit is an **amazing** Gen AI *library*!' ``` #### Advanced Speech Parameters [Section titled “Advanced Speech Parameters”](#advanced-speech-parameters) For more control over the generated speech: ```ts speechConfig: { voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib', speakingRate: 1.0, // Range: 0.25 to 4.0, default is 1.0 pitch: 0.0, // Range: -20.0 to 20.0, default is 0.0 volumeGainDb: 0.0, // Range: -96.0 to 16.0, default is 0.0 }, }, } ``` * `speakingRate`: Controls the speed of speech (higher values = faster speech) * `pitch`: Adjusts the pitch of the voice (higher values = higher pitch) * `volumeGainDb`: Controls the volume (higher values = louder) For more detailed information about the Gemini TTS models and their configuration options, see the [Google AI Speech Generation documentation](https://ai.google.dev/gemini-api/docs/speech-generation). ## Video Generation (Veo) Models [Section titled “Video Generation (Veo) Models”](#video-generation-veo-models) The Google Generative AI plugin provides access to video generation capabilities through the Veo models. These models can generate videos from text prompts or manipulate existing images to create dynamic video content. ### Basic Usage: Text-to-Video Generation [Section titled “Basic Usage: Text-to-Video Generation”](#basic-usage-text-to-video-generation) To generate a video from a text prompt using the Veo model: ```ts import { googleAI } from '@genkit-ai/googleai'; import * as fs from 'fs'; import { Readable } from 'stream'; import { MediaPart } from 'genkit'; import { genkit } from 'genkit'; const ai = genkit({ plugins: [googleAI()], }); ai.defineFlow('text-to-video-veo', async () => { let { operation } = await ai.generate({ model: googleAI.model('veo-2.0-generate-001'), prompt: 'A majestic dragon soaring over a mystical forest at dawn.', config: { durationSeconds: 5, aspectRatio: '16:9', }, }); if (!operation) { throw new Error('Expected the model to return an operation'); } // Wait until the operation completes. while (!operation.done) { operation = await ai.checkOperation(operation); // Sleep for 5 seconds before checking again. await new Promise((resolve) => setTimeout(resolve, 5000)); } if (operation.error) { throw new Error('failed to generate video: ' + operation.error.message); } const video = operation.output?.message?.content.find((p) => !!p.media); if (!video) { throw new Error('Failed to find the generated video'); } await downloadVideo(video, 'output.mp4'); }); async function downloadVideo(video: MediaPart, path: string) { const fetch = (await import('node-fetch')).default; // Add API key before fetching the video. const videoDownloadResponse = await fetch( `${video.media!.url}&key=${process.env.GEMINI_API_KEY}` ); if ( !videoDownloadResponse || videoDownloadResponse.status !== 200 || !videoDownloadResponse.body ) { throw new Error('Failed to fetch video'); } Readable.from(videoDownloadResponse.body).pipe(fs.createWriteStream(path)); } ``` Veo 3 uses the exact same API, just make sure you only use supported config options (see below). To use the Veo 3 model, reference `veo-3.0-generate-preview`: ```ts let { operation } = await ai.generate({ model: googleAI.model('veo-3.0-generate-preview'), prompt: 'A cinematic shot of a an old car driving down a deserted road at sunset.', }); ``` ### Video Generation from Photo Reference [Section titled “Video Generation from Photo Reference”](#video-generation-from-photo-reference) To use a photo as reference for the video using the Veo model (e.g. to make a static photo move), you can provide an image as part of the prompt. ```ts const startingImage = fs.readFileSync('photo.jpg', { encoding: 'base64' }); let { operation } = await ai.generate({ model: googleAI.model('veo-2.0-generate-001'), prompt: [ { text: 'make the subject in the photo move', }, { media: { contentType: 'image/jpeg', url: `data:image/jpeg;base64,${startingImage}`, }, }, ], config: { durationSeconds: 5, aspectRatio: '9:16', personGeneration: 'allow_adult', }, }); ``` ### Configuration Options [Section titled “Configuration Options”](#configuration-options-1) The Veo models support various configuration options. #### Veo Model Parameters [Section titled “Veo Model Parameters”](#veo-model-parameters) Full list of options can be found at * `negativePrompt`: Text string that describes anything you want to discourage the model from generating * `aspectRatio`: Changes the aspect ratio of the generated video. * `"16:9"`: Supported in Veo 3 and Veo 2. * `"9:16"`: Supported in Veo 2 only (defaults to “16:9”). * `personGeneration`: Allow the model to generate videos of people. The following values are supported: * **Text-to-video generation**: * `"allow_all"`: Generate videos that include adults and children. Currently the only available `personGeneration` value for Veo 3. * `"dont_allow"`: Veo 2 only. Don’t allow the inclusion of people or faces. * `"allow_adult"`: Veo 2 only. Generate videos that include adults, but not children. * **Image-to-video generation**: Veo 2 only * `"dont_allow"`: Don’t allow the inclusion of people or faces. * `"allow_adult"`: Generate videos that include adults, but not children. * `numberOfVideos`: Output videos requested * `1`: Supported in Veo 3 and Veo 2 * `2`: Supported in Veo 2 only. * `durationSeconds`: Veo 2 only. Length of each output video in seconds, between 5 and 8. Not configurable for Veo 3, default setting is 8 seconds. * `enhancePrompt`: Veo 2 only. Enable or disable the prompt rewriter. Enabled by default. Not configurable for Veo 3, default prompt enhancer is always on. ## Context Caching [Section titled “Context Caching”](#context-caching) The Google Generative AI plugin supports **context caching**, which allows models to reuse previously cached content to optimize performance and reduce latency for repetitive tasks. This feature is especially useful for conversational flows or scenarios where the model references a large body of text consistently across multiple requests. ### How to Use Context Caching [Section titled “How to Use Context Caching”](#how-to-use-context-caching) To enable context caching, ensure your model supports it. For example, `gemini-2.5-flash` and `gemini-1.5-pro` are models that support context caching. You can define a caching mechanism in your application like this: ```ts const ai = genkit({ plugins: [googleAI()], }); const llmResponse = await ai.generate({ messages: [ { role: 'user', content: [{ text: 'Here is the relevant text from War and Peace.' }], }, { role: 'model', content: [ { text: `Based on War and Peace, here is some analysis of Pierre Bezukhov's character.`, }, ], metadata: { cache: { ttlSeconds: 300, // Cache this message for 5 minutes }, }, }, ], model: googleAI.model('gemini-2.5-flash-001'), prompt: `Describe Pierre's transformation throughout the novel`, }); ``` In this setup: * **`messages`**: Allows you to pass conversation history. * **`metadata.cache.ttlSeconds`**: Specifies the time-to-live (TTL) for caching a specific response. ### Example: Leveraging Large Texts with Context [Section titled “Example: Leveraging Large Texts with Context”](#example-leveraging-large-texts-with-context) For applications referencing long documents, such as *War and Peace* or *Lord of the Rings*, you can structure your queries to reuse cached contexts: ```ts const fs = require('fs/promises'); const textContent = await fs.readFile('path/to/war_and_peace.txt', 'utf-8'); const llmResponse = await ai.generate({ messages: [ { role: 'user', content: [{ text: textContent }], // Include the large text as context }, { role: 'model', content: [ { text: 'This analysis is based on the provided text from War and Peace.', }, ], metadata: { cache: { ttlSeconds: 300, // Cache the response to avoid reloading the full text }, }, }, ], model: googleAI.model('gemini-2.5-flash-001'), prompt: 'Analyze the relationship between Pierre and Natasha.', }); ``` ### Caching other modes of content [Section titled “Caching other modes of content”](#caching-other-modes-of-content) The Gemini models are multi-modal, and other modes of content are allowed to be cached as well. For example, to cache a long piece of video content, you must first upload using the file manager from the Google AI SDK: ```ts import { GoogleAIFileManager } from '@google/generative-ai/server'; ``` ```ts const fileManager = new GoogleAIFileManager(process.env.GEMINI_API_KEY); // Upload video to Google AI using the Gemini Files API const uploadResult = await fileManager.uploadFile(videoFilePath, { mimeType: 'video/mp4', // Adjust according to the video format displayName: 'Uploaded Video for Analysis', }); const fileUri = uploadResult.file.uri; ``` Now you may configure the cache in your calls to `ai.generate`: ```ts const analyzeVideoResponse = await ai.generate({ messages: [ { role: 'user', content: [ { media: { url: fileUri, // Use the uploaded file URL contentType: 'video/mp4', }, }, ], }, { role: 'model', content: [ { text: 'This video seems to contain several key moments. I will analyze it now and prepare to answer your questions.', }, ], // Everything up to (including) this message will be cached. metadata: { cache: true, }, }, ], model: googleAI.model('gemini-2.5-flash-001'), prompt: query, }); ``` ### Supported Models for Context Caching [Section titled “Supported Models for Context Caching”](#supported-models-for-context-caching) Only specific models, such as `gemini-2.5-flash` and `gemini-1.5-pro`, support context caching. If an unsupported model is used, an error will be raised, indicating that caching cannot be applied. ### Further Reading [Section titled “Further Reading”](#further-reading) See more information regarding context caching on Google AI in their [documentation](https://ai.google.dev/gemini-api/docs/caching?lang=node). # LanceDB plugin > This document describes the LanceDB plugin for Genkit, which provides indexer and retriever implementations for LanceDB, an open-source vector database for AI applications. The LanceDB plugin provides indexer and retriever implementations that use [LanceDB](https://lancedb.com/), an open-source vector database for AI applications. ## Installation [Section titled “Installation”](#installation) ```bash npm install genkitx-lancedb ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { lancedb } from 'genkitx-lancedb'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [ // Google AI provides the gemini-embedding-001 embedder googleAI(), // LanceDB requires an embedder to translate from text to vector lancedb([ { dbUri: '.db', // optional lancedb uri, default to .db tableName: 'table', // optional table name, default to table embedder: googleAI.embedder('gemini-embedding-001'), }, ]), ], }); ``` You must specify an embedder to use with LanceDB. You can also optionally configure: * `dbUri`: The URI for the LanceDB database (defaults to `.db`) * `tableName`: The name of the table to use (defaults to `table`) ## Usage [Section titled “Usage”](#usage) Import retriever and indexer references like so: ```ts import { lancedbRetrieverRef, lancedbIndexerRef, WriteMode } from 'genkitx-lancedb'; ``` ### Retrieval [Section titled “Retrieval”](#retrieval) Use the retriever reference with `ai.retrieve()`: ```ts // To use the default configuration: let docs = await ai.retrieve({ retriever: lancedbRetrieverRef, query }); // To specify custom options: export const menuRetriever = lancedbRetrieverRef({ tableName: "table", // Use the same table name as the indexer displayName: "Menu", // Use a custom display name }); docs = await ai.retrieve({ retriever: menuRetriever, query, options: { k: 3, // Limit to 3 results }, }); ``` ### Indexing [Section titled “Indexing”](#indexing) Use the indexer reference with `ai.index()`: ```ts // To use the default configuration: await ai.index({ indexer: lancedbIndexerRef, documents }); // To specify custom options: export const menuPdfIndexer = lancedbIndexerRef({ // Using all defaults for dbUri, tableName, and embedder }); await ai.index({ indexer: menuPdfIndexer, documents, options: { writeMode: WriteMode.Overwrite, } }); ``` ## Example: Creating a RAG Flow [Section titled “Example: Creating a RAG Flow”](#example-creating-a-rag-flow) Here’s a complete example of creating a RAG (Retrieval-Augmented Generation) flow with LanceDB: ```ts import { lancedbIndexerRef, lancedb, lancedbRetrieverRef, WriteMode } from 'genkitx-lancedb'; import { googleAI } from '@genkit-ai/googleai'; import { z, genkit } from 'genkit'; import { Document } from 'genkit/retriever'; import { chunk } from 'llm-chunk'; import { readFile } from 'fs/promises'; import path from 'path'; import pdf from 'pdf-parse/lib/pdf-parse'; const ai = genkit({ plugins: [ googleAI(), lancedb([ { dbUri: '.db', tableName: 'table', embedder: googleAI.embedder('gemini-embedding-001'), }, ]), ], }); // Define indexer export const menuPdfIndexer = lancedbIndexerRef({ // Using all defaults }); const chunkingConfig = { minLength: 1000, maxLength: 2000, splitter: 'sentence', overlap: 100, delimiters: '', }; async function extractTextFromPdf(filePath: string) { const pdfFile = path.resolve(filePath); const dataBuffer = await readFile(pdfFile); const data = await pdf(dataBuffer); return data.text; } // Define indexing flow export const indexMenu = ai.defineFlow( { name: 'indexMenu', inputSchema: z.object({ filePath: z.string().describe('PDF file path') }), outputSchema: z.object({ success: z.boolean(), documentsIndexed: z.number(), error: z.string().optional(), }), }, async ({ filePath }) => { try { filePath = path.resolve(filePath); // Read the pdf const pdfTxt = await ai.run('extract-text', () => extractTextFromPdf(filePath) ); // Divide the pdf text into segments const chunks = await ai.run('chunk-it', async () => chunk(pdfTxt, chunkingConfig) ); // Convert chunks of text into documents to store in the index const documents = chunks.map((text) => { return Document.fromText(text, { filePath }); }); // Add documents to the index await ai.index({ indexer: menuPdfIndexer, documents, options: { writeMode: WriteMode.Overwrite, } }); return { success: true, documentsIndexed: documents.length, }; } catch (err) { // For unexpected errors that throw exceptions return { success: false, documentsIndexed: 0, error: err instanceof Error ? err.message : String(err) }; } } ); // Define retriever export const menuRetriever = lancedbRetrieverRef({ tableName: "table", // Use the same table name as the indexer displayName: "Menu", // Use a custom display name }); // Define retrieval flow export const menuQAFlow = ai.defineFlow( { name: "Menu", inputSchema: z.object({ query: z.string() }), outputSchema: z.object({ answer: z.string() }) }, async ({ query }) => { // Retrieve relevant documents const docs = await ai.retrieve({ retriever: menuRetriever, query, options: { k: 3, }, }); // Generate response using retrieved documents const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: ` You are acting as a helpful AI assistant that can answer questions about the food available on the menu at Genkit Grub Pub. Use only the context provided to answer the question. If you don't know, do not make up an answer. Do not add or change items on the menu. Question: ${query} `, docs, }); return { answer: text }; } ); ``` See the [Retrieval-augmented generation](/docs/rag) page for a general discussion on indexers and retrievers. ## Learn More [Section titled “Learn More”](#learn-more) For more information, feedback, or to report issues, visit the [LanceDB plugin GitHub repository](https://github.com/lancedb/genkitx-lancedb). # Model Context Protocol (MCP) plugin > The Genkit MCP plugin provides integration between Genkit and the Model Context Protocol (MCP). The Genkit MCP plugin provides integration between Genkit and the [Model Context Protocol](https://modelcontextprotocol.io) (MCP). MCP is an open standard allowing developers to build “servers” which provide tools, resources, and prompts to clients. Genkit MCP allows Genkit developers to: * Consume MCP tools, prompts, and resources as a client using `createMcpHost` or `createMcpClient`. * Provide Genkit tools and prompts as an MCP server using `createMcpServer`. ## Installation [Section titled “Installation”](#installation) To get started, you’ll need Genkit and the MCP plugin: ```bash npm i genkit @genkit-ai/mcp ``` ## MCP Host [Section titled “MCP Host”](#mcp-host) To connect to one or more MCP servers, you use the `createMcpHost` function. This function returns a `GenkitMcpHost` instance that manages connections to the configured MCP servers. ```ts import { googleAI } from '@genkit-ai/googleai'; import { createMcpHost } from '@genkit-ai/mcp'; import { genkit } from 'genkit'; const mcpHost = createMcpHost({ name: 'myMcpClients', // A name for the host plugin itself mcpServers: { // Each key (e.g., 'fs', 'git') becomes a namespace for the server's tools. fs: { command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', process.cwd()], }, memory: { command: 'npx', args: ['-y', '@modelcontextprotocol/server-memory'], }, }, }); const ai = genkit({ plugins: [googleAI()], }); (async () => { // Provide MCP tools to the model of your choice. const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: `Analyze all files in ${process.cwd()}.`, tools: await mcpHost.getActiveTools(ai), resources: await mcpHost.getActiveResources(ai), }); console.log(text); await mcpHost.close(); })(); ``` The `createMcpHost` function initializes a `GenkitMcpHost` instance, which handles the lifecycle and communication with the defined MCP servers. ### `createMcpHost()` Options [Section titled “createMcpHost() Options”](#createmcphost-options) ```ts export interface McpHostOptions { /** * An optional client name for this MCP host. This name is advertised to MCP Servers * as the connecting client name. Defaults to 'genkit-mcp'. */ name?: string; /** * An optional version for this MCP host. Primarily for * logging and identification within Genkit. * Defaults to '1.0.0'. */ version?: string; /** * A record for configuring multiple MCP servers. Each server connection is * controlled by a `GenkitMcpClient` instance managed by `GenkitMcpHost`. * The key in the record is used as the identifier for the MCP server. */ mcpServers?: Record; /** * If true, tool responses from the MCP server will be returned in their raw * MCP format. Otherwise (default), they are processed and potentially * simplified for better compatibility with Genkit's typical data structures. */ rawToolResponses?: boolean; /** * When provided, each connected MCP server will be sent the roots specified here. * Overridden by any specific roots sent in the `mcpServers` config for a given server. */ roots?: Root[]; } /** * Configuration for an individual MCP server. The interface should be familiar * and compatible with existing tool configurations e.g. Cursor or Claude * Desktop. * * In addition to stdio servers, remote servers are supported via URL and * custom/arbitary transports are supported as well. */ export type McpServerConfig = ( | McpStdioServerConfig | McpStreamableHttpConfig | McpTransportServerConfig ) & McpServerControls; export type McpStdioServerConfig = StdioServerParameters; export type McpStreamableHttpConfig = { url: string; } & Omit; export type McpTransportServerConfig = { transport: Transport; }; export interface McpServerControls { /** * when true, the server will be stopped and its registered components will * not appear in lists/plugins/etc */ disabled?: boolean; /** MCP roots configuration. See: https://modelcontextprotocol.io/docs/concepts/roots */ roots?: Root[]; } // from '@modelcontextprotocol/sdk/client/stdio.js' export type StdioServerParameters = { /** * The executable to run to start the server. */ command: string; /** * Command line arguments to pass to the executable. */ args?: string[]; /** * The environment to use when spawning the process. * * If not specified, the result of getDefaultEnvironment() will be used. */ env?: Record; /** * How to handle stderr of the child process. This matches the semantics of Node's `child_process.spawn`. * * The default is "inherit", meaning messages to stderr will be printed to the parent process's stderr. */ stderr?: IOType | Stream | number; /** * The working directory to use when spawning the process. * * If not specified, the current working directory will be inherited. */ cwd?: string; }; // from '@modelcontextprotocol/sdk/client/streamableHttp.js' export type StreamableHTTPClientTransportOptions = { /** * An OAuth client provider to use for authentication. * * When an `authProvider` is specified and the connection is started: * 1. The connection is attempted with any existing access token from the `authProvider`. * 2. If the access token has expired, the `authProvider` is used to refresh the token. * 3. If token refresh fails or no access token exists, and auth is required, `OAuthClientProvider.redirectToAuthorization` is called, and an `UnauthorizedError` will be thrown from `connect`/`start`. * * After the user has finished authorizing via their user agent, and is redirected back to the MCP client application, call `StreamableHTTPClientTransport.finishAuth` with the authorization code before retrying the connection. * * If an `authProvider` is not provided, and auth is required, an `UnauthorizedError` will be thrown. * * `UnauthorizedError` might also be thrown when sending any message over the transport, indicating that the session has expired, and needs to be re-authed and reconnected. */ authProvider?: OAuthClientProvider; /** * Customizes HTTP requests to the server. */ requestInit?: RequestInit; /** * Custom fetch implementation used for all network requests. */ fetch?: FetchLike; /** * Options to configure the reconnection behavior. */ reconnectionOptions?: StreamableHTTPReconnectionOptions; /** * Session ID for the connection. This is used to identify the session on the server. * When not provided and connecting to a server that supports session IDs, the server will generate a new session ID. */ sessionId?: string; }; ``` ## MCP Client (Single Server) [Section titled “MCP Client (Single Server)”](#mcp-client-single-server) For scenarios where you only need to connect to a single MCP server, or prefer to manage client instances individually, you can use `createMcpClient`. ```ts import { googleAI } from '@genkit-ai/googleai'; import { createMcpClient } from '@genkit-ai/mcp'; import { genkit } from 'genkit'; const myFsClient = createMcpClient({ name: 'myFileSystemClient', // A unique name for this client instance mcpServer: { command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', process.cwd()], }, // rawToolResponses: true, // Optional: get raw MCP responses }); // In your Genkit configuration: const ai = genkit({ plugins: [googleAI()], }); (async () => { await myFsClient.ready(); // Retrieve tools from this specific client const fsTools = await myFsClient.getActiveTools(ai); const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), // Replace with your model prompt: 'List files in ' + process.cwd(), tools: fsTools, }); console.log(text); await myFsClient.disable(); })(); ``` ### `createMcpClient()` Options [Section titled “createMcpClient() Options”](#createmcpclient-options) The `createMcpClient` function takes an `McpClientOptions` object: * **`name`**: (required, string) A unique name for this client instance. This name will be used as the namespace for its tools and prompts. * **`version`**: (optional, string) Version for this client instance. Defaults to “1.0.0”. * Additionally, it supports all options from `McpServerConfig` (e.g., `disabled`, `rawToolResponses`, and transport configurations), as detailed in the `createMcpHost` options section. ### Using MCP Actions (Tools, Prompts) [Section titled “Using MCP Actions (Tools, Prompts)”](#using-mcp-actions-tools-prompts) Both `GenkitMcpHost` (via `getActiveTools()`) and `GenkitMcpClient` (via `getActiveTools()`) discover available tools from their connected and enabled MCP server(s). These tools are standard Genkit `ToolAction` instances and can be provided to Genkit models. MCP prompts can be fetched using `McpHost.getPrompt(serverName, promptName)` or `mcpClient.getPrompt(promptName)`. These return an `ExecutablePrompt`. All MCP actions (tools, prompts, resources) are namespaced. * For `createMcpHost`, the namespace is the key you provide for that server in the `mcpServers` configuration (e.g., `localFs/read_file`). * For `createMcpClient`, the namespace is the `name` you provide in its options (e.g., `myFileSystemClient/list_resources`). ### Tool Responses [Section titled “Tool Responses”](#tool-responses) MCP tools return a `content` array as opposed to a structured response like most Genkit tools. The Genkit MCP plugin attempts to parse and coerce returned content: 1. If the content is text and valid JSON, it is parsed and returned as a JSON object. 2. If the content is text but not valid JSON, the raw text is returned. 3. If the content contains a single non-text part (e.g., an image), that part is returned directly. 4. If the content contains multiple or mixed parts (e.g., text and an image), the full content response array is returned. ## MCP Server [Section titled “MCP Server”](#mcp-server) You can also expose all of the tools and prompts from a Genkit instance as an MCP server using the `createMcpServer` function. ```ts import { googleAI } from '@genkit-ai/googleai'; import { createMcpServer } from '@genkit-ai/mcp'; import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'; import { genkit, z } from 'genkit/beta'; const ai = genkit({ plugins: [googleAI()], }); ai.defineTool( { name: 'add', description: 'add two numbers together', inputSchema: z.object({ a: z.number(), b: z.number() }), outputSchema: z.number(), }, async ({ a, b }) => { return a + b; } ); ai.definePrompt( { name: 'happy', description: 'everybody together now', input: { schema: z.object({ action: z.string().default('clap your hands').optional(), }), }, }, `If you're happy and you know it, {{action}}.` ); ai.defineResource( { name: 'my resouces', uri: 'my://resource', }, async () => { return { content: [ { text: 'my resource', }, ], }; } ); ai.defineResource( { name: 'file', template: 'file://{path}', }, async ({ uri }) => { return { content: [ { text: `file contents for ${uri}`, }, ], }; } ); // Use createMcpServer const server = createMcpServer(ai, { name: 'example_server', version: '0.0.1', }); // Setup (async) then starts with stdio transport by default server.setup().then(async () => { await server.start(); const transport = new StdioServerTransport(); await server!.server?.connect(transport); }); ``` The `createMcpServer` function returns a `GenkitMcpServer` instance. The `start()` method on this instance will start an MCP server (using the stdio transport by default) that exposes all registered Genkit tools and prompts. To start the server with a different MCP transport, you can pass the transport instance to the `start()` method (e.g., `server.start(customMcpTransport)`). ### `createMcpServer()` Options [Section titled “createMcpServer() Options”](#createmcpserver-options) * **`name`**: (required, string) The name you want to give your server for MCP inspection. * **`version`**: (optional, string) The version your server will advertise to clients. Defaults to “1.0.0”. ### Known Limitations [Section titled “Known Limitations”](#known-limitations) * MCP prompts are only able to take string parameters, so inputs to schemas must be objects with only string property values. * MCP prompts only support `user` and `model` messages. `system` messages are not supported. * MCP prompts only support a single “type” within a message so you can’t mix media and text in the same message. ### Testing your MCP server [Section titled “Testing your MCP server”](#testing-your-mcp-server) You can test your MCP server using the official inspector. For example, if your server code compiled into `dist/index.js`, you could run: ```plaintext npx @modelcontextprotocol/inspector dist/index.js ``` Once you start the inspector, you can list prompts and actions and test them out manually. # Neo4j plugin > This document describes the Neo4j plugin for Genkit, which provides indexer and retriever implementations that use the Neo4j graph database for vector search capabilities. The Neo4j plugin provides indexer and retriever implementations that use the [Neo4j](https://neo4j.com/) graph database for vector search capabilities. ## Installation [Section titled “Installation”](#installation) ```bash npm install genkitx-neo4j ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { neo4j } from 'genkitx-neo4j'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [ neo4j([ { indexId: 'bob-facts', embedder: googleAI.embedder('gemini-embedding-001'), }, ]), ], }); ``` You must specify a Neo4j index ID and the embedding model you want to use. ### Connection Configuration [Section titled “Connection Configuration”](#connection-configuration) There are two ways to configure the Neo4j connection: 1. Using environment variables: ```plaintext NEO4J_URI=bolt://localhost:7687 # Neo4j's binary protocol NEO4J_USERNAME=neo4j NEO4J_PASSWORD=password NEO4J_DATABASE=neo4j # Optional: specify database name ``` 2. Using the `clientParams` option: ```ts neo4j([ { indexId: 'bob-facts', embedder: googleAI.embedder('gemini-embedding-001'), clientParams: { url: 'bolt://localhost:7687', // Neo4j's binary protocol username: 'neo4j', password: 'password', database: 'neo4j', // Optional }, }, ]), ``` > Note: The `bolt://` protocol is Neo4j’s proprietary binary protocol designed for efficient client-server communication. ### Configuration Options [Section titled “Configuration Options”](#configuration-options) The Neo4j plugin accepts the following configuration options: * `indexId`: (required) The name of the index to use in Neo4j * `embedder`: (required) The embedding model to use * `clientParams`: (optional) Neo4j connection configuration ## Usage [Section titled “Usage”](#usage) Import retriever and indexer references like so: ```ts import { neo4jRetrieverRef } from 'genkitx-neo4j'; import { neo4jIndexerRef } from 'genkitx-neo4j'; ``` ### Retrieval [Section titled “Retrieval”](#retrieval) Use the retriever reference with `ai.retrieve()`: ```ts // To use the index you configured when you loaded the plugin: let docs = await ai.retrieve({ retriever: neo4jRetrieverRef, query, // Optional: limit number of results (max 1000) k: 5 }); // To specify an index: export const bobFactsRetriever = neo4jRetrieverRef({ indexId: 'bob-facts', // Optional: custom display name displayName: 'Bob Facts Database' }); docs = await ai.retrieve({ retriever: bobFactsRetriever, query, k: 10 }); ``` ### Indexing [Section titled “Indexing”](#indexing) Use the indexer reference with `ai.index()`: ```ts // To use the index you configured when you loaded the plugin: await ai.index({ indexer: neo4jIndexerRef, documents }); // To specify an index: export const bobFactsIndexer = neo4jIndexerRef({ indexId: 'bob-facts', // Optional: custom display name displayName: 'Bob Facts Database' }); await ai.index({ indexer: bobFactsIndexer, documents }); ``` See the [Retrieval-augmented generation](/docs/rag) page for a general discussion on indexers and retrievers. ## Learn More [Section titled “Learn More”](#learn-more) For more information, feedback, or to report issues, visit the [Neo4j plugin GitHub repository](https://github.com/neo4j-partners/genkitx-neo4j/blob/main/README.md). # Ollama plugin > This document describes the Ollama plugin for Genkit, which provides interfaces to local LLMs supported by Ollama, including installation, configuration, and usage for models and embedders. The Ollama plugin provides interfaces to any of the local LLMs supported by [Ollama](https://ollama.com/). ## Installation [Section titled “Installation”](#installation) ```bash npm install genkitx-ollama ``` ## Configuration [Section titled “Configuration”](#configuration) This plugin requires that you first install and run the Ollama server. You can follow the instructions on: [Download Ollama](https://ollama.com/download). You can use the Ollama CLI to download the model you are interested in. For example: ```bash ollama pull gemma ``` To use this plugin, specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { ollama } from 'genkitx-ollama'; const ai = genkit({ plugins: [ ollama({ models: [ { name: 'gemma', type: 'generate', // type: 'chat' | 'generate' | undefined }, ], serverAddress: 'http://127.0.0.1:11434', // default local address }), ], }); ``` ### Authentication [Section titled “Authentication”](#authentication) If you would like to access remote deployments of Ollama that require custom headers (static, such as API keys, or dynamic, such as auth headers), you can specify those in the Ollama config plugin: Static headers: ```ts ollama({ models: [{ name: 'gemma'}], requestHeaders: { 'api-key': 'API Key goes here' }, serverAddress: 'https://my-deployment', }), ``` You can also dynamically set headers per request. Here’s an example of how to set an ID token using the Google Auth library: ```ts import { GoogleAuth } from 'google-auth-library'; import { ollama } from 'genkitx-ollama'; import { genkit } from 'genkit'; const ollamaCommon = { models: [{ name: 'gemma:2b' }] }; const ollamaDev = { ...ollamaCommon, serverAddress: 'http://127.0.0.1:11434', }; const ollamaProd = { ...ollamaCommon, serverAddress: 'https://my-deployment', requestHeaders: async (params) => { const headers = await fetchWithAuthHeader(params.serverAddress); return { Authorization: headers['Authorization'] }; }, }; const ai = genkit({ plugins: [ollama(isDevEnv() ? ollamaDev : ollamaProd)], }); // Function to lazily load GoogleAuth client let auth: GoogleAuth; function getAuthClient() { if (!auth) { auth = new GoogleAuth(); } return auth; } // Function to fetch headers, reusing tokens when possible async function fetchWithAuthHeader(url: string) { const client = await getIdTokenClient(url); const headers = await client.getRequestHeaders(url); // Auto-manages token refresh return headers; } async function getIdTokenClient(url: string) { const auth = getAuthClient(); const client = await auth.getIdTokenClient(url); return client; } ``` ## Usage [Section titled “Usage”](#usage) This plugin doesn’t statically export model references. Specify one of the models you configured using a string identifier: ```ts const llmResponse = await ai.generate({ model: 'ollama/gemma', prompt: 'Tell me a joke.', }); ``` ## Embedders [Section titled “Embedders”](#embedders) The Ollama plugin supports embeddings, which can be used for similarity searches and other NLP tasks. ```ts const ai = genkit({ plugins: [ ollama({ serverAddress: 'http://localhost:11434', embedders: [{ name: 'nomic-embed-text', dimensions: 768 }], }), ], }); async function getEmbeddings() { const embeddings = ( await ai.embed({ embedder: 'ollama/nomic-embed-text', content: 'Some text to embed!', }) )[0].embedding; return embeddings; } getEmbeddings().then((e) => console.log(e)); ``` # OpenAI Plugin > Learn how to configure and use Genkit OpenAI plugin to access various models and embedders from OpenAI. The `@genkit-ai/compat-oai` package includes a pre-configured plugin for official [OpenAI models](https://platform.openai.com/docs/models). Note The OpenAI plugin is built on top of the `openAICompatible` plugin. It is pre-configured for OpenAI’s API endpoints. ## Installation [Section titled “Installation”](#installation) ```bash npm install @genkit-ai/compat-oai ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, import `openAI` and specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; export const ai = genkit({ plugins: [openAI()], }); ``` The plugin requires an API key for the OpenAI API. You can get one from the [OpenAI Platform](https://platform.openai.com/api-keys). Configure the plugin to use your API key by doing one of the following: * Set the `OPENAI_API_KEY` environment variable to your API key. * Specify the API key when you initialize the plugin: ```ts openAI({ apiKey: yourKey }); ``` However, don’t embed your API key directly in code! Use this feature only in conjunction with a service like Google Cloud Secret Manager or similar. ## Usage [Section titled “Usage”](#usage) The plugin provides helpers to reference supported models and embedders. ### Chat Models [Section titled “Chat Models”](#chat-models) You can reference chat models like `gpt-4o` and `gpt-4-turbo` using the `openAI.model()` helper. ```ts import { genkit, z } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; const ai = genkit({ plugins: [openAI()], }); export const jokeFlow = ai.defineFlow( { name: 'jokeFlow', inputSchema: z.object({ subject: z.string() }), outputSchema: z.object({ joke: z.string() }), }, async ({ subject }) => { const llmResponse = await ai.generate({ prompt: `tell me a joke about ${subject}`, model: openAI.model('gpt-4o'), }); return { joke: llmResponse.text }; }, ); ``` You can also pass model-specific configuration: ```ts const llmResponse = await ai.generate({ prompt: `tell me a joke about ${subject}`, model: openAI.model('gpt-4o'), config: { temperature: 0.7, }, }); ``` ### Image Generation Models [Section titled “Image Generation Models”](#image-generation-models) The plugin supports image generation models like DALL-E 3. ```ts import { genkit } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; const ai = genkit({ plugins: [openAI()], }); // Reference an image generation model const dalle3 = openAI.model('dall-e-3'); // Use it to generate an image const imageResponse = await ai.generate({ model: dalle3, prompt: 'A photorealistic image of a cat programming a computer.', config: { size: '1024x1024', style: 'vivid', }, }); const imageUrl = imageResponse.media()?.url; ``` ### Text Embedding Models [Section titled “Text Embedding Models”](#text-embedding-models) You can use text embedding models to create vector embeddings from text. ```ts import { genkit, z } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; const ai = genkit({ plugins: [openAI()], }); export const embedFlow = ai.defineFlow( { name: 'embedFlow', inputSchema: z.object({ text: z.string() }), outputSchema: z.object({ embedding: z.string() }), }, async ({ text }) => { const embedding = await ai.embed({ embedder: openAI.embedder('text-embedding-ada-002'), content: text, }); return { embedding: JSON.stringify(embedding) }; }, ); ``` ### Audio Transcription and Speech Models [Section titled “Audio Transcription and Speech Models”](#audio-transcription-and-speech-models) The OpenAI plugin also supports audio models for transcription (speech-to-text) and speech generation (text-to-speech). #### Transcription (Speech-to-Text) [Section titled “Transcription (Speech-to-Text)”](#transcription-speech-to-text) Use models like `whisper-1` to transcribe audio files. ```ts import { genkit } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; import * as fs from 'fs'; const ai = genkit({ plugins: [openAI()], }); const whisper = openAI.model('whisper-1'); const audioFile = fs.readFileSync('path/to/your/audio.mp3'); const transcription = await ai.generate({ model: whisper, prompt: [ { media: { contentType: 'audio/mp3', url: `data:audio/mp3;base64,${audioFile.toString('base64')}`, }, }, ], }); console.log(transcription.text()); ``` #### Speech Generation (Text-to-Speech) [Section titled “Speech Generation (Text-to-Speech)”](#speech-generation-text-to-speech) Use models like `tts-1` to generate speech from text. ```ts import { genkit } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; import * as fs from 'fs'; const ai = genkit({ plugins: [openAI()], }); const tts = openAI.model('tts-1'); const speechResponse = await ai.generate({ model: tts, prompt: 'Hello, world! This is a test of text-to-speech.', config: { voice: 'alloy', }, }); const audioData = speechResponse.media(); if (audioData) { fs.writeFileSync('output.mp3', Buffer.from(audioData.url.split(',')[1], 'base64')); } ``` ## Advanced usage [Section titled “Advanced usage”](#advanced-usage) ### Passthrough configuration [Section titled “Passthrough configuration”](#passthrough-configuration) You can pass configuration options that are not defined in the plugin’s custom configuration schema. This permits you to access new models and features without having to update your Genkit version. ```ts import { genkit } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; const ai = genkit({ plugins: [openAI()], }); const llmResponse = await ai.generate({ prompt: `Tell me a cool story`, model: openAI.model('gpt-4-new'), // hypothetical new model config: { seed: 123, new_feature_parameter: ... // hypothetical config needed for new model }, }); ``` Genkit passes this config as-is to the OpenAI API giving you access to the new model features. Note that the field name and types are not validated by Genkit and should match the OpenAI API specification to work. ### Web-search built-in tool [Section titled “Web-search built-in tool”](#web-search-built-in-tool) Some OpenAI models support web search. You can enable it in the `config` block: ```ts import { genkit } from 'genkit'; import { openAI } from '@genkit-ai/compat-oai/openai'; const ai = genkit({ plugins: [openAI()], }); const llmResponse = await ai.generate({ prompt: `What was a positive news story from today?`, model: openAI.model('gpt-4o-search-preview'), config: { web_search_options: {}, }, }); ``` # pgvector retriever template > Learn how to use PostgreSQL and pgvector as a retriever implementation in Genkit Go. You can use PostgreSQL and `pgvector` as your retriever implementation. Use the following examples as a starting point and modify it to work with your database schema. ```ts import { genkit, z, Document } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; import { toSql } from 'pgvector'; import postgres from 'postgres'; const ai = genkit({ plugins: [googleAI()], }); const sql = postgres({ ssl: false, database: 'recaps' }); const QueryOptions = z.object({ show: z.string(), k: z.number().optional(), }); const sqlRetriever = ai.defineRetriever( { name: 'pgvector-myTable', configSchema: QueryOptions, }, async (input, options) => { const embedding = ( await ai.embed({ embedder: googleAI.embedder('gemini-embedding-001'), content: input, }) )[0].embedding; const results = await sql` SELECT episode_id, season_number, chunk as content FROM embeddings WHERE show_id = ${options.show} ORDER BY embedding <#> ${toSql(embedding)} LIMIT ${options.k ?? 3} `; return { documents: results.map((row) => { const { content, ...metadata } = row; return Document.fromText(content, metadata); }), }; }, ); ``` And here’s how to use the retriever in a flow: ```ts // Simple flow to use the sqlRetriever export const askQuestionsOnGoT = ai.defineFlow( { name: 'askQuestionsOnGoT', inputSchema: z.object({ question: z.string() }), outputSchema: z.object({ answer: z.string() }), }, async ({ question }) => { const docs = await ai.retrieve({ retriever: sqlRetriever, query: question, options: { show: 'Game of Thrones', }, }); console.log(docs); // Continue with using retrieved docs // in RAG prompts. //... // Return an answer (placeholder for actual implementation) return { answer: "Answer would be generated here based on retrieved documents" }; }, ); ``` # Pinecone plugin > This document describes the Pinecone plugin for Genkit, which provides indexer and retriever implementations for the Pinecone cloud vector database. The Pinecone plugin provides indexer and retriever implementations that use the [Pinecone](https://www.pinecone.io/) cloud vector database. ## Installation [Section titled “Installation”](#installation) ```bash npm install genkitx-pinecone ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { pinecone } from 'genkitx-pinecone'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [ pinecone([ { indexId: 'bob-facts', embedder: googleAI.embedder('gemini-embedding-001'), }, ]), ], }); ``` You must specify a Pinecone index ID and the embedding model you want to use. In addition, you must configure Genkit with your Pinecone API key. There are two ways to do this: * Set the `PINECONE_API_KEY` environment variable. * Specify it in the `clientParams` optional parameter: ```ts clientParams: { apiKey: ..., } ``` The value of this parameter is a `PineconeConfiguration` object, which gets passed to the Pinecone client; you can use it to pass any parameter the client supports. ## Usage [Section titled “Usage”](#usage) Import retriever and indexer references like so: ```ts import { pineconeRetrieverRef } from 'genkitx-pinecone'; import { pineconeIndexerRef } from 'genkitx-pinecone'; ``` Then, use these references with `ai.retrieve()` and `ai.index()`: ```ts // To use the index you configured when you loaded the plugin: let docs = await ai.retrieve({ retriever: pineconeRetrieverRef, query }); // To specify an index: export const bobFactsRetriever = pineconeRetrieverRef({ indexId: 'bob-facts', }); docs = await ai.retrieve({ retriever: bobFactsRetriever, query }); ``` ```ts // To use the index you configured when you loaded the plugin: await ai.index({ indexer: pineconeIndexerRef, documents }); // To specify an index: export const bobFactsIndexer = pineconeIndexerRef({ indexId: 'bob-facts', }); await ai.index({ indexer: bobFactsIndexer, documents }); ``` See the [Retrieval-augmented generation](/docs/rag) page for a general discussion on indexers and retrievers. # Third-party plugins by Firebase and partners > This page lists third-party plugins for Genkit that are built and maintained by Firebase or our partners. This page lists third-party plugins for Genkit that are built and maintained by Firebase or our partners. Neo4j The Neo4j plugin integrates the [Neo4j graph databases](https://neo4j.com/product/neo4j-graph-database/) into the Genkit framework. You can leverage knowledge graphs and semantic search for building advanced AI applications, particularly for RAG scenarios where contextual information from a knowledge graph is crucial. [View plugin info](https://github.com/neo4j-partners/genkitx-neo4j) DataStax Astra DB The Astra DB plugin integrates [Astra DB](https://www.datastax.com/products/datastax-astra) into the Genkit framework as an indexer and retriever. You can efficiently embed, index, and retrieve data within your Genkit applications. [View plugin info](https://github.com/datastax/genkitx-astra-db) Pinecone The Pinecone plugin provides indexer and retriever implementations that use the [Pinecone](https://www.pinecone.io/product/) cloud vector database. [View plugin info](/docs/plugins/pinecone) ChromaDB The Chroma plugin provides indexer and retriever implementations that use the [Chroma vector database](https://docs.trychroma.com/) in client/server mode. [View plugin info](/docs/plugins/chroma) AuthO The AuthO plugin enables you to build secure AI-powered applications using [Auth0](https://www.auth0.ai/) and [Okta FGA](https://docs.fga.dev/) [View plugin info](https://github.com/auth0-lab/auth0-ai-js/tree/main/packages/ai-genkit) Ollama The Ollama plugin provides interfaces to any of the local LLMs supported by [Ollama](https://ollama.com/) [View plugin info](/docs/plugins/ollama) pgvector The pgvector template is an example PostgreSQL and pgvector retriever implementation. You can use the provided examples as a starting point and modify them to work with your database schema. [View template](/docs/plugins/pgvector) Cloud SQL for PostgreSQL The Cloud SQL for PostgreSQL plugin provides indexer and retriever implementations that use the [Cloud SQL for PostgreSQL](https://cloud.google.com/sql/docs/postgres) cloud vector database. [View plugin info](/docs/plugins/cloud-sql-pg) MCP Toolbox For Databases The MCP Toolbox plugin integrates the [MCP Toolbox tools](https://github.com/googleapis/genai-toolbox) into the Genkit framework. It enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more. [View template](/docs/plugins/toolbox) # MCP Toolbox for Databases > This document introduces the MCP Toolbox for Databases, an open-source MCP server designed for enterprise-grade database tool development, and explains its integration with Genkit applications. [MCP Toolbox for Databases](https://github.com/googleapis/genai-toolbox) is an open source MCP server for databases. It was designed with enterprise-grade and production-quality in mind. It enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more. Toolbox Tools can be seemlessly integrated with Genkit applications. For more information on [getting started](https://googleapis.github.io/genai-toolbox/getting-started/) or [configuring](https://googleapis.github.io/genai-toolbox/getting-started/configure/) Toolbox, see the [documentation](https://googleapis.github.io/genai-toolbox/getting-started/introduction/). ![architecture](/_astro/mcp_db_toolbox.BLzcU50-_Z1COGKY.webp) ### Configure and deploy [Section titled “Configure and deploy”](#configure-and-deploy) Toolbox is an open source server that you deploy and manage yourself. For more instructions on deploying and configuring, see the official Toolbox documentation: * [Installing the Server](https://googleapis.github.io/genai-toolbox/getting-started/introduction/#installing-the-server) * [Configuring Toolbox](https://googleapis.github.io/genai-toolbox/getting-started/configure/) ### Install client SDK [Section titled “Install client SDK”](#install-client-sdk) Genkit relies on the `@toolbox-sdk/core` node package to use Toolbox. Install the package before getting started: ```shell npm install @toolbox-sdk/core ``` ### Loading Toolbox Tools [Section titled “Loading Toolbox Tools”](#loading-toolbox-tools) Once you’re Toolbox server is configured and up and running, you can load tools from your server using ADK: ```javascript import {ToolboxClient} from '@toolbox-sdk/core'; import { genkit, z } from 'genkit'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-1.5-pro'), }); // Replace with your Toolbox Server URL const URL = 'https://127.0.0.1:5000'; let client = ToolboxClient(URL); toolboxTools = await client.loadToolset('toolsetName'); const getGenkitTool = (toolboxTool) => ai.defineTool({ name: toolboxTool.getName(), description: toolboxTool.getDescription(), inputSchema: toolboxTool.getParams(), }, toolboxTool, ); const tools = toolboxTools.map(getGenkitTool); await ai.generate({ prompt: 'Ask some question.', tools: tools, }); ``` ### Advanced Toolbox Features [Section titled “Advanced Toolbox Features”](#advanced-toolbox-features) Toolbox has a variety of features to make developing Gen AI tools for databases. For more information, read more about the following features: * [Authenticated Parameters](https://googleapis.github.io/genai-toolbox/resources/tools/#authenticated-parameters): bind tool inputs to values from OIDC tokens automatically, making it easy to run sensitive queries without potentially leaking data * [Authorized Invocations:](https://googleapis.github.io/genai-toolbox/resources/tools/#authorized-invocations) restrict access to use a tool based on the users Auth token * [OpenTelemetry](https://googleapis.github.io/genai-toolbox/how-to/export_telemetry/): get metrics and tracing from Toolbox with OpenTelemetry # Vertex AI plugin > This document describes the Vertex AI plugin for Genkit, providing interfaces to Google's generative AI models, evaluation metrics, Vector Search, and text-to-speech capabilities. The Vertex AI plugin provides interfaces to several AI services: * [Google generative AI models](https://cloud.google.com/vertex-ai/generative-ai/docs/): * Gemini text generation * Imagen2 and Imagen3 image generation * Text embedding generation * Multimodal embedding generation * A subset of evaluation metrics through the Vertex AI [Rapid Evaluation API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/evaluation): * [BLEU](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#bleuinput) * [ROUGE](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#rougeinput) * [Fluency](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#fluencyinput) * [Safety](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#safetyinput) * [Groundeness](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#groundednessinput) * [Summarization Quality](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#summarizationqualityinput) * [Summarization Helpfulness](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#summarizationhelpfulnessinput) * [Summarization Verbosity](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#summarizationverbosityinput) * [Vector Search](https://cloud.google.com/vertex-ai/docs/vector-search/overview) ## Installation [Section titled “Installation”](#installation) ```bash npm install @genkit-ai/vertexai ``` If you want to locally run flows that use this plugin, you also need the [Google Cloud CLI tool](https://cloud.google.com/sdk/docs/install) installed. ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { vertexAI } from '@genkit-ai/vertexai'; const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); ``` The plugin requires you to specify your Google Cloud project ID, the [region](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations) to which you want to make Vertex API requests, and your Google Cloud project credentials. * You can specify your Google Cloud project ID either by setting `projectId` in the `vertexAI()` configuration or by setting the `GCLOUD_PROJECT` environment variable. If you’re running your flow from a Google Cloud environment (Cloud Functions, Cloud Run, and so on), `GCLOUD_PROJECT` is automatically set to the project ID of the environment. * You can specify the API location either by setting `location` in the `vertexAI()` configuration or by setting the `GCLOUD_LOCATION` environment variable. * To provide API credentials, you need to set up Google Cloud Application Default Credentials. 1. To specify your credentials: * If you’re running your flow from a Google Cloud environment (Cloud Functions, Cloud Run, and so on), this is set automatically. * On your local dev environment, do this by running: ```bash gcloud auth application-default login --project YOUR_PROJECT_ID ``` * For other environments, see the [Application Default Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc) docs. 2. In addition, make sure the account is granted the Vertex AI User IAM role (`roles/aiplatform.user`). See the Vertex AI [access control](https://cloud.google.com/vertex-ai/generative-ai/docs/access-control) docs. ## Usage [Section titled “Usage”](#usage) ### Generative AI Models [Section titled “Generative AI Models”](#generative-ai-models) The Vertex AI plugin allows you to use various Gemini, Imagen, and other Vertex AI models: ```ts import { vertexAI } from '@genkit-ai/vertexai'; const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); const llmResponse = await ai.generate({ model: vertexAI.model('gemini-2.5-flash'), prompt: 'What should I do when I visit Melbourne?', }); ``` This plugin also supports grounding Gemini text responses using [Google Search](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini#web-ground-gemini) or [your own data](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/ground-gemini#private-ground-gemini). Important: Vertex AI charges a fee for grounding requests in addition to the cost of making LLM requests. See the [Vertex AI pricing](https://cloud.google.com/vertex-ai/generative-ai/pricing) page and be sure you understand grounding request pricing before you use this feature. Example: ```ts const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); await ai.generate({ model: vertexAI.model('gemini-2.5-flash'), prompt: '...', config: { googleSearchRetrieval: { disableAttribution: true, } vertexRetrieval: { datastore: { projectId: 'your-cloud-project', location: 'us-central1', collection: 'your-collection', }, disableAttribution: true, } } }) ``` You can also use Vertex AI text embedding models for generating embeddings: ```ts const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); const embeddings = await ai.embed({ embedder: vertexAI.embedder('gemini-embedding-001'), content: 'How many widgets do you have in stock?', }); ``` For Chroma DB or other vector databases, you can specify the embedder like this: ```ts const ai = genkit({ plugins: [ chroma([ { embedder: vertexAI.embedder('gemini-embedding-001'), collectionName: 'my-collection', }, ]), ], }); ``` This plugin can also handle multimodal embeddings: ```ts import { vertexAI } from '@genkit-ai/vertexai'; const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); const embeddings = await ai.embed({ embedder: vertexAI.embedder('multimodal-embedding-001'), content: { content: [ { media: { url: 'gs://cloud-samples-data/generative-ai/video/pixel8.mp4', contentType: 'video/mp4', }, }, ], }, }); ``` Imagen3 model allows generating images from user prompt: ```ts import { vertexAI } from '@genkit-ai/vertexai'; const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); const response = await ai.generate({ model: vertexAI.model('imagen-3.0-generate-002'), output: { format: 'media' }, prompt: 'a banana riding a bicycle', }); return response.media; ``` and even advanced editing of existing images: ```ts const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); const baseImg = fs.readFileSync('base.png', { encoding: 'base64' }); const maskImg = fs.readFileSync('mask.png', { encoding: 'base64' }); const response = await ai.generate({ model: vertexAI.model('imagen-3.0-generate-002'), output: { format: 'media' }, prompt: [ { media: { url: `data:image/png;base64,${baseImg}` } }, { media: { url: `data:image/png;base64,${maskImg}` }, metadata: { type: 'mask' }, }, { text: 'replace the background with foo bar baz' }, ], config: { editConfig: { editMode: 'outpainting', }, }, }); return response.media; ``` Refer to [Imagen model documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/imagen-api#edit_images_2) for more detailed options. #### Anthropic Claude 3 on Vertex AI Model Garden [Section titled “Anthropic Claude 3 on Vertex AI Model Garden”](#anthropic-claude-3-on-vertex-ai-model-garden) If you have access to Claude 3 models ([haiku](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-haiku), [sonnet](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-sonnet) or [opus](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-opus)) in Vertex AI Model Garden you can use them with Genkit. Here’s a sample configuration for enabling Vertex AI Model Garden models: ```ts import { genkit } from 'genkit'; import { vertexAIModelGarden } from '@genkit-ai/vertexai/modelgarden'; const ai = genkit({ plugins: [ vertexAIModelGarden({ location: 'us-central1', models: ['claude-3-haiku', 'claude-3-sonnet', 'claude-3-opus'], }), ], }); ``` Then use them as regular models: ```ts const llmResponse = await ai.generate({ model: 'claude-3-sonnet', prompt: 'What should I do when I visit Melbourne?', }); ``` #### Llama 3.1 405b on Vertex AI Model Garden [Section titled “Llama 3.1 405b on Vertex AI Model Garden”](#llama-31-405b-on-vertex-ai-model-garden) First you’ll need to enable [Llama 3.1 API Service](https://console.cloud.google.com/vertex-ai/publishers/meta/model-garden/llama3-405b-instruct-maas) in Vertex AI Model Garden. Here’s sample configuration for Llama 3.1 405b in Vertex AI plugin: ```ts import { genkit } from 'genkit'; import { vertexAIModelGarden } from '@genkit-ai/vertexai/modelgarden'; const ai = genkit({ plugins: [ vertexAIModelGarden({ location: 'us-central1', models: ['llama3-405b-instruct-maas'], }), ], }); ``` Then use it as a regular model: ```ts const llmResponse = await ai.generate({ model: 'llama3-405b-instruct-maas', prompt: 'Write a function that adds two numbers together', }); ``` #### Mistral Models on Vertex AI Model Garden [Section titled “Mistral Models on Vertex AI Model Garden”](#mistral-models-on-vertex-ai-model-garden) If you have access to Mistral models ([Mistral Large](https://console.cloud.google.com/vertex-ai/publishers/mistralai/model-garden/mistral-large), [Mistral Nemo](https://console.cloud.google.com/vertex-ai/publishers/mistralai/model-garden/mistral-nemo), or [Codestral](https://console.cloud.google.com/vertex-ai/publishers/mistralai/model-garden/codestral)) in Vertex AI Model Garden, you can use them with Genkit. Here’s a sample configuration for enabling Vertex AI Model Garden models: ```ts import { genkit } from 'genkit'; import { vertexAIModelGarden } from '@genkit-ai/vertexai/modelgarden'; const ai = genkit({ plugins: [ vertexAIModelGarden({ location: 'us-central1', models: ['mistral-large', 'mistral-nemo', 'codestral'], }), ], }); ``` Then use them as regular models: ```ts const llmResponse = await ai.generate({ model: 'mistral-large', prompt: 'Write a function that adds two numbers together', config: { version: 'mistral-large-2411', // Optional: specify model version temperature: 0.7, // Optional: control randomness (0-1) maxOutputTokens: 1024, // Optional: limit response length topP: 0.9, // Optional: nucleus sampling parameter stopSequences: ['###'], // Optional: stop generation at sequences }, }); ``` The models support: * `mistral-large`: Latest Mistral large model with function calling capabilities * `mistral-nemo`: Optimized for efficiency and speed * `codestral`: Specialized for code generation tasks Each model supports streaming responses and function calling: ```ts const response = await ai.generateStream({ model: 'mistral-large', prompt: 'What should I cook tonight?', tools: ['recipe-finder'], config: { version: 'mistral-large-2411', temperature: 1, }, }); for await (const chunk of response.stream) { console.log(chunk.text); } ``` ### Evaluators [Section titled “Evaluators”](#evaluators) To use the evaluators from Vertex AI Rapid Evaluation, add an `evaluation` block to your `vertexAI` plugin configuration. ```ts import { genkit } from 'genkit'; import { vertexAIEvaluation, VertexAIEvaluationMetricType } from '@genkit-ai/vertexai/evaluation'; const ai = genkit({ plugins: [ vertexAIEvaluation({ location: 'us-central1', metrics: [ VertexAIEvaluationMetricType.SAFETY, { type: VertexAIEvaluationMetricType.ROUGE, metricSpec: { rougeType: 'rougeLsum', }, }, ], }), ], }); ``` The configuration above adds evaluators for the `Safety` and `ROUGE` metrics. The example shows two approaches- the `Safety` metric uses the default specification, whereas the `ROUGE` metric provides a customized specification that sets the rouge type to `rougeLsum`. Both evaluators can be run using the `genkit eval:run` command with a compatible dataset: that is, a dataset with `output` and `reference` fields. The `Safety` evaluator can also be run using the `genkit eval:flow -e vertexai/safety` command since it only requires an `output`. ### Indexers and retrievers [Section titled “Indexers and retrievers”](#indexers-and-retrievers) The Genkit Vertex AI plugin includes indexer and retriever implementations backed by the Vertex AI Vector Search service. (See the [Retrieval-augmented generation](http://../rag.md) page to learn how indexers and retrievers are used in a RAG implementation.) The Vertex AI Vector Search service is a document index that works alongside the document store of your choice: the document store contains the content of documents, and the Vertex AI Vector Search index contains, for each document, its vector embedding and a reference to the document in the document store. After your documents are indexed by the Vertex AI Vector Search service, it can respond to search queries, producing lists of indexes into your document store. The indexer and retriever implementations provided by the Vertex AI plugin use either Cloud Firestore or BigQuery as the document store. The plugin also includes interfaces you can implement to support other document stores. Important: Pricing for Vector Search consists of both a charge for every gigabyte of data you ingest and an hourly charge for the VMs that host your deployed indexes. See [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing#vectorsearch). This is likely to be most cost-effective when you are serving high volumes of traffic. Be sure to understand the billing implications the service will have on your project before using it. To use Vertex AI Vector Search: 1. Choose an embedding model. This model is responsible for creating vector embeddings from text or media. Advanced users might use an embedding model optimized for their particular data sets, but for most users, Vertex AI’s `gemini-embedding-001` model is a good choice for English text, the `text-multilingual-embedding-002` model is good for multilingual text, and the `multimodalEmbedding001` model is good for mixed text, images, and video. 2. In the [Vector Search](https://console.cloud.google.com/vertex-ai/matching-engine/indexes) section of the Google Cloud console, create a new index. The most important settings are: * **Dimensions:** Specify the dimensionality of the vectors produced by your chosen embedding model. The `gemini-embedding-001` and `text-multilingual-embedding-002` models produce vectors of 768 dimensions. The `multimodalEmbedding001` model can produce vectors of 128, 256, 512, or 1408 dimensions for text and image, and will produce vectors of 1408 dimensions for video. * **Update method:** Select streaming updates. After you create the index, deploy it to a standard (public) endpoint. 3. Get a document indexer and retriever for the document store you want to use: **Cloud Firestore** ```ts import { getFirestoreDocumentIndexer, getFirestoreDocumentRetriever } from '@genkit-ai/vertexai/vectorsearch'; import { initializeApp } from 'firebase-admin/app'; import { getFirestore } from 'firebase-admin/firestore'; initializeApp({ projectId: PROJECT_ID }); const db = getFirestore(); const firestoreDocumentRetriever = getFirestoreDocumentRetriever(db, FIRESTORE_COLLECTION); const firestoreDocumentIndexer = getFirestoreDocumentIndexer(db, FIRESTORE_COLLECTION); ``` **BigQuery** ```ts import { getBigQueryDocumentIndexer, getBigQueryDocumentRetriever } from '@genkit-ai/vertexai/vectorsearch'; import { BigQuery } from '@google-cloud/bigquery'; const bq = new BigQuery({ projectId: PROJECT_ID }); const bigQueryDocumentRetriever = getBigQueryDocumentRetriever(bq, BIGQUERY_TABLE, BIGQUERY_DATASET); const bigQueryDocumentIndexer = getBigQueryDocumentIndexer(bq, BIGQUERY_TABLE, BIGQUERY_DATASET); ``` **Other** To support other documents stores you can provide your own implementations of `DocumentRetriever` and `DocumentIndexer`: ```ts const myDocumentRetriever = async (neighbors) => { // Return the documents referenced by `neighbors`. // ... }; const myDocumentIndexer = async (documents) => { // Add `documents` to storage. // ... }; ``` For an example, see [Sample Vertex AI Plugin Retriever and Indexer with Local File](https://github.com/firebase/genkit/tree/main/js/testapps/vertexai-vector-search-custom). 4. Add a `vectorSearchOptions` block to your `vertexAI` plugin configuration: ```ts import { genkit } from 'genkit'; import { vertexAIVectorSearch } from '@genkit-ai/vertexai/vectorsearch'; const ai = genkit({ plugins: [ vertexAIVectorSearch({ projectId: PROJECT_ID, location: LOCATION, vectorSearchOptions: [ { indexId: VECTOR_SEARCH_INDEX_ID, indexEndpointId: VECTOR_SEARCH_INDEX_ENDPOINT_ID, deployedIndexId: VECTOR_SEARCH_DEPLOYED_INDEX_ID, publicDomainName: VECTOR_SEARCH_PUBLIC_DOMAIN_NAME, documentRetriever: firestoreDocumentRetriever, documentIndexer: firestoreDocumentIndexer, embedder: vertexAI.embedder('gemini-embedding-001'), }, ], }), ], }); ``` Provide the embedder you chose in the first step and the document indexer and retriever you created in the previous step. To configure the plugin to use the Vector Search index you created earlier, you need to provide several values, which you can find in the Vector Search section of the Google Cloud console: * `indexId`: listed on the [Indexes](https://console.cloud.google.com/vertex-ai/matching-engine/indexes) tab * `indexEndpointId`: listed on the [Index Endpoints](https://console.cloud.google.com/vertex-ai/matching-engine/index-endpoints) tab * `deployedIndexId` and `publicDomainName`: listed on the “Deployed index info” page, which you can open by clicking the name of the deployed index on either of the tabs mentioned earlier 5. Now that everything is configured, you can use the indexer and retriever in your Genkit application: ```ts import { vertexAiIndexerRef, vertexAiRetrieverRef } from '@genkit-ai/vertexai/vectorsearch'; // ... inside your flow function: await ai.index({ indexer: vertexAiIndexerRef({ indexId: VECTOR_SEARCH_INDEX_ID, }), documents, }); const res = await ai.retrieve({ retriever: vertexAiRetrieverRef({ indexId: VECTOR_SEARCH_INDEX_ID, }), query: queryDocument, }); ``` See the code samples for: * [Vertex Vector Search + BigQuery](https://github.com/firebase/genkit/tree/main/js/testapps/vertexai-vector-search-bigquery) * [Vertex Vector Search + Firestore](https://github.com/firebase/genkit/tree/main/js/testapps/vertexai-vector-search-firestore) * [Vertex Vector Search + a custom DB](https://github.com/firebase/genkit/tree/main/js/testapps/vertexai-vector-search-custom) ## Text-to-Speech (TTS) Models [Section titled “Text-to-Speech (TTS) Models”](#text-to-speech-tts-models) The Vertex AI plugin provides access to text-to-speech capabilities through Gemini TTS models. These models can convert text into natural-sounding speech for various applications. ### Basic Usage [Section titled “Basic Usage”](#basic-usage) To generate audio using a Vertex AI TTS model: ```ts import { vertexAI } from '@genkit-ai/vertexai'; import { writeFile } from 'node:fs/promises'; const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); const response = await ai.generate({ model: vertexAI.model('gemini-2.5-flash-preview-tts'), config: { responseModalities: ['AUDIO'], speechConfig: { voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib' }, }, }, }, prompt: 'Say that Genkit is an amazing Gen AI library', }); // Handle the audio data (returned as a data URL) if (response.media?.url) { // Extract base64 data from the data URL const audioBuffer = Buffer.from( response.media.url.substring(response.media.url.indexOf(',') + 1), 'base64' ); // Save to a file await writeFile('output.wav', audioBuffer); } ``` ### Multi-speaker Audio Generation [Section titled “Multi-speaker Audio Generation”](#multi-speaker-audio-generation) You can generate audio with multiple speakers, each with their own voice: ```ts const response = await ai.generate({ model: vertexAI.model('gemini-2.5-flash-preview-tts'), config: { responseModalities: ['AUDIO'], speechConfig: { multiSpeakerVoiceConfig: { speakerVoiceConfigs: [ { speaker: 'Speaker1', voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib' }, }, }, { speaker: 'Speaker2', voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Achernar' }, }, }, ], }, }, }, prompt: `Here's the dialog: Speaker1: "Genkit is an amazing Gen AI library!" Speaker2: "I thought it was a framework."`, }); ``` ### Configuration Options [Section titled “Configuration Options”](#configuration-options) The Vertex AI TTS models support a wide range of configuration options: #### Voice Selection [Section titled “Voice Selection”](#voice-selection) Vertex AI offers multiple pre-built voices with different characteristics: ```ts speechConfig: { voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib' // Other options include: 'Achernar', 'Ankaa', etc. }, }, } ``` #### Speech Emphasis and Prosody Control [Section titled “Speech Emphasis and Prosody Control”](#speech-emphasis-and-prosody-control) You can use markdown-style formatting in your prompt to add emphasis: * Bold text (`**like this**`) for stronger emphasis * Italic text (`*like this*`) for moderate emphasis Example: ```ts prompt: 'Genkit is an **amazing** Gen AI *library*!' ``` #### Advanced Speech Parameters [Section titled “Advanced Speech Parameters”](#advanced-speech-parameters) For more precise control over the generated speech: ```ts speechConfig: { voiceConfig: { prebuiltVoiceConfig: { voiceName: 'Algenib', speakingRate: 1.0, // Range: 0.25 to 4.0, default is 1.0 pitch: 0.0, // Range: -20.0 to 20.0, default is 0.0 volumeGainDb: 0.0, // Range: -96.0 to 16.0, default is 0.0 }, }, } ``` * `speakingRate`: Controls the speed of speech (higher values = faster speech) * `pitch`: Adjusts the pitch of the voice (higher values = higher pitch) * `volumeGainDb`: Controls the volume (higher values = louder) #### SSML Support [Section titled “SSML Support”](#ssml-support) For advanced speech synthesis control, you can use Speech Synthesis Markup Language (SSML) in your prompts: ```ts prompt: ` Here is a pause. This text is spoken slowly and with a higher pitch. 12345 ` ``` Note: When using SSML, you must wrap your entire prompt in `` tags. For more detailed information about the Vertex AI TTS models and their configuration options, see the [Vertex AI Speech Generation documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/generate-speech). ## Context Caching [Section titled “Context Caching”](#context-caching) The Vertex AI Genkit plugin supports **Context Caching**, which allows models to reuse previously cached content to optimize token usage when dealing with large pieces of content. This feature is especially useful for conversational flows or scenarios where the model references a large piece of content consistently across multiple requests. ### How to Use Context Caching [Section titled “How to Use Context Caching”](#how-to-use-context-caching) To enable context caching, ensure your model supports it. For example, `gemini-2.5-flash` and `gemini-2.0-pro` are models that support context caching, and you will have to specify version number `001`. You can define a caching mechanism in your application like this: ```ts const ai = genkit({ plugins: [vertexAI({ location: 'us-central1' })], }); const llmResponse = await ai.generate({ messages: [ { role: 'user', content: [{ text: 'Here is the relevant text from War and Peace.' }], }, { role: 'model', content: [ { text: "Based on War and Peace, here is some analysis of Pierre Bezukhov's character.", }, ], metadata: { cache: { ttlSeconds: 300, // Cache this message for 5 minutes }, }, }, ], model: vertexAI.model('gemini-2.5-flash'), prompt: "Describe Pierre's transformation throughout the novel.", }); ``` In this setup: * **`messages`**: Allows you to pass conversation history. * **`metadata.cache.ttlSeconds`**: Specifies the time-to-live (TTL) for caching a specific response. ### Example: Leveraging Large Texts with Context [Section titled “Example: Leveraging Large Texts with Context”](#example-leveraging-large-texts-with-context) For applications referencing long documents, such as *War and Peace* or *Lord of the Rings*, you can structure your queries to reuse cached contexts: ```ts const textContent = await fs.readFile('path/to/war_and_peace.txt', 'utf-8'); const llmResponse = await ai.generate({ messages: [ { role: 'user', content: [{ text: textContent }], // Include the large text as context }, { role: 'model', content: [ { text: 'This analysis is based on the provided text from War and Peace.', }, ], metadata: { cache: { ttlSeconds: 300, // Cache the response to avoid reloading the full text }, }, }, ], model: vertexAI.model('gemini-2.5-flash'), prompt: 'Analyze the relationship between Pierre and Natasha.', }); ``` ### Benefits of Context Caching [Section titled “Benefits of Context Caching”](#benefits-of-context-caching) 1. **Improved Performance**: Reduces the need for repeated processing of large inputs. 2. **Cost Efficiency**: Decreases API usage for redundant data, optimizing token consumption. 3. **Better Latency**: Speeds up response times for repeated or related queries. ### Supported Models for Context Caching [Section titled “Supported Models for Context Caching”](#supported-models-for-context-caching) Only specific models, such as `gemini-2.5-flash` and `gemini-2.0-pro`, support context caching, and currently only on version numbers `001`. If an unsupported model is used, an error will be raised, indicating that caching cannot be applied. ### Further Reading [Section titled “Further Reading”](#further-reading) See more information regarding context caching on Vertex AI in their [documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview). # xAI Plugin > Learn how to configure and use Genkit xAI plugin to access xAI (Grok) models. The `@genkit-ai/compat-oai` package includes a pre-configured plugin for [xAI (Grok)](https://x.ai/) models. The `xAI` plugin provides access to the `grok` family of models, including `grok-image` for image generation. Note The xAI plugin is built on top of the `openAICompatible` plugin. It is pre-configured for xAI’s API endpoints. ## Installation [Section titled “Installation”](#installation) ```bash npm install @genkit-ai/compat-oai ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, import `xAI` and specify it when you initialize Genkit: ```ts import { genkit } from 'genkit'; import { xAI } from '@genkit-ai/compat-oai/xai'; export const ai = genkit({ plugins: [xAI()], }); ``` You must provide an API key from xAI. You can get an API key from your [xAI account settings](https://console.x.ai/). Configure the plugin to use your API key by doing one of the following: * Set the `XAI_API_KEY` environment variable to your API key. * Specify the API key when you initialize the plugin: ```ts xAI({ apiKey: yourKey }); ``` As always, avoid embedding API keys directly in your code. ## Usage [Section titled “Usage”](#usage) Use the `xAI.model()` helper to reference a Grok model. ```ts import { genkit, z } from 'genkit'; import { xAI } from '@genkit-ai/compat-oai/xai'; const ai = genkit({ plugins: [xAI({ apiKey: process.env.XAI_API_KEY })], }); export const grokFlow = ai.defineFlow( { name: 'grokFlow', inputSchema: z.object({ subject: z.string() }), outputSchema: z.object({ fact: z.string() }), }, async ({ subject }) => { const llmResponse = await ai.generate({ model: xAI.model('grok-3-mini'), prompt: `tell me a fun fact about ${subject}`, }); return { fact: llmResponse.text }; }, ); ``` ## Advanced usage [Section titled “Advanced usage”](#advanced-usage) ### Passthrough configuration [Section titled “Passthrough configuration”](#passthrough-configuration) You can pass configuration options that are not defined in the plugin’s custom configuration schema. This permits you to access new models and features without having to update your Genkit version. ```ts import { genkit } from 'genkit'; import { xAI } from '@genkit-ai/compat-oai/xAI'; const ai = genkit({ plugins: [xAI()], }); const llmResponse = await ai.generate({ prompt: `Tell me a cool story`, model: xAI.model('grok-new'), // hypothetical new model config: { new_feature_parameter: ... // hypothetical config needed for new model }, }); ``` Genkit passes this configuration as-is to the xAI API giving you access to the new model features. Note that the field name and types are not validated by Genkit and should match the xAI API specification to work. # Retrieval-augmented generation (RAG) > Learn how Genkit simplifies retrieval-augmented generation (RAG) by providing abstractions and plugins for indexers, embedders, and retrievers to incorporate external data into LLM responses. Genkit provides abstractions that help you build retrieval-augmented generation (RAG) flows, as well as plugins that provide integrations with related tools. ## What is RAG? [Section titled “What is RAG?”](#what-is-rag) Retrieval-augmented generation is a technique used to incorporate external sources of information into an LLM’s responses. It’s important to be able to do so because, while LLMs are typically trained on a broad body of material, practical use of LLMs often requires specific domain knowledge (for example, you might want to use an LLM to answer customers’ questions about your company’s products). One solution is to fine-tune the model using more specific data. However, this can be expensive both in terms of compute cost and in terms of the effort needed to prepare adequate training data. In contrast, RAG works by incorporating external data sources into a prompt at the time it’s passed to the model. For example, you could imagine the prompt, “What is Bart’s relationship to Lisa?” might be expanded (“augmented”) by prepending some relevant information, resulting in the prompt, “Homer and Marge’s children are named Bart, Lisa, and Maggie. What is Bart’s relationship to Lisa?” This approach has several advantages: * It can be more cost-effective because you don’t have to retrain the model. * You can continuously update your data source and the LLM can immediately make use of the updated information. * You now have the potential to cite references in your LLM’s responses. On the other hand, using RAG naturally means longer prompts, and some LLM API services charge for each input token you send. Ultimately, you must evaluate the cost tradeoffs for your applications. RAG is a very broad area and there are many different techniques used to achieve the best quality RAG. The core Genkit framework offers three main abstractions to help you do RAG: * Indexers: add documents to an “index”. * Embedders: transforms documents into a vector representation * Retrievers: retrieve documents from an “index”, given a query. These definitions are broad on purpose because Genkit is un-opinionated about what an “index” is or how exactly documents are retrieved from it. Genkit only provides a `Document` format and everything else is defined by the retriever or indexer implementation provider. ### Indexers [Section titled “Indexers”](#indexers) The index is responsible for keeping track of your documents in such a way that you can quickly retrieve relevant documents given a specific query. This is most often accomplished using a vector database, which indexes your documents using multidimensional vectors called embeddings. A text embedding (opaquely) represents the concepts expressed by a passage of text; these are generated using special-purpose ML models. By indexing text using its embedding, a vector database is able to cluster conceptually related text and retrieve documents related to a novel string of text (the query). Before you can retrieve documents for the purpose of generation, you need to ingest them into your document index. A typical ingestion flow does the following: 1. Split up large documents into smaller documents so that only relevant portions are used to augment your prompts – “chunking”. This is necessary because many LLMs have a limited context window, making it impractical to include entire documents with a prompt. Genkit doesn’t provide built-in chunking libraries; however, there are open source libraries available that are compatible with Genkit. 2. Generate embeddings for each chunk. Depending on the database you’re using, you might explicitly do this with an embedding generation model, or you might use the embedding generator provided by the database. 3. Add the text chunk and its index to the database. You might run your ingestion flow infrequently or only once if you are working with a stable source of data. On the other hand, if you are working with data that frequently changes, you might continuously run the ingestion flow (for example, in a Cloud Firestore trigger, whenever a document is updated). ### Embedders [Section titled “Embedders”](#embedders) An embedder is a function that takes content (text, images, audio, etc.) and creates a numeric vector that encodes the semantic meaning of the original content. As mentioned above, embedders are leveraged as part of the process of indexing, however, they can also be used independently to create embeddings without an index. ### Retrievers [Section titled “Retrievers”](#retrievers) A retriever is a concept that encapsulates logic related to any kind of document retrieval. The most popular retrieval cases typically include retrieval from vector stores, however, in Genkit a retriever can be any function that returns data. To create a retriever, you can use one of the provided implementations or create your own. ## Supported indexers, retrievers, and embedders [Section titled “Supported indexers, retrievers, and embedders”](#supported-indexers-retrievers-and-embedders) Genkit provides indexer and retriever support through its plugin system. The following plugins are officially supported: * [Astra DB](/docs/plugins/astra-db) - DataStax Astra DB vector database * [Chroma DB](/docs/plugins/chroma) vector database * [Cloud Firestore vector store](/docs/plugins/firebase) * [Cloud SQL for PostgreSQL](/docs/plugins/cloud-sql-pg) with pgvector extension * [LanceDB](/docs/plugins/lancedb) open-source vector database * [Neo4j](/docs/plugins/neo4j) graph database with vector search * [Pinecone](/docs/plugins/pinecone) cloud vector database * [Vertex AI Vector Search](/docs/plugins/vertex-ai) In addition, Genkit supports the following vector stores through predefined code templates, which you can customize for your database configuration and schema: * PostgreSQL with [`pgvector`](/docs/templates/pgvector) ## Defining a RAG Flow [Section titled “Defining a RAG Flow”](#defining-a-rag-flow) The following examples show how you could ingest a collection of restaurant menu PDF documents into a vector database and retrieve them for use in a flow that determines what food items are available. ### Install dependencies for processing PDFs [Section titled “Install dependencies for processing PDFs”](#install-dependencies-for-processing-pdfs) ```bash npm install llm-chunk pdf-parse @genkit-ai/dev-local-vectorstore npm install --save-dev @types/pdf-parse ``` ### Add a local vector store to your configuration [Section titled “Add a local vector store to your configuration”](#add-a-local-vector-store-to-your-configuration) ```ts import { devLocalIndexerRef, devLocalVectorstore } from '@genkit-ai/dev-local-vectorstore'; import { googleAI } from '@genkit-ai/googleai'; import { z, genkit } from 'genkit'; const ai = genkit({ plugins: [ // googleAI provides the gemini-embedding-001 embedder googleAI(), // the local vector store requires an embedder to translate from text to vector devLocalVectorstore([ { indexName: 'menuQA', embedder: googleAI.embedder('gemini-embedding-001'), }, ]), ], }); ``` ### Define an Indexer [Section titled “Define an Indexer”](#define-an-indexer) The following example shows how to create an indexer to ingest a collection of PDF documents and store them in a local vector database. It uses the local file-based vector similarity retriever that Genkit provides out-of-the-box for simple testing and prototyping (*do not use in production*) #### Create the indexer [Section titled “Create the indexer”](#create-the-indexer) ```ts export const menuPdfIndexer = devLocalIndexerRef('menuQA'); ``` #### Create chunking config [Section titled “Create chunking config”](#create-chunking-config) This example uses the `llm-chunk` library which provides a simple text splitter to break up documents into segments that can be vectorized. The following definition configures the chunking function to guarantee a document segment of between 1000 and 2000 characters, broken at the end of a sentence, with an overlap between chunks of 100 characters. ```ts const chunkingConfig = { minLength: 1000, maxLength: 2000, splitter: 'sentence', overlap: 100, delimiters: '', } as any; ``` More chunking options for this library can be found in the [llm-chunk documentation](https://www.npmjs.com/package/llm-chunk). #### Define your indexer flow [Section titled “Define your indexer flow”](#define-your-indexer-flow) ```ts import { Document } from 'genkit/retriever'; import { chunk } from 'llm-chunk'; import { readFile } from 'fs/promises'; import path from 'path'; import pdf from 'pdf-parse'; async function extractTextFromPdf(filePath: string) { const pdfFile = path.resolve(filePath); const dataBuffer = await readFile(pdfFile); const data = await pdf(dataBuffer); return data.text; } export const indexMenu = ai.defineFlow( { name: 'indexMenu', inputSchema: z.object({ filePath: z.string().describe('PDF file path') }), outputSchema: z.object({ success: z.boolean(), documentsIndexed: z.number(), error: z.string().optional(), }), }, async ({ filePath }) => { try { filePath = path.resolve(filePath); // Read the pdf const pdfTxt = await ai.run('extract-text', () => extractTextFromPdf(filePath)); // Divide the pdf text into segments const chunks = await ai.run('chunk-it', async () => chunk(pdfTxt, chunkingConfig)); // Convert chunks of text into documents to store in the index. const documents = chunks.map((text) => { return Document.fromText(text, { filePath }); }); // Add documents to the index await ai.index({ indexer: menuPdfIndexer, documents, }); return { success: true, documentsIndexed: documents.length, }; } catch (err) { // For unexpected errors that throw exceptions return { success: false, documentsIndexed: 0, error: err instanceof Error ? err.message : String(err) }; } }, ); ``` #### Run the indexer flow [Section titled “Run the indexer flow”](#run-the-indexer-flow) ```bash genkit flow:run indexMenu '{"filePath": "menu.pdf"}' ``` After running the `indexMenu` flow, the vector database will be seeded with documents and ready to be used in Genkit flows with retrieval steps. ### Define a flow with retrieval [Section titled “Define a flow with retrieval”](#define-a-flow-with-retrieval) The following example shows how you might use a retriever in a RAG flow. Like the indexer example, this example uses Genkit’s file-based vector retriever, which you should not use in production. ```ts import { devLocalRetrieverRef } from '@genkit-ai/dev-local-vectorstore'; import { googleAI } from '@genkit-ai/googleai'; // Define the retriever reference export const menuRetriever = devLocalRetrieverRef('menuQA'); export const menuQAFlow = ai.defineFlow( { name: 'menuQA', inputSchema: z.object({ query: z.string() }), outputSchema: z.object({ answer: z.string() }) }, async ({ query }) => { // retrieve relevant documents const docs = await ai.retrieve({ retriever: menuRetriever, query, options: { k: 3 }, }); // generate a response const { text } = await ai.generate({ model: googleAI.model('gemini-2.5-flash'), prompt: ` You are acting as a helpful AI assistant that can answer questions about the food available on the menu at Genkit Grub Pub. Use only the context provided to answer the question. If you don't know, do not make up an answer. Do not add or change items on the menu. Question: ${query}`, docs, }); return { answer: text }; }, ); ``` #### Run the retriever flow [Section titled “Run the retriever flow”](#run-the-retriever-flow) ```bash genkit flow:run menuQA '{"query": "Recommend a dessert from the menu while avoiding dairy and nuts"}' ``` The output for this command should contain a response from the model, grounded in the indexed `menu.pdf` file. ## Write your own indexers and retrievers [Section titled “Write your own indexers and retrievers”](#write-your-own-indexers-and-retrievers) It’s also possible to create your own retriever. This is useful if your documents are managed in a document store that is not supported in Genkit (eg: MySQL, Google Drive, etc.). The Genkit SDK provides flexible methods that let you provide custom code for fetching documents. You can also define custom retrievers that build on top of existing retrievers in Genkit and apply advanced RAG techniques (such as reranking or prompt extensions) on top. ### Simple Retrievers [Section titled “Simple Retrievers”](#simple-retrievers) Simple retrievers let you easily convert existing code into retrievers: ```ts import { z } from 'genkit'; import { searchEmails } from './db'; ai.defineSimpleRetriever( { name: 'myDatabase', configSchema: z .object({ limit: z.number().optional(), }) .optional(), // we'll extract "message" from the returned email item content: 'message', // and several keys to use as metadata metadata: ['from', 'to', 'subject'], }, async (query, config) => { const result = await searchEmails(query.text, { limit: config.limit }); return result.data.emails; }, ); ``` ### Custom Retrievers [Section titled “Custom Retrievers”](#custom-retrievers) ```ts import { CommonRetrieverOptionsSchema } from 'genkit/retriever'; import { z } from 'genkit'; export const menuRetriever = devLocalRetrieverRef('menuQA'); const advancedMenuRetrieverOptionsSchema = CommonRetrieverOptionsSchema.extend({ preRerankK: z.number().max(1000), }); const advancedMenuRetriever = ai.defineRetriever( { name: `custom/advancedMenuRetriever`, configSchema: advancedMenuRetrieverOptionsSchema, }, async (input, options) => { const extendedPrompt = await extendPrompt(input); const docs = await ai.retrieve({ retriever: menuRetriever, query: extendedPrompt, options: { k: options.preRerankK || 10 }, }); const rerankedDocs = await rerank(docs); return rerankedDocs.slice(0, options.k || 3); }, ); ``` (`extendPrompt` and `rerank` is something you would have to implement yourself, not provided by the framework) And then you can just swap out your retriever: ```ts const docs = await ai.retrieve({ retriever: advancedRetriever, query: input, options: { preRerankK: 7, k: 3 }, }); ``` ### Rerankers and Two-Stage Retrieval [Section titled “Rerankers and Two-Stage Retrieval”](#rerankers-and-two-stage-retrieval) A reranking model — also known as a cross-encoder — is a type of model that, given a query and document, will output a similarity score. We use this score to reorder the documents by relevance to our query. Reranker APIs take a list of documents (for example the output of a retriever) and reorders the documents based on their relevance to the query. This step can be useful for fine-tuning the results and ensuring that the most pertinent information is used in the prompt provided to a generative model. #### Reranker Example [Section titled “Reranker Example”](#reranker-example) A reranker in Genkit is defined in a similar syntax to retrievers and indexers. Here is an example using a reranker in Genkit. This flow reranks a set of documents based on their relevance to the provided query using a predefined Vertex AI reranker. ```ts const FAKE_DOCUMENT_CONTENT = [ 'pythagorean theorem', 'e=mc^2', 'pi', 'dinosaurs', 'quantum mechanics', 'pizza', 'harry potter', ]; export const rerankFlow = ai.defineFlow( { name: 'rerankFlow', inputSchema: z.object({ query: z.string() }), outputSchema: z.array( z.object({ text: z.string(), score: z.number(), }), ), }, async ({ query }) => { const documents = FAKE_DOCUMENT_CONTENT.map((text) => ({ content: text })); const rerankedDocuments = await ai.rerank({ reranker: 'vertexai/semantic-ranker-512', query: { content: query }, documents, }); return rerankedDocuments.map((doc) => ({ text: doc.content, score: doc.metadata.score, })); }, ); ``` This reranker uses the Vertex AI genkit plugin with `semantic-ranker-512` to score and rank documents. The higher the score, the more relevant the document is to the query. #### Custom Rerankers [Section titled “Custom Rerankers”](#custom-rerankers) You can also define custom rerankers to suit your specific use case. This is helpful when you need to rerank documents using your own custom logic or a custom model. Here’s a simple example of defining a custom reranker: ```ts export const customReranker = ai.defineReranker( { name: 'custom/reranker', configSchema: z.object({ k: z.number().optional(), }), }, async (query, documents, options) => { // Your custom reranking logic here const rerankedDocs = documents.map((doc) => { const score = Math.random(); // Assign random scores for demonstration return { ...doc, metadata: { ...doc.metadata, score }, }; }); return rerankedDocs.sort((a, b) => b.metadata.score - a.metadata.score).slice(0, options.k || 3); }, ); ``` Once defined, this custom reranker can be used just like any other reranker in your RAG flows, giving you flexibility to implement advanced reranking strategies. # pgvector retriever template > This document provides a template for using PostgreSQL and pgvector as a retriever implementation in Genkit, with examples for configuration and usage in flows. You can use PostgreSQL and `pgvector` as your retriever implementation. Use the following example as a starting point and modify it to work with your database schema. ```ts import { genkit, z, Document } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; import { toSql } from 'pgvector'; import postgres from 'postgres'; const ai = genkit({ plugins: [googleAI()], }); const sql = postgres({ ssl: false, database: 'recaps' }); const QueryOptions = z.object({ show: z.string(), k: z.number().optional(), }); const sqlRetriever = ai.defineRetriever( { name: 'pgvector-myTable', configSchema: QueryOptions, }, async (input, options) => { const embedding = ( await ai.embed({ embedder: googleAI.embedder('gemini-embedding-001'), content: input, }) )[0].embedding; const results = await sql` SELECT episode_id, season_number, chunk as content FROM embeddings WHERE show_id = ${options.show} ORDER BY embedding <#> ${toSql(embedding)} LIMIT ${options.k ?? 3} `; return { documents: results.map((row) => { const { content, ...metadata } = row; return Document.fromText(content, metadata); }), }; }, ); ``` And here’s how to use the retriever in a flow: ```ts // Simple flow to use the sqlRetriever export const askQuestionsOnGoT = ai.defineFlow( { name: 'askQuestionsOnGoT', inputSchema: z.object({ question: z.string() }), outputSchema: z.object({ answer: z.string() }), }, async ({ question }) => { const docs = await ai.retrieve({ retriever: sqlRetriever, query: question, options: { show: 'Game of Thrones', }, }); console.log(docs); // Continue with using retrieved docs // in RAG prompts. // For example: const { text } = await ai.generate({ prompt: `Answer this question using the provided context: ${question}`, docs, }); return { answer: text }; }, ); ``` # Tool calling > Learn how to enable LLMs to interact with external applications and data using Genkit's tool calling feature, covering tool definition, usage, and advanced scenarios. *Tool calling*, also known as *function calling*, is a structured way to give LLMs the ability to make requests back to the application that called it. You define the tools you want to make available to the model, and the model will make tool requests to your app as necessary to fulfill the prompts you give it. The use cases of tool calling generally fall into a few themes: **Giving an LLM access to information it wasn’t trained with** * Frequently changing information, such as a stock price or the current weather. * Information specific to your app domain, such as product information or user profiles. Note the overlap with [retrieval augmented generation](/docs/rag) (RAG), which is also a way to let an LLM integrate factual information into its generations. RAG is a heavier solution that is most suited when you have a large amount of information or the information that’s most relevant to a prompt is ambiguous. On the other hand, if retrieving the information the LLM needs is a simple function call or database lookup, tool calling is more appropriate. **Introducing a degree of determinism into an LLM workflow** * Performing calculations that the LLM cannot reliably complete itself. * Forcing an LLM to generate verbatim text under certain circumstances, such as when responding to a question about an app’s terms of service. **Performing an action when initiated by an LLM** * Turning on and off lights in an LLM-powered home assistant * Reserving table reservations in an LLM-powered restaurant agent ## Before you begin [Section titled “Before you begin”](#before-you-begin) If you want to run the code examples on this page, first complete the steps in the [Getting started](/docs/get-started) guide. All of the examples assume that you have already set up a project with Genkit dependencies installed. This page discusses one of the advanced features of Genkit model abstraction, so before you dive too deeply, you should be familiar with the content on the [Generating content with AI models](/docs/models) page. You should also be familiar with Genkit’s system for defining input and output schemas, which is discussed on the [Flows](/docs/flows) page. ## Overview of tool calling [Section titled “Overview of tool calling”](#overview-of-tool-calling) [Genkit by Example: Tool Calling ](https://examples.genkit.dev/tool-calling?utm_source=genkit.dev\&utm_content=contextlink)See how Genkit can enable rich UI for tool calling in a live demo. At a high level, this is what a typical tool-calling interaction with an LLM looks like: 1. The calling application prompts the LLM with a request and also includes in the prompt a list of tools the LLM can use to generate a response. 2. The LLM either generates a complete response or generates a tool call request in a specific format. 3. If the caller receives a complete response, the request is fulfilled and the interaction ends; but if the caller receives a tool call, it performs whatever logic is appropriate and sends a new request to the LLM containing the original prompt or some variation of it as well as the result of the tool call. 4. The LLM handles the new prompt as in Step 2. For this to work, several requirements must be met: * The model must be trained to make tool requests when it’s needed to complete a prompt. Most of the larger models provided through web APIs, such as Gemini and Claude, can do this, but smaller and more specialized models often cannot. Genkit will throw an error if you try to provide tools to a model that doesn’t support it. * The calling application must provide tool definitions to the model in the format it expects. * The calling application must prompt the model to generate tool calling requests in the format the application expects. ## Tool calling with Genkit [Section titled “Tool calling with Genkit”](#tool-calling-with-genkit) Genkit provides a single interface for tool calling with models that support it. Each model plugin ensures that the last two of the above criteria are met, and the Genkit instance’s `generate()` function automatically carries out the tool calling loop described earlier. ### Model support [Section titled “Model support”](#model-support) Tool calling support depends on the model, the model API, and the Genkit plugin. Consult the relevant documentation to determine if tool calling is likely to be supported. In addition: * Genkit will throw an error if you try to provide tools to a model that doesn’t support it. * If the plugin exports model references, the `info.supports.tools` property will indicate if it supports tool calling. ### Defining tools [Section titled “Defining tools”](#defining-tools) Use the Genkit instance’s `defineTool()` function to write tool definitions: ```ts import { genkit, z } from 'genkit'; import { googleAI } from '@genkitai/google-ai'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); const getWeather = ai.defineTool( { name: 'getWeather', description: 'Gets the current weather in a given location', inputSchema: z.object({ location: z.string().describe('The location to get the current weather for'), }), outputSchema: z.string(), }, async (input) => { // Here, we would typically make an API call or database query. For this // example, we just return a fixed value. return `The current weather in ${input.location} is 63°F and sunny.`; }, ); ``` The syntax here looks just like the `defineFlow()` syntax; however, `name`, `description`, and `inputSchema` parameters are required. When writing a tool definition, take special care with the wording and descriptiveness of these parameters. They are vital for the LLM to make effective use of the available tools. ### Using tools [Section titled “Using tools”](#using-tools) Include defined tools in your prompts to generate content. * Generate ```ts const response = await ai.generate({ prompt: "What is the weather in Baltimore?", tools: [getWeather], }); ``` * definePrompt ```ts const weatherPrompt = ai.definePrompt( { name: "weatherPrompt", tools: [getWeather], }, "What is the weather in {{location}}?" ); const response = await weatherPrompt({ location: "Baltimore" }); ``` * Prompt file ```dotprompt --- tools: [getWeather] input: schema: location: string --- What is the weather in {{location}}? ``` Then you can execute the prompt in your code as follows: ```ts // assuming prompt file is named weatherPrompt.prompt const weatherPrompt = ai.prompt("weatherPrompt"); const response = await weatherPrompt({ location: "Baltimore" }); ``` * Chat ```ts const chat = ai.chat({ system: "Answer questions using the tools you have.", tools: [getWeather], }); const response = await chat.send("What is the weather in Baltimore?"); // Or, specify tools that are message-specific const response = await chat.send({ prompt: "What is the weather in Baltimore?", tools: [getWeather], }); ``` ### Streaming and Tool Calling [Section titled “Streaming and Tool Calling”](#streaming-and-tool-calling) When combining tool calling with streaming responses, you will receive `toolRequest` and `toolResponse` content parts in the chunks of the stream. For example, the following code: ```ts const { stream } = ai.generateStream({ prompt: "What is the weather in Baltimore?", tools: [getWeather], }); for await (const chunk of stream) { console.log(chunk); } ``` Might produce a sequence of chunks similar to: ```ts {index: 0, role: "model", content: [{text: "Okay, I'll check the weather"}]} {index: 0, role: "model", content: [{text: "for Baltimore."}]} // toolRequests will be emitted as a single chunk by most models {index: 0, role: "model", content: [{toolRequest: {name: "getWeather", input: {location: "Baltimore"}}}]} // when streaming multiple messages, Genkit increments the index and indicates the new role {index: 1, role: "tool", content: [{toolResponse: {name: "getWeather", output: "Temperature: 68 degrees\nStatus: Cloudy."}}]} {index: 2, role: "model", content: [{text: "The weather in Baltimore is 68 degrees and cloudy."}]} ``` You can use these chunks to dynamically construct the full generated message sequence. ### Limiting Tool Call Iterations with `maxTurns` [Section titled “Limiting Tool Call Iterations with maxTurns”](#limiting-tool-call-iterations-with-maxturns) When working with tools that might trigger multiple sequential calls, you can control resource usage and prevent runaway execution using the `maxTurns` parameter. This sets a hard limit on how many back-and-forth interactions the model can have with your tools in a single generation cycle. **Why use maxTurns?** * **Cost Control**: Prevents unexpected API usage charges from excessive tool calls * **Performance**: Ensures responses complete within reasonable timeframes * **Safety**: Guards against infinite loops in complex tool interactions * **Predictability**: Makes your application behavior more deterministic The default value is 5 turns, which works well for most scenarios. Each “turn” represents one complete cycle where the model can make tool calls and receive responses. **Example: Web Research Agent** Consider a research agent that might need to search multiple times to find comprehensive information: ```ts const webSearch = ai.defineTool( { name: 'webSearch', description: 'Search the web for current information', inputSchema: z.object({ query: z.string().describe('Search query'), }), outputSchema: z.string(), }, async (input) => { // Simulate web search API call return `Search results for "${input.query}": [relevant information here]`; }, ); const response = await ai.generate({ prompt: 'Research the latest developments in quantum computing, including recent breakthroughs, key companies, and future applications.', tools: [webSearch], maxTurns: 8, // Allow up to 8 research iterations }); ``` **Example: Financial Calculator** Here’s a more complex scenario where an agent might need multiple calculation steps: ```ts const calculator = ai.defineTool( { name: 'calculator', description: 'Perform mathematical calculations', inputSchema: z.object({ expression: z.string().describe('Mathematical expression to evaluate'), }), outputSchema: z.number(), }, async (input) => { // Safe evaluation of mathematical expressions return eval(input.expression); // In production, use a safe math parser }, ); const stockAnalyzer = ai.defineTool( { name: 'stockAnalyzer', description: 'Get current stock price and basic metrics', inputSchema: z.object({ symbol: z.string().describe('Stock symbol (e.g., AAPL)'), }), outputSchema: z.object({ price: z.number(), change: z.number(), volume: z.number(), }), }, async (input) => { // Simulate stock API call return { price: 150.25, change: 2.50, volume: 45000000 }; }, ); ``` * Generate ```typescript const response = await ai.generate({ prompt: 'Calculate the total value of my portfolio: 100 shares of AAPL, 50 shares of GOOGL, and 200 shares of MSFT. Also calculate what percentage each holding represents.', tools: [calculator, stockAnalyzer], maxTurns: 12, // Multiple stock lookups + calculations needed }); ``` * definePrompt ```typescript const portfolioAnalysisPrompt = ai.definePrompt( { name: "portfolioAnalysis", tools: [calculator, stockAnalyzer], maxTurns: 12, }, "Calculate the total value of my portfolio: {{holdings}}. Also calculate what percentage each holding represents." ); const response = await portfolioAnalysisPrompt({ holdings: "100 shares of AAPL, 50 shares of GOOGL, and 200 shares of MSFT" }); ``` * Prompt file ```dotprompt --- tools: [calculator, stockAnalyzer] maxTurns: 12 input: schema: holdings: string --- Calculate the total value of my portfolio: {{holdings}}. Also calculate what percentage each holding represents. ``` Then execute the prompt: ```typescript const portfolioAnalysisPrompt = ai.prompt("portfolioAnalysis"); const response = await portfolioAnalysisPrompt({ holdings: "100 shares of AAPL, 50 shares of GOOGL, and 200 shares of MSFT" }); ``` * Chat ```typescript const chat = ai.chat({ system: "You are a financial analysis assistant. Use the available tools to provide accurate calculations and current market data.", tools: [calculator, stockAnalyzer], maxTurns: 12, }); const response = await chat.send("Calculate the total value of my portfolio: 100 shares of AAPL, 50 shares of GOOGL, and 200 shares of MSFT. Also calculate what percentage each holding represents."); ``` **What happens when maxTurns is reached?** When the limit is hit, Genkit stops the tool-calling loop and returns the model’s current response, even if it was in the middle of using tools. The model will typically provide a partial answer or explain that it couldn’t complete all the requested operations. ### Dynamically defining tools at runtime [Section titled “Dynamically defining tools at runtime”](#dynamically-defining-tools-at-runtime) As most things in Genkit tools need to be predefined during your app’s initialization. This is necessary so that you would be able interact with your tools from the Genkit Dev UI. This is typically the recommended way. However there are scenarios when the tool must be defined dynamically per user request. You can dynamically define tools using `ai.dynamicTool` function. It is very similar to `ai.defineTool` method, however dynamic tools are not tracked by Genkit runtime, so cannot be interacted with from Genkit Dev UI and must be passed to the `ai.generate` call by reference (for regular tools you can also use a string tool name). ```ts import { genkit, z } from 'genkit'; import { googleAI } from '@genkit-ai/googleai'; const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); ai.defineFlow('weatherFlow', async () => { const getWeather = ai.dynamicTool( { name: 'getWeather', description: 'Gets the current weather in a given location', inputSchema: z.object({ location: z.string().describe('The location to get the current weather for'), }), outputSchema: z.string(), }, async (input) => { return `The current weather in ${input.location} is 63°F and sunny.`; }, ); const { text } = await ai.generate({ prompt: 'What is the weather in Baltimore?', tools: [getWeather], }); return text; }); ``` When defining dynamic tools, to specify input and output schemas you can either use Zod as shown in the previous example, or you can pass in manually constructed JSON Schema. ```ts const getWeather = ai.dynamicTool( { name: 'getWeather', description: 'Gets the current weather in a given location', inputJsonSchema: myInputJsonSchema, outputJsonSchema: myOutputJsonSchema, }, async (input) => { /* ... */ }, ); ``` Dynamic tools don’t require the implementation function. If you don’t pass in the function the tool will behave like an [interrupt](/docs/interrupts) and you can do manual tool call handling: ```ts const getWeather = ai.dynamicTool({ name: 'getWeather', description: 'Gets the current weather in a given location', inputJsonSchema: myInputJsonSchema, outputJsonSchema: myOutputJsonSchema, }); ``` ### Pause the tool loop by using interrupts [Section titled “Pause the tool loop by using interrupts”](#pause-the-tool-loop-by-using-interrupts) By default, Genkit repeatedly calls the LLM until every tool call has been resolved. You can conditionally pause execution in situations where you want to, for example: * Ask the user a question or display UI. * Confirm a potentially risky action with the user. * Request out-of-band approval for an action. **Interrupts** are special tools that can halt the loop and return control to your code so that you can handle more advanced scenarios. Visit the [interrupts guide](/docs/interrupts) to learn how to use them. ### Explicitly handling tool calls [Section titled “Explicitly handling tool calls”](#explicitly-handling-tool-calls) If you want full control over this tool-calling loop, for example to apply more complicated logic, set the `returnToolRequests` parameter to `true`. Now it’s your responsibility to ensure all of the tool requests are fulfilled: ```ts const getWeather = ai.defineTool( { // ... tool definition ... }, async ({ location }) => { // ... tool implementation ... }, ); const generateOptions: GenerateOptions = { prompt: "What's the weather like in Baltimore?", tools: [getWeather], returnToolRequests: true, }; let llmResponse; while (true) { llmResponse = await ai.generate(generateOptions); const toolRequests = llmResponse.toolRequests; if (toolRequests.length < 1) { break; } const toolResponses: ToolResponsePart[] = await Promise.all( toolRequests.map(async (part) => { switch (part.toolRequest.name) { case 'specialTool': return { toolResponse: { name: part.toolRequest.name, ref: part.toolRequest.ref, output: await getWeather(part.toolRequest.input), }, }; default: throw Error('Tool not found'); } }), ); generateOptions.messages = llmResponse.messages; generateOptions.prompt = toolResponses; } ``` # Chat with a PDF file > Learn how to build a conversational application that allows users to extract information from PDF documents using natural language. This tutorial demonstrates how to build a conversational application that allows users to extract information from PDF documents using natural language. 1. [Set up your project](#1-set-up-your-project) 2. [Import the required dependencies](#2-import-the-required-dependencies) 3. [Configure Genkit and the default model](#3-configure-genkit-and-the-default-model) 4. [Load and parse the PDF file](#4-load-and-parse-the-pdf) 5. [Set up the prompt](#5-set-up-the-prompt) 6. [Implement the UI](#6-implement-the-ui) 7. [Implement the chat loop](#7-implement-the-chat-loop) 8. [Run the app](#8-run-the-app) ## Prerequisites [Section titled “Prerequisites”](#prerequisites) Before starting work, you should have these prerequisites set up: * [Node.js v20+](https://nodejs.org/en/download) * [npm](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm) ## Implementation Steps [Section titled “Implementation Steps”](#implementation-steps) After setting up your dependencies, you can build the project. ### 1. Set up your project [Section titled “1. Set up your project”](#1-set-up-your-project) 1. Create a directory structure and a file to hold your source code. ```bash mkdir -p chat-with-a-pdf/src && \ cd chat-with-a-pdf && \ touch src/index.ts ``` 2. Initialize a new TypeScript project. ```bash npm init -y ``` 3. Install the pdf-parse module. ```bash npm install pdf-parse && npm install --save-dev @types/pdf-parse ``` 4. Install the following Genkit dependencies to use Genkit in your project: ```bash npm install genkit @genkit-ai/googleai ``` * `genkit` provides Genkit core capabilities. * `@genkit-ai/googleai` provides access to the Google AI Gemini models. 5. Get and configure your model API key To use the Gemini API, which this tutorial uses, you must first configure an API key. If you don’t already have one, [create a key](https://makersuite.google.com/app/apikey) in Google AI Studio. The Gemini API provides a generous free-of-charge tier and does not require a credit card to get started. After creating your API key, set the `GEMINI_API_KEY` environment variable to your key with the following command: ```bash export GEMINI_API_KEY= ``` Note Although this tutorial uses the Gemini API from AI Studio, Genkit supports a wide variety of model providers, including: * [Gemini from Vertex AI.](/docs/plugins/vertex-ai#generative-ai-models) * Anthropic’s Claude 3 models and Llama 3.1 through the [Vertex AI Model Garden](/docs/plugins/vertex-ai#anthropic-claude-3-on-vertex-ai-model-garden), as well as community plugins. * Open source models through [Ollama](/docs/plugins/ollama). * [Community-supported providers](/docs/models#models-supported-by-genkit) such as OpenAI and Cohere. ### 2. Import the required dependencies [Section titled “2. Import the required dependencies”](#2-import-the-required-dependencies) In the `index.ts` file that you created, add the following lines to import the dependencies required for this project: ```typescript import { googleAI } from '@genkit-ai/googleai'; import { genkit } from 'genkit/beta'; // chat is a beta feature import pdf from 'pdf-parse'; import fs from 'fs'; import { createInterface } from 'node:readline/promises'; ``` * The first line imports the `googleAI` plugin from the `@genkit-ai/googleai` package, enabling access to Google’s Gemini models. * The next two lines import the `pdf-parse` library for parsing PDF files and the `fs` module for file system operations. * The final line imports the `createInterface` function from the `node:readline/promises` module, which is used to create a command-line interface for user interaction. ### 3. Configure Genkit and the default model [Section titled “3. Configure Genkit and the default model”](#3-configure-genkit-and-the-default-model) Add the following lines to configure Genkit and set Gemini 2.0 Flash as the default model. ```typescript const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); ``` You can then add a skeleton for the code and error-handling. ```typescript (async () => { try { // Step 1: get command line arguments // Step 2: load PDF file // Step 3: construct prompt // Step 4: start chat // Step 5: chat loop } catch (error) { console.error('Error parsing PDF or interacting with Genkit:', error); } })(); // <-- don't forget the trailing parentheses to call the function! ``` ### 4. Load and parse the PDF [Section titled “4. Load and parse the PDF”](#4-load-and-parse-the-pdf) 1. Add code to read the PDF filename that was passed in from the command line. ```typescript // Step 1: get command line arguments const filename = process.argv[2]; if (!filename) { console.error('Please provide a filename as a command line argument.'); process.exit(1); } ``` 2. Add code to load the contents of the PDF file. ```typescript // Step 2: load PDF file let dataBuffer = fs.readFileSync(filename); const { text } = await pdf(dataBuffer); ``` ### 5. Set up the prompt [Section titled “5. Set up the prompt”](#5-set-up-the-prompt) Add code to set up the prompt: ```typescript // Step 3: construct prompt const prefix = process.argv[3] || "Sample prompt: Answer the user's questions about the contents of this PDF file."; const prompt = ` ${prefix} Context: ${text} `; ``` * The first `const` declaration defines a default prompt if the user doesn’t pass in one of their own from the command line. * The second `const` declaration interpolates the prompt prefix and the full text of the PDF file into the prompt for the model. ### 6. Implement the UI [Section titled “6. Implement the UI”](#6-implement-the-ui) Add the following code to start the chat and implement the UI: ```typescript // Step 4: start chat const chat = ai.chat({ system: prompt }); const readline = createInterface(process.stdin, process.stdout); console.log("You're chatting with Gemini. Ctrl-C to quit.\n"); ``` The first `const` declaration starts the chat with the model by calling the `chat` method, passing the prompt (which includes the full text of the PDF file). The rest of the code instantiates a text input, then displays a message to the user. ### 7. Implement the chat loop [Section titled “7. Implement the chat loop”](#7-implement-the-chat-loop) Under Step 5, add code to receive user input and send that input to the model using `chat.send`. This part of the app loops until the user presses *CTRL + C*. ```typescript // Step 5: chat loop while (true) { const userInput = await readline.question('> '); const { text } = await chat.send(userInput); console.log(text); } ``` ### 8. Run the app [Section titled “8. Run the app”](#8-run-the-app) To run the app, open the terminal in the root folder of your project, then run the following command: ```typescript npx tsx src/index.ts path/to/some.pdf ``` You can then start chatting with the PDF file. # Summarize YouTube videos > Learn how to build a conversational application that allows users to summarize YouTube videos and chat about their contents using natural language. This tutorial demonstrates how to build a conversational application that allows users to summarize YouTube videos and chat about their contents using natural language. 1. [Set up your project](#1-set-up-your-project) 2. [Import the required dependencies](#2-import-the-required-dependencies) 3. [Configure Genkit and the default model](#3-configure-genkit-and-the-default-model) 4. [Get the video URL from the command line](#4-parse-the-command-line-and-get-the-video-url) 5. [Set up the prompt](#5-set-up-the-prompt) 6. [Generate the response](#6-generate-the-response) 7. [Run the app](#7-run-the-app) ## Prerequisites [Section titled “Prerequisites”](#prerequisites) Before starting work, you should have these prerequisites set up: * [Node.js v20+](https://nodejs.org/en/download) * [npm](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm) ## Implementation Steps [Section titled “Implementation Steps”](#implementation-steps) After setting up your dependencies, you can build the project. ### 1. Set up your project [Section titled “1. Set up your project”](#1-set-up-your-project) 1. Create a directory structure and a file to hold your source code. ```bash mkdir -p summarize-a-video/src && \ cd summarize-a-video && \ touch src/index.ts ``` 2. Initialize a new TypeScript project. ```bash npm init -y ``` 3. Install the following Genkit dependencies to use Genkit in your project: ```bash npm install genkit @genkit-ai/googleai ``` * `genkit` provides Genkit core capabilities. * `@genkit-ai/googleai` provides access to the Google AI Gemini models. 4. Get and configure your model API key To use the Gemini API, which this tutorial uses, you must first configure an API key. If you don’t already have one, [create a key](https://makersuite.google.com/app/apikey) in Google AI Studio. The Gemini API provides a generous free-of-charge tier and does not require a credit card to get started. After creating your API key, set the `GEMINI_API_KEY` environment variable to your key with the following command: ```bash export GEMINI_API_KEY= ``` Note Although this tutorial uses the Gemini API from AI Studio, Genkit supports a wide variety of model providers, including: * [Gemini from Vertex AI.](/docs/plugins/vertex-ai#generative-ai-models) * Anthropic’s Claude 3 models and Llama 3.1 through the [Vertex AI Model Garden](/docs/plugins/vertex-ai#anthropic-claude-3-on-vertex-ai-model-garden), as well as community plugins. * Open source models through [Ollama](/docs/plugins/ollama). * [Community-supported providers](/docs/models#models-supported-by-genkit) such as OpenAI and Cohere. ### 2. Import the required dependencies [Section titled “2. Import the required dependencies”](#2-import-the-required-dependencies) In the `index.ts` file that you created, add the following lines to import the dependencies required for this project: ```typescript import { googleAI } from '@genkit-ai/googleai'; import { genkit } from 'genkit'; ``` * The first line imports the `googleAI` plugin from the `@genkit-ai/googleai` package, enabling access to Google’s Gemini models. ### 3. Configure Genkit and the default model [Section titled “3. Configure Genkit and the default model”](#3-configure-genkit-and-the-default-model) Add the following lines to configure Genkit and set Gemini 2.0 Flash as the default model. ```typescript const ai = genkit({ plugins: [googleAI()], model: googleAI.model('gemini-2.5-flash'), }); ``` You can then add a skeleton for the code and error-handling. ```typescript (async () => { try { // Step 1: get command line arguments // Step 2: construct prompt // Step 3: process video } catch (error) { console.error('Error processing video:', error); } })(); // <-- don't forget the trailing parentheses to call the function! ``` ### 4. Parse the command line and get the video URL [Section titled “4. Parse the command line and get the video URL”](#4-parse-the-command-line-and-get-the-video-url) Add code to read the URL of the video that was passed in from the command line. ```typescript // Step 1: get command line arguments const videoURL = process.argv[2]; if (!videoURL) { console.error('Please provide a video URL as a command line argument.'); process.exit(1); } ``` ### 5. Set up the prompt [Section titled “5. Set up the prompt”](#5-set-up-the-prompt) Add code to set up the prompt: ```typescript // Step 2: construct prompt const prompt = process.argv[3] || 'Please summarize the following video:'; ``` * This `const` declaration defines a default prompt if the user doesn’t pass in one of their own from the command line. ### 6. Generate the response [Section titled “6. Generate the response”](#6-generate-the-response) Add the following code to pass a multimodal prompt to the model: ```typescript // Step 3: process video const { text } = await ai.generate({ prompt: [{ text: prompt }, { media: { url: videoURL, contentType: 'video/mp4' } }], }); console.log(text); ``` This code snippet calls the `ai.generate` method to send a multimodal prompt to the model. The prompt consists of two parts: * `{ text: prompt }`: This is the text prompt that you defined earlier. * `{ media: { url: videoURL, contentType: "video/mp4" } }`: This is the URL of the video that you provided as a command-line argument. The `contentType` is set to `video/mp4` to indicate that the URL points to an MP4 video file. The `ai.generate` method returns an object containing the generated text, which is then logged to the console. ### 7. Run the app [Section titled “7. Run the app”](#7-run-the-app) To run the app, open the terminal in the root folder of your project, then run the following command: ```bash npx tsx src/index.ts https://www.youtube.com/watch\?v\=YUgXJkNqH9Q ``` After a moment, a summary of the video you provided appears. You can pass in other prompts as well. For example: ```bash npx tsx src/index.ts https://www.youtube.com/watch\?v\=YUgXJkNqH9Q "Transcribe this video" ``` Note If you get an error message saying “no matches found”, you might need to wrap the video URL in quotes. # Genkit with Cloud Run > Learn how to deploy Genkit Go flows as web services using Cloud Run. You can deploy Genkit flows as web services using Cloud Run. This page, as an example, walks you through the process of deploying the default sample flow. 1. Install the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install) if you haven’t already. 2. Create a new Google Cloud project using the [Cloud console](https://console.cloud.google.com) or choose an existing one. The project must be linked to a billing account. After you create or choose a project, configure the Google Cloud CLI to use it: ```bash gcloud auth login gcloud init ``` 3. Create a directory for the Genkit sample project: ```bash mkdir -p ~/tmp/genkit-cloud-project cd ~/tmp/genkit-cloud-project ``` If you’re going to use an IDE, open it to this directory. 4. Initialize a Go module in your project directory: ```bash go mod init example/cloudrun go mod get github.com/firebase/genkit/go ``` 5. Create a sample app using Genkit: ```go package main import ( "context" "fmt" "log" "net/http" "os" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" "github.com/firebase/genkit/go/plugins/server" ) func main() { ctx := context.Background() // Initialize Genkit with the Google AI plugin and Gemini 2.0 Flash. // Alternatively, use &googlegenai.VertexAI{} and "vertexai/gemini-2.5-flash" // to use Vertex AI as the provider instead. g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.GoogleAI{}), genkit.WithDefaultModel("googleai/gemini-2.5-flash"), ) if err != nil { log.Fatalf("failed to initialize Genkit: %w", err) } flow := genkit.DefineFlow(g, "jokesFlow", func(ctx context.Context, topic string) (string, error) { resp, err := genkit.Generate(ctx, g, ai.WithPrompt(`Tell a short joke about %s. Be creative!`, topic), ) if err != nil { return "", fmt.Errorf("failed to generate joke: %w", err) } return resp.Text(), nil }) mux := http.NewServeMux() mux.HandleFunc("POST /jokesFlow", genkit.Handler(flow)) log.Fatal(server.Start(ctx, "0.0.0.0:"+os.Getenv("PORT"), mux)) } ``` 6. Make API credentials available to your deployed function. Choose which credentials you need based on your choice in the sample above: Gemini (Google AI) 1. Make sure Google AI is [available in your region](https://ai.google.dev/available_regions). 2. [Generate an API key](https://aistudio.google.com/app/apikey) for the Gemini API using Google AI Studio. 3. Make the API key available in the Cloud Run environment: 1. In the Cloud console, enable the [Secret Manager API](https://console.cloud.google.com/apis/library/secretmanager.googleapis.com?project=_). 2. On the [Secret Manager](https://console.cloud.google.com/security/secret-manager?project=_) page, create a new secret containing your API key. 3. After you create the secret, on the same page, grant your default compute service account access to the secret with the **Secret Manager Secret Accessor** role. (You can look up the name of the default compute service account on the IAM page.) In a later step, when you deploy your service, you will need to reference the name of this secret. Gemini (Vertex AI) 1. In the Cloud console, [Enable the Vertex AI API](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com?project=_) for your project. 2. On the [IAM](https://console.cloud.google.com/iam-admin/iam?project=_) page, ensure that the **Default compute service account** is granted the **Vertex AI User** role. The only secret you need to set up for this tutorial is for the model provider, but in general, you must do something similar for each service your flow uses. 7. **Optional**: Try your flow in the developer UI: 1. Set up your local environment for the model provider you chose: Gemini (Google AI) ```bash export GEMINI_API_KEY= ``` Gemini (Vertex AI) ```bash export GOOGLE_CLOUD_PROJECT= export GOOGLE_CLOUD_LOCATION=us-central1 gcloud auth application-default login ``` 2. Start the UI: ```bash genkit start -- go run . ``` 3. In the developer UI (`http://localhost:4000/`), run the flow: 1. Click **jokesFlow**. 2. On the **Input JSON** tab, provide a subject for the model: ```json "bananas" ``` 3. Click **Run**. 8. If everything’s working as expected so far, you can build and deploy the flow: Gemini (Google AI) ```bash gcloud run deploy --port 3400 \ --update-secrets=GEMINI_API_KEY=:latest ``` Gemini (Vertex AI) ```bash gcloud run deploy --port 3400 \ --set-env-vars GOOGLE_CLOUD_PROJECT= \ --set-env-vars GOOGLE_CLOUD_LOCATION=us-central1 ``` (`GOOGLE_CLOUD_LOCATION` configures the Vertex API region you want to use.) Choose `N` when asked if you want to allow unauthenticated invocations. Answering `N` will configure your service to require IAM credentials. See [Authentication](https://cloud.google.com/run/docs/authenticating/overview) in the Cloud Run docs for information on providing these credentials. After deployment finishes, the tool will print the service URL. You can test it with `curl`: ```bash curl -X POST https:///jokesFlow \ -H "Authorization: Bearer $(gcloud auth print-identity-token)" \ -H "Content-Type: application/json" -d '{"data": "bananas"}' ``` # Deploy flows to any app hosting platform > Learn how to deploy Genkit Go flows as web services using any service that can host a Go binary. You can deploy Genkit flows as web services using any service that can host a Go binary. This page, as an example, walks you through the general process of deploying the default sample flow, and points out where you must take provider-specific actions. 1. Create a directory for the Genkit sample project: ```bash mkdir -p ~/tmp/genkit-cloud-project cd ~/tmp/genkit-cloud-project ``` If you’re going to use an IDE, open it to this directory. 2. Initialize a Go module in your project directory: ```bash go mod init example/cloudrun go get github.com/firebase/genkit/go ``` 3. Create a sample app using Genkit: ```go package main import ( "context" "fmt" "log" "net/http" "os" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" "github.com/firebase/genkit/go/plugins/server" ) func main() { ctx := context.Background() // Initialize Genkit with the Google AI plugin and Gemini 2.0 Flash. // Alternatively, use &googlegenai.VertexAI{} and "vertexai/gemini-2.5-flash" // to use Vertex AI as the provider instead. g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.GoogleAI{}), genkit.WithDefaultModel("googleai/gemini-2.5-flash"), ) if err != nil { log.Fatalf("failed to initialize Genkit: %w", err) } flow := genkit.DefineFlow(g, "jokesFlow", func(ctx context.Context, topic string) (string, error) { resp, err := genkit.Generate(ctx, g, ai.WithPrompt(`Tell a short joke about %s. Be creative!`, topic), ) if err != nil { return "", fmt.Errorf("failed to generate joke: %w", err) } return resp.Text(), nil }) mux := http.NewServeMux() mux.HandleFunc("POST /jokesFlow", genkit.Handler(flow)) log.Fatal(server.Start(ctx, "127.0.0.1:"+os.Getenv("PORT"), mux)) } ``` 4. Implement some form of authentication and authorization to gate access to the flows you plan to deploy. Because most generative AI services are metered, you most likely do not want to allow open access to any endpoints that call them. Some hosting services provide an authentication layer as a frontend to apps deployed on them, which you can use for this purpose. 5. Make API credentials available to your deployed function. Do one of the following, depending on the model provider you chose: Gemini (Google AI) 1. Make sure Google AI is [available in your region](https://ai.google.dev/available_regions). 2. [Generate an API key](https://aistudio.google.com/app/apikey) for the Gemini API using Google AI Studio. 3. Make the API key available in the deployed environment. Most app hosts provide some system for securely handling secrets such as API keys. Often, these secrets are available to your app in the form of environment variables. If you can assign your API key to the `GEMINI_API_KEY` variable, Genkit will use it automatically. Otherwise, you need to modify the `googlegenai.GoogleAI` plugin struct to explicitly set the key. (But don’t embed the key directly in code! Use the secret management facilities provided by your hosting provider.) Gemini (Vertex AI) 1. In the Cloud console, [Enable the Vertex AI API](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com?project=_) for your project. 2. On the [IAM](https://console.cloud.google.com/iam-admin/iam?project=_) page, create a service account for accessing the Vertex AI API if you don’t alreacy have one. Grant the account the **Vertex AI User** role. 3. [Set up Application Default Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc#on-prem) in your hosting environment. 4. Configure the plugin with your Google Cloud project ID and the Vertex AI API location you want to use. You can do so either by setting the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` environment variables in your hosting environment, or in your `googlegenai.VertexAI{}` constructor. The only secret you need to set up for this tutorial is for the model provider, but in general, you must do something similar for each service your flow uses. 6. **Optional**: Try your flow in the developer UI: 1. Set up your local environment for the model provider you chose: Gemini (Google AI) ```bash export GEMINI_API_KEY= ``` Gemini (Vertex AI) ```bash export GOOGLE_CLOUD_PROJECT= export GOOGLE_CLOUD_LOCATION=us-central1 gcloud auth application-default login ``` 2. Start the UI: ```bash genkit start -- go run . ``` 3. In the developer UI (`http://localhost:4000/`), run the flow: 4. Click **jokesFlow**. 5. On the **Input JSON** tab, provide a subject for the model: ```json "bananas" ``` 6. Click **Run**. 7. If everything’s working as expected so far, you can build and deploy the flow using your provider’s tools. # Managing prompts with Dotprompt > Learn how to use Dotprompt to manage prompts, models, and parameters for generative AI models in Genkit Go. Prompt engineering is the primary way that you, as an app developer, influence the output of generative AI models. For example, when using LLMs, you can craft prompts that influence the tone, format, length, and other characteristics of the models’ responses. The way you write these prompts will depend on the model you’re using; a prompt written for one model might not perform well when used with another model. Similarly, the model parameters you set (temperature, top-k, and so on) will also affect output differently depending on the model. Getting all three of these factors—the model, the model parameters, and the prompt—working together to produce the output you want is rarely a trivial process and often involves substantial iteration and experimentation. Genkit provides a library and file format called Dotprompt, that aims to make this iteration faster and more convenient. [Dotprompt](https://github.com/google/dotprompt) is designed around the premise that **prompts are code**. You define your prompts along with the models and model parameters they’re intended for separately from your application code. Then, you (or, perhaps someone not even involved with writing application code) can rapidly iterate on the prompts and model parameters using the Genkit Developer UI. Once your prompts are working the way you want, you can import them into your application and run them using Genkit. Your prompt definitions each go in a file with a `.prompt` extension. Here’s an example of what these files look like: ```dotprompt --- model: googleai/gemini-1.5-flash config: temperature: 0.9 input: schema: location: string style?: string name?: string default: location: a restaurant --- You are the world's most welcoming AI assistant and are currently working at {{location}}. Greet a guest{{#if name}} named {{name}}{{/if}}{{#if style}} in the style of {{style}}{{/if}}. ``` The portion in the triple-dashes is YAML front matter, similar to the front matter format used by GitHub Markdown and Jekyll; the rest of the file is the prompt, which can optionally use [Handlebars](https://handlebarsjs.com/guide/) templates. The following sections will go into more detail about each of the parts that make a `.prompt` file and how to use them. ## Before you begin [Section titled “Before you begin”](#before-you-begin) Before reading this page, you should be familiar with the content covered on the [Generating content with AI models](/go/docs/models) page. If you want to run the code examples on this page, first complete the steps in the [Get started](/go/docs/get-started-go) guide. All of the examples assume that you have already installed Genkit as a dependency in your project. ## Creating prompt files [Section titled “Creating prompt files”](#creating-prompt-files) Although Dotprompt provides several [different ways](#defining-prompts-in-code) to create and load prompts, it’s optimized for projects that organize their prompts as `.prompt` files within a single directory (or subdirectories thereof). This section shows you how to create and load prompts using this recommended setup. ### Creating a prompt directory [Section titled “Creating a prompt directory”](#creating-a-prompt-directory) The Dotprompt library expects to find your prompts in a directory at your project root and automatically loads any prompts it finds there. By default, this directory is named `prompts`. For example, using the default directory name, your project structure might look something like this: ```text your-project/ ├── prompts/ │ └── hello.prompt ├── main.go ├── go.mod └── go.sum ``` If you want to use a different directory, you can specify it when you configure Genkit: ```go g, err := genkit.Init(ctx.Background(), ai.WithPromptDir("./llm_prompts")) ``` ### Creating a prompt file [Section titled “Creating a prompt file”](#creating-a-prompt-file) There are two ways to create a `.prompt` file: using a text editor, or with the developer UI. #### Using a text editor [Section titled “Using a text editor”](#using-a-text-editor) If you want to create a prompt file using a text editor, create a text file with the `.prompt` extension in your prompts directory: for example, `prompts/hello.prompt`. Here is a minimal example of a prompt file: ```dotprompt --- model: vertexai/gemini-1.5-flash --- You are the world's most welcoming AI assistant. Greet the user and offer your assistance. ``` The portion in the dashes is YAML front matter, similar to the front matter format used by GitHub Markdown and Jekyll; the rest of the file is the prompt, which can optionally use Handlebars templates. The front matter section is optional, but most prompt files will at least contain metadata specifying a model. The remainder of this page shows you how to go beyond this, and make use of Dotprompt’s features in your prompt files. #### Using the developer UI [Section titled “Using the developer UI”](#using-the-developer-ui) You can also create a prompt file using the model runner in the developer UI. Start with application code that imports the Genkit library and configures it to use the model plugin you’re interested in. For example: ```go package main import ( "context" "log" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" ) func main() { g, err := genkit.Init(context.Background(), genkit.WithPlugins(&googlegenai.GoogleAI{})) if err != nil { log.Fatal(err) } // Blocks end of program execution to use the developer UI. select {} } ``` Load the developer UI in the same project: ```bash genkit start -- go run . ``` In the **Model** section, choose the model you want to use from the list of models provided by the plugin. ![Genkit developer UI model runner](/_astro/developer_ui_model_runner.cHO4a-_l_Z1Vv7kN.webp) Then, experiment with the prompt and configuration until you get results you’re happy with. When you’re ready, press the Export button and save the file to your prompts directory. ## Running prompts [Section titled “Running prompts”](#running-prompts) After you’ve created prompt files, you can run them from your application code, or using the tooling provided by Genkit. Regardless of how you want to run your prompts, first start with application code that imports the Genkit library and the model plugins you’re interested in. For example: ```go package main import ( "context" "log" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" ) func main() { g, err := genkit.Init(context.Background(), genkit.WithPlugins(&googlegenai.GoogleAI{})) if err != nil { log.Fatal(err) } // Blocks end of program execution to use the developer UI. select {} } ``` If you’re storing your prompts in a directory other than the default, be sure to specify it when you configure Genkit. ### Run prompts from code [Section titled “Run prompts from code”](#run-prompts-from-code) To use a prompt, first load it using the `genkit.LookupPrompt()` function: ```go helloPrompt := genkit.LookupPrompt(g, "hello") ``` An executable prompt has similar options to that of `genkit.Generate()` and many of them are overridable at execution time, including things like input (see the section about [specifying input schemas](#input-and-output-schemas)), configuration, and more: ```go resp, err := helloPrompt.Execute(context.Background(), ai.WithModelName("googleai/gemini-2.5-flash"), ai.WithInput(map[string]any{"name": "John"}), ai.WithConfig(&googlegenai.GeminiConfig{Temperature: 0.5}) ) ``` Any parameters you pass to the prompt call will override the same parameters specified in the prompt file. See [Generate content with AI models](/go/docs/models) for descriptions of the available options. ### Using the developer UI [Section titled “Using the developer UI”](#using-the-developer-ui-1) As you’re refining your app’s prompts, you can run them in the Genkit developer UI to quickly iterate on prompts and model configurations, independently from your application code. Load the developer UI from your project directory: ```bash genkit start -- go run . ``` ![Genkit developer UI prompt runner](/_astro/prompts-in-developer-ui.LmFDtByL_ZBrbGw.webp) Once you’ve loaded prompts into the developer UI, you can run them with different input values, and experiment with how changes to the prompt wording or the configuration parameters affect the model output. When you’re happy with the result, you can click the **Export prompt** button to save the modified prompt back into your project directory. ## Model configuration [Section titled “Model configuration”](#model-configuration) In the front matter block of your prompt files, you can optionally specify model configuration values for your prompt: ```yaml --- model: googleai/gemini-2.5-flash config: temperature: 1.4 topK: 50 topP: 0.4 maxOutputTokens: 400 stopSequences: - '' - '' --- ``` These values map directly to the `WithConfig()` option accepted by the executable prompt: ```go resp, err := helloPrompt.Execute(context.Background(), ai.WithConfig(&googlegenai.GeminiConfig{ Temperature: 1.4, TopK: 50, TopP: 0.4, MaxOutputTokens: 400, StopSequences: []string{"", ""}, })) ``` See [Generate content with AI models](/go/docs/models) for descriptions of the available options. ## Input and output schemas [Section titled “Input and output schemas”](#input-and-output-schemas) You can specify input and output schemas for your prompt by defining them in the front matter section. These schemas are used in much the same way as those passed to a `genkit.Generate()` request or a flow definition: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: theme?: string default: theme: "pirate" output: schema: dishname: string description: string calories: integer allergens(array): string --- Invent a menu item for a {{theme}} themed restaurant. ``` This code produces the following structured output: ```go package main import ( "context" "log" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" ) func main() { ctx := context.Background() g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.GoogleAI{})) if err != nil { log.Fatal(err) } menuPrompt := genkit.LookupPrompt(g, "menu") if menuPrompt == nil { log.Fatal("no prompt named 'menu' found") } resp, err := menuPrompt.Execute(ctx, ai.WithInput(map[string]any{"theme": "medieval"}), ) if err != nil { log.Fatal(err) } var output map[string]any if err := resp.Output(&output); err != nil { log.Fatal(err) } log.Println(output["dishname"]) log.Println(output["description"]) // Blocks end of program execution to use the developer UI. select {} } ``` You have several options for defining schemas in a `.prompt` file: Dotprompt’s own schema definition format, Picoschema; standard JSON Schema; or, as references to schemas defined in your application code. The following sections describe each of these options in more detail. ### Picoschema [Section titled “Picoschema”](#picoschema) The schemas in the example above are defined in a format called Picoschema. Picoschema is a compact, YAML-optimized schema definition format that simplifies defining the most important attributes of a schema for LLM usage. Here’s a longer example of a schema, which specifies the information an app might store about an article: ```yaml schema: title: string # string, number, and boolean types are defined like this subtitle?: string # optional fields are marked with a `?` draft?: boolean, true when in draft state status?(enum, approval status): [PENDING, APPROVED] date: string, the date of publication e.g. '2024-04-09' # descriptions follow a comma tags(array, relevant tags for article): string # arrays are denoted via parentheses authors(array): name: string email?: string metadata?(object): # objects are also denoted via parentheses updatedAt?: string, ISO timestamp of last update approvedBy?: integer, id of approver extra?: any, arbitrary extra data (*): string, wildcard field ``` The above schema is equivalent to the following Go type: ```go type Article struct { Title string `json:"title"` Subtitle string `json:"subtitle,omitempty" jsonschema:"required=false"` Draft bool `json:"draft,omitempty"` // True when in draft state Status string `json:"status,omitempty" jsonschema:"enum=PENDING,enum=APPROVED"` // Approval status Date string `json:"date"` // The date of publication e.g. '2025-04-07' Tags []string `json:"tags"` // Relevant tags for article Authors []struct { Name string `json:"name"` Email string `json:"email,omitempty"` } `json:"authors"` Metadata struct { UpdatedAt string `json:"updatedAt,omitempty"` // ISO timestamp of last update ApprovedBy int `json:"approvedBy,omitempty"` // ID of approver } `json:"metadata,omitempty"` Extra any `json:"extra"` // Arbitrary extra data } ``` Picoschema supports scalar types `string`, `integer`, `number`, `boolean`, and `any`. Objects, arrays, and enums are denoted by a parenthetical after the field name. Objects defined by Picoschema have all properties required unless denoted optional by `?`, and don’t allow additional properties. When a property is marked as optional, it is also made nullable to provide more leniency for LLMs to return null instead of omitting a field. In an object definition, the special key `(*)` can be used to declare a “wildcard” field definition. This will match any additional properties not supplied by an explicit key. ### JSON Schema [Section titled “JSON Schema”](#json-schema) Picoschema does not support many of the capabilities of full JSON schema. If you require more robust schemas, you may supply a JSON Schema instead: ```yaml output: schema: type: object properties: field1: type: number minimum: 20 ``` {# TODO: Talk about defining schemas in code to reference in .prompt file once implemented. #} ## Prompt templates [Section titled “Prompt templates”](#prompt-templates) The portion of a `.prompt` file that follows the front matter (if present) is the prompt itself, which will be passed to the model. While this prompt could be a simple text string, very often you will want to incorporate user input into the prompt. To do so, you can specify your prompt using the [Handlebars](https://handlebarsjs.com/guide/) templating language. Prompt templates can include placeholders that refer to the values defined by your prompt’s input schema. You already saw this in action in the section on input and output schemas: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: theme?: string default: theme: "pirate" output: schema: dishname: string description: string calories: integer allergens(array): string --- Invent a menu item for a {{theme}} themed restaurant. ``` In this example, the Handlebars expression, `{{theme}}`, resolves to the value of the input’s `theme` property when you run the prompt. To pass input to the prompt, call the prompt as in the following example: ```go menuPrompt = genkit.LookupPrompt(g, "menu") resp, err := menuPrompt.Execute(context.Background(), ai.WithInput(map[string]any{"theme": "medieval"}), ) ``` Note that because the input schema declared the `theme` property to be optional and provided a default, you could have omitted the property, and the prompt would have resolved using the default value. Handlebars templates also support some limited logical constructs. For example, as an alternative to providing a default, you could define the prompt using Handlebars’s `#if` helper: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: theme?: string --- Invent a menu item for a {{#if theme}}{{theme}}{{else}}themed{{/if}} restaurant. ``` In this example, the prompt renders as “Invent a menu item for a restaurant” when the `theme` property is unspecified. See the [Handlebars documentation](https://handlebarsjs.com/guide/builtin-helpers.html) for information on all of the built-in logical helpers. In addition to properties defined by your input schema, your templates can also refer to values automatically defined by Genkit. The next few sections describe these automatically-defined values and how you can use them. ### Multi-message prompts [Section titled “Multi-message prompts”](#multi-message-prompts) By default, Dotprompt constructs a single message with a “user” role. However, some prompts, such as a system prompt, are best expressed as combinations of multiple messages. The `{{role}}` helper provides a straightforward way to construct multi-message prompts: ```dotprompt --- model: vertexai/gemini-2.5-flash input: schema: userQuestion: string --- {{role "system"}} You are a helpful AI assistant that really loves to talk about food. Try to work food items into all of your conversations. {{role "user"}} {{userQuestion}} ``` ### Multi-modal prompts [Section titled “Multi-modal prompts”](#multi-modal-prompts) For models that support multimodal input, such as images alongside text, you can use the `{{media}}` helper: ```dotprompt --- model: vertexai/gemini-2.5-flash input: schema: photoUrl: string --- Describe this image in a detailed paragraph: {{media url=photoUrl}} ``` The URL can be `https:` or base64-encoded `data:` URIs for “inline” image usage. In code, this would be: ```go multimodalPrompt = genkit.LookupPrompt(g, "multimodal") resp, err := multimodalPrompt.Execute(context.Background(), ai.WithInput(map[string]any{"photoUrl": "https://example.com/photo.jpg"}), ) ``` See also [Multimodal input](/go/docs/models#multimodal-input), on the [Generating content with AI models](/go/docs/models) page, for an example of constructing a `data:` URL. ### Partials [Section titled “Partials”](#partials) Partials are reusable templates that can be included inside any prompt. Partials can be especially helpful for related prompts that share common behavior. When loading a prompt directory, any file prefixed with an underscore (`_`) is considered a partial. So a file `_personality.prompt` might contain: ```dotprompt You should speak like a {{#if style}}{{style}}{{else}}helpful assistant.{{/if}}. ``` This can then be included in other prompts: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: name: string style?: string --- {{ role "system" }} {{>personality style=style}} {{ role "user" }} Give the user a friendly greeting. User's Name: {{name}} ``` Partials are inserted using the `{{>NAME_OF_PARTIAL args...}}` syntax. If no arguments are provided to the partial, it executes with the same context as the parent prompt. Partials accept named arguments or a single positional argument representing the context. This can be helpful for tasks such as rendering members of a list. **\_destination.prompt** ```dotprompt - {{name}} ({{country}}) ``` **chooseDestination.prompt** ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: destinations(array): name: string country: string --- Help the user decide between these vacation destinations: {{#each destinations}} {{>destination this}} {{/each}} ``` #### Defining partials in code [Section titled “Defining partials in code”](#defining-partials-in-code) You can also define partials in code using `genkit.DefinePartial()`: ```go genkit.DefinePartial(g, "personality", "Talk like a {% verbatim %}{{#if style}}{{style}}{{else}}{% endverbatim %}helpful assistant{% verbatim %}{{/if}}{% endverbatim %}.") ``` Code-defined partials are available in all prompts. ### Defining Custom Helpers [Section titled “Defining Custom Helpers”](#defining-custom-helpers) You can define custom helpers to process and manage data inside of a prompt. Helpers are registered globally using `genkit.DefineHelper()`: ```go genkit.DefineHelper(g, "shout", func(input string) string { return strings.ToUpper(input) }) ``` Once a helper is defined you can use it in any prompt: ```dotprompt --- model: googleai/gemini-2.5-flash input: schema: name: string --- HELLO, {{shout name}}!!! ``` ## Prompt variants [Section titled “Prompt variants”](#prompt-variants) Because prompt files are just text, you can (and should!) commit them to your version control system, simplifying the process of comparing changes over time. Often, tweaked versions of prompts can only be fully tested in a production environment side-by-side with existing versions. Dotprompt supports this through its variants feature. To create a variant, create a `[name].[variant].prompt` file. For example, if you were using Gemini 2.0 Flash in your prompt but wanted to see if Gemini 2.5 Pro would perform better, you might create two files: * `myPrompt.prompt`: the “baseline” prompt * `myPrompt.gemini25pro.prompt`: a variant named `gemini25pro` To use a prompt variant, specify the variant option when loading: ```go myPrompt := genkit.LookupPrompt(g, "myPrompt.gemini25Pro") ``` The name of the variant is included in the metadata of generation traces, so you can compare and contrast actual performance between variants in the Genkit trace inspector. ## Defining prompts in code [Section titled “Defining prompts in code”](#defining-prompts-in-code) All of the examples discussed so far have assumed that your prompts are defined in individual `.prompt` files in a single directory (or subdirectories thereof), accessible to your app at runtime. Dotprompt is designed around this setup, and its authors consider it to be the best developer experience overall. However, if you have use cases that are not well supported by this setup, you can also define prompts in code using the `genkit.DefinePrompt()` function: ```go type GeoQuery struct { CountryCount int `json:"countryCount"` } type CountryList struct { Countries []string `json:"countries"` } geographyPrompt, err := genkit.DefinePrompt( g, "GeographyPrompt", ai.WithSystem("You are a geography teacher. Respond only when the user asks about geography."), ai.WithPrompt("Give me the {% verbatim %}{{countryCount}}{% endverbatim %} biggest countries in the world by inhabitants."), ai.WithConfig(&googlegenai.GeminiConfig{Temperature: 0.5}), ai.WithInputType(GeoQuery{CountryCount: 10}) // Defaults to 10. ai.WithOutputType(CountryList{}), ) if err != nil { log.Fatal(err) } resp, err := geographyPrompt.Execute(context.Background(), ai.WithInput(GeoQuery{CountryCount: 15})) if err != nil { log.Fatal(err) } var list CountryList if err := resp.Output(&list); err != nil { log.Fatal(err) } log.Println("Countries: %s", list.Countries) ``` Prompts may also be rendered into a `GenerateActionOptions` which may then be processed and passed into `genkit.GenerateWithRequest()`: ```go actionOpts, err := geographyPrompt.Render(ctx, ai.WithInput(GeoQuery{CountryCount: 15})) if err != nil { log.Fatal(err) } // Do something with the value... actionOpts.Config = &googlegenai.GeminiConfig{Temperature: 0.8} resp, err := genkit.GenerateWithRequest(ctx, g, actionOpts, nil, nil) // No middleware or streaming ``` Note that all prompt options carry over to `GenerateActionOptions` with the exception of `WithMiddleware()`, which must be passed separately if using `Prompt.Render()` instead of `Prompt.Execute()`. # Evaluation > Learn how to evaluate Genkit Go flows and models using built-in and third-party tools. Evaluation is a form of testing that helps you validate your LLM’s responses and ensure they meet your quality bar. Genkit supports third-party evaluation tools through plugins, paired with powerful observability features that provide insight into the runtime state of your LLM-powered applications. Genkit tooling helps you automatically extract data including inputs, outputs, and information from intermediate steps to evaluate the end-to-end quality of LLM responses as well as understand the performance of your system’s building blocks. ### Types of evaluation [Section titled “Types of evaluation”](#types-of-evaluation) Genkit supports two types of evaluation: * **Inference-based evaluation**: This type of evaluation runs against a collection of pre-determined inputs, assessing the corresponding outputs for quality. This is the most common evaluation type, suitable for most use cases. This approach tests a system’s actual output for each evaluation run. You can perform the quality assessment manually, by visually inspecting the results. Alternatively, you can automate the assessment by using an evaluation metric. * **Raw evaluation**: This type of evaluation directly assesses the quality of inputs without any inference. This approach typically is used with automated evaluation using metrics. All required fields for evaluation (e.g., `input`, `context`, `output` and `reference`) must be present in the input dataset. This is useful when you have data coming from an external source (e.g., collected from your production traces) and you want to have an objective measurement of the quality of the collected data. For more information, see the [Advanced use](#advanced-use) section of this page. This section explains how to perform inference-based evaluation using Genkit. ## Quick start [Section titled “Quick start”](#quick-start) Perform these steps to get started quickly with Genkit. ### Setup [Section titled “Setup”](#setup) 1. Use an existing Genkit app or create a new one by following our [Get started](/go/docs/get-started-go) guide. 2. Add the following code to define a simple RAG application to evaluate. For this guide, we use a dummy retriever that always returns the same documents. ```go package main import ( "context" "fmt" "log" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" ) func main() { ctx := context.Background() // Initialize Genkit g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.GoogleAI{}), genkit.WithDefaultModel("googleai/gemini-2.5-flash"), ) if err != nil { log.Fatalf("Genkit initialization error: %v", err) } // Dummy retriever that always returns the same facts dummyRetrieverFunc := func(ctx context.Context, req *ai.RetrieverRequest) (*ai.RetrieverResponse, error) { facts := []string{ "Dog is man's best friend", "Dogs have evolved and were domesticated from wolves", } // Just return facts as documents. var docs []*ai.Document for _, fact := range facts { docs = append(docs, ai.DocumentFromText(fact, nil)) } return &ai.RetrieverResponse{Documents: docs}, nil } factsRetriever := genkit.DefineRetriever(g, "local", "dogFacts", dummyRetrieverFunc) m := googlegenai.GoogleAIModel(g, "gemini-2.5-flash") if m == nil { log.Fatal("failed to find model") } // A simple question-answering flow genkit.DefineFlow(g, "qaFlow", func(ctx context.Context, query string) (string, error) { factDocs, err := ai.Retrieve(ctx, factsRetriever, ai.WithTextDocs(query)) if err != nil { return "", fmt.Errorf("retrieval failed: %w", err) } llmResponse, err := genkit.Generate(ctx, g, ai.WithModelName("googleai/gemini-2.5-flash"), ai.WithPrompt("Answer this question with the given context: %s", query), ai.WithDocs(factDocs.Documents...) ) if err != nil { return "", fmt.Errorf("generation failed: %w", err) } return llmResponse.Text(), nil }) } ``` 3. You can optionally add evaluation metrics to your application to use while evaluating. This guide uses the `EvaluatorRegex` metric from the `evaluators` package. ```go import ( "github.com/firebase/genkit/go/plugins/evaluators" ) func main() { // ... metrics := []evaluators.MetricConfig{ { MetricType: evaluators.EvaluatorRegex, }, } // Initialize Genkit g, err := genkit.Init(ctx, genkit.WithPlugins( &googlegenai.GoogleAI{}, &evaluators.GenkitEval{Metrics: metrics}, // Add this plugin ), genkit.WithDefaultModel("googleai/gemini-2.5-flash"), ) } ``` **Note:** Ensure that the `evaluators` package is installed in your go project: ```bash go get github.com/firebase/genkit/go/plugins/evaluators ``` 4. Start your Genkit application. ```bash genkit start -- go run main.go ``` ### Create a dataset [Section titled “Create a dataset”](#create-a-dataset) Create a dataset to define the examples we want to use for evaluating our flow. 1. Go to the Dev UI at `http://localhost:4000` and click the **Datasets** button to open the Datasets page. 2. Click the **Create Dataset** button to open the create dataset dialog. a. Provide a `datasetId` for your new dataset. This guide uses `myFactsQaDataset`. b. Select `Flow` dataset type. c. Leave the validation target field empty and click **Save** 3. Your new dataset page appears, showing an empty dataset. Add examples to it by following these steps: a. Click the **Add example** button to open the example editor panel. b. Only the `Input` field is required. Enter `"Who is man's best friend?"` in the `Input` field, and click **Save** to add the example has to your dataset. If you have configured the `EvaluatorRegex` metric and would like to try it out, you need to specify a Reference string that contains the pattern to match the output against. For the preceding input, set the `Reference output` text to `"(?i)dog"`, which is a case-insensitive regular- expression pattern to match the word “dog” in the flow output. c. Repeat steps (a) and (b) a couple of more times to add more examples. This guide adds the following example inputs to the dataset: ```text "Can I give milk to my cats?" "From which animals did dogs evolve?" ``` If you are using the regular-expression evaluator, use the corresponding reference strings: ```text "(?i)don't know" "(?i)wolf|wolves" ``` Note that this is a contrived example and the regular-expression evaluator may not be the right choice to evaluate the responses from `qaFlow`. However, this guide can be applied to any Genkit Go evaluator of your choice. By the end of this step, your dataset should have 3 examples in it, with the values mentioned above. ### Run evaluation and view results [Section titled “Run evaluation and view results”](#run-evaluation-and-view-results) To start evaluating the flow, click the **Run new evaluation** button on your dataset page. You can also start a new evaluation from the *Evaluations* tab. 1. Select the `Flow` radio button to evaluate a flow. 2. Select `qaFlow` as the target flow to evaluate. 3. Select `myFactsQaDataset` as the target dataset to use for evaluation. 4. If you have installed an evaluator metric using Genkit plugins, you can see these metrics in this page. Select the metrics that you want to use with this evaluation run. This is entirely optional: Omitting this step will still return the results in the evaluation run, but without any associated metrics. If you have not provided any reference values and are using the `EvaluatorRegex` metric, your evaluation will fail since this metric needs reference to be set. 5. Click **Run evaluation** to start evaluation. Depending on the flow you’re testing, this may take a while. Once the evaluation is complete, a success message appears with a link to view the results. Click the link to go to the *Evaluation details* page. You can see the details of your evaluation on this page, including original input, extracted context and metrics (if any). ## Core concepts [Section titled “Core concepts”](#core-concepts) ### Terminology [Section titled “Terminology”](#terminology) Knowing the following terms can help ensure that you correctly understand the information provided on this page: * **Evaluation**: An evaluation is a process that assesses system performance. In Genkit, such a system is usually a Genkit primitive, such as a flow or a model. An evaluation can be automated or manual (human evaluation). * **Bulk inference** Inference is the act of running an input on a flow or model to get the corresponding output. Bulk inference involves performing inference on multiple inputs simultaneously. * **Metric** An evaluation metric is a criterion on which an inference is scored. Examples include accuracy, faithfulness, maliciousness, whether the output is in English, etc. * **Dataset** A dataset is a collection of examples to use for inference-based evaluation. A dataset typically consists of `Input` and optional `Reference` fields. The `Reference` field does not affect the inference step of evaluation but it is passed verbatim to any evaluation metrics. In Genkit, you can create a dataset through the Dev UI. There are two types of datasets in Genkit: *Flow* datasets and *Model* datasets. ## Supported evaluators [Section titled “Supported evaluators”](#supported-evaluators) Genkit supports several evaluators, some built-in, and others provided externally. ### Genkit evaluators [Section titled “Genkit evaluators”](#genkit-evaluators) Genkit includes a small number of built-in evaluators, ported from the [JS evaluators plugin](https://js.api.genkit.dev/enums/_genkit-ai_evaluator.GenkitMetric.html), to help you get started: * EvaluatorDeepEqual — Checks if the generated output is deep-equal to the reference output provided. * EvaluatorRegex — Checks if the generated output matches the regular expression provided in the reference field. * EvaluatorJsonata — Checks if the generated output matches the [JSONATA](https://jsonata.org/) expression provided in the reference field. ## Advanced use [Section titled “Advanced use”](#advanced-use) Along with its basic functionality, Genkit also provides advanced support for certain evaluation use cases. ### Evaluation using the CLI [Section titled “Evaluation using the CLI”](#evaluation-using-the-cli) Genkit CLI provides a rich API for performing evaluation. This is especially useful in environments where the Dev UI is not available (e.g. in a CI/CD workflow). Genkit CLI provides 3 main evaluation commands: `eval:flow`, `eval:extractData`, and `eval:run`. #### Evaluation `eval:flow` command [Section titled “Evaluation eval:flow command”](#evaluation-evalflow-command) The `eval:flow` command runs inference-based evaluation on an input dataset. This dataset may be provided either as a JSON file or by referencing an existing dataset in your Genkit runtime. ```bash # Referencing an existing dataset genkit eval:flow qaFlow --input myFactsQaDataset # or, using a dataset from a file genkit eval:flow qaFlow --input testInputs.json ``` **Note:** Make sure that you start your genkit app before running these CLI commands. ```bash genkit start -- go run main.go ``` Here, `testInputs.json` should be an array of objects containing an `input` field and an optional `reference` field, like below: ```json [ { "input": "What is the French word for Cheese?" }, { "input": "What green vegetable looks like cauliflower?", "reference": "Broccoli" } ] ``` If your flow requires auth, you may specify it using the `--context` argument: ```bash genkit eval:flow qaFlow --input testInputs.json --context '{"auth": {"email_verified": true}}' ``` By default, the `eval:flow` and `eval:run` commands use all available metrics for evaluation. To run on a subset of the configured evaluators, use the `--evaluators` flag and provide a comma-separated list of evaluators by name: ```bash genkit eval:flow qaFlow --input testInputs.json --evaluators=genkitEval/regex,genkitEval/jsonata ``` You can view the results of your evaluation run in the Dev UI at `localhost:4000/evaluate`. #### `eval:extractData` and `eval:run` commands [Section titled “eval:extractData and eval:run commands”](#evalextractdata-and-evalrun-commands) To support *raw evaluation*, Genkit provides tools to extract data from traces and run evaluation metrics on extracted data. This is useful, for example, if you are using a different framework for evaluation or if you are collecting inferences from a different environment to test locally for output quality. You can batch run your Genkit flow and extract an *evaluation dataset* from the resultant traces. A raw evaluation dataset is a collection of inputs for evaluation metrics, *without* running any prior inference. Run your flow over your test inputs: ```bash genkit flow:batchRun qaFlow testInputs.json ``` Extract the evaluation data: ```bash genkit eval:extractData qaFlow --maxRows 2 --output factsEvalDataset.json ``` The exported data has a format different from the dataset format presented earlier. This is because this data is intended to be used with evaluation metrics directly, without any inference step. Here is the syntax of the extracted data. ```json Array<{ "testCaseId": string, "input": any, "output": any, "context": any[], "traceIds": string[], }>; ``` The data extractor automatically locates retrievers and adds the produced docs to the context array. You can run evaluation metrics on this extracted dataset using the `eval:run` command. ```bash genkit eval:run factsEvalDataset.json ``` By default, `eval:run` runs against all configured evaluators, and as with `eval:flow`, results for `eval:run` appear in the evaluation page of Developer UI, located at `localhost:4000/evaluate`. # Defining AI workflows > Learn how to define and use Genkit flows in Go to structure your AI logic. The core of your app’s AI features is generative model requests, but it’s rare that you can simply take user input, pass it to the model, and display the model output back to the user. Usually, there are pre- and post-processing steps that must accompany the model call. For example: * Retrieving contextual information to send with the model call. * Retrieving the history of the user’s current session, for example in a chat app. * Using one model to reformat the user input in a way that’s suitable to pass to another model. * Evaluating the “safety” of a model’s output before presenting it to the user. * Combining the output of several models. Every step of this workflow must work together for any AI-related task to succeed. In Genkit, you represent this tightly-linked logic using a construction called a flow. Flows are written just like functions, using ordinary Go code, but they add additional capabilities intended to ease the development of AI features: * **Type safety**: Input and output schemas, which provides both static and runtime type checking. * **Integration with developer UI**: Debug flows independently of your application code using the developer UI. In the developer UI, you can run flows and view traces for each step of the flow. * **Simplified deployment**: Deploy flows directly as web API endpoints, using any platform that can host a web app. Genkit’s flows are lightweight and unobtrusive, and don’t force your app to conform to any specific abstraction. All of the flow’s logic is written in standard Go, and code inside a flow doesn’t need to be flow-aware. ## Defining and calling flows [Section titled “Defining and calling flows”](#defining-and-calling-flows) In its simplest form, a flow just wraps a function. The following example wraps a function that calls `genkit.Generate()`: ```go menuSuggestionFlow := genkit.DefineFlow(g, "menuSuggestionFlow", func(ctx context.Context, theme string) (string, error) { resp, err := genkit.Generate(ctx, g, ai.WithPrompt("Invent a menu item for a %s themed restaurant.", theme), ) if err != nil { return "", err } return resp.Text(), nil }) ``` Just by wrapping your `genkit.Generate()` calls like this, you add some functionality: Doing so lets you run the flow from the Genkit CLI and from the developer UI, and is a requirement for several of Genkit’s features, including deployment and observability (later sections discuss these topics). ### Input and output schemas [Section titled “Input and output schemas”](#input-and-output-schemas) One of the most important advantages Genkit flows have over directly calling a model API is type safety of both inputs and outputs. When defining flows, you can define schemas, in much the same way as you define the output schema of a `genkit.Generate()` call; however, unlike with `genkit.Generate()`, you can also specify an input schema. Here’s a refinement of the last example, which defines a flow that takes a string as input and outputs an object: ```go type MenuItem struct { Name string `json:"name"` Description string `json:"description"` } menuSuggestionFlow := genkit.DefineFlow(g, "menuSuggestionFlow", func(ctx context.Context, theme string) (MenuItem, error) { return genkit.GenerateData[MenuItem](ctx, g, ai.WithPrompt("Invent a menu item for a %s themed restaurant.", theme), ) }) ``` Note that the schema of a flow does not necessarily have to line up with the schema of the `genkit.Generate()` calls within the flow (in fact, a flow might not even contain `genkit.Generate()` calls). Here’s a variation of the example that calls `genkit.GenerateData()`, but uses the structured output to format a simple string, which the flow returns. Note how we pass `MenuItem` as a type parameter; this is the equivalent of passing the `WithOutputType()` option and getting a value of that type in response. ```go type MenuItem struct { Name string `json:"name"` Description string `json:"description"` } menuSuggestionMarkdownFlow := genkit.DefineFlow(g, "menuSuggestionMarkdownFlow", func(ctx context.Context, theme string) (string, error) { item, _, err := genkit.GenerateData[MenuItem](ctx, g, ai.WithPrompt("Invent a menu item for a %s themed restaurant.", theme), ) if err != nil { return "", err } return fmt.Sprintf("**%s**: %s", item.Name, item.Description), nil }) ``` ### Calling flows [Section titled “Calling flows”](#calling-flows) Once you’ve defined a flow, you can call it from your Go code: ```go item, err := menuSuggestionFlow.Run(context.Background(), "bistro") ``` The argument to the flow must conform to the input schema. If you defined an output schema, the flow response will conform to it. For example, if you set the output schema to `MenuItem`, the flow output will contain its properties: ```go item, err := menuSuggestionFlow.Run(context.Background(), "bistro") if err != nil { log.Fatal(err) } log.Println(item.Name) log.Println(item.Description) ``` ## Streaming flows [Section titled “Streaming flows”](#streaming-flows) Flows support streaming using an interface similar to `genkit.Generate()`’s streaming interface. Streaming is useful when your flow generates a large amount of output, because you can present the output to the user as it’s being generated, which improves the perceived responsiveness of your app. As a familiar example, chat-based LLM interfaces often stream their responses to the user as they are generated. Here’s an example of a flow that supports streaming: ```go type Menu struct { Theme string `json:"theme"` Items []MenuItem `json:"items"` } type MenuItem struct { Name string `json:"name"` Description string `json:"description"` } menuSuggestionFlow := genkit.DefineStreamingFlow(g, "menuSuggestionFlow", func(ctx context.Context, theme string, callback core.StreamCallback[string]) (Menu, error) { item, _, err := genkit.GenerateData[MenuItem](ctx, g, ai.WithPrompt("Invent a menu item for a %s themed restaurant.", theme), ai.WithStreaming(func(ctx context.Context, chunk *ai.ModelResponseChunk) error { // Here, you could process the chunk in some way before sending it to // the output stream using StreamCallback. In this example, we output // the text of the chunk, unmodified. return callback(ctx, chunk.Text()) }), ) if err != nil { return nil, err } return Menu{ Theme: theme, Items: []MenuItem{item}, }, nil }) ``` The `string` type in `StreamCallback[string]` specifies the type of values your flow streams. This does not necessarily need to be the same type as the return type, which is the type of the flow’s complete output (`Menu` in this example). In this example, the values streamed by the flow are directly coupled to the values streamed by the `genkit.Generate()` call inside the flow. Although this is often the case, it doesn’t have to be: you can output values to the stream using the callback as often as is useful for your flow. ### Calling streaming flows [Section titled “Calling streaming flows”](#calling-streaming-flows) Streaming flows can be run like non-streaming flows with `menuSuggestionFlow.Run(ctx, "bistro")` or they can be streamed: ```go streamCh, err := menuSuggestionFlow.Stream(context.Background(), "bistro") if err != nil { log.Fatal(err) } for result := range streamCh { if result.Err != nil { log.Fatal("Stream error: %v", result.Err) } if result.Done { log.Printf("Menu with %s theme:\n", result.Output.Theme) for item := range result.Output.Items { log.Println(" - %s: %s", item.Name, item.Description) } } else { log.Println("Stream chunk:", result.Stream) } } ``` ## Running flows from the command line [Section titled “Running flows from the command line”](#running-flows-from-the-command-line) You can run flows from the command line using the Genkit CLI tool: ```bash genkit flow:run menuSuggestionFlow '"French"' ``` For streaming flows, you can print the streaming output to the console by adding the `-s` flag: ```bash genkit flow:run menuSuggestionFlow '"French"' -s ``` Running a flow from the command line is useful for testing a flow, or for running flows that perform tasks needed on an ad hoc basis—for example, to run a flow that ingests a document into your vector database. ## Debugging flows [Section titled “Debugging flows”](#debugging-flows) One of the advantages of encapsulating AI logic within a flow is that you can test and debug the flow independently from your app using the Genkit developer UI. The developer UI relies on the Go app continuing to run, even if the logic has completed. If you are just getting started and Genkit is not part of a broader app, add `select {}` as the last line of `main()` to prevent the app from shutting down so that you can inspect it in the UI. To start the developer UI, run the following command from your project directory: ```bash genkit start -- go run . ``` From the **Run** tab of developer UI, you can run any of the flows defined in your project: ![Screenshot of the Flow runner](/_astro/devui-flows.CU7lon_X_Z1bEbxA.webp) After you’ve run a flow, you can inspect a trace of the flow invocation by either clicking **View trace** or looking at the **Inspect** tab. ## Deploying flows [Section titled “Deploying flows”](#deploying-flows) You can deploy your flows directly as web API endpoints, ready for you to call from your app clients. Deployment is discussed in detail on several other pages, but this section gives brief overviews of your deployment options. ### `net/http` Server [Section titled “net/http Server”](#nethttp-server) To deploy a flow using any Go hosting platform, such as Cloud Run, define your flow using `genkit.DefineFlow()` and start a `net/http` server with the provided flow handler using `genkit.Handler()`: ```go package main import ( "context" "log" "net/http" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" "github.com/firebase/genkit/go/plugins/server" ) type MenuItem struct { Name string `json:"name"` Description string `json:"description"` } func main() { ctx := context.Background() g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.GoogleAI{})) if err != nil { log.Fatal(err) } menuSuggestionFlow := genkit.DefineFlow(g, "menuSuggestionFlow", func(ctx context.Context, theme string) (MenuItem, error) { item, _, err := genkit.GenerateData[MenuItem](ctx, g, ai.WithPrompt("Invent a menu item for a %s themed restaurant.", theme), ) return item, err }) mux := http.NewServeMux() mux.HandleFunc("POST /menuSuggestionFlow", genkit.Handler(menuSuggestionFlow)) log.Fatal(server.Start(ctx, "127.0.0.1:3400", mux)) } ``` `server.Start()` is an optional helper function that starts the server and manages its lifecycle, including capturing interrupt signals to ease local development, but you may use your own method. To serve all the flows defined in your codebase, you can use `genkit.ListFlows()`: ```go mux := http.NewServeMux() for _, flow := range genkit.ListFlows(g) { mux.HandleFunc("POST /"+flow.Name(), genkit.Handler(flow)) } log.Fatal(server.Start(ctx, "127.0.0.1:3400", mux)) ``` You can call a flow endpoint with a POST request as follows: ```bash curl -X POST "http://localhost:3400/menuSuggestionFlow" \ -H "Content-Type: application/json" -d '{"data": "banana"}' ``` ### Other server frameworks [Section titled “Other server frameworks”](#other-server-frameworks) You can also use other server frameworks to deploy your flows. For example, you can use [Gin](https://gin-gonic.com/) with just a few lines: ```go router := gin.Default() for _, flow := range genkit.ListFlows(g) { router.POST("/"+flow.Name(), func(c *gin.Context) { genkit.Handler(flow)(c.Writer, c.Request) }) } log.Fatal(router.Run(":3400")) ``` For information on deploying to specific platforms, see [Genkit with Cloud Run](/go/docs/cloud-run). # Get started with Genkit using Go > Learn how to set up Genkit and make your first generative AI request in a Go application. This guide shows you how to get started with Genkit in a Go app. If you discover issues with the libraries or this documentation please report them in our [GitHub repository](https://github.com/firebase/genkit/). ## Make your first request [Section titled “Make your first request”](#make-your-first-request) 1. Install Go 1.24 or later. See [Download and install](https://go.dev/doc/install) in the official Go docs. 2. Initialize a new Go project directory with the Genkit package: ```bash mkdir genkit-intro && cd genkit-intro go mod init example/genkit-intro go get github.com/firebase/genkit/go ``` 3. Create a `main.go` file with the following sample code: ```go package main import ( "context" "log" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" ) func main() { ctx := context.Background() // Initialize Genkit with the Google AI plugin and Gemini 2.0 Flash. g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.GoogleAI{}), genkit.WithDefaultModel("googleai/gemini-2.5-flash"), ) if err != nil { log.Fatalf("could not initialize Genkit: %v", err) } resp, err := genkit.Generate(ctx, g, ai.WithPrompt("What is the meaning of life?")) if err != nil { log.Fatalf("could not generate model response: %v", err) } log.Println(resp.Text()) } ``` 4. Configure your Gemini API key by setting the `GEMINI_API_KEY` environment variable: ```bash export GEMINI_API_KEY= ``` If you don’t already have one, [create a key in Google AI Studio](https://aistudio.google.com/apikey). Google AI provides a generous free-of-charge tier and does not require a credit card to get started. 5. Run the app to see the model response: ```bash go run . # Example output (may vary): # There is no single universally agreed-upon meaning of life; it's a deeply # personal question. Many find meaning through connection, growth, # contribution, happiness, or discovering their own purpose. ``` ## Next steps [Section titled “Next steps”](#next-steps) Now that you’re set up to make model requests with Genkit, learn how to use more Genkit capabilities to build your AI-powered apps and workflows. To get started with additional Genkit capabilities, see the following guides: * [Developer tools](/docs/devtools): Learn how to set up and use Genkit’s CLI and developer UI to help you locally test and debug your app. * [Generating content](/go/docs/models): Learn how to use Genkit’s unified generation API to generate text and structured data from any supported model. * [Creating flows](/go/docs/flows): Learn how to use special Genkit functions, called flows, that provide end-to-end observability for workflows and rich debugging from Genkit tooling. * [Managing prompts](/go/docs/dotprompt): Learn how Genkit helps you manage your prompts and configuration together as code. # Generating content with AI models > Learn how to use Genkit's unified API in Go to generate content with various AI models like LLMs and image generators. At the heart of generative AI are AI *models*. The two most prominent examples of generative models are large language models (LLMs) and image generation models. These models take input, called a *prompt* (most commonly text, an image, or a combination of both), and from it produce as output text, an image, or even audio or video. The output of these models can be surprisingly convincing: LLMs generate text that appears as though it could have been written by a human being, and image generation models can produce images that are very close to real photographs or artwork created by humans. In addition, LLMs have proven capable of tasks beyond simple text generation: * Writing computer programs. * Planning subtasks that are required to complete a larger task. * Organizing unorganized data. * Understanding and extracting information data from a corpus of text. * Following and performing automated activities based on a text description of the activity. There are many models available to you, from several different providers. Each model has its own strengths and weaknesses and one model might excel at one task but perform less well at others. Apps making use of generative AI can often benefit from using multiple different models depending on the task at hand. As an app developer, you typically don’t interact with generative AI models directly, but rather through services available as web APIs. Although these services often have similar functionality, they all provide them through different and incompatible APIs. If you want to make use of multiple model services, you have to use each of their proprietary SDKs, potentially incompatible with each other. And if you want to upgrade from one model to the newest and most capable one, you might have to build that integration all over again. Genkit addresses this challenge by providing a single interface that abstracts away the details of accessing potentially any generative AI model service, with several prebuilt implementations already available. Building your AI-powered app around Genkit simplifies the process of making your first generative AI call and makes it equally straightforward to combine multiple models or swap one model for another as new models emerge. ### Before you begin [Section titled “Before you begin”](#before-you-begin) If you want to run the code examples on this page, first complete the steps in the [Get started](/go/docs/get-started-go) guide. All of the examples assume that you have already installed Genkit as a dependency in your project. ### Models supported by Genkit [Section titled “Models supported by Genkit”](#models-supported-by-genkit) Genkit is designed to be flexible enough to use potentially any generative AI model service. Its core libraries define the common interface for working with models, and model plugins define the implementation details for working with a specific model and its API. The Genkit team maintains plugins for working with models provided by Vertex AI, Google Generative AI, and Ollama: * Gemini family of LLMs, through the [Google GenAI plugin](/go/docs/plugins/google-genai). * Gemma 3, Llama 4, and many more open models, through the [Ollama plugin](/go/docs/plugins/ollama) (you must host the Ollama server yourself). ### Loading and configuring model plugins [Section titled “Loading and configuring model plugins”](#loading-and-configuring-model-plugins) Before you can use Genkit to start generating content, you need to load and configure a model plugin. If you’re coming from the Get Started guide, you’ve already done this. Otherwise, see the [Get Started](/go/docs/get-started-go) guide or the individual plugin’s documentation and follow the steps there before continuing. ### The `genkit.Generate()` function [Section titled “The genkit.Generate() function”](#the-genkitgenerate-function) In Genkit, the primary interface through which you interact with generative AI models is the `genkit.Generate()` function. The simplest `genkit.Generate()` call specifies the model you want to use and a text prompt: ```go package main import ( "context" "log" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" ) func main() { ctx := context.Background() g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.GoogleAI{}), genkit.WithDefaultModel("googleai/gemini-2.5-flash"), ) if err != nil { log.Fatalf("could not initialize Genkit: %v", err) } resp, err := genkit.Generate(ctx, g, ai.WithPrompt("Invent a menu item for a pirate themed restaurant."), ) if err != nil { log.Fatalf("could not generate model response: %v", err) } log.Println(resp.Text()) } ``` When you run this brief example, it will print out some debugging information followed by the output of the `genkit.Generate()` call, which will usually be Markdown text as in the following example: ```md ## The Blackheart's Bounty **A hearty stew of slow-cooked beef, spiced with rum and molasses, served in a hollowed-out cannonball with a side of crusty bread and a dollop of tangy pineapple salsa.** **Description:** This dish is a tribute to the hearty meals enjoyed by pirates on the high seas. The beef is tender and flavorful, infused with the warm spices of rum and molasses. The pineapple salsa adds a touch of sweetness and acidity, balancing the richness of the stew. The cannonball serving vessel adds a fun and thematic touch, making this dish a perfect choice for any pirate-themed adventure. ``` Run the script again and you’ll get a different output. The preceding code sample sent the generation request to the default model, which you specified when you configured the Genkit instance. You can also specify a model for a single `genkit.Generate()` call: ```go resp, err := genkit.Generate(ctx, g, ai.WithModelName("googleai/gemini-2.5-pro"), ai.WithPrompt("Invent a menu item for a pirate themed restaurant."), ) ``` A model string identifier looks like `providerid/modelid`, where the provider ID (in this case, `googleai`) identifies the plugin, and the model ID is a plugin-specific string identifier for a specific version of a model. These examples also illustrate an important point: when you use `genkit.Generate()` to make generative AI model calls, changing the model you want to use is a matter of passing a different value to the model parameter. By using `genkit.Generate()` instead of the native model SDKs, you give yourself the flexibility to more easily use several different models in your app and change models in the future. So far you have only seen examples of the simplest `genkit.Generate()` calls. However, `genkit.Generate()` also provides an interface for more advanced interactions with generative models, which you will see in the sections that follow. ### System prompts [Section titled “System prompts”](#system-prompts) Some models support providing a *system prompt*, which gives the model instructions as to how you want it to respond to messages from the user. You can use the system prompt to specify characteristics such as a persona you want the model to adopt, the tone of its responses, and the format of its responses. If the model you’re using supports system prompts, you can provide one with the `ai.WithSystem()` option: ```go resp, err := genkit.Generate(ctx, g, ai.WithSystem("You are a food industry marketing consultant."), ai.WithPrompt("Invent a menu item for a pirate themed restaurant."), ) ``` For models that don’t support system prompts, `ai.WithSystem()` simulates it by modifying the request to appear *like* a system prompt. ### Model parameters [Section titled “Model parameters”](#model-parameters) The `genkit.Generate()` function takes a `ai.WithConfig()` option, through which you can specify optional settings that control how the model generates content: ```go resp, err := genkit.Generate(ctx, g, ai.WithModelName("googleai/gemini-2.5-flash"), ai.WithPrompt("Invent a menu item for a pirate themed restaurant."), ai.WithConfig(&googlegenai.GeminiConfig{ MaxOutputTokens: 500, StopSequences: ["", ""], Temperature: 0.5, TopP: 0.4, TopK: 50, }), ) ``` The exact parameters that are supported depend on the individual model and model API. However, the parameters in the previous example are common to almost every model. The following is an explanation of these parameters: #### Parameters that control output length [Section titled “Parameters that control output length”](#parameters-that-control-output-length) **MaxOutputTokens** LLMs operate on units called *tokens*. A token usually, but does not necessarily, map to a specific sequence of characters. When you pass a prompt to a model, one of the first steps it takes is to *tokenize* your prompt string into a sequence of tokens. Then, the LLM generates a sequence of tokens from the tokenized input. Finally, the sequence of tokens gets converted back into text, which is your output. The maximum output tokens parameter sets a limit on how many tokens to generate using the LLM. Every model potentially uses a different tokenizer, but a good rule of thumb is to consider a single English word to be made of 2 to 4 tokens. As stated earlier, some tokens might not map to character sequences. One such example is that there is often a token that indicates the end of the sequence: when an LLM generates this token, it stops generating more. Therefore, it’s possible and often the case that an LLM generates fewer tokens than the maximum because it generated the “stop” token. **StopSequences** You can use this parameter to set the tokens or token sequences that, when generated, indicate the end of LLM output. The correct values to use here generally depend on how the model was trained, and are usually set by the model plugin. However, if you have prompted the model to generate another stop sequence, you might specify it here. Note that you are specifying character sequences, and not tokens per se. In most cases, you will specify a character sequence that the model’s tokenizer maps to a single token. #### Parameters that control “creativity” [Section titled “Parameters that control “creativity””](#parameters-that-control-creativity) The *temperature*, *top-p*, and *top-k* parameters together control how “creative” you want the model to be. This section provides very brief explanations of what these parameters mean, but the more important point is this: these parameters are used to adjust the character of an LLM’s output. The optimal values for them depend on your goals and preferences, and are likely to be found only through experimentation. **Temperature** LLMs are fundamentally token-predicting machines. For a given sequence of tokens (such as the prompt) an LLM predicts, for each token in its vocabulary, the likelihood that the token comes next in the sequence. The temperature is a scaling factor by which these predictions are divided before being normalized to a probability between 0 and 1. Low temperature values—between 0.0 and 1.0—amplify the difference in likelihoods between tokens, with the result that the model will be even less likely to produce a token it already evaluated to be unlikely. This is often perceived as output that is less creative. Although 0.0 is technically not a valid value, many models treat it as indicating that the model should behave deterministically, and to only consider the single most likely token. High temperature values—those greater than 1.0—compress the differences in likelihoods between tokens, with the result that the model becomes more likely to produce tokens it had previously evaluated to be unlikely. This is often perceived as output that is more creative. Some model APIs impose a maximum temperature, often 2.0. **TopP** *Top-p* is a value between 0.0 and 1.0 that controls the number of possible tokens you want the model to consider, by specifying the cumulative probability of the tokens. For example, a value of 1.0 means to consider every possible token (but still take into account the probability of each token). A value of 0.4 means to only consider the most likely tokens, whose probabilities add up to 0.4, and to exclude the remaining tokens from consideration. **TopK** *Top-k* is an integer value that also controls the number of possible tokens you want the model to consider, but this time by explicitly specifying the maximum number of tokens. Specifying a value of 1 means that the model should behave deterministically. #### Experiment with model parameters [Section titled “Experiment with model parameters”](#experiment-with-model-parameters) You can experiment with the effect of these parameters on the output generated by different model and prompt combinations by using the Developer UI. Start the developer UI with the `genkit start` command and it will automatically load all of the models defined by the plugins configured in your project. You can quickly try different prompts and configuration values without having to repeatedly make these changes in code. #### Pair model with its config [Section titled “Pair model with its config”](#pair-model-with-its-config) Given that each provider or even a specific model may have its own configuration schema or warrant certain settings, it may be error prone to set separate options using `ai.WithModelName()` and `ai.WithConfig()` since the latter is not strongly typed to the former. To pair a model with its config, you can create a model reference that you can pass into the generate call instead: ```go model := googlegenai.GoogleAIModelRef("gemini-2.5-flash", &googlegenai.GeminiConfig{ MaxOutputTokens: 500, StopSequences: ["", ""], Temperature: 0.5, TopP: 0.4, TopK: 50, }) resp, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithPrompt("Invent a menu item for a pirate themed restaurant."), ) if err != nil { log.Fatal(err) } ``` The constructor for the model reference will enforce that the correct config type is provided which may reduce mismatches. ### Structured output [Section titled “Structured output”](#structured-output) When using generative AI as a component in your application, you often want output in a format other than plain text. Even if you’re just generating content to display to the user, you can benefit from structured output simply for the purpose of presenting it more attractively to the user. But for more advanced applications of generative AI, such as programmatic use of the model’s output, or feeding the output of one model into another, structured output is a must. In Genkit, you can request structured output from a model by specifying an output type when you call `genkit.Generate()`: ```go type MenuItem struct { Name string `json:"name"` Description string `json:"description"` Calories int `json:"calories"` Allergens []string `json:"allergens"` } resp, err := genkit.Generate(ctx, g, ai.WithPrompt("Invent a menu item for a pirate themed restaurant."), ai.WithOutputType(MenuItem{}), ) if err != nil { log.Fatal(err) // One possible error is that the response does not conform to the type. } ``` Model output types are specified as JSON schema using the [`invopop/jsonschema`](https://github.com/invopop/jsonschema) package. This provides runtime type checking, which bridges the gap between static Go types and the unpredictable output of generative AI models. This system lets you write code that can rely on the fact that a successful generate call will always return output that conforms to your Go types. When you specify an output type in `genkit.Generate()`, Genkit does several things behind the scenes: * Augments the prompt with additional guidance about the selected output format. This also has the side effect of specifying to the model what content exactly you want to generate (for example, not only suggest a menu item but also generate a description, a list of allergens, and so on). * Verifies that the output conforms to the schema. * Marshals the model output into a Go type. To get structured output from a successful generate call, call `Output()` on the model response with an empty value of the type: ```go var item MenuItem if err := resp.Output(&item); err != nil { log.Fatalf(err) } log.Printf("%s (%d calories, %d allergens): %s\n", item.Name, item.Calories, len(item.Allergens), item.Description) ``` Alternatively, you can use `genkit.GenerateData()` for a more succinct call: ```go item, resp, err := genkit.GenerateData[MenuItem](ctx, g, ai.WithPrompt("Invent a menu item for a pirate themed restaurant."), ) if err != nil { log.Fatal(err) } log.Printf("%s (%d calories, %d allergens): %s\n", item.Name, item.Calories, len(item.Allergens), item.Description) ``` This function requires the output type parameter but automatically sets the `ai.WithOutputType()` option and calls `ModelResponse.Output()` before returning the value. #### Handling errors [Section titled “Handling errors”](#handling-errors) Note in the prior example that the `genkit.Generate()` call can result in an error. One possible error can happen when the model fails to generate output that conforms to the schema. The best strategy for dealing with such errors will depend on your exact use case, but here are some general hints: * **Try a different model**. For structured output to succeed, the model must be capable of generating output in JSON. The most powerful LLMs like Gemini are versatile enough to do this; however, smaller models, such as some of the local models you would use with Ollama, might not be able to generate structured output reliably unless they have been specifically trained to do so. * **Simplify the schema**. LLMs may have trouble generating complex or deeply nested types. Try using clear names, fewer fields, or a flattened structure if you are not able to reliably generate structured data. * **Retry the `genkit.Generate()` call**. If the model you’ve chosen only rarely fails to generate conformant output, you can treat the error as you would treat a network error, and retry the request using some kind of incremental back-off strategy. ### Streaming [Section titled “Streaming”](#streaming) When generating large amounts of text, you can improve the experience for your users by presenting the output as it’s generated—streaming the output. A familiar example of streaming in action can be seen in most LLM chat apps: users can read the model’s response to their message as it’s being generated, which improves the perceived responsiveness of the application and enhances the illusion of chatting with an intelligent counterpart. In Genkit, you can stream output using the `ai.WithStreaming()` option: ```go resp, err := genkit.Generate(ctx, g, ai.WithPrompt("Suggest a complete menu for a pirate themed restaurant."), ai.WithStreaming(func(ctx context.Context, chunk *ai.ModelResponseChunk) error { // Do something with the chunk... log.Println(chunk.Text()) return nil }), ) if err != nil { log.Fatal(err) } log.Println(resp.Text()) ``` ### Multimodal input [Section titled “Multimodal input”](#multimodal-input) The examples you’ve seen so far have used text strings as model prompts. While this remains the most common way to prompt generative AI models, many models can also accept other media as prompts. Media prompts are most often used in conjunction with text prompts that instruct the model to perform some operation on the media, such as to caption an image or transcribe an audio recording. The ability to accept media input and the types of media you can use are completely dependent on the model and its API. For example, the Gemini 2.0 series of models can accept images, video, and audio as prompts. To provide a media prompt to a model that supports it, instead of passing a simple text prompt to `genkit.Generate()`, pass an array consisting of a media part and a text part. This example specifies an image using a publicly-accessible HTTPS URL. ```go resp, err := genkit.Generate(ctx, g, ai.WithModelName("googleai/gemini-2.5-flash"), ai.WithMessages( NewUserMessage( NewMediaPart("image/jpeg", "https://example.com/photo.jpg"), NewTextPart("Compose a poem about this image."), ), ), ) ``` You can also pass media data directly by encoding it as a data URL. For example: ```go image, err := ioutil.ReadFile("photo.jpg") if err != nil { log.Fatal(err) } resp, err := genkit.Generate(ctx, g, ai.WithModelName("googleai/gemini-2.5-flash"), ai.WithMessages( NewUserMessage( NewMediaPart("image/jpeg", "data:image/jpeg;base64," + base64.StdEncoding.EncodeToString(image)), NewTextPart("Compose a poem about this image."), ), ), ) ``` All models that support media input support both data URLs and HTTPS URLs. Some model plugins add support for other media sources. For example, the Vertex AI plugin also lets you use Cloud Storage (`gs://`) URLs. ### Next steps [Section titled “Next steps”](#next-steps) #### Learn more about Genkit [Section titled “Learn more about Genkit”](#learn-more-about-genkit) * As an app developer, the primary way you influence the output of generative AI models is through prompting. Read [Managing prompts with Dotprompt](/go/docs/dotprompt) to learn how Genkit helps you develop effective prompts and manage them in your codebase. * Although `genkit.Generate()` is the nucleus of every generative AI powered application, real-world applications usually require additional work before and after invoking a generative AI model. To reflect this, Genkit introduces the concept of *flows*, which are defined like functions but add additional features such as observability and simplified deployment. To learn more, see [Defining AI workflows](/go/docs/flows). #### Advanced LLM use [Section titled “Advanced LLM use”](#advanced-llm-use) There are techniques your app can use to reap even more benefit from LLMs. * One way to enhance the capabilities of LLMs is to prompt them with a list of ways they can request more information from you, or request you to perform some action. This is known as *tool calling* or *function calling*. Models that are trained to support this capability can respond to a prompt with a specially-formatted response, which indicates to the calling application that it should perform some action and send the result back to the LLM along with the original prompt. Genkit has library functions that automate both the prompt generation and the call-response loop elements of a tool calling implementation. See [Tool calling](/go/docs/tool-calling) to learn more. * Retrieval-augmented generation (RAG) is a technique used to introduce domain-specific information into a model’s output. This is accomplished by inserting relevant information into a prompt before passing it on to the language model. A complete RAG implementation requires you to bring several technologies together: text embedding generation models, vector databases, and large language models. See [Retrieval-augmented generation (RAG)](/go/docs/rag) to learn how Genkit simplifies the process of coordinating these various elements. # Monitoring > Learn about Genkit's monitoring features, including OpenTelemetry export and trace inspection in the Developer UI for Go applications. Genkit provides two complementary monitoring features: OpenTelemetry export and trace inspection using the developer UI. ## OpenTelemetry export [Section titled “OpenTelemetry export”](#opentelemetry-export) Genkit is fully instrumented with [OpenTelemetry](https://opentelemetry.io/) and provides hooks to export telemetry data. The [Google Cloud plugin](/go/docs/plugins/google-cloud) exports telemetry to Cloud’s operations suite. ## Trace store [Section titled “Trace store”](#trace-store) The trace store feature is complementary to the telemetry instrumentation. It lets you inspect your traces for your flow runs in the Genkit Developer UI. This feature is enabled whenever you run a Genkit flow in a dev environment (such as when using `genkit start` or `genkit flow:run`). # Writing Genkit plugins > Learn the fundamentals of creating Genkit plugins in Go to extend its capabilities with new models, retrievers, and more. Genkit’s capabilities are designed to be extended by plugins. Genkit plugins are configurable modules that can provide models, retrievers, trace stores, and more. You’ve already seen plugins in action just by using Genkit: ```go import ( "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" "github.com/firebase/genkit/go/plugins/server" ) ``` ```go g, err := genkit.Init(ctx, ai.WithPlugins( &googlegenai.GoogleAI{APIKey: ...}, &googlegenai.VertexAI{ProjectID: "my-project", Location: "us-central1"}, ), ) ``` The Vertex AI plugin takes configuration (such as the user’s Google Cloud project ID) and registers a variety of new models, embedders, and more with the Genkit registry. The registry serves as a lookup service for named actions at runtime, and powers Genkit’s local UI for running and inspecting models, prompts, and more. ## Creating a plugin [Section titled “Creating a plugin”](#creating-a-plugin) In Go, a Genkit plugin is a package that adheres to a small set of conventions. A single module can contain several plugins. ### Provider ID [Section titled “Provider ID”](#provider-id) Every plugin must have a unique identifier string that distinguishes it from other plugins. Genkit uses this identifier as a namespace for every resource your plugin defines, to prevent naming conflicts with other plugins. For example, if your plugin has an ID `yourplugin` and provides a model called `text-generator`, the full model identifier will be `yourplugin/text-generator`. This provider ID needs to be exported and you should define it once for your plugin and use it consistently when required by a Genkit function. ```go package yourplugin const providerID = "yourplugin" ``` ### Standard exports [Section titled “Standard exports”](#standard-exports) Every plugin should define and export the following symbols to conform to the `genkit.Plugin` interface: * A struct type that encapsulates all of the configuration options accepted by the plugin. For any plugin options that are secret values, such as API keys, you should offer both a config option and a default environment variable to configure it. This lets your plugin take advantage of the secret-management features offered by many hosting providers (such as Cloud Secret Manager, which you can use with Cloud Run). For example: ```go type MyPlugin struct { APIKey string // Other options you may allow to configure... } ``` * A `Name()` method on the struct that returns the provider ID. * An `Init()` method on the struct with a declaration like the following: ```go func (m *MyPlugin) Init(ctx context.Context, g *genkit.Genkit) error ``` In this function, perform any setup steps required by your plugin. For example: * Confirm that any required configuration values are specified and assign default values to any unspecified optional settings. * Verify that the given configuration options are valid together. * Create any shared resources required by the rest of your plugin. For example, create clients for any services your plugin accesses. To the extent possible, the resources provided by your plugin shouldn’t assume that any other plugins have been installed before this one. This method will be called automatically during `genkit.Init()` when the user passes the plugin into the `WithPlugins()` option. ## Building plugin features [Section titled “Building plugin features”](#building-plugin-features) A single plugin can activate many new things within Genkit. For example, the Vertex AI plugin activates several new models as well as an embedder. ### Model plugins [Section titled “Model plugins”](#model-plugins) Genkit model plugins add one or more generative AI models to the Genkit registry. A model represents any generative model that is capable of receiving a prompt as input and generating text, media, or data as output. See [Writing a Genkit model plugin](/go/docs/plugin-authoring-models). ### Telemetry plugins [Section titled “Telemetry plugins”](#telemetry-plugins) Genkit telemetry plugins configure Genkit’s OpenTelemetry instrumentation to export traces, metrics, and logs to a particular monitoring or visualization tool. See [Writing a Genkit telemetry plugin](/go/docs/plugin-authoring-telemetry). ## Publishing a plugin [Section titled “Publishing a plugin”](#publishing-a-plugin) Genkit plugins can be published as normal Go packages. To increase discoverability, your package should have `genkit` somewhere in its name so it can be found with a simple search on [`pkg.go.dev`](https://pkg.go.dev/search?q=genkit). Any of the following are good choices: * `github.com/yourorg/genkit-plugins/servicename` * `github.com/yourorg/your-repo/genkit/servicename` # Writing a Genkit model plugin > Learn how to create a Genkit model plugin in Go to integrate new generative AI models. Genkit model plugins add one or more generative AI models to the Genkit registry. A model represents any generative model that is capable of receiving a prompt as input and generating text, media, or data as output. ## Before you begin [Section titled “Before you begin”](#before-you-begin) Read [Writing Genkit plugins](/go/docs/plugin-authoring) for information about writing any kind of Genkit plug-in, including model plugins. In particular, note that every plugin must export a type that conforms to the `genkit.Plugin` interface, which includes a `Name()` and a `Init()` function. ## Model definitions [Section titled “Model definitions”](#model-definitions) Generally, a model plugin will make one or more `genkit.DefineModel()` calls in its `Init` function—once for each model the plugin is providing an interface to. A model definition consists of three components: 1. Metadata declaring the model’s capabilities. 2. A configuration type with any specific parameters supported by the model. 3. A generation function that accepts an `ai.ModelRequest` and returns an `ai.ModelResponse`, presumably using an AI model to generate the latter. At a high level, here’s what it looks like in code: ```go package myplugin import ( "context" "fmt" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" ) const providerID = "myProvider" // Unique ID for your plugin provider // MyModelConfig defines the configuration options for your model. // Embed ai.GenerationCommonConfig for common options. type MyModelConfig struct { ai.GenerationCommonConfig AnotherCustomOption string `json:"anotherCustomOption,omitempty"` CustomOption int `json:"customOption,omitempty"` } // DefineModel registers your custom model with Genkit. func DefineMyModel(g *genkit.Genkit) { genkit.DefineModel(g, providerID, "my-model", &ai.ModelInfo{ Label: "My Model", // User-friendly label Supports: &ai.ModelSupports{ Multiturn: true, // Does the model support multi-turn chats? SystemRole: true, // Does the model support system messages? Media: false, // Can the model accept media input? Tools: false, // Does the model support function calling (tools)? }, Versions: []string{"my-model-001"}, // List supported versions/aliases }, // The generation function func(ctx context.Context, mr *ai.ModelRequest, cb ai.ModelStreamCallback) (*ai.ModelResponse, error) { // Verify that the request includes a configuration that conforms to your schema. var cfg MyModelConfig if mr.Config != nil { // Attempt to cast the config; handle potential type mismatch if typedCfg, ok := mr.Config.(*MyModelConfig); ok { cfg = *typedCfg } else { // Handle incorrect config type if necessary, or rely on default values // For simplicity, this example proceeds with default cfg if cast fails } } // Now 'cfg' holds the configuration, either from the request or default. // Use your custom logic to convert Genkit's ai.ModelRequest into a form // usable by the model's native API. apiRequest, err := apiRequestFromGenkitRequest(mr, cfg) // Pass config too if err != nil { return nil, fmt.Errorf("failed to create API request: %w", err) } // Send the request to the model API, using your own code or the model // API's client library. apiResponse, err := callModelAPI(ctx, apiRequest) // Pass context if needed if err != nil { return nil, fmt.Errorf("model API call failed: %w", err) } // Use your custom logic to convert the model's response to Genkit's ai.ModelResponse. response, err := genResponseFromAPIResponse(apiResponse) if err != nil { return nil, fmt.Errorf("failed to convert API response: %w", err) } return response, nil }, ) } // Placeholder for the function that converts Genkit request to your API's format func apiRequestFromGenkitRequest(mr *ai.ModelRequest, cfg MyModelConfig) (interface{}, error) { // Implementation depends on your specific model API fmt.Printf("Converting Genkit request with config: %+v\n", cfg) // ... conversion logic ... return "your-api-request-format", nil // Replace with actual request object } // Placeholder for the function that calls your model's API func callModelAPI(ctx context.Context, apiRequest interface{}) (interface{}, error) { // Implementation depends on your specific model API client library // ... API call logic ... return "your-api-response-format", nil // Replace with actual response object } // Placeholder for the function that converts your API's response to Genkit's format func genResponseFromAPIResponse(apiResponse interface{}) (*ai.ModelResponse, error) { // Implementation depends on your specific model API response format // ... conversion logic ... return &ai.ModelResponse{ Candidates: []*ai.Candidate{ { Message: &ai.Message{ Content: []*ai.Part{ai.NewTextPart("Generated response text")}, Role: ai.RoleModel, }, FinishReason: ai.FinishReasonStop, }, }, }, nil // Replace with actual response conversion } // Example Plugin implementation type MyPlugin struct{} func (p *MyPlugin) Name() string { return providerID } func (p *MyPlugin) Init(ctx context.Context, g *genkit.Genkit) error { DefineMyModel(g) // Define other models or resources here return nil } // Ensure MyPlugin implements genkit.Plugin var _ genkit.Plugin = &MyPlugin{} ``` ### Declaring model capabilities [Section titled “Declaring model capabilities”](#declaring-model-capabilities) Every model definition must contain, as part of its metadata, an `ai.ModelInfo` value that declares which features the model supports. Genkit uses this information to determine certain behaviors, such as verifying whether certain inputs are valid for the model. For example, if the model doesn’t support multi-turn interactions, then it’s an error to pass it a message history. Note that these declarations refer to the capabilities of the model as provided by your plugin, and do not necessarily map one-to-one to the capabilities of the underlying model and model API. For example, even if the model API doesn’t provide a specific way to define system messages, your plugin might still declare support for the system role, and implement it as special logic that inserts system messages into the user prompt. ### Defining your model’s config schema [Section titled “Defining your model’s config schema”](#defining-your-models-config-schema) To specify the generation options a model supports, define and export a configuration type. Genkit has an `ai.GenerationCommonConfig` type that contains options frequently supported by generative AI model services, which you can embed or use outright. Your generation function should verify that the request contains the correct options type. ### Transforming requests and responses [Section titled “Transforming requests and responses”](#transforming-requests-and-responses) The generation function carries out the primary work of a Genkit model plugin: transforming the `ai.ModelRequest` from Genkit’s common format into a format that is supported by your model’s API, and then transforming the response from your model into the `ai.ModelResponse` format used by Genkit. Sometimes, this may require massaging or manipulating data to work around model limitations. For example, if your model does not natively support a `system` message, you may need to transform a prompt’s system message into a user-model message pair. ## Exports [Section titled “Exports”](#exports) In addition to the resources that all plugins must export, a model plugin should also export the following: * A generation config type, as discussed [earlier](#defining-your-models-config-schema). * A `Model()` function, which returns references to your plugin’s defined models. Often, this can be: ```go func Model(g *genkit.Genkit, name string) *ai.Model { return genkit.LookupModel(g, providerID, name) } ``` * A `ModelRef` function, which creates a model reference paired with its config that can validate the type and be passed around together: ```go func ModelRef(name string, config *MyModelConfig) *ai.ModelRef { return ai.NewModelRef(name, config) } ``` * **Optional**: A `DefineModel()` function, which lets users define models that your plugin can provide, but that you do not automatically define. There are two main reasons why you might want to provide such a function: * Your plugin provides access to too many models to practically register each one. For example, the Ollama plugin can provide access to dozens of different models, with more added frequently. For this reason, it doesn’t automatically define any models, and instead requires the user to call `DefineModel()` for each model they want to use. * To give your users the ability to use newly-released models that you have not yet added to your plugin. A plugin’s `DefineModel()` function is typically a frontend to `genkit.DefineModel()` that defines a generation function, but lets the user specify the model name and model capabilities. # Writing a Genkit telemetry plugin > Learn how to create a Genkit telemetry plugin in Go to export traces, metrics, and logs using OpenTelemetry. The Genkit libraries are instrumented with [OpenTelemetry](http://opentelemetry.io) to support collecting traces, metrics, and logs. Genkit users can export this telemetry data to monitoring and visualization tools by installing a plugin that configures the [OpenTelemetry Go SDK](https://opentelemetry.io/docs/languages/go/getting-started/) to export to a particular OpenTelemetry-capable system. Genkit includes a plugin that configures OpenTelemetry to export data to [Google Cloud Monitoring and Cloud Logging](/go/docs/plugins/google-cloud). To support other monitoring systems, you can extend Genkit by writing a telemetry plugin, as described on this page. ## Before you begin [Section titled “Before you begin”](#before-you-begin) Read [Writing Genkit plugins](/go/docs/plugin-authoring) for information about writing any kind of Genkit plugin, including telemetry plugins. In particular, note that every plugin must export an `Init` function, which users are expected to call before using the plugin. ## Exporters and Loggers [Section titled “Exporters and Loggers”](#exporters-and-loggers) As stated earlier, the primary job of a telemetry plugin is to configure OpenTelemetry (which Genkit has already been instrumented with) to export data to a particular service. To do so, you need the following: * An implementation of OpenTelemetry’s [`SpanExporter`](https://pkg.go.dev/go.opentelemetry.io/otel/sdk/trace#SpanExporter) interface that exports data to the service of your choice. * An implementation of OpenTelemetry’s [`metric.Exporter`](https://pkg.go.dev/go.opentelemetry.io/otel/sdk/metric#Exporter) interface that exports data to the service of your choice. * Either a [`slog.Logger`](https://pkg.go.dev/log/slog#Logger) or an implementation of the [`slog.Handler`](https://pkg.go.dev/log/slog#Handler) interface, that exports logs to the service of your choice. Depending on the service you’re interested in exporting to, this might be a relatively minor effort or a large one. Because OpenTelemetry is an industry standard, many monitoring services already have libraries that implement these interfaces. For example, the `googlecloud` plugin for Genkit makes use of the [`opentelemetry-operations-go`](https://github.com/GoogleCloudPlatform/opentelemetry-operations-go) library, maintained by the Google Cloud team. Similarly, many monitoring services provide libraries that implement the standard `slog` interfaces. On the other hand, if no such libraries are available for your service, implementing the necessary interfaces can be a substantial project. Check the [OpenTelemetry registry](https://opentelemetry.io/ecosystem/registry/?component=exporter\&language=go) or the monitoring service’s docs to see if integrations are already available. If you need to build these integrations yourself, take a look at the source of the [official OpenTelemetry exporters](https://github.com/open-telemetry/opentelemetry-go/tree/main/exporters) and the page [A Guide to Writing `slog` Handlers](https://github.com/golang/example/blob/master/slog-handler-guide/README). ## Building the plugin [Section titled “Building the plugin”](#building-the-plugin) ### Dependencies [Section titled “Dependencies”](#dependencies) Every telemetry plugin needs to import the Genkit core library and several OpenTelemetry libraries: ```go // Import the Genkit core library. "github.com/firebase/genkit/go/genkit" // Import the OpenTelemetry libraries. "go.opentelemetry.io/otel" "go.opentelemetry.io/otel/sdk/metric" "go.opentelemetry.io/otel/sdk/trace" ``` If you are building a plugin around an existing OpenTelemetry or `slog` integration, you will also need to import them. ### `Config` [Section titled “Config”](#config) A telemetry plugin should, at a minimum, support the following configuration options: ```go type Config struct { // Export even in the dev environment. ForceExport bool // The interval for exporting metric data. // The default is 60 seconds. MetricInterval time.Duration // The minimum level at which logs will be written. // Defaults to [slog.LevelInfo]. LogLevel slog.Leveler } ``` The examples that follow assume you are making these options available and will provide some guidance on how to handle them. Most plugins will also include configuration settings for the service it’s exporting to (API key, project name, and so on). ### `Init()` [Section titled “Init()”](#init) The `Init()` function of a telemetry plugin should do all of the following: * Return early if Genkit is running in a development environment (such as when running with with `genkit start`) and the `Config.ForceExport` option isn’t set: ```go shouldExport := cfg.ForceExport || os.Getenv("GENKIT_ENV") != "dev" if !shouldExport { return nil } ``` * Initialize your trace span exporter and register it with Genkit: ```go spanProcessor := trace.NewBatchSpanProcessor(YourCustomSpanExporter{}) genkit.RegisterSpanProcessor(g, spanProcessor) ``` * Initialize your metric exporter and register it with the OpenTelemetry library: ```go r := metric.NewPeriodicReader( YourCustomMetricExporter{}, metric.WithInterval(cfg.MetricInterval), ) mp := metric.NewMeterProvider(metric.WithReader(r)) otel.SetMeterProvider(mp) ``` Use the user-configured collection interval (`Config.MetricInterval`) when initializing the `PeriodicReader`. * Register your `slog` handler as the default logger: ```go logger := slog.New(YourCustomHandler{ Options: &slog.HandlerOptions{Level: cfg.LogLevel}, }) slog.SetDefault(logger) ``` You should configure your handler to honor the user-specified minimum log level (`Config.LogLevel`). ### PII redaction [Section titled “PII redaction”](#pii-redaction) Because most generative AI flows begin with user input of some kind, it’s a likely possibility that some flow traces contain personally-identifiable information (PII). To protect your users’ information, you should redact PII from traces before you export them. If you are building your own span exporter, you can build this functionality into it. If you’re building your plugin around an existing OpenTelemetry integration, you can wrap the provided span exporter with a custom exporter that carries out this task. For example, the `googlecloud` plugin removes the `genkit:input` and `genkit:output` attributes from every span before exporting them using a wrapper similar to the following: ```go type redactingSpanExporter struct { trace.SpanExporter } func (e *redactingSpanExporter) ExportSpans(ctx context.Context, spanData []trace.ReadOnlySpan) error { var redacted []trace.ReadOnlySpan for _, s := range spanData { redacted = append(redacted, redactedSpan{s}) } return e.SpanExporter.ExportSpans(ctx, redacted) } func (e *redactingSpanExporter) Shutdown(ctx context.Context) error { return e.SpanExporter.Shutdown(ctx) } type redactedSpan struct { trace.ReadOnlySpan } func (s redactedSpan) Attributes() []attribute.KeyValue { // Omit input and output, which may contain PII. var ts []attribute.KeyValue for _, a := range s.ReadOnlySpan.Attributes() { if a.Key == "genkit:input" || a.Key == "genkit:output" { continue } ts = append(ts, a) } return ts } ``` ## Troubleshooting [Section titled “Troubleshooting”](#troubleshooting) If you’re having trouble getting data to show up where you expect, OpenTelemetry provides a useful [diagnostic tool](https://opentelemetry.io/docs/languages/js/getting-started/nodejs/#troubleshooting) that helps locate the source of the problem. # AlloyDB plugin > Learn how to configure and use the AlloyDB plugin as a retriever implementation in Genkit Go. The AlloyDB plugin provides provides the retriever implementation to search a [AlloyDB](https://cloud.google.com/alloydb/docs) database using the [pgvector](https://github.com/pgvector/pgvector) extension. ## Configuration [Section titled “Configuration”](#configuration) ## Configuration [Section titled “Configuration”](#configuration-1) To use this plugin, follow these steps: 1. Import the plugin ```go import "github.com/firebase/genkit/go/plugins/alloydb" ``` 2. Create a `PostgresEngine` instance: * Using basic authentication ```go pEngine, err := alloydb.NewPostgresEngine(ctx, WithUser('user'), WithPassword('password'), WithAlloyDBInstance('my-project', 'us-central1', 'my-cluster', 'my-instance'), WithDatabase('my-database') ``` * Using email authentication ```go pEngine, err := alloydb.NewPostgresEngine(ctx, WithAlloyDBInstance('my-project', 'us-central1', 'my-cluster', 'my-instance'), WithDatabase('my-database'), WithIAMAccountEmail('mail@company.com')) ``` * Using custom pool ```go pool, err := pgxpool.New(ctx, "add_your_connection_string") if err != nil { return err } pEngine, err := alloydb.NewPostgresEngine(ctx, WithDatabase("db_test"), WithPool(pool)) ``` 3. Create the Postgres plugin * Using plugin method Init ```go postgres := &alloydb.Postgres{ engine: pEngine, } if err := (postgres).Init(ctx, g); err != nil { return err } ``` * Using the genkit method init ```go postgres := &alloydb.Postgres{ engine: pEngine, } g, err := genkit.Init(ctx, genkit.WithPlugins(postgres)) if err != nil { return err } ``` ## Usage [Section titled “Usage”](#usage) To add documents to a AlloyDB index, first create an index definition that specifies the features of the table: ```go cfg := &alloydb.Config{ TableName: 'documents', SchemaName: 'public', ContentColumn: "content", EmbeddingColumn: "embedding", MetadataColumns: []string{"source", "category"}, IDColumn: "custom_id", MetadataJSONColumn: "custom_metadata", Embedder: embedder, EmbedderOptions: nil, } doc, retriever, err := postgresql.DefineRetriever(ctx, g, postgres, cfg) if err != nil { return err } docs := []*ai.Document{{ Content: []*ai.Part{{ Kind: ai.PartText, ContentType: "text/plain", Text: "The product features include...", }}, Metadata: map[string]any{"source": "website", "category": "product-docs", "custom_id": "doc-123"}, }} if err := doc.Index(ctx, docs); err != nil { return err } ``` Similarly, to retrieve documents from an index,use the retriever method: ```go doc, retriever, err := alloydb.DefineRetriever(ctx, g, postgres, cfg) if err != nil { return err } d2 := ai.DocumentFromText( "The product features include..." , nil) resp, err := retriever.Retrieve(ctx, &ai.RetrieverRequest{ Query: d2, k:5, filter: "source='website' AND category='product-docs'" }) if err != nil { return err } ``` It’s also possible to use the Retrieve method from Retriever ```go d2 := ai.DocumentFromText( "The product features include..." , nil) retrieverOptions := &alloydb.RetrieverOptions{ k:5, filter: "source='website' AND category='product-docs'" } resp, err := ai.Retrieve(ctx, retriever,ai.WithDocs(d2), &ai.WithConfig(retrieverOptions)) if err != nil { return err } ``` See the [Retrieval-augmented generation](/go/docs/rag) page for a general discussion on using retrievers for RAG. # Cloud SQL for PostgreSQL plugin > Learn how to configure and use the PostgreSQL plugin as a retriever implementation in Genkit Go. The Postgresql plugin provides the retriever implementation to search a [Cloud SQL for Postgresql](https://cloud.google.com/sql/docs/postgres) database using the [pgvector](https://github.com/pgvector/pgvector) extension. ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, follow these steps: 1. Import the plugin ```go import "github.com/firebase/genkit/go/plugins/postgresql" ``` 2. Create a `PostgresEngine` instance: * Using basic authentication ```go pEngine, err := NewPostgresEngine(ctx, WithUser('user'), WithPassword('password'), WithCloudSQLInstance('my-project', 'us-central1', 'my-instance'), WithDatabase('my-database') ``` * Using email authentication ```go pEngine, err := NewPostgresEngine(ctx, WithCloudSQLInstance('my-project', 'us-central1', 'my-instance'), WithDatabase('my-database'), WithIAMAccountEmail('mail@company.com')) ``` * Using custom pool ```go pool, err := pgxpool.New(ctx, "add_your_connection_string") if err != nil { return err } pEngine, err := NewPostgresEngine(ctx, WithDatabase("db_test"), WithPool(pool)) ``` 3. Create the Postgres plugin * Using plugin method Init ```go postgres := &postgresql.Postgres{ engine: pEngine, } if err := (postgres).Init(ctx, g); err != nil { return err } ``` * Using the genkit method init ```go postgres := &postgresql.Postgres{ engine: pEngine, } g, err := genkit.Init(ctx, genkit.WithPlugins(postgres)) if err != nil { return err } ``` ## Usage [Section titled “Usage”](#usage) To add documents to a Postgresql index, first create a retrieve definition that specifies the features of the table: ```go cfg := &postgresql.Config{ TableName: 'documents', SchemaName: 'public', ContentColumn: "content", EmbeddingColumn: "embedding", MetadataColumns: []string{"source", "category"}, IDColumn: "custom_id", MetadataJSONColumn: "custom_metadata", Embedder: embedder, EmbedderOptions: nil, } doc, retriever, err := postgresql.DefineRetriever(ctx, g, postgres, cfg) if err != nil { return err } docs := []*ai.Document{{ Content: []*ai.Part{{ Kind: ai.PartText, ContentType: "text/plain", Text: "The product features include...", }}, Metadata: map[string]any{"source": "website", "category": "product-docs", "custom_id": "doc-123"}, }} if err := doc.Index(ctx, docs); err != nil { return err } ``` Similarly, to retrieve documents from an index, use the retrieve method: ```go d2 := ai.DocumentFromText( "The product features include..." , nil) resp, err := retriever.Retrieve(ctx, &ai.RetrieverRequest{ Query: d2, k:5, filter: "source='website' AND category='product-docs'" }) if err != nil { return err } ``` It’s also possible to use the Retrieve method from Retriever ```go _, retriever, err := postgresql.DefineRetriever(ctx, g, postgres, cfg) if err != nil { return err } d2 := ai.DocumentFromText( "The product features include..." , nil) retrieverOptions := &postgresql.RetrieverOptions{ k:5, filter: "source='website' AND category='product-docs'" } resp, err := ai.Retrieve(ctx, retriever,ai.WithDocs(d2), &ai.WithConfig(retrieverOptions)) if err != nil { return err } ``` See the [Retrieval-augmented generation](/go/docs/rag) page for a general discussion on using retrievers for RAG. # Firebase plugin > Learn how to configure and use the Genkit Firebase plugin for Go to integrate with Firebase services including Firestore for RAG applications. The Firebase plugin provides integration with Firebase services for Genkit applications. It enables you to use Firebase Firestore as a vector database for retrieval-augmented generation (RAG) applications by defining retrievers. ## Prerequisites [Section titled “Prerequisites”](#prerequisites) This plugin requires: * A Firebase project - Create one at the [Firebase Console](https://console.firebase.google.com/) * Firestore database enabled in your Firebase project * Firebase credentials configured for your application ### Firebase Setup [Section titled “Firebase Setup”](#firebase-setup) 1. **Create a Firebase project** at [Firebase Console](https://console.firebase.google.com/) 2. **Enable Firestore** in your project: * Go to Firestore Database in the Firebase console * Click “Create database” * Choose your security rules and location 3. **Set up authentication** using one of these methods: * For local development: `firebase login` and `firebase use ` * For production: Service account key or Application Default Credentials ## Configuration [Section titled “Configuration”](#configuration) ### Basic Configuration [Section titled “Basic Configuration”](#basic-configuration) To use this plugin, import the `firebase` package and initialize it with your project: ```go import "github.com/firebase/genkit/go/plugins/firebase" ``` ```go // Option 1: Using project ID (recommended) firebasePlugin := &firebase.Firebase{ ProjectId: "your-firebase-project-id", } g, err := genkit.Init(context.Background(), genkit.WithPlugins(firebasePlugin)) if err != nil { log.Fatal(err) } ``` ### Environment Variable Configuration [Section titled “Environment Variable Configuration”](#environment-variable-configuration) You can also configure the project ID using environment variables: ```bash export FIREBASE_PROJECT_ID=your-firebase-project-id ``` ```go // Plugin will automatically use FIREBASE_PROJECT_ID environment variable firebasePlugin := &firebase.Firebase{} g, err := genkit.Init(context.Background(), genkit.WithPlugins(firebasePlugin)) ``` ### Advanced Configuration [Section titled “Advanced Configuration”](#advanced-configuration) For advanced use cases, you can provide a pre-configured Firebase app: ```go import firebasev4 "firebase.google.com/go/v4" // Create Firebase app with custom configuration app, err := firebasev4.NewApp(ctx, &firebasev4.Config{ ProjectID: "your-project-id", // Additional Firebase configuration options }) if err != nil { log.Fatal(err) } firebasePlugin := &firebase.Firebase{ App: app, } ``` ## Usage [Section titled “Usage”](#usage) ### Defining Firestore Retrievers [Section titled “Defining Firestore Retrievers”](#defining-firestore-retrievers) The primary use case for the Firebase plugin is creating retrievers for RAG applications: ```go // Define a Firestore retriever retrieverOptions := firebase.RetrieverOptions{ Name: "my-documents", Collection: "documents", VectorField: "embedding", EmbedderName: "text-embedding-3-small", TopK: 10, } retriever, err := firebase.DefineRetriever(ctx, g, retrieverOptions) if err != nil { log.Fatal(err) } ``` ### Using Retrievers in RAG Workflows [Section titled “Using Retrievers in RAG Workflows”](#using-retrievers-in-rag-workflows) Once defined, you can use the retriever in your RAG workflows: ```go // Retrieve relevant documents results, err := ai.Retrieve(ctx, retriever, ai.WithDocs("What is machine learning?")) if err != nil { log.Fatal(err) } // Use retrieved documents in generation var contextDocs []string for _, doc := range results.Documents { contextDocs = append(contextDocs, doc.Content[0].Text) } context := strings.Join(contextDocs, "\n\n") resp, err := genkit.Generate(ctx, g, ai.WithPrompt(fmt.Sprintf("Context: %s\n\nQuestion: %s", context, "What is machine learning?")), ) ``` ### Complete RAG Example [Section titled “Complete RAG Example”](#complete-rag-example) Here’s a complete example showing how to set up a RAG system with Firebase: ```go package main import ( "context" "fmt" "log" "strings" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/firebase" "github.com/firebase/genkit/go/plugins/compat_oai/openai" ) func main() { ctx := context.Background() // Initialize plugins firebasePlugin := &firebase.Firebase{ ProjectId: "my-firebase-project", } openaiPlugin := &openai.OpenAI{ APIKey: "your-openai-api-key", } g, err := genkit.Init(ctx, genkit.WithPlugins(firebasePlugin, openaiPlugin)) if err != nil { log.Fatal(err) } // Define retriever for knowledge base retriever, err := firebase.DefineRetriever(ctx, g, firebase.RetrieverOptions{ Name: "knowledge-base", Collection: "documents", VectorField: "embedding", EmbedderName: "text-embedding-3-small", TopK: 5, }) if err != nil { log.Fatal(err) } // RAG query function query := "How does machine learning work?" // Step 1: Retrieve relevant documents retrievalResults, err := ai.Retrieve(ctx, retriever, ai.WithDocs(query)) if err != nil { log.Fatal(err) } // Step 2: Prepare context from retrieved documents var contextParts []string for _, doc := range retrievalResults.Documents { contextParts = append(contextParts, doc.Content[0].Text) } context := strings.Join(contextParts, "\n\n") // Step 3: Generate answer with context model := openaiPlugin.Model(g, "gpt-4o") response, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithPrompt(fmt.Sprintf(` Based on the following context, answer the question: Context: %s Question: %s Answer:`, context, query)), ) if err != nil { log.Fatal(err) } fmt.Printf("Answer: %s\n", response.Text()) } ``` ## Firestore Data Structure [Section titled “Firestore Data Structure”](#firestore-data-structure) ### Document Storage Format [Section titled “Document Storage Format”](#document-storage-format) Your Firestore documents should follow this structure for optimal retrieval: ```json { "content": "Your document text content here...", "embedding": [0.1, -0.2, 0.3, ...], "metadata": { "title": "Document Title", "author": "Author Name", "category": "Technology", "timestamp": "2024-01-15T10:30:00Z" } } ``` ### Indexing Documents [Section titled “Indexing Documents”](#indexing-documents) To add documents to your Firestore collection with embeddings: ```go // Example of adding documents with embeddings embedder := openaiPlugin.Embedder(g, "text-embedding-3-small") documents := []struct { Content string Metadata map[string]interface{} }{ { Content: "Machine learning is a subset of artificial intelligence...", Metadata: map[string]interface{}{ "title": "Introduction to ML", "category": "Technology", }, }, // More documents... } for _, doc := range documents { // Generate embedding embeddingResp, err := ai.Embed(ctx, embedder, ai.WithDocs(doc.Content)) if err != nil { log.Fatal(err) } // Store in Firestore firestoreClient, _ := firebasePlugin.App.Firestore(ctx) _, err = firestoreClient.Collection("documents").Doc().Set(ctx, map[string]interface{}{ "content": doc.Content, "embedding": embeddingResp.Embeddings[0].Embedding, "metadata": doc.Metadata, }) if err != nil { log.Fatal(err) } } ``` ## Configuration Options [Section titled “Configuration Options”](#configuration-options) ### Firebase struct [Section titled “Firebase struct”](#firebase-struct) ```go type Firebase struct { // ProjectId is your Firebase project ID // If empty, uses FIREBASE_PROJECT_ID environment variable ProjectId string // App is a pre-configured Firebase app instance // Use either ProjectId or App, not both App *firebasev4.App } ``` ### RetrieverOptions [Section titled “RetrieverOptions”](#retrieveroptions) ```go type RetrieverOptions struct { // Name is a unique identifier for the retriever Name string // Collection is the Firestore collection name containing documents Collection string // VectorField is the field name containing the embedding vectors VectorField string // EmbedderName is the name of the embedder to use for query vectorization EmbedderName string // TopK is the number of top similar documents to retrieve TopK int // Additional filtering and configuration options } ``` ## Authentication [Section titled “Authentication”](#authentication) ### Local Development [Section titled “Local Development”](#local-development) For local development, use the Firebase CLI: ```bash # Install Firebase CLI npm install -g firebase-tools # Login and set project firebase login firebase use your-project-id ``` ### Production Deployment [Section titled “Production Deployment”](#production-deployment) For production, use one of these authentication methods: #### Service Account Key [Section titled “Service Account Key”](#service-account-key) ```go import "google.golang.org/api/option" app, err := firebasev4.NewApp(ctx, &firebasev4.Config{ ProjectID: "your-project-id", }, option.WithCredentialsFile("path/to/serviceAccountKey.json")) ``` #### Application Default Credentials [Section titled “Application Default Credentials”](#application-default-credentials) Set the environment variable: ```bash export GOOGLE_APPLICATION_CREDENTIALS="path/to/serviceAccountKey.json" ``` Or use the metadata server on Google Cloud Platform. ## Error Handling [Section titled “Error Handling”](#error-handling) Handle Firebase-specific errors appropriately: ```go retriever, err := firebase.DefineRetriever(ctx, g, options) if err != nil { if strings.Contains(err.Error(), "plugin not found") { log.Fatal("Firebase plugin not initialized. Make sure to include it in genkit.Init()") } log.Fatalf("Failed to create retriever: %v", err) } // Handle retrieval errors results, err := ai.Retrieve(ctx, retriever, ai.WithDocs(query)) if err != nil { log.Printf("Retrieval failed: %v", err) // Implement fallback logic } ``` ## Best Practices [Section titled “Best Practices”](#best-practices) ### Performance Optimization [Section titled “Performance Optimization”](#performance-optimization) * **Batch Operations**: Use Firestore batch writes when adding multiple documents * **Index Configuration**: Set up appropriate Firestore indexes for your queries * **Caching**: Implement caching for frequently accessed documents * **Pagination**: Use pagination for large result sets ### Security [Section titled “Security”](#security) * **Firestore Rules**: Configure proper security rules for your collections * **API Keys**: Never expose Firebase configuration in client-side code * **Authentication**: Implement proper user authentication for sensitive data ### Cost Management [Section titled “Cost Management”](#cost-management) * **Document Size**: Keep documents reasonably sized to minimize read costs * **Query Optimization**: Design efficient queries to reduce operation costs * **Storage Management**: Regularly clean up unused documents and embeddings ## Integration Examples [Section titled “Integration Examples”](#integration-examples) ### With Multiple Embedders [Section titled “With Multiple Embedders”](#with-multiple-embedders) ```go // Use different embedders for different types of content technicalRetriever, err := firebase.DefineRetriever(ctx, g, firebase.RetrieverOptions{ Name: "technical-docs", Collection: "technical_documents", VectorField: "embedding", EmbedderName: "text-embedding-3-large", // More accurate for technical content TopK: 5, }) generalRetriever, err := firebase.DefineRetriever(ctx, g, firebase.RetrieverOptions{ Name: "general-knowledge", Collection: "general_documents", VectorField: "embedding", EmbedderName: "text-embedding-3-small", // Faster for general content TopK: 10, }) ``` ### With Flows [Section titled “With Flows”](#with-flows) ```go ragFlow := genkit.DefineFlow(g, "rag-qa", func(ctx context.Context, query string) (string, error) { // Retrieve context results, err := ai.Retrieve(ctx, retriever, ai.WithDocs(query)) if err != nil { return "", err } // Generate response response, err := genkit.Generate(ctx, g, ai.WithPrompt(buildPromptWithContext(query, results)), ) if err != nil { return "", err } return response.Text(), nil }) ``` # Google Cloud telemetry and logging plugin > Learn how to configure the Genkit Google Cloud plugin to export telemetry and logs to Cloud's operations suite for Go applications. The Google Cloud plugin exports Genkit’s telemetry and logging data to [Google Cloud’s operation suite](https://cloud.google.com/products/operations). Note: Logging is facilitated by the `slog` package in favor of the [OpenTelemetry](https://opentelemetry.io/) logging APIs. Export of logs is done via a dedicated Google Cloud exporter. ## Prerequisites [Section titled “Prerequisites”](#prerequisites) If you want to locally run flows that use this plugin, you need the [Google Cloud CLI tool](https://cloud.google.com/sdk/docs/install) installed. ## Set up a Google Cloud account [Section titled “Set up a Google Cloud account”](#set-up-a-google-cloud-account) This plugin requires a Google Cloud account ([sign up](https://cloud.google.com/gcp) if you don’t already have one) and a Google Cloud project. Prior to adding the plugin, make sure that the following APIs are enabled for your project: * [Cloud Logging API](https://console.cloud.google.com/apis/library/logging.googleapis.com) * [Cloud Trace API](https://console.cloud.google.com/apis/library/cloudtrace.googleapis.com) * [Cloud Monitoring API](https://console.cloud.google.com/apis/library/monitoring.googleapis.com) These APIs should be listed in the [API dashboard](https://console.cloud.google.com/apis/dashboard) for your project. Click [here](https://support.google.com/googleapi/answer/6158841) to learn more about enabling and disabling APIs. ## Configuration [Section titled “Configuration”](#configuration) To enable exporting to Google Cloud Tracing, Logging, and Monitoring, import the `googlecloud` package and run `Init()`. After calling `Init()`, your telemetry gets automatically exported. ```go import "github.com/firebase/genkit/go/plugins/googlecloud" ``` ```go if err := (&googlecloud.GoogleCloud{ProjectID: "your-google-cloud-project"}).Init(ctx, g); err != nil { return err } ``` You must specify the Google Cloud project to which you want to export telemetry data. There are also some optional parameters: * `ProjectID`: (Required) Your Google Cloud project ID. * `ForceExport`: Export telemetry data even when running in a dev environment (such as when using `genkit start` or `genkit flow:run`). This is a quick way to test your integration and send your first events for monitoring in Google Cloud. If you use this option, you also need to make your Cloud credentials available locally: ```bash gcloud auth application-default login ``` * `MetricInterval`: The interval (`time.Duration`) at which to export telemetry information. By default, this is 60 seconds. * `LogLevel`: The minimum severity level (`slog.Level`) of log entries to export. By default, `slog.LevelInfo`. The plugin requires your Google Cloud project credentials. If you’re running your flows from a Google Cloud environment (Cloud Run, etc), the credentials are set automatically. Running in other environments requires setting up [Application Default Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc). ## Production monitoring via Google Cloud’s operations suite [Section titled “Production monitoring via Google Cloud’s operations suite”](#production-monitoring-via-google-clouds-operations-suite) Once a flow is deployed, navigate to [Google Cloud’s operations suite](https://console.cloud.google.com/) and select your project. ![Google Cloud Operations Suite dashboard](/_astro/cloud-ops-suite.nuxcL9JP_Z145aJ1.webp) ### Logs and traces [Section titled “Logs and traces”](#logs-and-traces) From the side menu, find ‘Logging’ and click ‘Logs explorer’. ![Logs Explorer menu item in Cloud Logging](/_astro/cloud-ops-logs-explorer-menu.BEWdhThV_ZD7rzA.webp) You will see all logs that are associated with your deployed flow, including `console.log()`. Any log which has the prefix `[genkit]` is a Genkit-internal log that contains information that may be interesting for debugging purposes. For example, Genkit logs in the format `Config[...]` contain metadata such as the temperature and topK values for specific LLM inferences. Logs in the format `Output[...]` contain LLM responses while `Input[...]` logs contain the prompts. Cloud Logging has robust ACLs that allow fine grained control over sensitive logs. > Note: Prompts and LLM responses are redacted from trace attributes in Cloud Trace. For specific log lines, it is possible to navigate to their respective traces by clicking on the extended menu ![Log line menu icon](/_astro/cloud-ops-log-menu-icon.4Dim4J5G_1v5wgo.webp) icon and selecting “View in trace details”. ![View in trace details option in log menu](/_astro/cloud-ops-view-trace-details.CBSAOREC_R8kiS.webp) This will bring up a trace preview pane providing a quick glance of the details of the trace. To get to the full details, click the “View in Trace” link at the top right of the pane. ![View in Trace link in trace preview pane](/_astro/cloud-ops-view-in-trace.BKg24a3E_1wNRAe.webp) The most prominent navigation element in Cloud Trace is the trace scatter plot. It contains all collected traces in a given time span. ![Cloud Trace scatter plot](/_astro/cloud-ops-trace-graph.CeQ28Xfh_15Q4Se.webp) Clicking on each data point will show its details below the scatter plot. ![Cloud Trace details view](/_astro/cloud-ops-trace-view.B7au5dRz_moJCv.webp) The detailed view contains the flow shape, including all steps, and important timing information. Cloud Trace has the ability to interleave all logs associated with a given trace within this view. Select the “Show expanded” option in the “Logs & events” drop down. ![Show expanded option in Logs \& events dropdown](/_astro/cloud-ops-show-expanded.CIo-3v3F_CudcV.webp) The resultant view allows detailed examination of logs in the context of the trace, including prompts and LLM responses. ![Trace details view with expanded logs](/_astro/cloud-ops-output-logs.Db6riJri_218yQN.webp) ### Metrics [Section titled “Metrics”](#metrics) Viewing all metrics that Genkit exports can be done by selecting “Logging” from the side menu and clicking on “Metrics management”. ![Metrics Management menu item in Cloud Logging](/_astro/cloud-ops-metrics-mgmt.DBcXxge4_Z15Hmgw.webp) The metrics management console contains a tabular view of all collected metrics, including those that pertain to Cloud Run and its surrounding environment. Clicking on the ‘Workload’ option will reveal a list that includes Genkit-collected metrics. Any metric with the `genkit` prefix constitutes an internal Genkit metric. ![Metrics table showing Genkit metrics](/_astro/cloud-ops-metrics-table.ByJ1DRTl_Z19rWai.webp) Genkit collects several categories of metrics, including flow-level, action-level, and generate-level metrics. Each metric has several useful dimensions facilitating robust filtering and grouping. Common dimensions include: * `flow_name` - the top-level name of the flow. * `flow_path` - the span and its parent span chain up to the root span. * `error_code` - in case of an error, the corresponding error code. * `error_message` - in case of an error, the corresponding error message. * `model` - the name of the model. * `temperature` - the inference temperature [value](https://ai.google.dev/docs/concepts#model-parameters). * `topK` - the inference topK [value](https://ai.google.dev/docs/concepts#model-parameters). * `topP` - the inference topP [value](https://ai.google.dev/docs/concepts#model-parameters). #### Flow-level metrics [Section titled “Flow-level metrics”](#flow-level-metrics) | Name | Dimensions | | -------------------- | --------------------------------------- | | genkit/flow/requests | flow\_name, error\_code, error\_message | | genkit/flow/latency | flow\_name | #### Action-level metrics [Section titled “Action-level metrics”](#action-level-metrics) | Name | Dimensions | | ---------------------- | --------------------------------------- | | genkit/action/requests | flow\_name, error\_code, error\_message | | genkit/action/latency | flow\_name | #### Generate-level metrics [Section titled “Generate-level metrics”](#generate-level-metrics) | Name | Dimensions | | ------------------------------------- | ----------------------------------------------------------------------- | | genkit/ai/generate | flow\_path, model, temperature, topK, topP, error\_code, error\_message | | genkit/ai/generate/input\_tokens | flow\_path, model, temperature, topK, topP | | genkit/ai/generate/output\_tokens | flow\_path, model, temperature, topK, topP | | genkit/ai/generate/input\_characters | flow\_path, model, temperature, topK, topP | | genkit/ai/generate/output\_characters | flow\_path, model, temperature, topK, topP | | genkit/ai/generate/input\_images | flow\_path, model, temperature, topK, topP | | genkit/ai/generate/output\_images | flow\_path, model, temperature, topK, topP | | genkit/ai/generate/latency | flow\_path, model, temperature, topK, topP, error\_code, error\_message | Visualizing metrics can be done through the Metrics Explorer. Using the side menu, select ‘Logging’ and click ‘Metrics explorer’ ![Metrics Explorer menu item in Cloud Logging](/_astro/cloud-ops-metrics-explorer.hSpZPXYP_MoOA.webp) Select a metrics by clicking on the “Select a metric” dropdown, selecting ‘Generic Node’, ‘Genkit’, and a metric. ![Selecting a Genkit metric in Metrics Explorer](/_astro/cloud-ops-metrics-generic-node.ZUgtOJ4W_Z16qeM.webp) The visualization of the metric will depend on its type (counter, histogram, etc). The Metrics Explorer provides robust aggregation and querying facilities to help graph metrics by their various dimensions. ![Metrics Explorer showing a Genkit metric graph](/_astro/cloud-ops-metrics-metric.B0GvHb0j_bbryX.webp) ## Telemetry Delay [Section titled “Telemetry Delay”](#telemetry-delay) There may be a slight delay before telemetry for a particular execution of a flow is displayed in Cloud’s operations suite. In most cases, this delay is under 1 minute. ## Quotas and limits [Section titled “Quotas and limits”](#quotas-and-limits) There are several quotas that are important to keep in mind: * [Cloud Trace Quotas](http://cloud.google.com/trace/docs/quotas) * 128 bytes per attribute key * 256 bytes per attribute value * [Cloud Logging Quotas](http://cloud.google.com/logging/quotas) * 256 KB per log entry * [Cloud Monitoring Quotas](http://cloud.google.com/monitoring/quotas) ## Cost [Section titled “Cost”](#cost) Cloud Logging, Cloud Trace, and Cloud Monitoring have generous free tiers. Specific pricing can be found at the following links: * [Cloud Logging Pricing](http://cloud.google.com/stackdriver/pricing#google-cloud-observability-pricing) * [Cloud Trace Pricing](https://cloud.google.com/trace#pricing) * [Cloud Monitoring Pricing](https://cloud.google.com/stackdriver/pricing#monitoring-pricing-summary) # Google Generative AI plugin > Learn how to configure and use the Genkit Google Generative AI plugin for Go to access Gemini models via the Gemini API or Vertex AI API. The Google Generative AI plugin provides interfaces to Google’s Gemini models through either the Gemini API or the Vertex AI Gemini API. ## Configuration [Section titled “Configuration”](#configuration) The configuration depends on which provider you choose: ### Google AI [Section titled “Google AI”](#google-ai) To use this plugin, import the `googlegenai` package and pass `googlegenai.GoogleAI` to `WithPlugins()` in the Genkit initializer: ```go import "github.com/firebase/genkit/go/plugins/googlegenai" ``` ```go g, err := genkit.Init(context.Background(), ai.WithPlugins(&googlegenai.GoogleAI{})) ``` The plugin requires an API key for the Gemini API, which you can get from [Google AI Studio](https://aistudio.google.com/app/apikey). Configure the plugin to use your API key by doing one of the following: * Set the `GEMINI_API_KEY` environment variable to your API key. * Specify the API key when you initialize the plugin: ```go ai.WithPlugins(&googlegenai.GoogleAI{APIKey: "YOUR_API_KEY"}) ``` However, don’t embed your API key directly in code! Use this feature only in conjunction with a service like Cloud Secret Manager or similar. ### Vertex AI [Section titled “Vertex AI”](#vertex-ai) To use this plugin, import the `googlegenai` package and pass `googlegenai.VertexAI` to `WithPlugins()` in the Genkit initializer: ```go import "github.com/firebase/genkit/go/plugins/googlegenai" ``` ```go g, err := genkit.Init(context.Background(), genkit.WithPlugins(&googlegenai.VertexAI{})) ``` The plugin requires you to specify your Google Cloud project ID, the [region](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations) to which you want to make Vertex API requests, and your Google Cloud project credentials. * By default, `googlegenai.VertexAI` gets your Google Cloud project ID from the `GOOGLE_CLOUD_PROJECT` environment variable. You can also pass this value directly: ```go genkit.WithPlugins(&googlegenai.VertexAI{ProjectID: "my-project-id"}) ``` * By default, `googlegenai.VertexAI` gets the Vertex AI API location from the `GOOGLE_CLOUD_LOCATION` environment variable. You can also pass this value directly: ```go genkit.WithPlugins(&googlegenai.VertexAI{Location: "us-central1"}) ``` * To provide API credentials, you need to set up Google Cloud Application Default Credentials. 1. To specify your credentials: * If you’re running your flow from a Google Cloud environment (Cloud Functions, Cloud Run, and so on), this is set automatically. * On your local dev environment, do this by running: ```shell gcloud auth application-default login ``` * For other environments, see the [Application Default Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc) docs. 2. In addition, make sure the account is granted the Vertex AI User IAM role (`roles/aiplatform.user`). See the Vertex AI [access control](https://cloud.google.com/vertex-ai/generative-ai/docs/access-control) docs. ## Usage [Section titled “Usage”](#usage) ### Generative models [Section titled “Generative models”](#generative-models) To get a reference to a supported model, specify its identifier to either `googlegenai.GoogleAIModel` or `googlgenai.VertexAIModel`: ```go model := googlegenai.GoogleAIModel(g, "gemini-2.5-flash") ``` Alternatively, you may create a `ModelRef` which pairs the model name with its config: ```go modelRef := googlegenai.GoogleAIModelRef("gemini-2.5-flash", &googlegenai.GeminiConfig{ Temperature: 0.5, MaxOutputTokens: 500, // Other configuration... }) ``` The following models are supported: `gemini-1.5-pro`, `gemini-1.5-flash`, `gemini-2.0-pro`, `gemini-2.5-flash`, and other experimental models. Model references have a `Generate()` method that calls the Google API: ```go resp, err := genkit.Generate(ctx, g, ai.WithModel(modelRef), ai.WithPrompt("Tell me a joke.")) if err != nil { return err } log.Println(resp.Text()) ``` See [Generating content with AI models](/go/docs/models) for more information. ### Embedding models [Section titled “Embedding models”](#embedding-models) To get a reference to a supported embedding model, specify its identifier to either `googlegenai.GoogleAIEmbedder` or `googlgenai.VertexAIEmbedder`: ```go embeddingModel := googlegenai.GoogleAIEmbedder(g, "text-embedding-004") ``` The following models are supported: * **Google AI** `text-embedding-004` and `embedding-001` * **Vertex AI** `textembedding-gecko@003`, `textembedding-gecko@002`, `textembedding-gecko@001`, `text-embedding-004`, `textembedding-gecko-multilingual@001`, `text-multilingual-embedding-002`, and `multimodalembedding` Embedder references have an `Embed()` method that calls the Google AI API: ```go resp, err := ai.Embed(ctx, embeddingModel, ai.WithDocs(userInput)) if err != nil { return err } ``` You can retrieve docs by passing in an input to a Retriever’s `Retrieve()` method: ```go resp, err := ai.Retrieve(ctx, myRetriever, ai.WithDocs(userInput)) if err != nil { return err } ``` See [Retrieval-augmented generation (RAG)](/go/docs/rag) for more information. # MCP (Model Context Protocol) plugin > Learn how to integrate MCP servers with Genkit for Go and expose Genkit tools as MCP servers. The MCP (Model Context Protocol) plugin enables integration with MCP servers and allows you to expose Genkit tools as MCP servers. You can connect to external MCP servers to use their tools and prompts, manage multiple server connections, or turn your Genkit application into an MCP server. ## Prerequisites [Section titled “Prerequisites”](#prerequisites) This plugin requires MCP servers to be available. For testing and development, you can use: * `mcp-server-time` - Spmple server Exposing time operations * `@modelcontextprotocol/server-everything` - A comprehensive MCP server for testing * Custom MCP servers written in Python, TypeScript, or other languages ## Configuration [Section titled “Configuration”](#configuration) ### Single Server Connection [Section titled “Single Server Connection”](#single-server-connection) To connect to a single MCP server, import the `mcp` package and create a `GenkitMCPClient`: ```go import "github.com/firebase/genkit/go/plugins/mcp" ``` ```go ctx := context.Background() g, err := genkit.Init(ctx) if err != nil { log.Fatal(err) } client, err := mcp.NewGenkitMCPClient(mcp.MCPClientOptions{ Name: "mcp-server-time", Stdio: &mcp.StdioConfig{ Command: "uvx", Args: []string{"mcp-server-time"}, }, }) if err != nil { log.Fatal(err) } ``` ### Multiple Server Management [Section titled “Multiple Server Management”](#multiple-server-management) To manage connections to multiple MCP servers, use `GenkitMCPManager`: ```go import "github.com/firebase/genkit/go/plugins/mcp" ``` ```go manager, err := mcp.NewMCPManager(mcp.MCPManagerOptions{ Name: "my-app", MCPServers: []mcp.MCPServerConfig{ { Name: "everything-server", Config: mcp.MCPClientOptions{ Name: "everything-server", Stdio: &mcp.StdioConfig{ Command: "npx", Args: []string{"-y", "@modelcontextprotocol/server-everything"}, }, }, }, { Name: "mcp-server-time", Config: mcp.MCPClientOptions{ Name: "mcp-server-time", Stdio: &mcp.StdioConfig{ Command: "uvx", Args: []string{"mcp-server-time"}, }, }, }, }, }) if err != nil { log.Fatal(err) } ``` ### Exposing as MCP Server [Section titled “Exposing as MCP Server”](#exposing-as-mcp-server) To expose your Genkit tools as an MCP server, create an `MCPServer`: ```go import "github.com/firebase/genkit/go/plugins/mcp" ``` ```go // Define your tools first addTool := genkit.DefineTool(g, "add", "Add two numbers", func(ctx *ai.ToolContext, input struct{A, B int}) (int, error) { return input.A + input.B, nil }) // Create MCP server server := mcp.NewMCPServer(g, mcp.MCPServerOptions{ Name: "genkit-calculator", Version: "1.0.0", }) ``` ## Usage [Section titled “Usage”](#usage) ### Using Tools from MCP Servers [Section titled “Using Tools from MCP Servers”](#using-tools-from-mcp-servers) Once connected to an MCP server, you can retrieve and use its tools: ```go // Get a specific tool echoTool, err := client.GetTool(ctx, g, "echo") if err != nil { log.Fatal(err) } // Use the tool in your workflow resp, err := genkit.Generate(ctx, g, ai.WithModel(myModel), ai.WithPrompt("Use the echo tool to repeat this message"), ai.WithTools(echoTool), ) if err != nil { log.Fatal(err) } ``` ### Using Prompts from MCP Servers [Section titled “Using Prompts from MCP Servers”](#using-prompts-from-mcp-servers) Retrieve and use prompts from connected MCP servers: ```go // Get a specific prompt simplePrompt, err := client.GetPrompt(ctx, g, "simple_prompt") if err != nil { log.Fatal(err) } // Use the prompt resp, err := genkit.Generate(ctx, g, ai.WithModel(myModel), ai.WithPrompt(simplePrompt), ) ``` ### Managing Multiple Servers [Section titled “Managing Multiple Servers”](#managing-multiple-servers) With `GenkitMCPManager`, you can dynamically manage server connections: ```go // Connect to a new server at runtime err = manager.Connect("weather", mcp.MCPClientOptions{ Name: "weather-server", Stdio: &mcp.StdioConfig{ Command: "python", Args: []string{"weather_server.py"}, }, }) if err != nil { log.Fatal(err) } // Disconnect a server completely err = manager.Disconnect("weather") if err != nil { log.Fatal(err) } // Get all tools from all active servers tools, err := manager.GetActiveTools(ctx, g) if err != nil { log.Fatal(err) } // Get a specific prompt from a specific server prompt, err := manager.GetPrompt(ctx, g, "mcp-server-time", "current_time", nil) if err != nil { log.Fatal(err) } ``` For individual client management (disable/enable without disconnecting), you would access the clients directly. The manager focuses on connection lifecycle management. ### Running as MCP Server [Section titled “Running as MCP Server”](#running-as-mcp-server) To run your Genkit application as an MCP server: ```go // Option 1: Auto-expose all defined tools server := mcp.NewMCPServer(g, mcp.MCPServerOptions{ Name: "genkit-calculator", Version: "1.0.0", }) // Option 2: Expose only specific tools server = mcp.NewMCPServer(g, mcp.MCPServerOptions{ Name: "genkit-calculator", Version: "1.0.0", Tools: []ai.Tool{addTool, multiplyTool}, }) // Start the MCP server log.Println("Starting MCP server...") if err := server.ServeStdio(ctx); err != nil { log.Fatal(err) } ``` ## Transport Options [Section titled “Transport Options”](#transport-options) ### Stdio Transport [Section titled “Stdio Transport”](#stdio-transport) You can use either Stdio or SSE ```go Stdio: &mcp.StdioConfig{ Command: "uvx", Args: []string{"mcp-server-time"}, Env: []string{"DEBUG=1"}, } ``` ```go SSE: &mcp.SSEConfig{ BaseURL: "http://localhost:3000/sse", } ``` ## Testing [Section titled “Testing”](#testing) ### Testing Your MCP Server [Section titled “Testing Your MCP Server”](#testing-your-mcp-server) To test your Genkit application as an MCP server: ```bash # Run your server go run main.go # Test with MCP Inspector in another terminal npx @modelcontextprotocol/inspector go run main.go ``` ## Configuration Options [Section titled “Configuration Options”](#configuration-options) ### MCPClientOptions [Section titled “MCPClientOptions”](#mcpclientoptions) ```go type MCPClientOptions struct { Name string // Server identifier Version string // Version number (defaults to "1.0.0") Disabled bool // Disabled flag to temporarily disable this client Stdio *StdioConfig // Stdio transport config SSE *SSEConfig // SSE transport config } ``` ### StdioConfig [Section titled “StdioConfig”](#stdioconfig) ```go type StdioConfig struct { Command string // Command to run Args []string // Command arguments Env []string // Environment variables } ``` ### MCPServerConfig [Section titled “MCPServerConfig”](#mcpserverconfig) ```go type MCPServerConfig struct { Name string // Name for this server Config MCPClientOptions // Client configuration options } ``` ### MCPManagerOptions [Section titled “MCPManagerOptions”](#mcpmanageroptions) ```go type MCPManagerOptions struct { Name string // Manager instance name Version string // Manager version (defaults to "1.0.0") MCPServers []MCPServerConfig // Array of server configurations } ``` ### MCPServerOptions [Section titled “MCPServerOptions”](#mcpserveroptions) ```go type MCPServerOptions struct { Name string // Server name Version string // Server version Tools []ai.Tool // Specific tools to expose (optional) } ``` # Ollama plugin > Learn how to configure and use the Genkit Ollama plugin for Go to interact with local LLMs like Gemma and Llama. The Ollama plugin provides interfaces to any of the local LLMs supported by [Ollama](https://ollama.com/). ## Prerequisites [Section titled “Prerequisites”](#prerequisites) This plugin requires that you first install and run the Ollama server. You can follow the instructions on the [Download Ollama](https://ollama.com/download) page. Use the Ollama CLI to download the models you are interested in. For example: ```bash ollama pull gemma3 ``` For development, you can run Ollama on your development machine. Deployed apps usually run Ollama on a GPU-accelerated machine that is different from the one hosting the app backend running Genkit. ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, pass `ollama.Ollama` to `WithPlugins()` in the Genkit initializer, specifying the address of your Ollama server: ```go import "github.com/firebase/genkit/go/plugins/ollama" ``` ```go g, err := genkit.Init(context.Background(), genkit.WithPlugins(&ollama.Ollama{ServerAddress: "http://127.0.0.1:11434"})) ``` ## Usage [Section titled “Usage”](#usage) To generate content, you first need to create a model definition based on the model you installed and want to use. For example, if you installed Gemma 2: ```go model := ollama.DefineModel( ollama.ModelDefinition{ Name: "gemma3", Type: "chat", // "chat" or "generate" }, &ai.ModelInfo{ Multiturn: true, SystemRole: true, Tools: false, Media: false, }, ) ``` Then, you can use the model reference to send requests to your Ollama server: ```go resp, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithPrompt("Tell me a joke.")) if err != nil { return err } log.Println(resp.Text()) ``` See [Generating content](/go/docs/models) for more information. # OpenAI-Compatible API plugin > Learn how to use the Genkit OpenAI-Compatible plugin for Go to access OpenAI, Anthropic, and other OpenAI-compatible APIs. The OpenAI-Compatible API plugin (`compat_oai`) provides a unified interface for accessing multiple AI providers that implement OpenAI’s API specification. This includes OpenAI, Anthropic, and other compatible services. ## Overview [Section titled “Overview”](#overview) The `compat_oai` package serves as a foundation for building plugins that work with OpenAI-compatible APIs. It includes: * **Base Implementation**: Common functionality for OpenAI-compatible APIs * **OpenAI Plugin**: Direct access to OpenAI’s models and embeddings * **Anthropic Plugin**: Access to Claude models through OpenAI-compatible endpoints * **Extensible Framework**: Build custom plugins for other compatible providers ## Prerequisites [Section titled “Prerequisites”](#prerequisites) Depending on which provider you use, you’ll need: * **OpenAI**: API key from [OpenAI API Keys page](https://platform.openai.com/api-keys) * **Anthropic**: API key from [Anthropic Console](https://console.anthropic.com/) * **Other providers**: API keys from the respective services ## OpenAI Provider [Section titled “OpenAI Provider”](#openai-provider) ### Configuration [Section titled “Configuration”](#configuration) ```go import "github.com/firebase/genkit/go/plugins/compat_oai/openai" ``` ```go g, err := genkit.Init(context.Background(), genkit.WithPlugins(&openai.OpenAI{ APIKey: "YOUR_OPENAI_API_KEY", // or set OPENAI_API_KEY env var })) ``` ### Supported Models [Section titled “Supported Models”](#supported-models) #### Latest Models [Section titled “Latest Models”](#latest-models) * **gpt-4.1** - Latest GPT-4.1 with multimodal support * **gpt-4.1-mini** - Faster, cost-effective GPT-4.1 variant * **gpt-4.1-nano** - Ultra-efficient GPT-4.1 variant * **gpt-4.5-preview** - Preview of GPT-4.5 with advanced capabilities #### Production Models [Section titled “Production Models”](#production-models) * **gpt-4o** - Advanced GPT-4 with vision and tool support * **gpt-4o-mini** - Fast and cost-effective GPT-4o variant * **gpt-4-turbo** - High-performance GPT-4 with large context window #### Reasoning Models [Section titled “Reasoning Models”](#reasoning-models) * **o3-mini** - Latest compact reasoning model * **o1** - Advanced reasoning model for complex problems * **o1-mini** - Compact reasoning model * **o1-preview** - Preview reasoning model #### Legacy Models [Section titled “Legacy Models”](#legacy-models) * **gpt-4** - Original GPT-4 model * **gpt-3.5-turbo** - Fast and efficient language model ### Embedding Models [Section titled “Embedding Models”](#embedding-models) * **text-embedding-3-large** - Most capable embedding model * **text-embedding-3-small** - Fast and efficient embedding model * **text-embedding-ada-002** - Legacy embedding model ### OpenAI Usage Example [Section titled “OpenAI Usage Example”](#openai-usage-example) ```go import ( "github.com/firebase/genkit/go/plugins/compat_oai/openai" "github.com/firebase/genkit/go/plugins/compat_oai" ) // Initialize OpenAI plugin oai := &openai.OpenAI{APIKey: "YOUR_API_KEY"} g, err := genkit.Init(ctx, genkit.WithPlugins(oai)) // Use GPT-4o for general tasks model := oai.Model(g, "gpt-4o") resp, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithPrompt("Explain quantum computing."), ) // Use embeddings embedder := oai.Embedder(g, "text-embedding-3-large") embeds, err := ai.Embed(ctx, embedder, ai.WithDocs("Hello, world!")) ``` ## Anthropic Provider [Section titled “Anthropic Provider”](#anthropic-provider) ### Configuration [Section titled “Configuration”](#configuration-1) ```go import "github.com/firebase/genkit/go/plugins/compat_oai/anthropic" ``` ```go g, err := genkit.Init(context.Background(), genkit.WithPlugins(&anthropic.Anthropic{ Opts: []option.RequestOption{ option.WithAPIKey("YOUR_ANTHROPIC_API_KEY"), }, })) ``` ### Supported Models [Section titled “Supported Models”](#supported-models-1) * **claude-3-7-sonnet-20250219** - Latest Claude 3.7 Sonnet with advanced capabilities * **claude-3-5-haiku-20241022** - Fast and efficient Claude 3.5 Haiku * **claude-3-5-sonnet-20240620** - Balanced Claude 3.5 Sonnet * **claude-3-opus-20240229** - Most capable Claude 3 model * **claude-3-haiku-20240307** - Fastest Claude 3 model ### Anthropic Usage Example [Section titled “Anthropic Usage Example”](#anthropic-usage-example) ```go import ( "github.com/firebase/genkit/go/plugins/compat_oai/anthropic" "github.com/openai/openai-go/option" ) // Initialize Anthropic plugin claude := &anthropic.Anthropic{ Opts: []option.RequestOption{ option.WithAPIKey("YOUR_ANTHROPIC_API_KEY"), }, } g, err := genkit.Init(ctx, genkit.WithPlugins(claude)) // Use Claude for tasks requiring reasoning model := claude.Model(g, "claude-3-7-sonnet-20250219") resp, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithPrompt("Analyze this complex problem step by step."), ) ``` ## Using Multiple Providers [Section titled “Using Multiple Providers”](#using-multiple-providers) You can use both providers in the same application: ```go import ( "github.com/firebase/genkit/go/plugins/compat_oai/openai" "github.com/firebase/genkit/go/plugins/compat_oai/anthropic" ) oai := &openai.OpenAI{APIKey: "YOUR_OPENAI_KEY"} claude := &anthropic.Anthropic{ Opts: []option.RequestOption{ option.WithAPIKey("YOUR_ANTHROPIC_KEY"), }, } g, err := genkit.Init(ctx, genkit.WithPlugins(oai, claude)) // Use OpenAI for embeddings and tool-heavy tasks openaiModel := oai.Model(g, "gpt-4o") embedder := oai.Embedder(g, "text-embedding-3-large") // Use Anthropic for reasoning and analysis claudeModel := claude.Model(g, "claude-3-7-sonnet-20250219") ``` ## Advanced Features [Section titled “Advanced Features”](#advanced-features) ### Tool Calling [Section titled “Tool Calling”](#tool-calling) OpenAI models support tool calling: ```go // Define a tool weatherTool := genkit.DefineTool(g, "get_weather", "Get current weather", func(ctx *ai.ToolContext, input struct{City string}) (string, error) { return fmt.Sprintf("It's sunny in %s", input.City), nil }) // Use with GPT models (tools not supported on Claude via OpenAI API) model := oai.Model(g, "gpt-4o") resp, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithPrompt("What's the weather like in San Francisco?"), ai.WithTools(weatherTool), ) ``` ### Multimodal Support [Section titled “Multimodal Support”](#multimodal-support) Both providers support vision capabilities: ```go // Works with GPT-4o and Claude models resp, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithMessages([]*ai.Message{ ai.NewUserMessage( ai.WithTextPart("What do you see in this image?"), ai.WithMediaPart("image/jpeg", imageData), ), }), ) ``` ### Streaming [Section titled “Streaming”](#streaming) Both providers support streaming responses: ```go resp, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithPrompt("Write a long explanation."), ai.WithStreaming(func(ctx context.Context, chunk *ai.ModelResponseChunk) error { for _, content := range chunk.Content { fmt.Print(content.Text) } return nil }), ) ``` ## Configuration [Section titled “Configuration”](#configuration-2) ### Common Configuration [Section titled “Common Configuration”](#common-configuration) Both providers support OpenAI-compatible configuration: ```go import "github.com/firebase/genkit/go/plugins/compat_oai" config := &compat_oai.OpenAIConfig{ Temperature: 0.7, MaxOutputTokens: 1000, TopP: 0.9, StopSequences: []string{"END"}, } resp, err := genkit.Generate(ctx, g, ai.WithModel(model), ai.WithPrompt("Your prompt here"), ai.WithConfig(config), ) ``` ### Advanced Options [Section titled “Advanced Options”](#advanced-options) ```go import "github.com/openai/openai-go/option" // Custom base URL for OpenAI-compatible services opts := []option.RequestOption{ option.WithAPIKey("YOUR_API_KEY"), option.WithBaseURL("https://your-custom-endpoint.com/v1"), option.WithOrganization("your-org-id"), option.WithHeader("Custom-Header", "value"), } ``` # pgvector retriever template > Learn how to use PostgreSQL and pgvector as a retriever implementation in Genkit Go. You can use PostgreSQL and `pgvector` as your retriever implementation. Use the following examples as a starting point and modify it to work with your database schema. We use [database/sql](https://pkg.go.dev/database/sql) to connect to the Postgres server, but you may use another client library of your choice. ```go func defineRetriever(g *genkit.Genkit, db *sql.DB, embedder ai.Embedder) ai.Retriever { f := func(ctx context.Context, req *ai.RetrieverRequest) (*ai.RetrieverResponse, error) { eres, err := ai.Embed(ctx, embedder, ai.WithDocs(req.Query)) if err != nil { return nil, err } rows, err := db.QueryContext(ctx, ` SELECT episode_id, season_number, chunk as content FROM embeddings WHERE show_id = $1 ORDER BY embedding <#> $2 LIMIT 2`, req.Options, pgv.NewVector(eres.Embeddings[0].Embedding)) if err != nil { return nil, err } defer rows.Close() res := &ai.RetrieverResponse{} for rows.Next() { var eid, sn int var content string if err := rows.Scan(&eid, &sn, &content); err != nil { return nil, err } meta := map[string]any{ "episode_id": eid, "season_number": sn, } doc := &ai.Document{ Content: []*ai.Part{ai.NewTextPart(content)}, Metadata: meta, } res.Documents = append(res.Documents, doc) } if err := rows.Err(); err != nil { return nil, err } return res, nil } return genkit.DefineRetriever(g, provider, "shows", f) } ``` And here’s how to use the retriever in a flow: ```go retriever := defineRetriever(g, db, embedder) type input struct { Question string Show string } genkit.DefineFlow(g, "askQuestion", func(ctx context.Context, in input) (string, error) { res, err := ai.Retrieve(ctx, retriever, ai.WithConfig(in.Show), ai.WithTextDocs(in.Question)) if err != nil { return "", err } for _, doc := range res.Documents { fmt.Printf("%+v %q\n", doc.Metadata, doc.Content[0].Text) } // Use documents in RAG prompts. return "", nil }) ``` # Pinecone plugin > Learn how to configure and use the Genkit Pinecone plugin for Go to integrate with the Pinecone cloud vector database. The Pinecone plugin provides retriever implementatons that use the [Pinecone](https://www.pinecone.io/) cloud vector database. ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, import the `pinecone` package and call `pinecone.Init()`: ```go import "github.com/firebase/genkit/go/plugins/pinecone" ``` ```go if err := (&pinecone.Pinecone{}).Init(ctx, g); err != nil { return err } ``` The plugin requires your Pinecone API key. Configure the plugin to use your API key by doing one of the following: * Set the `PINECONE_API_KEY` environment variable to your API key. * Specify the API key when you initialize the plugin: ```go if err := (&pinecone.Pinecone{APIKey: pineconeAPIKey}).Init(ctx, g); err != nil { return err } ``` However, don’t embed your API key directly in code! Use this feature only in conjunction with a service like Cloud Secret Manager or similar. ## Usage [Section titled “Usage”](#usage) Index your documents in pinecone. An example of indexing is provided within the Pinecone plugin as shown below. This functionality should be customized by the user according to their use case. ```go err = pinecone.Index(ctx, docChunks, ds, "") if err != nil { return err } ``` To retrieve documents from an index, first create a retriever definition: ```go menuRetriever, err := pinecone.DefineRetriever(ctx, g, pinecone.Config{ IndexID: "menu_data", // Your Pinecone index Embedder: googlegenai.GoogleAIEmbedder(g, "text-embedding-004"), // Embedding model of your choice }) if err != nil { return err } ``` Then, call the retriever’s `Retrieve()` method, passing it a text query: ```go resp, err := menuRetriever.Retrieve(ctx, &ai.RetrieverRequest{ Query: ai.DocumentFromText(userInput, nil), Options: nil, }) if err != nil { return err } menuInfo := resp.Documents ``` See the [Retrieval-augmented generation](/go/docs/rag) page for a general discussion on using retrievers for RAG. # Third-party plugins by Firebase and partners > This page lists third-party plugins for Genkit that are built and maintained by Firebase or our partners. This page lists third-party plugins for Genkit that are built and maintained by Firebase or our partners. Pinecone The Pinecone plugin provides retriever implementations that use the [Pinecone](https://www.pinecone.io/product/) cloud vector database. [View plugin info](/go/docs/plugins/pinecone) Ollama The Ollama plugin provides interfaces to any of the local LLMs supported by [Ollama](https://ollama.com) [View plugin info](/go/docs/plugins/ollama) pgvector The pgvector template is an example PostgreSQL and pgvector retriever implementation. You can use the provided examples as a starting point and modify them to work with your database schema. [View template](/go/docs/plugins/pgvector) AlloyDB for PostgreSQL The AlloyDB plugin provides retriever implementations that use the [AlloyDB for PostgreSQL](https://cloud.google.com/alloydb/docs) cloud vector database. [View plugin info](/go/docs/plugins/alloydb) Cloud SQL for PostgreSQL The Cloud SQL for PostgreSQL plugin provides retriever implementations that use the [Cloud SQL for PostgreSQL](https://cloud.google.com/sql/docs/postgres) cloud vector database. [View plugin info](/go/docs/plugins/cloud-sql-pg) # Retrieval-augmented generation (RAG) > Learn how to build Retrieval-Augmented Generation (RAG) flows in Genkit Go using indexers, embedders, and retrievers. Genkit provides abstractions that help you build retrieval-augmented generation (RAG) flows, as well as plugins that provide integrations with related tools. ## What is RAG? [Section titled “What is RAG?”](#what-is-rag) Retrieval-augmented generation is a technique used to incorporate external sources of information into an LLM’s responses. It’s important to be able to do so because, while LLMs are typically trained on a broad body of material, practical use of LLMs often requires specific domain knowledge (for example, you might want to use an LLM to answer customers’ questions about your company’s products). One solution is to fine-tune the model using more specific data. However, this can be expensive both in terms of compute cost and in terms of the effort needed to prepare adequate training data. In contrast, RAG works by incorporating external data sources into a prompt at the time it’s passed to the model. For example, you could imagine the prompt, “What is Bart’s relationship to Lisa?” might be expanded (“augmented”) by prepending some relevant information, resulting in the prompt, “Homer and Marge’s children are named Bart, Lisa, and Maggie. What is Bart’s relationship to Lisa?” This approach has several advantages: * It can be more cost effective because you don’t have to retrain the model. * You can continuously update your data source and the LLM can immediately make use of the updated information. * You now have the potential to cite references in your LLM’s responses. On the other hand, using RAG naturally means longer prompts, and some LLM API services charge for each input token you send. Ultimately, you must evaluate the cost tradeoffs for your applications. RAG is a very broad area and there are many different techniques used to achieve the best quality RAG. The core Genkit framework offers two main abstractions to help you do RAG: * Embedders: transforms documents into a vector representation * Retrievers: retrieve documents from an “index”, given a query. These definitions are broad on purpose because Genkit is un-opinionated about what an “index” is or how exactly documents are retrieved from it. Genkit only provides a `Document` format and everything else is defined by the retriever or indexer implementation provider. ### Embedders [Section titled “Embedders”](#embedders) An embedder is a function that takes content (text, images, audio, etc.) and creates a numeric vector that encodes the semantic meaning of the original content. As mentioned above, embedders are leveraged as part of the process of indexing. However, they can also be used independently to create embeddings without an index. ### Retrievers [Section titled “Retrievers”](#retrievers) A retriever is a concept that encapsulates logic related to any kind of document retrieval. The most popular retrieval cases typically include retrieval from vector stores. However, in Genkit a retriever can be any function that returns data. To create a retriever, you can use one of the provided implementations or create your own. ## Supported retrievers, and embedders [Section titled “Supported retrievers, and embedders”](#supported-retrievers-and-embedders) Genkit provides retriever support through its plugin system. The following plugins are officially supported: * [Pinecone](/go/docs/plugins/pinecone) cloud vector database In addition, Genkit supports the following vector stores through predefined code templates, which you can customize for your database configuration and schema: * PostgreSQL with [`pgvector`](/go/docs/plugins/pgvector) Embedding model support is provided through the following plugins: | Plugin | Models | | ----------------------------------------------------- | -------------- | | [Google Generative AI](/go/docs/plugins/google-genai) | Text embedding | ## Defining a RAG Flow [Section titled “Defining a RAG Flow”](#defining-a-rag-flow) The following examples show how you could ingest a collection of restaurant menu PDF documents into a vector database and retrieve them for use in a flow that determines what food items are available. *Note*: Although retriever functions are defined using Genkit, users are expected to add their own functionality to index the documents. ### Install dependencies [Section titled “Install dependencies”](#install-dependencies) In this example, we will use the `textsplitter` library from `langchaingo` and the `ledongthuc/pdf` PDF parsing Library: ```bash go get github.com/tmc/langchaingo/textsplitter go get github.com/ledongthuc/pdf ``` #### Create chunking config [Section titled “Create chunking config”](#create-chunking-config) This example uses the `textsplitter` library which provides a simple text splitter to break up documents into segments that can be vectorized. The following definition configures the chunking function to return document segments of 200 characters, with an overlap between chunks of 20 characters. ```go splitter := textsplitter.NewRecursiveCharacter( textsplitter.WithChunkSize(200), textsplitter.WithChunkOverlap(20), ) ``` More chunking options for this library can be found in the [`langchaingo` documentation](https://pkg.go.dev/github.com/tmc/langchaingo/textsplitter#Option). #### Define your indexer flow [Section titled “Define your indexer flow”](#define-your-indexer-flow) ```go genkit.DefineFlow( g, "indexMenu", func(ctx context.Context, path string) (any, error) { // Extract plain text from the PDF. Wrap the logic in Run so it // appears as a step in your traces. pdfText, err := genkit.Run(ctx, "extract", func() (string, error) { return readPDF(path) }) if err != nil { return nil, err } // Split the text into chunks. Wrap the logic in Run so it appears as a // step in your traces. docs, err := genkit.Run(ctx, "chunk", func() ([]*ai.Document, error) { chunks, err := splitter.SplitText(pdfText) if err != nil { return nil, err } var docs []*ai.Document for _, chunk := range chunks { docs = append(docs, ai.DocumentFromText(chunk, nil)) } return docs, nil }) if err != nil { return nil, err } // Add chunks to the index using custom index function }, ) ``` ```go // Helper function to extract plain text from a PDF. Excerpted from // https://github.com/ledongthuc/pdf func readPDF(path string) (string, error) { f, r, err := pdf.Open(path) if f != nil { defer f.Close() } if err != nil { return "", err } reader, err := r.GetPlainText() if err != nil { return "", err } bytes, err := io.ReadAll(reader) if err != nil { return "", err } return string(bytes), nil } ``` #### Run the indexer flow [Section titled “Run the indexer flow”](#run-the-indexer-flow) ```bash genkit flow:run indexMenu "'menu.pdf'" ``` After running the `indexMenu` flow, the vector database will be seeded with documents and ready to be used in Genkit flows with retrieval steps. ### Define a flow with retrieval [Section titled “Define a flow with retrieval”](#define-a-flow-with-retrieval) The following example shows how you might use a retriever in a RAG flow. This example uses Genkit’s file-based vector retriever, which you should not use in production. ```go ctx := context.Background() g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.VertexAI{})) if err != nil { log.Fatal(err) } if err = localvec.Init(); err != nil { log.Fatal(err) } model := googlegenai.VertexAIModel(g, "gemini-1.5-flash") _, menuPdfRetriever, err := localvec.DefineRetriever( g, "menuQA", localvec.Config{Embedder: googlegenai.VertexAIEmbedder(g, "text-embedding-004")}, ) if err != nil { log.Fatal(err) } genkit.DefineFlow( g, "menuQA", func(ctx context.Context, question string) (string, error) { // Retrieve text relevant to the user's question. resp, err := ai.Retrieve(ctx, menuPdfRetriever, ai.WithTextDocs(question)) if err != nil { return "", err } // Call Generate, including the menu information in your prompt. return genkit.GenerateText(ctx, g, ai.WithModelName("googleai/gemini-2.5-flash"), ai.WithDocs(resp.Documents), ai.WithSystem(` You are acting as a helpful AI assistant that can answer questions about the food available on the menu at Genkit Grub Pub. Use only the context provided to answer the question. If you don't know, do not make up an answer. Do not add or change items on the menu.`) ai.WithPrompt(question)) }) ``` ## Write your own retrievers [Section titled “Write your own retrievers”](#write-your-own-retrievers) It’s also possible to create your own retriever. This is useful if your documents are managed in a document store that is not supported in Genkit (eg: MySQL, Google Drive, etc.). The Genkit SDK provides flexible methods that let you provide custom code for fetching documents. You can also define custom retrievers that build on top of existing retrievers in Genkit and apply advanced RAG techniques (such as reranking or prompt extension) on top. For example, suppose you have a custom re-ranking function you want to use. The following example defines a custom retriever that applies your function to the menu retriever defined earlier: ```go type CustomMenuRetrieverOptions struct { K int PreRerankK int } advancedMenuRetriever := genkit.DefineRetriever( g, "custom", "advancedMenuRetriever", func(ctx context.Context, req *ai.RetrieverRequest) (*ai.RetrieverResponse, error) { // Handle options passed using our custom type. opts, _ := req.Options.(CustomMenuRetrieverOptions) // Set fields to default values when either the field was undefined // or when req.Options is not a CustomMenuRetrieverOptions. if opts.K == 0 { opts.K = 3 } if opts.PreRerankK == 0 { opts.PreRerankK = 10 } // Call the retriever as in the simple case. resp, err := ai.Retrieve(ctx, menuPDFRetriever, ai.WithDocs(req.Query), ai.WithConfig(ocalvec.RetrieverOptions{K: opts.PreRerankK}), ) if err != nil { return nil, err } // Re-rank the returned documents using your custom function. rerankedDocs := rerank(response.Documents) response.Documents = rerankedDocs[:opts.K] return response, nil }, ) ``` # Tool calling > Learn how to use tool calling (function calling) with Genkit Go to give LLMs access to external information and actions. *Tool calling*, also known as *function calling*, is a structured way to give LLMs the ability to make requests back to the application that called it. You define the tools you want to make available to the model, and the model will make tool requests to your app as necessary to fulfill the prompts you give it. The use cases of tool calling generally fall into a few themes: **Giving an LLM access to information it wasn’t trained with** * Frequently changing information, such as a stock price or the current weather. * Information specific to your app domain, such as product information or user profiles. Note the overlap with [retrieval augmented generation](/go/docs/rag) (RAG), which is also a way to let an LLM integrate factual information into its generations. RAG is a heavier solution that is most suited when you have a large amount of information or the information that’s most relevant to a prompt is ambiguous. On the other hand, if a function call or database lookup is all that’s necessary for retrieving the information the LLM needs, tool calling is more appropriate. **Introducing a degree of determinism into an LLM workflow** * Performing calculations that the LLM cannot reliably complete itself. * Forcing an LLM to generate verbatim text under certain circumstances, such as when responding to a question about an app’s terms of service. **Performing an action when initiated by an LLM** * Turning on and off lights in an LLM-powered home assistant * Reserving table reservations in an LLM-powered restaurant agent ## Before you begin [Section titled “Before you begin”](#before-you-begin) If you want to run the code examples on this page, first complete the steps in the [Get started](/go/docs/get-started-go) guide. All of the examples assume that you have already set up a project with Genkit dependencies installed. This page discusses one of the advanced features of Genkit model abstraction, so before you dive too deeply, you should be familiar with the content on the [Generating content with AI models](/go/docs/models) page. You should also be familiar with Genkit’s system for defining input and output schemas, which is discussed on the [Flows](/go/docs/flows) page. ## Overview of tool calling [Section titled “Overview of tool calling”](#overview-of-tool-calling) At a high level, this is what a typical tool-calling interaction with an LLM looks like: 1. The calling application prompts the LLM with a request and also includes in the prompt a list of tools the LLM can use to generate a response. 2. The LLM either generates a complete response or generates a tool call request in a specific format. 3. If the caller receives a complete response, the request is fulfilled and the interaction ends; but if the caller receives a tool call, it performs whatever logic is appropriate and sends a new request to the LLM containing the original prompt or some variation of it as well as the result of the tool call. 4. The LLM handles the new prompt as in Step 2. For this to work, several requirements must be met: * The model must be trained to make tool requests when it’s needed to complete a prompt. Most of the larger models provided through web APIs such as Gemini can do this, but smaller and more specialized models often cannot. Genkit will throw an error if you try to provide tools to a model that doesn’t support it. * The calling application must provide tool definitions to the model in the format it expects. * The calling application must prompt the model to generate tool calling requests in the format the application expects. ## Tool calling with Genkit [Section titled “Tool calling with Genkit”](#tool-calling-with-genkit) Genkit provides a single interface for tool calling with models that support it. Each model plugin ensures that the last two criteria mentioned in the previous section are met, and the `genkit.Generate()` function automatically carries out the tool-calling loop described earlier. ### Model support [Section titled “Model support”](#model-support) Tool calling support depends on the model, the model API, and the Genkit plugin. Consult the relevant documentation to determine if tool calling is likely to be supported. In addition: * Genkit will throw an error if you try to provide tools to a model that doesn’t support it. * If the plugin exports model references, the `ModelInfo.Supports.Tools` property will indicate if it supports tool calling. ### Defining tools [Section titled “Defining tools”](#defining-tools) Use the `genkit.DefineTool()` function to write tool definitions: ```go package main import ( "context" "fmt" "log" "github.com/firebase/genkit/go/ai" "github.com/firebase/genkit/go/genkit" "github.com/firebase/genkit/go/plugins/googlegenai" ) // Define the input structure for the tool type WeatherInput struct { Location string `json:"location" jsonschema_description:"Location to get weather for"` } func main() { ctx := context.Background() g, err := genkit.Init(ctx, genkit.WithPlugins(&googlegenai.GoogleAI{}), genkit.WithDefaultModel("googleai/gemini-1.5-flash"), // Updated model name ) if err != nil { log.Fatalf("Genkit initialization failed: %v", err) } genkit.DefineTool( g, "getWeather", "Gets the current weather in a given location", func(ctx context.Context, input WeatherInput) (string, error) { // Here, we would typically make an API call or database query. For this // example, we just return a fixed value. log.Printf("Tool 'getWeather' called for location: %s", input.Location) return fmt.Sprintf("The current weather in %s is 63°F and sunny.", input.Location), nil }) } ``` The syntax here looks just like the `genkit.DefineFlow()` syntax; however, you must write a description. Take special care with the wording and descriptiveness of the description as it is vital for the LLM to decide to use it appropriately. ### Using tools [Section titled “Using tools”](#using-tools) Include defined tools in your prompts to generate content. **Using `genkit.Generate()`:** ```go resp, err := genkit.Generate(ctx, g, ai.WithPrompt("What is the weather in San Francisco?"), ai.WithTools(getWeatherTool), ) ``` **Using `genkit.DefinePrompt()`:** ```go weatherPrompt, err := genkit.DefinePrompt(g, "weatherPrompt", ai.WithPrompt("What is the weather in {% verbatim %}{{location}}{% endverbatim %}?"), ai.WithTools(getWeatherTool), ) if err != nil { log.Fatal(err) } resp, err := weatherPrompt.Execute(ctx, with.Input(map[string]any{"location": "San Francisco"}), ) ``` **Using a `.prompt` file:** Create a file named `prompts/weatherPrompt.prompt` (assuming default prompt directory): ```dotprompt --- system: "Answer questions using the tools you have." tools: [getWeather] input: schema: location: string --- What is the weather in {{location}}? ``` Then execute it in your Go code: ```go // Assuming prompt file named weatherPrompt.prompt exists in ./prompts dir. weatherPrompt := genkit.LookupPrompt("weatherPrompt") if weatherPrompt == nil { log.Fatal("no prompt named 'weatherPrompt' found") } resp, err := weatherPrompt.Execute(ctx, ai.WithInput(map[string]any{"location": "San Francisco"}), ) ``` Genkit will automatically handle the tool call if the LLM needs to use the `getWeather` tool to answer the prompt. ### Explicitly handling tool calls [Section titled “Explicitly handling tool calls”](#explicitly-handling-tool-calls) If you want full control over this tool-calling loop, for example to apply more complicated logic, set the `WithReturnToolRequests()` option to `true`. Now it’s your responsibility to ensure all of the tool requests are fulfilled: ```go getWeatherTool := genkit.DefineTool( g, "getWeather", "Gets the current weather in a given location", func(ctx *ai.ToolContext, location struct { Location string `jsonschema_description:"Location to get weather for"` }) (string, error) { // Tool implementation... return "sunny", nil }, ) resp, err := genkit.Generate(ctx, g, ai.WithPrompt("What is the weather in San Francisco?"), ai.WithTools(getWeatherTool), ai.WithReturnToolRequests(true), ) if err != nil { log.Fatal(err) } parts := []*ai.Part{} for _, req := range resp.ToolRequests() { tool := genkit.LookupTool(g, req.Name) if tool == nil { log.Fatalf("tool %q not found", req.Name) } output, err := tool.RunRaw(ctx, req.Input) if err != nil { log.Fatalf("tool %q execution failed: %v", tool.Name(), err) } parts = append(parts, ai.NewToolResponsePart(&ai.ToolResponse{ Name: req.Name, Ref: req.Ref, Output: output, })) } resp, err = genkit.Generate(ctx, g, ai.WithMessages(append(resp.History(), ai.NewMessage(ai.RoleTool, nil, parts...))...), ) if err != nil { log.Fatal(err) } ``` # Deploy with Cloud Run > Learn how to deploy your Genkit Python app to Cloud Run. You can easily deploy your Genkit app to Cloud Run. For prerequisites and basic scaffolding see [Cloud Run - Python quickstart](https://cloud.google.com/run/docs/quickstarts/build-and-deploy/deploy-python-service#before-you-begin) documentation. Once you have a simple Cloud Run app set up and ready to go, update the `requirements.txt` to add Genkit libraries. In this example we’ll be using the Google GenAI plugin. requirements.txt ```text genkit genkit-plugin-google-genai ``` Update you app code to use Genkit. ```python import os from flask import Flask from genkit.ai import Genkit from genkit.plugins.flask import genkit_flask_handler from genkit.plugins.google_genai import ( GoogleGenai, google_genai_name, ) ai = Genkit( plugins=[GoogleGenai()], model=google_genai_name('gemini-2.5-flash'), ) app = Flask(__name__) @app.post('/joke') @genkit_flask_handler(ai) @ai.flow() async def joke(name: str, ctx): return await ai.generate( on_chunk=ctx.send_chunk, prompt=f'tell a medium sized joke about {name}', ) if __name__ == "__main__": app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080))) ``` Then proceeed with Cloud Run [deployment](https://cloud.google.com/run/docs/quickstarts/build-and-deploy/deploy-python-service#deploy) instructions. # Deploy with Flask > Learn how to build a Flask application using Genkit for Python. Prerequisites: make sure you have everything installed from the [Get Started](/python/docs/get-started/) guide. 1. Install Genkit Flask plugin ```bash pip install git+https://github.com/firebase/genkit#subdirectory=py/plugins/flask ``` Or create a `requirements.txt` file requirements.txt ```text genkit-plugin-flask @ git+https://github.com/firebase/genkit#subdirectory=py/plugins/google-genai ``` 2. Create `main.py` file: main.py ```python from flask import Flask from genkit.ai import Genkit from genkit.plugins.flask import genkit_flask_handler from genkit.plugins.google_genai import ( GoogleGenai, google_genai_name, ) ai = Genkit( plugins=[GoogleGenai()], model=google_genai_name('gemini-2.5-flash'), ) app = Flask(__name__) @app.post('/joke') @genkit_flask_handler(ai) @ai.flow() async def joke(name: str, ctx): return await ai.generate( on_chunk=ctx.send_chunk, prompt=f'tell a medium sized joke about {name}', ) ``` 3. Run the app: ```bash flask --app main.py run ``` Or with Dev UI: ```bash genkit start -- flask --app main.py run ``` You can invoke the flow via HTTP: ```bash curl -X POST http://127.0.0.1:5000/joke -d '{"data": "banana"}' -H 'content-Type: application/json' -H 'Accept: text/event-stream' ``` or you can use [Genkit client library](https://js.api.genkit.dev/modules/genkit.beta_client.html). ## Authorization and custom context [Section titled “Authorization and custom context”](#authorization-and-custom-context) You can do custom authorization and custom context parsing by passing a `ContextProvider` implementation. ```python from genkit.types import GenkitError # Assume parse_request_header is defined elsewhere # def parse_request_header(auth_header): # # Example implementation: Replace with your actual logic # if auth_header and auth_header.startswith('Bearer '): # token = auth_header.split(' ')[1] # # Validate token and return username, or None/raise error # if token == "valid-token": # return "testuser" # return None async def my_context_provider(request): # This function needs access to the request object from Flask # The exact way to get headers might depend on how genkit_flask_handler passes the request auth_header = request.headers.get('authorization') username = parse_request_header(auth_header) # Call the (assumed) function return {'username': username} @app.post('/say_hi') @genkit_flask_handler(ai, context_provider=my_context_provider) @ai.flow() async def say_hi(name: str, ctx): if not ctx.context.get('username'): raise GenkitError(status='UNAUTHENTICATED', message='user not provided') return await ai.generate( on_chunk=ctx.send_chunk, prompt=f'say hi to {ctx.context.get("username")}', ) ``` `parse_request_header` can be your custom authorization header parsing/validation. # Get Started with Python (alpha) > Get started with Genkit using Python (alpha). The Genkit libraries for Python are now available for preview! Because the Python libraries are currently in Alpha, you might see API and functional changes as development progresses. We recommend using it only for prototyping and exploration. If you discover issues with the libraries or this documentation please report them in our [GitHub repository](https://github.com/firebase/genkit/). This guide shows you how to get started with Genkit in a Python app. ## Requirements [Section titled “Requirements”](#requirements) * Python 3.10 or later. See [Download and install](https://www.python.org/downloads/) in the official Python docs. * Node.js 20 or later (for the Genkit CLI and UI). See the below for a brief guide on installing Node. ## Create and explore a sample project [Section titled “Create and explore a sample project”](#create-and-explore-a-sample-project) 1. Create a new project directory: ```bash mkdir genkit-intro && cd genkit-intro ``` 2. (recommended) Create a Python virtual environment: ```bash python3 -m venv . ``` (activate if necessary, depending on the environment) ```bash source bin/activate # for bash ``` 3. Install dependencies ```bash pip3 install genkit pip3 install genkit-plugin-google-genai ``` Or create a `requirements.txt` file requirements.txt ```text genkit genkit-plugin-google-genai ``` and run: ```bash pip3 install -r requirements.txt ``` 4. Configure your model API key The simplest way to get started is with Google AI Gemini API. Make sure it’s [available in your region](https://ai.google.dev/available_regions). [Generate an API key](https://aistudio.google.com/app/apikey) for the Gemini API using Google AI Studio. Then, set the `GEMINI_API_KEY` environment variable to your key: ```bash export GEMINI_API_KEY= ``` 5. Create `main.py` file: main.py ```python import json from pydantic import BaseModel, Field from genkit.ai import Genkit from genkit.plugins.google_genai import GoogleAI ai = Genkit( plugins=[GoogleAI()], model='googleai/gemini-2.5-flash', ) class RpgCharacter(BaseModel): name: str = Field(description='name of the character') back_story: str = Field(description='back story') abilities: list[str] = Field(description='list of abilities (3-4)') @ai.flow() async def generate_character(name: str): result = await ai.generate( prompt=f'generate an RPG character named {name}', output_schema=RpgCharacter, ) return result.output async def main() -> None: print(json.dumps(await generate_character('Goblorb'), indent=2)) ai.run_main(main()) ``` 6. Run your app. Genkit apps are just regular Python applications. Run them however you normally run your app. ```bash python3 main.py ``` 7. Inspect your app with the Genkit Dev UI See instructions for installing the Genkit CLI (which includes the Dev UI) below. To inspect your app with Genkit Dev UI run with `genkit start -- ` command. E.g.: ```bash genkit start -- python3 main.py ``` The command will print the Dev UI URL. E.g.: ```plaintext Genkit Developer UI: http://localhost:4000 ``` ## Install Genkit CLI [Section titled “Install Genkit CLI”](#install-genkit-cli) 1. If you don’t already have Node 20 or newer on your system, install it now. Recommendation: The [`nvm`](https://github.com/nvm-sh/nvm) and [`nvm-windows`](https://github.com/coreybutler/nvm-windows) tools are a convenient way to install specific versions of Node if it’s not already installed on your system. These tools install Node on a per-user basis, so you don’t need to make system-wide changes. To install `nvm`: * Linux, macOS, etc. Run the following command: ```bash curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash ``` * Windows Download and run the installer as described in the [nvm-windows docs](https://github.com/coreybutler/nvm-windows?tab=readme-ov-file#install-nvm-windows). Then, to install Node and `npm`, open a new shell and run the following command: ```bash nvm install 20 ``` 2. Install the Genkit CLI by running the following command: ```bash npm install -g genkit-cli ``` This command installs the Genkit CLI into your Node installation directory so that it can be used outside of a Node project. # Defining AI workflows with Flows > Learn how to define, call, stream, debug, and deploy AI workflows using Genkit Flows in Python. The core of your app’s AI features are generative model requests, but it’s rare that you can simply take user input, pass it to the model, and display the model output back to the user. Usually, there are pre- and post-processing steps that must accompany the model call. For example: * Retrieving contextual information to send with the model call * Retrieving the history of the user’s current session, for example in a chat app * Using one model to reformat the user input in a way that’s suitable to pass to another model * Evaluating the “safety” of a model’s output before presenting it to the user * Combining the output of several models Every step of this workflow must work together for any AI-related task to succeed. In Genkit, you represent this tightly-linked logic using a construction called a flow. Flows are written just like functions, using ordinary Python code, but they add additional capabilities intended to ease the development of AI features: * **Type safety**: Input and output schemas defined using [Pydantic Models](https://docs.pydantic.dev/latest/concepts/models/), which provides both static and runtime type checking * **Streaming**: Flows support streaming of data, such as parital LLM responses, or any custom serializable objects. * **Integration with developer UI**: Debug flows independently of your application code using the developer UI. In the developer UI, you can run flows and view traces for each step of the flow. * **Simplified deployment**: Deploy flows directly as web API endpoints, using Cloud Run or any platform that can host a web app. Unlike similar features in other frameworks, Genkit’s flows are lightweight and unobtrusive, and don’t force your app to conform to any specific abstraction. All of the flow’s logic is written in standard Python, and code inside a flow doesn’t need to be flow-aware. ## Defining and calling flows [Section titled “Defining and calling flows”](#defining-and-calling-flows) In its simplest form, a flow just wraps a function. The following example wraps a function that calls `generate()`: ```python @ai.flow() async def menu_suggestion_flow(theme: str): response = await ai.generate( prompt=f'Invent a menu item for a {theme} themed restaurant.', ) return response.text ``` Just by wrapping your `generate()` calls like this, you add some functionality: doing so lets you run the flow from the Genkit CLI and from the developer UI, and is a requirement for several of Genkit’s features, including deployment and observability (later sections discuss these topics). ### Input and output schemas [Section titled “Input and output schemas”](#input-and-output-schemas) One of the most important advantages Genkit flows have over directly calling a model API is type safety of both inputs and outputs. When defining flows, you can define schemas for them using Pydantic. Here’s a refinement of the last example, which defines a flow that takes a string as input and outputs an object: ```python from pydantic import BaseModel class MenuItemSchema(BaseModel): dishname: str description: str @ai.flow() async def menu_suggestion_flow(theme: str) -> MenuItemSchema: response = await ai.generate( prompt=f'Invent a menu item for a {theme} themed restaurant.', output_schema=MenuItemSchema, ) return response.output ``` Note that the schema of a flow does not necessarily have to line up with the schema of the `generate()` calls within the flow (in fact, a flow might not even contain `generate()` calls). Here’s a variation of the example that passes a schema to `generate()`, but uses the structured output to format a simple string, which the flow returns. ```python @ai.flow() async def menu_suggestion_flow(theme: str) -> str: # Changed return type annotation response = await ai.generate( prompt=f'Invent a menu item for a {theme} themed restaurant.', output_schema=MenuItemSchema, ) output: MenuItemSchema = response.output return f'**{output.dishname}**: {output.description}' ``` ### Calling flows [Section titled “Calling flows”](#calling-flows) Once you’ve defined a flow, you can call it from your Python code as a regular function. The argument to the flow must conform to the input schema, if you defined one. ```python response = await menu_suggestion_flow('bistro') ``` If you defined an output schema, the flow response will conform to it. For example, if you set the output schema to `MenuItemSchema`, the flow output will contain its properties. ## Streaming flows [Section titled “Streaming flows”](#streaming-flows) Flows support streaming using an interface similar to `generate_stream()`’s streaming interface. Streaming is useful when your flow generates a large amount of output, because you can present the output to the user as it’s being generated, which improves the perceived responsiveness of your app. As a familiar example, chat-based LLM interfaces often stream their responses to the user as they are generated. Here’s an example of a flow that supports streaming: ```python @ai.flow() async def menu_suggestion_flow(theme: str, ctx): stream, response = ai.generate_stream( prompt=f'Invent a menu item for a {theme} themed restaurant.', ) async for chunk in stream: ctx.send_chunk(chunk.text) return { 'theme': theme, 'menu_item': (await response).text, } ``` The second parameter to your flow definition is called “side channel”. It provides features such as request context and the `send_chunk` callback. The `send_chunk` callback takes a single parameter. Whenever data becomes available within your flow, send the data to the output stream by calling this function. In the above example, the values streamed by the flow are directly coupled to the values streamed by the `generate_stream()` call inside the flow. Although this is often the case, it doesn’t have to be: you can output values to the stream using the callback as often as is useful for your flow. ### Calling streaming flows [Section titled “Calling streaming flows”](#calling-streaming-flows) Streaming flows are also callable, but they immediately return a response object rather than a promise. Flow’s `stream` method returns the stream async iterable, which you can iterate over the streaming output of the flow as it’s generated. ```python stream, response = menu_suggestion_flow.stream('bistro') async for chunk in stream: print(chunk) ``` You can also get the complete output of the flow, as you can with a non-streaming flow. The final response is a future that you can `await` on. ```python print(await response) ``` Note that the streaming output of a flow might not be the same type as the complete output. ## Debugging flows [Section titled “Debugging flows”](#debugging-flows) One of the advantages of encapsulating AI logic within a flow is that you can test and debug the flow independently from your app using the Genkit developer UI. To start the developer UI, run the following commands from your project directory: ```bash genkit start -- python app.py ``` Update `python app.py` to match the way you normally run your app. From the **Run** tab of developer UI, you can run any of the flows defined in your project: After you’ve run a flow, you can inspect a trace of the flow invocation by either clicking **View trace** or looking on the **Inspect** tab. In the trace viewer, you can see details about the execution of the entire flow, as well as details for each of the individual steps within the flow. ## Deploying flows [Section titled “Deploying flows”](#deploying-flows) You can deploy your flows directly as web API endpoints, ready for you to call from your app clients. Deployment is discussed in detail on several other pages, but this section gives brief overviews of your deployment options. For information on deploying to specific platforms, see [Deploy with Cloud Run](/python/docs/cloud-run/) and [Deploy with Flask](/python/docs/flask/). # Tool Interrupts > Learn how to use interrupts to pause and resume LLM generation loops in Genkit Python. import ExampleLink from ”@/examples/ExampleLink.astro”; *Interrupts* are a special kind of [tool](/python/docs/reference/tools/) that can pause the LLM generation-and-tool-calling loop to return control back to you. When you’re ready, you can then *resume* generation by sending *replies* that the LLM processes for further generation. The most common uses for interrupts fall into a few categories: * **Human-in-the-Loop:** Enabling the user of an interactive AI to clarify needed information or confirm the LLM’s action before it is completed, providing a measure of safety and confidence. * **Async Processing:** Starting an asynchronous task that can only be completed out-of-band, such as sending an approval notification to a human reviewer or kicking off a long-running background process. * **Exit from an Autonomous Task:** Providing the model a way to mark a task as complete, in a workflow that might iterate through a long series of tool calls. ## Before you begin [Section titled “Before you begin”](#before-you-begin) All of the examples documented here assume that you have already set up a project with Genkit dependencies installed. If you want to run the code examples on this page, first complete the steps in the [Get started](/python/docs/get-started/) guide. Before diving too deeply, you should also be familiar with the following concepts: * [Generating content](/python/docs/reference/models/) with AI models. * Genkit’s system for [defining input and output schemas](/python/docs/reference/flows/). * General methods of [tool-calling](/python/docs/reference/tools/). ## Overview of interrupts [Section titled “Overview of interrupts”](#overview-of-interrupts) At a high level, this is what an interrupt looks like when interacting with an LLM: 1. The calling application prompts the LLM with a request. The prompt includes a list of tools, including at least one for an interrupt that the LLM can use to generate a response. 2. The LLM generates either a complete response or a tool call request in a specific format. To the LLM, an interrupt call looks like any other tool call. 3. If the LLM calls an interrupting tool, the Genkit library automatically pauses generation rather than immediately passing responses back to the model for additional processing. 4. The developer checks whether an interrupt call is made, and performs whatever task is needed to collect the information needed for the interrupt response. 5. The developer resumes generation by passing an interrupt response to the model. This action triggers a return to Step 2. ## Define manual-response interrupts [Section titled “Define manual-response interrupts”](#define-manual-response-interrupts) The most common kind of interrupt allows the LLM to request clarification from the user, for example by asking a multiple-choice question. For this use case, use the Genkit instance’s `tool()` decorator: ```python from pydantic import BaseModel, Field class Questions(BaseModel): choices: list[str] = Field(description='the choices to display to the user') allow_other: bool = Field(description='when true, allow write-ins') @ai.tool() def ask_question(input: Questions, ctx) -> str: """Use this to ask the user a clarifying question""" ctx.interrupt() ``` Note that the `outputSchema` of an interrupt corresponds to the response data you will provide as opposed to something that will be automatically populated by a tool function. ### Use interrupts [Section titled “Use interrupts”](#use-interrupts) Interrupts are passed into the `tools` array when generating content, just like other types of tools. You can pass both normal tools and interrupts to the same `generate` call: ```python interrupted_response = await ai.generate( prompt='Ask me a movie trivia question.', tools=['ask_question'], ) ``` Genkit immediately returns a response on receipt of an interrupt tool call. ### Respond to interrupts [Section titled “Respond to interrupts”](#respond-to-interrupts) If you’ve passed one or more interrupts to your generate call, you need to check the response for interrupts so that you can handle them: ```python # You can check the 'finish_reason' attribute of the response if interrupted_response.finish_reason == 'interrupted': print("Generation interrupted.") # Or you can check if any interrupt requests are on the response if interrupted_response.interrupts and len(interrupted_response.interrupts) > 0: print(f"Interrupts found: {len(interrupted_response.interrupts)}") ``` Responding to an interrupt is done using the `tool_responses` option on a subsequent `generate` call, making sure to pass in the existing history. There’s a `tool_response` helper function to help you construct the response. Once resumed, the model re-enters the generation loop, including tool execution, until either it completes or another interrupt is triggered: ```python from genkit.ai import tool_response # Assuming tool_response is imported response = await ai.generate( messages=interrupted_response.messages, tool_responses=[tool_response(interrupted_response.interrupts[0], 'b')], tools=['ask_question'], ) ``` # Generating content with AI models > Learn how to use Genkit to generate content with various AI models in Python, including text, structured output, streaming, and multimodal input. At the heart of generative AI are AI *models*. Currently, the two most prominent examples of generative models are large language models (LLMs) and image generation models. These models take input, called a *prompt* (most commonly text, an image, or a combination of both), and from it produce as output text, an image, or even audio or video. The output of these models can be surprisingly convincing: LLMs generate text that appears as though it could have been written by a human being, and image generation models can produce images that are very close to real photographs or artwork created by humans. In addition, LLMs have proven capable of tasks beyond simple text generation: * Writing computer programs * Planning subtasks that are required to complete a larger task * Organizing unorganized data * Understanding and extracting information data from a corpus of text * Following and performing automated activities based on a text description of the activity There are many models available to you, from several different providers. Each model has its own strengths and weaknesses and one model might excel at one task but perform less well at others. Apps making use of generative AI can often benefit from using multiple different models depending on the task at hand. As an app developer, you typically don’t interact with generative AI models directly, but rather through services available as web APIs. Although these services often have similar functionality, they all provide them through different and incompatible APIs. If you want to make use of multiple model services, you have to use each of their proprietary SDKs, potentially incompatible with each other. And if you want to upgrade from one model to the newest and most capable one, you might have to build that integration all over again. Genkit addresses this challenge by providing a single interface that abstracts away the details of accessing potentially any generative AI model service, with several pre-built implementations already available. Building your AI-powered app around Genkit simplifies the process of making your first generative AI call and makes it equally easy to combine multiple models or swap one model for another as new models emerge. ### Loading and configuring model plugins [Section titled “Loading and configuring model plugins”](#loading-and-configuring-model-plugins) Before you can use Genkit to start generating content, you need to load and configure a model plugin. If you’re coming from the Getting Started guide, you’ve already done this. Otherwise, see the [Get started](/python/docs/get-started/) guide or the individual plugin’s documentation and follow the steps there before continuing. ### The generate() method [Section titled “The generate() method”](#the-generate-method) In Genkit, the primary interface through which you interact with generative AI models is the `generate()` method. The simplest `generate()` call specifies the model you want to use and a text prompt: ```python import asyncio from genkit.ai import Genkit from genkit.plugins.google_genai import GoogleGenai ai = Genkit( plugins=[GoogleGenai()], model='googleai/gemini-2.5-flash', ) async def main() -> None: result = await ai.generate( prompt='Invent a menu item for a pirate themed restaurant.', ) print(result.text) ai.run_main(main()) ``` When you run this brief example it will print out the output of the `generate()` all, which will usually be Markdown text as in the following example: ```md ## The Blackheart's Bounty **A hearty stew of slow-cooked beef, spiced with rum and molasses, served in a hollowed-out cannonball with a side of crusty bread and a dollop of tangy pineapple salsa.** **Description:** This dish is a tribute to the hearty meals enjoyed by pirates on the high seas. The beef is tender and flavorful, infused with the warm spices of rum and molasses. The pineapple salsa adds a touch of sweetness and acidity, balancing the richness of the stew. The cannonball serving vessel adds a fun and thematic touch, making this dish a perfect choice for any pirate-themed adventure. ``` Run the script again and you’ll get a different output. The preceding code sample sent the generation request to the default model, which you specified when you configured the Genkit instance. You can also specify a model for a single `generate()` call: ```python result = await ai.generate( prompt='Invent a menu item for a pirate themed restaurant.', model='googleai/gemini-2.0-pro', ) ``` A model string identifier looks like `providerid/modelid`, where the provider ID (in this case, `google_genai`) identifies the plugin, and the model ID is a plugin-specific string identifier for a specific version of a model. These examples also illustrate an important point: when you use `generate()` to make generative AI model calls, changing the model you want to use is simply a matter of passing a different value to the model parameter. By using `generate()` instead of the native model SDKs, you give yourself the flexibility to more easily use several different models in your app and change models in the future. So far you have only seen examples of the simplest `generate()` calls. However, `generate()` also provides an interface for more advanced interactions with generative models, which you will see in the sections that follow. ### System prompts [Section titled “System prompts”](#system-prompts) Some models support providing a *system prompt*, which gives the model instructions as to how you want it to respond to messages from the user. You can use the system prompt to specify a persona you want the model to adopt, the tone of its responses, the format of its responses, and so on. If the model you’re using supports system prompts, you can provide one with the `system` parameter: ```python result = await ai.generate( system='You are a food industry marketing consultant.', prompt='Invent a menu item for a pirate themed restaurant.', ) ``` ### Model parameters [Section titled “Model parameters”](#model-parameters) The `generate()` function takes a `config` parameter, through which you can specify optional settings that control how the model generates content: ```python result = await ai.generate( prompt='Invent a menu item for a pirate themed restaurant.', config={ 'max_output_tokens': 400, 'stop_sequences': ['', ''], 'temperature': 1.2, 'top_p': 0.4, 'top_k': 50, }, ) ``` The exact parameters that are supported depend on the individual model and model API. However, the parameters in the previous example are common to almost every model. The following is an explanation of these parameters: ### Structured output [Section titled “Structured output”](#structured-output) When using generative AI as a component in your application, you often want output in a format other than plain text. Even if you’re just generating content to display to the user, you can benefit from structured output simply for the purpose of presenting it more attractively to the user. But for more advanced applications of generative AI, such as programmatic use of the model’s output, or feeding the output of one model into another, structured output is a must. In Genkit, you can request structured output from a model by specifying a schema when you call `generate()`: ```python from pydantic import BaseModel class MenuItemSchema(BaseModel): name: str description: str calories: int allergens: list[str] result = await ai.generate( prompt='Invent a menu item for a pirate themed restaurant.', output_schema=MenuItemSchema, ) ``` Model output schemas are specified using the [Pydantic Models](https://docs.pydantic.dev/latest/concepts/models/). In addition to a schema definition language, Pydantic also provides runtime type checking, which bridges the gap between static Python types and the unpredictable output of generative AI models. Pydantic lets you write code that can rely on the fact that a successful generate call will always return output that conforms to your Python types. When you specify a schema in `generate()`, Genkit does several things behind the scenes: * Augments the prompt with additional guidance about the desired output format. This also has the side effect of specifying to the model what content exactly you want to generate (for example, not only suggest a menu item but also generate a description, a list of allergens, and so on). * Parses the model output into a Pydantic object. * Verifies that the output conforms with the schema. To get structured output from a successful generate call, use the response object’s `output` property: ```python output = response.output ``` ### Streaming [Section titled “Streaming”](#streaming) When generating large amounts of text, you can improve the experience for your users by presenting the output as it’s generated—streaming the output. A familiar example of streaming in action can be seen in most LLM chat apps: users can read the model’s response to their message as it’s being generated, which improves the perceived responsiveness of the application and enhances the illusion of chatting with an intelligent counterpart. In Genkit, you can stream output using the `generateStream()` method. Its syntax is similar to the `generate()` method: ```python stream, response = ai.generate_stream( prompt='Suggest a complete menu for a pirate themed restaurant.', ) ``` The response object has a `stream` property, which you can use to iterate over the streaming output of the request as it’s generated: ```python async for chunk in stream: print(chunk.text) ``` You can also get the complete output of the request, as you can with a non-streaming request: ```python complete_text = (await response).text ``` Streaming also works with structured output: ```python from pydantic import BaseModel class MenuItemSchema(BaseModel): name: str description: str calories: int allergens: list[str] class MenuSchema(BaseModel): starters: list[MenuItemSchema] mains: list[MenuItemSchema] desserts: list[MenuItemSchema] stream, response = ai.generate_stream( prompt='Invent a menu item for a pirate themed restaurant.', output_schema=MenuSchema, ) async for chunk in stream: print(chunk.output) print((await response).output) ``` Streaming structured output works a little differently from streaming text: the `output` property of a response chunk is an object constructed from the accumulation of the chunks that have been produced so far, rather than an object representing a single chunk (which might not be valid on its own). **Every chunk of structured output in a sense supersedes the chunk that came before it**. For example, here’s what the first five outputs from the prior example might look like: ```json null { "starters": [ {} ] } { "starters": [ { "name": "Captain's Treasure Chest", "description": "A" } ] } { "starters": [ { "name": "Captain's Treasure Chest", "description": "A mix of spiced nuts, olives, and marinated cheese served in a treasure chest.", "calories": 350 } ] } { "starters": [ { "name": "Captain's Treasure Chest", "description": "A mix of spiced nuts, olives, and marinated cheese served in a treasure chest.", "calories": 350, "allergens": [] }, { "name": "Shipwreck Salad", "description": "Fresh" } ] } ``` ### Multimodal input [Section titled “Multimodal input”](#multimodal-input) The examples you’ve seen so far have used text strings as model prompts. While this remains the most common way to prompt generative AI models, many models can also accept other media as prompts. Media prompts are most often used in conjunction with text prompts that instruct the model to perform some operation on the media, such as to caption an image or transcribe an audio recording. The ability to accept media input and the types of media you can use are completely dependent on the model and its API. For example, the Gemini 1.5 series of models can accept images, video, and audio as prompts. To provide a media prompt to a model that supports it, instead of passing a simple text prompt to `generate`, pass an array consisting of a media part and a text part: ```python from genkit.ai import Part # Import Part result = await ai.generate( prompt=[ Part(media={'url': 'https://example.com/photo.jpg'}), Part(text='Compose a poem about this image.'), ], ) ``` In the above example, you specified an image using a publicly-accessible HTTPS URL. You can also pass media data directly by encoding it as a data URL. For example: ```python import base64 from genkit.ai import Part # Import Part # Assume read_file is defined elsewhere to read image bytes # def read_file(path): # with open(path, 'rb') as f: # return f.read() image_bytes = read_file('image.jpg') base64_encoded_image = base64.b64encode(image_bytes).decode('utf-8') # Decode bytes to string result = await ai.generate( prompt=[ Part(media={'url': f'data:image/jpeg;base64,{base64_encoded_image}'}), Part(text='Compose a poem about this image.'), ], ) ``` All models that support media input support both data URLs and HTTPS URLs. Some model plugins add support for other media sources. For example, the Vertex AI plugin also lets you use Cloud Storage (`gs://`) URLs. ### Generating media [Section titled “Generating media”](#generating-media) So far, most of the examples on this page have dealt with generating text using LLMs. However, Genkit can also be used with image generation models. Using `generate()` with an image generation model is similar to using an LLM. For example, to generate an image using the Imagen model: ```python # TODO: Add example for image generation ``` # Dev Local Vector Store Plugin > Learn how to use the Dev Local Vector Store plugin for local development and testing in Genkit Python. # Dev Local Vector Store [Section titled “Dev Local Vector Store”](#dev-local-vector-store) The Dev Local Vector Store plugin provides a local, file-based vector store for development and testing purposes. It is not intended for production use. ## Installation [Section titled “Installation”](#installation) ```bash pip3 install genkit-plugin-dev-local-vectorstore ``` ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, specify it when you initialize Genkit: ```python from genkit.ai import Genkit from genkit.plugins.dev_local_vectorstore import DevLocalVectorStore from genkit.plugins.google_genai import VertexAI # Assuming VertexAI is used for embedder ai = Genkit( plugins=[ VertexAI(), # Ensure the embedder's plugin is loaded DevLocalVectorStore( name='my_vectorstore', embedder='vertexai/text-embedding-004', # Example embedder ), ], # Define a default model if needed # model='vertexai/gemini-1.5-flash', ) ``` ### Configuration Options [Section titled “Configuration Options”](#configuration-options) * **name** (str): A unique name for this vector store instance. This is used as the `retriever` argument to `ai.retrieve`. * **embedder** (str): The name of the embedding model to use. Must match a configured embedder in your Genkit project. * **embedder\_options** (dict, optional): Options to pass to the embedder. ## Usage [Section titled “Usage”](#usage) ### Indexing Documents [Section titled “Indexing Documents”](#indexing-documents) The Dev Local Vector Store automatically creates indexes. To populate with data you must call the static method `.index(name, documents)`: ```python from genkit.ai import Genkit from genkit.plugins.dev_local_vectorstore import DevLocalVectorStore from genkit.plugins.google_genai import VertexAI # Assuming VertexAI is used for embedder from genkit.types import Document # Assuming 'ai' is configured as shown in the Configuration section # ai = Genkit(...) data_list = [ 'This is the first document.', 'This is the second document.', 'This is the third document.', "This is the fourth document.", ] genkit_docs = [Document.from_text(text=item) for item in data_list] # Ensure the vector store name matches the one in the Genkit config await DevLocalVectorStore.index('my_vectorstore', genkit_docs) ``` ### Retrieving Documents [Section titled “Retrieving Documents”](#retrieving-documents) Use `ai.retrieve` and pass the store name configured in the DevLocalVectorStore constructor. ```python from genkit.types import Document # Assuming 'ai' is configured as shown in the Configuration section # ai = Genkit(...) docs = await ai.retrieve( query=Document.from_text('search query'), retriever='my_vectorstore', # Matches the 'name' in DevLocalVectorStore config ) # print(docs) # Process the retrieved documents ``` # Firestore Vector Store Plugin > Learn how to use the Firestore Vector Store plugin with Genkit Python to leverage Google Cloud Firestore for RAG. # Firestore Vector Store [Section titled “Firestore Vector Store”](#firestore-vector-store) The Firestore plugin provides retriever implementations that use Google Cloud Firestore as a vector store. ## Installation [Section titled “Installation”](#installation) ```bash pip3 install genkit-plugin-firebase ``` ## Prerequisites [Section titled “Prerequisites”](#prerequisites) * A Firebase project with Cloud Firestore enabled. * The `genkit` package installed. * `gcloud` CLI for managing credentials and Firestore indexes. ## Configuration [Section titled “Configuration”](#configuration) To use this plugin, specify it when you initialize Genkit: ```python from genkit.ai import Genkit from genkit.plugins.firebase.firestore import FirestoreVectorStore from genkit.plugins.google_genai import VertexAI # Assuming VertexAI provides the embedder from google.cloud import firestore # Ensure you have authenticated with gcloud and set the project firestore_client = firestore.Client() ai = Genkit( plugins=[ VertexAI(), # Ensure the embedder's plugin is loaded FirestoreVectorStore( name='my_firestore_retriever', collection='my_collection', # Replace with your collection name vector_field='embedding', content_field='text', embedder='vertexai/text-embedding-004', # Example embedder firestore_client=firestore_client, ), ] # Define a default model if needed # model='vertexai/gemini-1.5-flash', ) ``` ### Configuration Options [Section titled “Configuration Options”](#configuration-options) * **name** (str): A unique name for this retriever instance. * **collection** (str): The name of the Firestore collection to query. * **vector\_field** (str): The name of the field in the Firestore documents that contains the vector embedding. * **content\_field** (str): The name of the field in the Firestore documents that contains the text content. * **embedder** (str): The name of the embedding model to use. Must match a configured embedder in your Genkit project. * **firestore\_client**: A `google.cloud.firestore.Client` object that will be used for all queries to the vectorstore. ## Usage [Section titled “Usage”](#usage) 1. **Create a Firestore Client**: ```python from google.cloud import firestore # Ensure you have authenticated with gcloud and set the project firestore_client = firestore.Client() ``` 2. **Define a Firestore Retriever**: ```python from genkit.ai import Genkit from genkit.plugins.firebase.firestore import FirestoreVectorStore from genkit.plugins.google_genai import VertexAI # Assuming VertexAI provides the embedder from google.cloud import firestore # Assuming firestore_client is already created # firestore_client = firestore.Client() ai = Genkit( plugins=[ VertexAI(), # Ensure the embedder's plugin is loaded FirestoreVectorStore( name='my_firestore_retriever', collection='my_collection', # Replace with your collection name vector_field='embedding', content_field='text', embedder='vertexai/text-embedding-004', # Example embedder firestore_client=firestore_client, ), ] # Define a default model if needed # model='vertexai/gemini-1.5-flash', ) ``` 3. **Retrieve Documents**: ```python from genkit.ai import Document # Import Document # Assuming 'ai' is configured as above async def retrieve_documents(): # Note: ai.retrieve expects a Document object for the query query_doc = Document.from_text("What are the main topics?") return await ai.retrieve( query=query_doc, retriever='my_firestore_retriever', # Matches the 'name' in FirestoreVectorStore config ) # Example of calling the async function # import asyncio # retrieved_docs = asyncio.run(retrieve_documents()) # print(retrieved_docs) ``` ## Populating the Index [Section titled “Populating the Index”](#populating-the-index) Before you can retrieve documents, you need to populate your Firestore collection with data and their corresponding vector embeddings. Here’s how you can do it: 1. **Prepare your Data**: Organize your data into documents. Each document should have at least two fields: a `text` field containing the content you want to retrieve, and an `embedding` field that holds the vector embedding of the content. You can add any other metadata as well. 2. **Generate Embeddings**: Use the same embedding model configured in your `FirestoreVectorStore` to generate vector embeddings for your text content. The `ai.embed()` method can be used. 3. **Upload Documents to Firestore**: Use the Firestore client to upload the documents with their embeddings to the specified collection. Here’s an example of how to index data: ```python from genkit.ai import Document, Genkit # Import Genkit and Document from genkit.types import TextPart from google.cloud import firestore # Import firestore # Assuming 'ai' is configured with VertexAI and FirestoreVectorStore plugins # Assuming 'firestore_client' is an initialized firestore.Client() instance async def index_documents(documents: list[str], collection_name: str): """Indexes the documents in Firestore.""" genkit_documents = [Document(content=[TextPart(text=doc)]) for doc in documents] # Ensure the embedder name matches the one configured in Genkit embed_response = await ai.embed(embedder='vertexai/text-embedding-004', content=genkit_documents) # Use 'content' parameter embeddings = [emb.embedding for emb in embed_response.embeddings] for i, document_text in enumerate(documents): doc_id = f'doc-{i + 1}' embedding = embeddings[i] doc_ref = firestore_client.collection(collection_name).document(doc_id) result = doc_ref.set({ 'text': document_text, 'embedding': embedding, # Ensure this field name matches 'vector_field' in config 'metadata': f'metadata for doc {i + 1}', }) print(f"Indexed document {doc_id}") # Optional: print progress # Example Usage # documents = [ # "This is document one.", # "This is document two.", # "This is document three.", # ] # import asyncio # asyncio.run(index_documents(documents, 'my_collection')) # Replace 'my_collection' with your actual collection name ``` ## Creating a Firestore Index [Section titled “Creating a Firestore Index”](#creating-a-firestore-index) To enable vector similarity search you will need to configure the index in your Firestore database. Use the following command: ```bash gcloud firestore indexes composite create \ --project= \ --collection-group= \ --query-scope=COLLECTION \ --field-config=vector-config='{"dimension":,"flat": {}}',field-path= ``` * Replace `` with the ID of your Firebase project. * Replace `` with the name of your Firestore collection (e.g., `my_collection`). * Replace `` with the correct dimension for your embedding model. Common values are: * `768` for `text-embedding-004` (Vertex AI) * Replace `` with the name of the field containing vector embeddings (e.g., `embedding`). # Google GenAI Plugin > Learn how to configure and use the Google GenAI plugin for Genkit Python, providing access to Google Gemini API and Vertex AI models. # Google Gen AI [Section titled “Google Gen AI”](#google-gen-ai) The `genkit-plugin-google-genai` package provides two plugins for accessing Google’s generative AI models: 1. `GoogleAI`: For accessing models via the Google Gemini API (requires an API key). 2. `VertexAI`: For accessing models via the Gemini API within Google Cloud Vertex AI (uses standard Google Cloud authentication). ## Installation [Section titled “Installation”](#installation) ```bash pip3 install genkit-plugin-google-genai ``` ## Configuration [Section titled “Configuration”](#configuration) ### Google Gemini API (`GoogleAI`) [Section titled “Google Gemini API (GoogleAI)”](#google-gemini-api-googleai) To use the Google Gemini API, you need an API key. ```python from genkit.ai import Genkit from genkit.plugins.google_genai import GoogleAI ai = Genkit( plugins=[GoogleAI()], model='googleai/gemini-2.5-flash', ) ``` You will need to set GEMINI\_API\_KEY environment variable or you can provide the API Key directly: ```python ai = Genkit( plugins=[GoogleAI(api_key='...')] ) ``` ### Gemini API in Vertex AI (`VertexAI`) [Section titled “Gemini API in Vertex AI (VertexAI)”](#gemini-api-in-vertex-ai-vertexai) To use models via Vertex AI, ensure you have authenticated with Google Cloud (e.g., via `gcloud auth application-default login`). ```python from genkit.ai import Genkit from genkit.plugins.google_genai import VertexAI ai = Genkit( plugins=[VertexAI()], model='vertexai/gemini-2.5-flash', # optional ) ``` You can specify the `location` and `project` ID, among other configuration options available in the `VertexAI` constructor. ```python ai = Genkit( plugins=[VertexAI( location='us-east1', project='my-project-id', )], ) ``` # Ollama Plugin > Learn how to configure and use the Ollama plugin for Genkit Python to run local LLMs and embedding models. # Ollama [Section titled “Ollama”](#ollama) The `genkit-plugin-ollama` package provides integration with [Ollama](https://ollama.com/), allowing you to run various open-source large language models and embedding models locally. ## Installation [Section titled “Installation”](#installation) ```bash pip3 install genkit-plugin-ollama ``` You will need to download and install Ollama separately: Use the Ollama CLI to pull the models you would like to use. For example: ```bash ollama pull gemma2 # Example model ollama pull nomic-embed-text # Example embedder ``` ## Configuration [Section titled “Configuration”](#configuration) Configure the Ollama plugin in your Genkit initialization, specifying the models and embedders you have pulled and wish to use. ```python from genkit.ai import Genkit from genkit.plugins.ollama import Ollama, ModelDefinition, EmbeddingModelDefinition ai = Genkit( plugins=[ Ollama( models=[ ModelDefinition(name='gemma2'), # Match the model pulled via ollama CLI # Add other models as needed # ModelDefinition(name='mistral'), ], embedders=[ EmbeddingModelDefinition( name='nomic-embed-text', # Match the embedder pulled via ollama CLI # Specify dimensions if known/required by your use case # dimensions=768, # Example dimension ) ], # Optional: Specify Ollama server address if not default (http://127.0.0.1:11434) # address="http://custom-ollama-host:11434" ) ], ) ``` Then use Ollama models and embedders by specifying the `ollama/` prefix followed by the model/embedder name defined in the configuration: ```python from genkit.ai import Document # Import Document # Assuming 'ai' is configured as above async def run_ollama(): generate_response = await ai.generate( prompt='Tell me a short story about a space cat.', model='ollama/gemma2', # Use the configured model name ) print("Generated Text:", generate_response.text) embedding_response = await ai.embed( embedder='ollama/nomic-embed-text', # Use the configured embedder name content=[Document.from_text('This is text to embed.')], # Pass content as a list of Documents ) print("Embedding:", embedding_response.embeddings[0].embedding) # Access the embedding vector # Example of running the async function # import asyncio # asyncio.run(run_ollama()) ``` # Retrieval-Augmented Generation (RAG) > Learn how to build Retrieval-Augmented Generation (RAG) flows in Genkit Python using embedders and retrievers. Genkit provides abstractions that help you build retrieval-augmented generation (RAG) flows, as well as plugins that provide integrations with related tools. ## What is RAG? [Section titled “What is RAG?”](#what-is-rag) Retrieval-augmented generation is a technique used to incorporate external sources of information into an LLM’s responses. It’s important to be able to do so because, while LLMs are typically trained on a broad body of material, practical use of LLMs often requires specific domain knowledge (for example, you might want to use an LLM to answer customers’ questions about your company’s products). One solution is to fine-tune the model using more specific data. However, this can be expensive both in terms of compute cost and in terms of the effort needed to prepare adequate training data. In contrast, RAG works by incorporating external data sources into a prompt at the time it’s passed to the model. For example, you could imagine the prompt, “What is Bart’s relationship to Lisa?” might be expanded (“augmented”) by prepending some relevant information, resulting in the prompt, “Homer and Marge’s children are named Bart, Lisa, and Maggie. What is Bart’s relationship to Lisa?” This approach has several advantages: * It can be more cost-effective because you don’t have to retrain the model. * You can continuously update your data source and the LLM can immediately make use of the updated information. * You now have the potential to cite references in your LLM’s responses. On the other hand, using RAG naturally means longer prompts, and some LLM API services charge for each input token you send. Ultimately, you must evaluate the cost tradeoffs for your applications. RAG is a very broad area and there are many different techniques used to achieve the best quality RAG. The core Genkit framework offers three main abstractions to help you do RAG: * **Embedders**: transforms documents into a vector representation * **Retrievers**: retrieve documents from an “index”, given a query. These definitions are broad on purpose because Genkit is un-opinionated about what an “index” is or how exactly documents are retrieved from it. Genkit only provides a `Document` format and everything else is defined by the retriever or indexer implementation provider. ### Embedders [Section titled “Embedders”](#embedders) An embedder is a function that takes content (text, images, audio, etc.) and creates a numeric vector that encodes the semantic meaning of the original content. As mentioned above, embedders are leveraged as part of the process of indexing, however, they can also be used independently to create embeddings without an index. ### Retrievers [Section titled “Retrievers”](#retrievers) A retriever is a concept that encapsulates logic related to any kind of document retrieval. The most popular retrieval cases typically include retrieval from vector stores, however, in Genkit a retriever can be any function that returns data. To create a retriever, you can use one of the provided implementations or create your own. ## Defining a RAG Flow [Section titled “Defining a RAG Flow”](#defining-a-rag-flow) The following examples show how you could ingest a collection of restaurant menu PDF documents into a vector database and retrieve them for use in a flow that determines what food items are available. Note that indexing is outside the scope of Genkit and you should use the SDKs/APIs provided by the vector store you are using. The following example shows how you might use a retriever in a RAG flow. Like the retriever example, this example uses Firestore Vector Store. ```python from genkit.ai import Genkit, Document from genkit.plugins.google_genai import ( VertexAI, vertexai_name, ) from genkit.plugins.firebase.firestore import FirestoreVectorStore, DistanceMeasure ai = Genkit( plugins=[ VertexAI(), FirestoreVectorStore( name='my_firestore_retriever', collection='mycollection', vector_field='embedding', content_field='text', embedder=EMBEDDING_MODEL, distance_measure=DistanceMeasure.EUCLIDEAN, firestore_client=firestore_client, ), ], ) @ai.flow() async def qa_flow(query: str): docs = await ai.retrieve( query=Document.from_text(query), retriever='firestore/my_firestore_retriever' ) response = await ai.generate(prompt=query, docs=docs) return response.text ``` #### Run the retriever flow [Section titled “Run the retriever flow”](#run-the-retriever-flow) ```python result = await qa_flow('Recommend a dessert from the menu while avoiding dairy and nuts') print(result) ``` The output for this command should contain a response from the model, grounded in the indexed `menu.pdf` file. ## Write your own retrievers [Section titled “Write your own retrievers”](#write-your-own-retrievers) It’s also possible to create your own retriever. This is useful if your documents are managed in a document store that is not supported in Genkit (eg: MySQL, Google Drive, etc.). The Genkit SDK provides flexible methods that let you provide custom code for fetching documents. You can also define custom retrievers that build on top of existing retrievers in Genkit and apply advanced RAG techniques (such as reranking or prompt extensions) on top. ```python from genkit.types import ( RetrieverRequest, RetrieverResponse, Document, ActionRunContext ) async def my_retriever(request: RetrieverRequest, ctx: ActionRunContext): """Example of a retriever. Args: request: The request to the retriever. ctx: The context of the retriever. """ return RetrieverResponse(documents=[Document.from_text('Hello'), Document.from_text('World')]) ai.define_retriever(name='my_retriever', fn=my_retriever) ``` Then you’ll be able to use your retriever with `ai.retrieve`: ```python docs = await ai.retrieve( query=Document.from_text(query), retriever='my_retriever' ) ``` # Tool (Function) Calling > Learn how to use tool calling (function calling) with Genkit Python to give LLMs access to external information and actions. *Tool calling*, also known as *function calling*, is a structured way to give LLMs the ability to make requests back to the application that called it. You define the tools you want to make available to the model, and the model will make tool requests to your app as necessary to fulfill the prompts you give it. The use cases of tool calling generally fall into a few themes: **Giving an LLM access to information it wasn’t trained with** * Frequently changing information, such as a stock price or the current weather. * Information specific to your app domain, such as product information or user profiles. Note the overlap with retrieval augmented generation (RAG), which is also a way to let an LLM integrate factual information into its generations. RAG is a heavier solution that is most suited when you have a large amount of information or the information that’s most relevant to a prompt is ambiguous. On the other hand, if retrieving the information the LLM needs is a simple function call or database lookup, tool calling is more appropriate. **Introducing a degree of determinism into an LLM workflow** * Performing calculations that the LLM cannot reliably complete itself. * Forcing an LLM to generate verbatim text under certain circumstances, such as when responding to a question about an app’s terms of service. **Performing an action when initiated by an LLM** * Turning on and off lights in an LLM-powered home assistant * Reserving table reservations in an LLM-powered restaurant agent ## Before you begin [Section titled “Before you begin”](#before-you-begin) If you want to run the code examples on this page, first complete the steps in the [Get started](/python/docs/get-started/) guide. All of the examples assume that you have already set up a project with Genkit dependencies installed. This page discusses one of the advanced features of Genkit model abstraction, so before you dive too deeply, you should be familiar with the content on the [Generating content with AI models](/python/docs/reference/models/) page. You should also be familiar with Genkit’s system for defining input and output schemas, which is discussed on the [Flows](/python/docs/reference/flows/) page. ## Overview of tool calling [Section titled “Overview of tool calling”](#overview-of-tool-calling) At a high level, this is what a typical tool-calling interaction with an LLM looks like: 1. The calling application prompts the LLM with a request and also includes in the prompt a list of tools the LLM can use to generate a response. 2. The LLM either generates a complete response or generates a tool call request in a specific format. 3. If the caller receives a complete response, the request is fulfilled and the interaction ends; but if the caller receives a tool call, it performs whatever logic is appropriate and sends a new request to the LLM containing the original prompt or some variation of it as well as the result of the tool call. 4. The LLM handles the new prompt as in Step 2. For this to work, several requirements must be met: * The model must be trained to make tool requests when it’s needed to complete a prompt. Most of the larger models provided through web APIs, such as Gemini and Claude, can do this, but smaller and more specialized models often cannot. Genkit will throw an error if you try to provide tools to a model that doesn’t support it. * The calling application must provide tool definitions to the model in the format it expects. * The calling application must prompt the model to generate tool calling requests in the format the application expects. ## Tool calling with Genkit [Section titled “Tool calling with Genkit”](#tool-calling-with-genkit) Genkit provides a single interface for tool calling with models that support it. Each model plugin ensures that the last two of the above criteria are met, and the Genkit instance’s `generate()` function automatically carries out the tool calling loop described earlier. ### Model support [Section titled “Model support”](#model-support) Tool calling support depends on the model, the model API, and the Genkit plugin. Consult the relevant documentation to determine if tool calling is likely to be supported. In addition: * Genkit will throw an error if you try to provide tools to a model that doesn’t support it. * If the plugin exports model references, the `info.supports.tools` property will indicate if it supports tool calling. ### Defining tools [Section titled “Defining tools”](#defining-tools) Use the Genkit instance’s `tool()` decorator to write tool definitions: ```python from pydantic import BaseModel, Field from genkit.ai import Genkit from genkit.plugins.google_genai import GoogleGenai ai = Genkit( plugins=[GoogleGenai()], model='googleai/gemini-2.5-flash', ) class WeatherInput(BaseModel): location: str = Field(description='The location to get the current weather for') @ai.tool() def get_weather(input: WeatherInput) -> str: """Gets the current weather in a given location""" # Replace with actual weather fetching logic return f'The current weather in {input.location} is 63°F and sunny.' ``` The syntax here looks just like the `flow()` syntax; however `description` parameter is required. When writing a tool definition, take special care with the wording and descriptiveness of these parameters. They are vital for the LLM to make effective use of the available tools. ### Using tools [Section titled “Using tools”](#using-tools) Include defined tools in your prompts to generate content. ```python result = await ai.generate( prompt='What is the weather in Baltimore?', tools=['get_weather'], ) ``` Genkit will automatically handle the tool call if the LLM needs to use the `get_weather` tool to answer the prompt. ### Pause the tool loop by using interrupts [Section titled “Pause the tool loop by using interrupts”](#pause-the-tool-loop-by-using-interrupts) By default, Genkit repeatedly calls the LLM until every tool call has been resolved. You can conditionally pause execution in situations where you want to, for example: * Ask the user a question or display UI. * Confirm a potentially risky action with the user. * Request out-of-band approval for an action. **Interrupts** are special tools that can halt the loop and return control to your code so that you can handle more advanced scenarios. Visit the [interrupts guide](/python/docs/reference/interrupts/) to learn how to use them. ### Explicitly handling tool calls [Section titled “Explicitly handling tool calls”](#explicitly-handling-tool-calls) If you want full control over this tool-calling loop, for example to apply more complicated logic, set the `return_tool_requests` parameter to `True`. Now it’s your responsibility to ensure all of the tool requests are fulfilled: ```python llm_response = await ai.generate( prompt='What is the weather in Baltimore?', tools=['get_weather'], return_tool_requests=True, ) tool_request_parts = llm_response.tool_requests if len(tool_request_parts) == 0: print(llm_response.text) else: for part in tool_request_parts: await handle_tool(part.name, part.input) ```