Skip to content

Anthropic Plugin

The Anthropic plugin provides a unified interface to connect with Anthropic’s Claude models through the Anthropic API using API key authentication. The @genkit-ai/anthropic package is the official Anthropic plugin for Genkit.

The plugin supports a wide range of capabilities:

  • Language Models: Claude models for text generation, reasoning, and multimodal tasks
  • Structured Output: JSON schema-based output generation (via beta API)
  • Thinking and Reasoning: Extended thinking for Claude 4.x models
  • Multimodal: Image understanding and PDF processing
  • Tool Calling: Function calling and tool use
  • Web Search: Real-time web search through server-side tools
  • Prompt Caching: Reduce costs and latency by caching repeated prompts
  • Documents and Citations: Document-based RAG with citation support
Terminal window
npm i --save @genkit-ai/anthropic
import { genkit } from 'genkit';
import { anthropic } from '@genkit-ai/anthropic';
const ai = genkit({
plugins: [
anthropic(),
// Or with an explicit API key:
// anthropic({ apiKey: 'your-api-key' }),
],
});

Requires an Anthropic API Key, which you can get from the Anthropic Console. You can provide this key in several ways:

  1. Environment variables: Set ANTHROPIC_API_KEY
  2. Plugin configuration: Pass apiKey when initializing the plugin (shown above)

The plugin accepts the following configuration options:

OptionTypeRequiredDescription
apiKeystringYes*Your Anthropic API key. Can also be set via ANTHROPIC_API_KEY environment variable
apiVersion'stable' | 'beta'NoDefault API surface for all requests. Can be overridden per-request (default: 'stable')

*The API key is required but can be provided via the environment variable ANTHROPIC_API_KEY instead of the config option.

const ai = genkit({
plugins: [
anthropic({
apiKey: 'your-api-key',
apiVersion: 'beta', // Use beta API by default ('stable' or 'beta')
}),
],
});

You can override properties, such as the apiVersion, on a request-level basis.

const response = await ai.generate({
model: anthropic.model('claude-opus-4-5'),
prompt: 'Generate a creative story.',
config: {
apiVersion: 'beta',
betas: ['effort-2025-11-24'], // Enable specific beta features
output_config: {
effort: 'medium',
},
},
});

Anthropic’s prompt caching feature allows you to cache large portions of your prompts (such as system prompts, documents, or images) to reduce costs and latency for repeated requests. Cached content can be reused across multiple API calls, providing significant performance and cost benefits.

Key Benefits:

  • Cost Reduction: Cached tokens are significantly cheaper than regular input tokens
  • Lower Latency: Cached prompts load faster, reducing response time
  • Efficient for Repetitive Content: Ideal for system prompts, large context documents, or few-shot examples

How It Works:

Anthropic automatically caches content based on the cache_control metadata. You can cache system prompts, user messages, images, documents, etc. Use the cacheControl() helper for type-safe cache configuration.

Enable prompt caching in a system prompt using the cacheControl() helper:

import { anthropic, cacheControl } from '@genkit-ai/anthropic';
const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
messages: [
{
role: 'system',
content: [
{
text: "You are a helpful assistant with expertise in quantum physics. [Large system prompt...]".repeat(
100,
),
metadata: { ...cacheControl() }, // default: ephemeral
},
],
},
{
role: 'user',
content: [{ text: 'Explain quantum entanglement.' }],
},
],
});
// Or with explicit TTL:
// metadata: { ...cacheControl({ ttl: '1h' }) }
// Or using the type directly:
// import { type AnthropicCacheControl } from '@genkit-ai/anthropic';
// metadata: { cache_control: { type: 'ephemeral', ttl: '5m' } as AnthropicCacheControl }

You can monitor cache usage through the response metadata. Check the usage field in the response to see cache read and creation metrics.

console.log(response.metadata.usage);

It will look like this:

{
"inputTokens": 3,
"outputTokens": 217,
"custom": {
"cache_creation_input_tokens": 3639,
"cache_read_input_tokens": 0,
"ephemeral_5m_input_tokens": 3639,
"ephemeral_1h_input_tokens": 0
}
}

You can create models that call the Anthropic API. The models support tool calls, multimodal capabilities, and structured output.

Claude 4.5 Series - Latest models with advanced reasoning and structured output:

  • claude-haiku-4-5 - Fast and efficient model with structured output support
  • claude-sonnet-4-5 - Balanced model with structured output support
  • claude-opus-4-5 - Most capable model with structured output support
  • claude-opus-4-1 - High-performance model with structured output support

Claude 4 Series - Advanced reasoning models:

  • claude-sonnet-4 - Balanced model for complex tasks
  • claude-opus-4 - Most capable Claude 4 model

Claude 3.5 Series:

  • claude-3-5-haiku - Fast and efficient Claude 3.5 model

Claude 3 Series:

  • claude-3-haiku - Fastest Claude 3 model
import { genkit } from 'genkit';
import { anthropic } from '@genkit-ai/anthropic';
const ai = genkit({
plugins: [anthropic()],
});
const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
prompt: 'Explain how neural networks learn in simple terms.',
});
console.log(response.text);

You can also pass configuration when creating a model reference:

// Create a model with beta API version
const betaModel = anthropic.model('claude-sonnet-4-5', { apiVersion: 'beta' });
const response = await ai.generate({
model: betaModel,
prompt: 'Your prompt here',
});

Claude 4.5 models support structured output generation via the beta API, which guarantees that the model output will conform to a specified JSON schema.

import { z } from 'genkit';
const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5', { apiVersion: 'beta' }),
output: {
schema: z.object({
name: z.string(),
bio: z.string(),
age: z.number(),
}),
format: 'json',
constrained: true,
},
prompt: 'Generate a profile for a fictional character',
});
console.log(response.output);

Output Configuration:

  • schema ZodSchema - The JSON schema that defines the expected output structure
  • format ‘json’ - Specifies JSON output format (required for structured output)
  • constrained boolean - When true, enforces strict adherence to the schema

The Anthropic API has specific requirements for JSON schemas used in structured output:

Required Features

  • Objects: Must have additionalProperties: false (automatically added by the plugin)
  • Arrays: Standard array items are supported
  • Enums: Fully supported (z.enum)

Limitations

  • Unions (z.union): Complex unions may be problematic. Prefer using a single object with optional fields.
  • Validation Keywords: Keywords like pattern, minLength, maxLength, minItems, and maxItems are not enforced by the API’s constrained decoding. They may be included but won’t be validated.
  • Recursion: Recursive schemas are generally not supported.
  • Complexity: Deeply nested schemas or schemas with hundreds of properties may trigger complexity limits.

Best Practices

  • Keep schemas simple and flat where possible
  • Use property descriptions (.describe()) to guide the model
  • If you need strict validation (e.g., regex), perform it in your application code after receiving the structured response

Claude 4.x models can expose their internal reasoning process, which improves transparency for complex tasks.

const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
prompt: 'Walk me through your reasoning for Fermat's little theorem.',
config: {
thinking: {
enabled: true,
budgetTokens: 4096, // Must be >= 1024 and less than max_tokens
},
},
});
console.log(response.text); // Final assistant answer
console.log(response.reasoning); // Summarized thinking steps

Thinking Configuration:

  • enabled: boolean - Enable thinking for this request
  • budgetTokens: number - Number of thinking tokens to allocate (must be >= 1024 and less than max_tokens)

When thinking is enabled, streamed responses deliver reasoning parts as they arrive so you can render the chain-of-thought incrementally.

Claude models support streaming responses using generateStream():

const { stream } = ai.generateStream({
model: anthropic.model('claude-sonnet-4-5'),
prompt: 'Write a long explanation about quantum computing.',
});
for await (const chunk of stream) {
if (chunk.text) {
process.stdout.write(chunk.text);
}
if (chunk.reasoning) {
// Handle thinking/reasoning chunks
console.log('\n[Thinking]', chunk.reasoning);
}
}

Claude models can reason about images passed as inline data or URLs. Supported formats include JPEG, PNG, GIF, and WebP.

const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
prompt: [
{ text: "Describe what is in this image" },
{ media: { url: "https://example.com/image.jpg" } },
],
});

Claude models can process PDF documents to extract information, summarize content, or answer questions based on the visual layout and text.

const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
prompt: [
{ text: 'Summarize this document' },
{
media: {
contentType: "application/pdf",
url: "https://example.com/doc.pdf",
},
},
],
});

Claude models support function calling and tool use. Define tools using ai.defineTool() and pass them to the model:

import { z } from 'genkit';
const getWeather = ai.defineTool(
{
name: 'getWeather',
description: 'Gets the current weather in a given location',
inputSchema: z.object({
location: z
.string()
.describe("The location to get the current weather for"),
}),
outputSchema: z.string(),
},
async (input) => {
// Execute the tool logic here
return `The current weather in ${input.location} is 63°F and sunny.`;
},
);
const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
prompt: "What's the weather like in San Francisco?",
tools: [getWeather],
});
// The response will contain the tool output if the model decided to call it
console.log(response.text);

Tool Choice Configuration:

You can control tool usage with the tool_choice configuration:

const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
prompt: 'Get the weather for San Francisco',
tools: [getWeather],
config: {
tool_choice: {
type: 'tool',
name: 'getWeather', // Force use of a specific tool
},
// Or use 'auto' to let the model decide
// tool_choice: { type: 'auto' },
// Or use 'any' to require at least one tool call
// tool_choice: { type: 'any' },
},
});

Claude models support web search capabilities through Anthropic’s server-side tool integration. When enabled, the model can search the web to find current information and include it in responses.

Key Features:

  • Real-time Information: Access current web data beyond the model’s training cutoff
  • Automatic Search: Model decides when to search based on the query
  • Source Attribution: Results include source information for transparency

Web search is available through Anthropic’s server tools. The model will use web search when it determines that current information would improve the response:

import { anthropic } from '@genkit-ai/anthropic';
const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
prompt: 'What are the latest developments in quantum computing this week?',
config: {
tools: [
{
type: 'web_search_20250305',
name: 'web_search',
},
],
},
});
console.log(response.text);

Claude models support document-based RAG with citation support. Use the anthropicDocument() helper to provide documents that can be cited in responses.

import { anthropic, anthropicDocument } from '@genkit-ai/anthropic';
const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
messages: [
{
role: 'user',
content: [
anthropicDocument({
source: {
type: "text",
data: "The grass is green. The sky is blue.",
},
title: "Nature Facts",
citations: { enabled: true },
}),
{ text: 'What color is the grass?' },
],
},
],
});
// Access citations from the response
if (response.messages) {
for (const message of response.messages) {
for (const part of message.content) {
if (part.metadata?.citations) {
console.log('Citations:', part.metadata.citations);
}
}
}
}

Document Sources:

The anthropicDocument() helper supports multiple source types:

  • Text: { type: 'text', data: string, mediaType?: string }
  • Base64: { type: 'base64', data: string, mediaType: string }
  • File: { type: 'file', fileId: string } (from Anthropic Files API)
  • URL: { type: 'url', url: string } (for PDFs)
  • Content: { type: 'content', content: Array<...> } (custom content blocks)

Citation Types:

Citations can reference:

  • Character locations (char_location) for text documents
  • Page numbers (page_location) for PDF documents
  • Content block indices (content_block_location) for custom content

Claude models support system messages to set the model’s behavior:

const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
messages: [
{
role: 'system',
content: [
{ text: "You are a helpful assistant that explains concepts clearly." },
],
},
{
role: 'user',
content: [{ text: 'Explain quantum computing.' }],
},
],
});

Anthropic models support various configuration options:

const response = await ai.generate({
model: anthropic.model('claude-sonnet-4-5'),
prompt: 'Your prompt here',
config: {
temperature: 0.7, // Controls randomness (0.0 to 1.0)
maxOutputTokens: 4096, // Maximum tokens to generate
topP: 0.9, // Nucleus sampling parameter
tool_choice: { type: 'auto' }, // Tool usage control
metadata: {
user_id: 'user-123', // User identifier for tracking
},
apiVersion: 'beta', // Override default API version
},
});

Configuration Options:

  • temperature number - Controls randomness (0.0 to 1.0). Higher values make output more random.
  • maxOutputTokens number - Maximum number of tokens to generate in the response.
  • topP number - Nucleus sampling parameter (0.0 to 1.0).
  • tool_choice object - Controls tool usage:
    • { type: 'auto' } - Let the model decide
    • { type: 'any' } - Require at least one tool call
    • { type: 'tool', name: string } - Force use of a specific tool
  • metadata object - Metadata to include in the request:
    • user_id string - User identifier for tracking and abuse prevention
  • apiVersion ‘stable’ | ‘beta’ - Override the default API version for this request
  • thinking object - Thinking configuration (Claude 4.x only):
    • enabled boolean - Enable thinking
    • budgetTokens number - Thinking token budget (>= 1024)

The plugin supports Genkit Plugin API v2, which allows you to use models directly without initializing the full Genkit framework:

import { anthropic } from '@genkit-ai/anthropic';
// Create a model reference directly
const claude = anthropic.model('claude-sonnet-4-5');
// Use the model directly
const response = await claude({
messages: [
{
role: 'user',
content: [{ text: 'Tell me a joke.' }],
},
],
});
console.log(response);

This approach is useful for:

  • Framework developers who need raw model access
  • Testing models in isolation
  • Using Genkit models in non-Genkit applications

The beta API surface provides access to experimental features, but some server-managed tool blocks are not yet supported by this plugin. The following beta API features will cause an error if encountered:

  • web_fetch_tool_result
  • code_execution_tool_result
  • bash_code_execution_tool_result
  • text_editor_code_execution_tool_result
  • mcp_tool_result
  • mcp_tool_use
  • container_upload

Note that server_tool_use and web_search_tool_result ARE supported and work with both stable and beta APIs.

For comprehensive examples demonstrating all plugin features, see the Genkit Anthropic testapp.