Now in early access

Gemini and Claude API.
Cheaper. Right now.

Prepay for credits and get instant API access to Google's Gemini and Anthropic's Claude — fully OpenAI-compatible. Up to 20% cheaper than going direct.

Buy Credits View Pricing

OpenAI-compatible Instant API key delivery No monthly fees

How it works

Up and running in minutes

Three steps from checkout to your first API call. No dashboards to configure, no quotas to request.

Step 01

Buy Credits

Choose a credit bundle — $20, $50, $100, or $500. Pay via Stripe. No subscription, no renewal surprises.

Step 02

Get Your API Key

We email you an API key immediately after payment. No waiting, no approval process, no account setup required.

Step 03

Start Building

Swap your existing base URL to getinference.io/v1. Works with any OpenAI-compatible SDK or library — zero code changes.

Models

Frontier models, lower cost

Access Google and Anthropic's best models through a single, consistent API. Market rates pulled live.

Google · Gemini

Gemini 1.5 Flash

Fast, cheap, and capable. The go-to model for high-volume production workloads where speed and cost matter.

Our price · Input / 1M

$0.06

Our price · Output / 1M

$0.18

Direct API rate Loading…

Google · Gemini

Gemini 1.5 Pro

Best for complex reasoning, long context windows up to 2M tokens, and tasks that need deeper understanding.

Our price · Input / 1M

$1.12

Our price · Output / 1M

$3.36

Direct API rate Loading…

Anthropic · Claude

Claude 3.5 Sonnet

Anthropic's flagship model. Frontier-grade intelligence for coding, analysis, writing, and reasoning tasks.

Our price · Input / 1M

$2.40

Our price · Output / 1M

$9.60

Direct API rate Loading…

Integration

One line to switch

GetInference implements the OpenAI Responses API. Change the base URL and API key — everything else stays the same.

Uses the modern client.responses.create() API
Works with the official OpenAI Python and Node.js SDKs
Drop-in for LangChain, LlamaIndex, or any tool with a custom base URL

example.py

            from openai import OpenAI

client = OpenAI(
    api_key="your-getinference-key",
    base_url="https://getinference.io/v1"
)

response = client.responses.create(
    model="gemini-flash",
    input="Explain how transformers work."
)

print(response.output_text)
          

            import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-getinference-key",
  baseURL: "https://getinference.io/v1",
});

const response = await client.responses.create({
  model: "gemini-flash",
  input: "Explain how transformers work.",
});

console.log(response.output_text);
          

Pricing

Simple, prepaid bundles

No subscriptions. No seats. Credits don't expire. Market rates below are fetched live from LiteLLM's public pricing index.

Per-model rates — input tokens / 1M Live market rates

Model	GetInference	Direct API	You save
Gemini 1.5 Flash	$0.06	—	—
Gemini 1.5 Pro	$1.12	—	—
Claude 3.5 Sonnet	$2.40	—	—

Credit bundles

Bundle	Credit value	Gemini Flash (input / 1M)	Saving vs. direct
Starter	$20	$0.06	—
Builder	$50	$0.06	—
Pro	$100	$0.06	—
Scale	$500 BEST VALUE	$0.05	~17% off

Buy Credits →

Stripe-powered checkout. Credits added to your account instantly.

Gemini and Claude API.Cheaper. Right now.

Up and running in minutes

Buy Credits

Get Your API Key

Start Building

Frontier models, lower cost

Gemini 1.5 Flash

Gemini 1.5 Pro

Claude 3.5 Sonnet

One line to switch

Simple, prepaid bundles

Gemini and Claude API.
Cheaper. Right now.