Now in early access

Gemini and Claude API.
Cheaper. Right now.

Prepay for credits and get instant API access to Google's Gemini and Anthropic's Claude — fully OpenAI-compatible. Up to 20% cheaper than going direct.

OpenAI-compatible Instant API key delivery No monthly fees

Up and running in minutes

Three steps from checkout to your first API call. No dashboards to configure, no quotas to request.

Step 01

Buy Credits

Choose a credit bundle — $20, $50, $100, or $500. Pay via Stripe. No subscription, no renewal surprises.

Step 02

Get Your API Key

We email you an API key immediately after payment. No waiting, no approval process, no account setup required.

Step 03

Start Building

Swap your existing base URL to getinference.io/v1. Works with any OpenAI-compatible SDK or library — zero code changes.


Frontier models, lower cost

Access Google and Anthropic's best models through a single, consistent API. Market rates pulled live.

Google · Gemini

Gemini 1.5 Flash

Fast, cheap, and capable. The go-to model for high-volume production workloads where speed and cost matter.

Our price · Input / 1M
$0.06
Our price · Output / 1M
$0.18
Direct API rate Loading…
Google · Gemini

Gemini 1.5 Pro

Best for complex reasoning, long context windows up to 2M tokens, and tasks that need deeper understanding.

Our price · Input / 1M
$1.12
Our price · Output / 1M
$3.36
Direct API rate Loading…
Anthropic · Claude

Claude 3.5 Sonnet

Anthropic's flagship model. Frontier-grade intelligence for coding, analysis, writing, and reasoning tasks.

Our price · Input / 1M
$2.40
Our price · Output / 1M
$9.60
Direct API rate Loading…

One line to switch

GetInference implements the OpenAI Responses API. Change the base URL and API key — everything else stays the same.

  • Uses the modern client.responses.create() API
  • Works with the official OpenAI Python and Node.js SDKs
  • Drop-in for LangChain, LlamaIndex, or any tool with a custom base URL
example.py
from openai import OpenAI

client = OpenAI(
    api_key="your-getinference-key",
    base_url="https://getinference.io/v1"
)

response = client.responses.create(
    model="gemini-flash",
    input="Explain how transformers work."
)

print(response.output_text)
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-getinference-key",
  baseURL: "https://getinference.io/v1",
});

const response = await client.responses.create({
  model: "gemini-flash",
  input: "Explain how transformers work.",
});

console.log(response.output_text);

Simple, prepaid bundles

No subscriptions. No seats. Credits don't expire. Market rates below are fetched live from LiteLLM's public pricing index.

Per-model rates — input tokens / 1M Live market rates
Model GetInference Direct API You save
Gemini 1.5 Flash $0.06
Gemini 1.5 Pro $1.12
Claude 3.5 Sonnet $2.40
Credit bundles
Bundle Credit value Gemini Flash (input / 1M) Saving vs. direct
Starter $20 $0.06
Builder $50 $0.06
Pro $100 $0.06
Scale $500 BEST VALUE $0.05 ~17% off
Buy Credits →

Stripe-powered checkout. Credits added to your account instantly.