Kybernis Documentation
Kybernis is the deterministic execution layer for AI systems. In simple terms: it keeps a clean ledger, prevents duplicate side-effects, and makes sure every action follows policy.
A tamper-proof log (ledger) of what happened and when.
Checks budgets, retries, and safe destinations.
The proxy + gates that stop risky actions unless approved.
Why Kybernis exists
AI systems don’t fail in reasoning — they fail at execution boundaries. Kybernis focuses on the hard production problems: duplicate charges, flaky upstreams, runaway loops, and missing auditability.
- Per‑app retries and idempotency keys
- Ad‑hoc logging and manual incident review
- Budget checks sprinkled in code
- Centralized policy enforcement
- Deterministic run ledger + receipts
- Durable idempotency + safe retries
Architecture (three planes)
Policies, budgets, retries, ledger, and /infra/decide.
Proxy + gates enforce policy at the side‑effect boundary.
SDK emits spans for non‑HTTP work to keep the ledger complete.
App/Frameworks → Execution Plane (proxy + gates) → External tools ↑ Control Plane (policies + ledger)Optional: the protocol worker handles A2A streaming/execution when the gateway is enabled.
Core concepts (plain English)
A single unit of work (one user request, workflow, or job).
A metered step inside a run (HTTP call, tool, LLM, DB, side‑effect).
Append‑only record of spans and events for audit + replay.
Human‑in‑the‑loop boundary for irreversible actions.
A deterministic key that prevents duplicate side‑effects.
Ordered spans + verdicts + hashes for replay‑safe audit.
Access & authentication
Console access requires signing in with your organization. API access uses bearer tokens scoped to a tenant. In simple terms: dashboard uses login, APIs use a token + tenant ID from Settings → API keys.
Execution flow
Every irreversible action must be preceded by a policy verdict and recorded as a span. In plain English: ask permission, do the action, write it down.
SPAN_START → /infra/decide → enforce → SPAN_END“Decide” is the rule check. “Enforce” is the real action. Both are logged.
- allow
- deny
- delay
- downgrade
- require_approval
- ok
- error
- canceled
- deduped
- blocked
Environment variables
Set these so Kybernis knows where to talk, who you are, and which run to scope actions to. Use your tenant ID and token from Settings → API keys.
- API_BASE — control plane base URL.
- API_TOKEN — bearer token for server calls.
- DASH_TENANT_ID — default tenant for UI actions.
- KYB_SDK_ON — enable SDK (1/0).
- KYB_API_BASE — control plane base URL.
- KYB_API_TOKEN — bearer token.
- KYB_TENANT_ID — tenant scope (from API keys).
- KYB_RUN_ID — run scope (unique per request/workflow).
- KYB_TOOL_REGISTER — auto‑register tools.
- KYB_GATE_AUTO_APPROVE — auto‑approve gates for dev.
- KYB_TOOL_CALL_LIMIT — per‑tool call limit.
- KYB_ENFORCE_EGRESS — fail‑closed if proxy missing.
- HTTP_PROXY/HTTPS_PROXY — route HTTP through proxy.
Proxy integration (no code changes)
Route HTTP(S) side-effects through the proxy to get idempotency, retries, and budgets automatically. In simple terms: the proxy blocks duplicates and applies your limits.
If no idempotency key is provided, the proxy derives one from method + URL + body so the same request can be deduped safely.
- X-Kyb-On — turn rules on
- X-Kyb-Idempotency-Key — stop duplicates
- X-Kyb-Retry — allow safe retries
- X-Kyb-Budget — cap spend
- X-Kyb-Run-Id — tie to a run
- X-Kyb-Verdict — allowed or blocked
- X-Kyb-Reason — plain reason
- X-Kyb-Policy-Id — which rule
- X-Kyb-Dedupe-Of — link to original
- X-Kyb-Attempt — retry count
SDK quickstart (Python)
pip install kybernisexport KYB_SDK_ON=1export KYB_API_BASE=https://<api>export KYB_API_TOKEN=<token>export KYB_TENANT_ID=<tenant>export KYB_RUN_ID=<run>from kybernis import KybernisClient, kyb_toolclient = KybernisClient()@kyb_tool(client, run_id="{RUN_ID{'}'}", tool_name="charge")def charge(amount: int): return {"ok": True, "cost_micros": amount * 1000}The SDK exposes KybernisClient, kyb_tool decorators, span helpers, and side‑effect helpers.
SDK reference (Python)
- tool_register(name, kind, endpoint, labels) → tool_id
- tool_heartbeat(tool_id) → keeps tool alive
- tool_report(tool_id, status, circuit_state, failure_inc)
- emit_event(run_id, event_type, payload) → ledger event
- record_cost(run_id, cost_micros, currency)
- gate_request(...) → {gate_id, status, verdict}
- gate_request_span(...) → gate + span linkage
- gate_status(gate_id) → {status}
- gate_wait(gate_id, timeout_s) → final decision
- span_start(client, run_id, span_type, name) → span_id
- span_end(client, run_id, span_id, status, latency_ms)
- span(client, run_id, span_type, name) → context manager
Wrap a tool call with gate checks. If denied, it raises (or returns a blocked object).
- webhook(url, payload, idempotency_key)
- email(url, payload, idempotency_key)
- charge(url, payload, idempotency_key)
Quick integration: Google ADK Financial Advisor
This example shows a fast, safe Kybernis integration for the Google ADK multi‑agent Financial Advisor. Use it to prevent duplicate trades, enforce budgets, and keep a deterministic run ledger.
The upstream agent code lives athttps://github.com/Wingrammer/adk-samples/tree/main/python/agents/financial-advisor.
- Data Analyst: market + SEC research via Google Search
- Trading Analyst: proposes strategies by risk profile
- Execution Agent: plans entry/exit execution
- Risk Evaluation: analyzes risks and mitigations
- Guardrails on side‑effects (orders, webhooks, notifications)
- Idempotency across retries
- Run ledger + receipts for auditability
- Policy‑gated HITL for risky actions
Start with a data analyst and a broker-facing execution agent. Keep tools simple and HTTP‑only.
git clone https://github.com/Wingrammer/adk-samples.git
cd adk-samples/python/agents/financial-advisor
uv sync
# Vertex/ADK env
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT=your-project
export GOOGLE_CLOUD_LOCATION=us-central1
import requests
def broker_place_order(payload):
return requests.post("http://localhost:8099/orders", json=payload, timeout=10).json()
from google.adk import Agent
from google.adk.tools import google_search
from . import prompt
MODEL = "gemini-2.5-pro"
data_analyst_agent = Agent(
model=MODEL,
name="data_analyst_agent",
instruction=prompt.DATA_ANALYST_PROMPT,
output_key="market_data_analysis_output",
tools=[google_search],
)
from google.adk import Agent
from ..mock_broker_tools import broker_place_order
from . import prompt
MODEL = "gemini-2.5-pro"
execution_analyst_agent = Agent(
model=MODEL,
name="execution_analyst_agent",
instruction=prompt.EXECUTION_ANALYST_PROMPT,
output_key="execution_plan_output",
tools=[broker_place_order],
)
from google.adk.agents import LlmAgent
from google.adk.tools.agent_tool import AgentTool
from . import prompt
from .sub_agents.data_analyst import data_analyst_agent
from .sub_agents.execution_analyst import execution_analyst_agent
MODEL = "gemini-2.5-pro"
root_agent = LlmAgent(
name="financial_coordinator",
model=MODEL,
instruction=prompt.FINANCIAL_COORDINATOR_PROMPT,
output_key="financial_coordinator_output",
tools=[
AgentTool(agent=data_analyst_agent),
AgentTool(agent=execution_analyst_agent),
],
)
- Start the mock broker.
- Run the ADK agent.
- Trigger the same order twice and watch duplicates happen.
go run ./cmd/mock-effects
adk run financial_advisor
# Example prompts
# "Buy 5 shares of MOCK and show my updated balance"
# "Place the same order twice"- Install the Kybernis SDK from PyPI.
- Set KYB_API_KEY, KYB_API_BASE, KYB_TENANT_ID in .env.
- Uncomment the Kybernis wrapper and keep using root_agent.
pip install kybernis
KYB_API_KEY=your_key
KYB_API_BASE=https://api.kybernis.com
KYB_TENANT_ID=your_tenant
from kybernis import KybernisSDK
# Set KYB_API_KEY in .env.
kyb = KybernisSDK(root_agent)
The mock broker is not Kybernis. It is a separate demo service hosted by Kybernis for education. It is used to simulate production failures (timeouts, retries, duplicate calls) so the guardrails are obvious.
- Base URL: http://localhost:8099
- POST /orders (buy/sell)
- POST /orders/fill, POST /orders/cancel
- Clone the ADK samples repo and open the financial‑advisor agent.
- Install dependencies with `uv sync`.
- Set Google Cloud env vars and run `adk run financial_advisor` or `adk web`.
- GOOGLE_GENAI_USE_VERTEXAI=true
- GOOGLE_CLOUD_PROJECT
- GOOGLE_CLOUD_LOCATION
- GOOGLE_CLOUD_STORAGE_BUCKET (Agent Engine only)
Keep the official disclaimer in the UI. Kybernis only governs execution safety.
Retries multiply side‑effects. A single duplicate trade can become thousands at scale. Kybernis prevents duplicates and records every action in a deterministic ledger.
GitLab MCP demo (separate, working)
This is a standalone MCP demo that shows the difference between calling GitLab MCP directly versus routing the same tool through Kybernis.
Fast to test, but no centralized registry, tenant scoping, or gateway-level policy and rate limits. Each agent must know the GitLab MCP URL and handle auth on its own.
export GITLAB_MCP_URL=https://gitlab-mcp.example.com
curl -X POST $GITLAB_MCP_URL/mcp/invoke \
-H "Content-Type: application/json" \
-d '{
"tool": "gitlab.search",
"input": {"query": "rate limit"}
}'Kybernis becomes the MCP gateway. Tools are registered once, tenant-scoped, and invoked through a single endpoint with consistent auth and rate limits.
export KYB_PROTOCOL_GATEWAY=http://localhost:8080
export GITLAB_MCP_URL=https://gitlab-mcp.example.com
curl -X POST http://localhost:8080/infra/mcp/tools/register \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $KYB_API_KEY" \
-d '{
"tenant_id": "'$KYB_TENANT_ID'",
"name": "gitlab.search",
"description": "Search GitLab issues and repos",
"endpoint": "'$GITLAB_MCP_URL'/mcp/invoke",
"input_schema": {"type":"object","properties":{"query":{"type":"string"}},"required":["query"]},
"output_schema": {"type":"object"}
}'curl -X POST http://localhost:8080/mcp/invoke \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $KYB_API_KEY" \
-H "X-Kyb-Tenant-Id: $KYB_TENANT_ID" \
-d '{
"tool": "gitlab.search",
"input": {"query": "rate limit"}
}'Kybernis emits span events for MCP tool invocations and will auto-generate a run id when missing, so MCP calls appear in the ledger and can be included in receipts.
API reference (infra plane)
The API is the control plane. A typical production flow: create a run → call /infra/decide → execute → emit spans → fetch receipts.
- POST /infra/decide
- GET /infra/runs
- GET /infra/runs/receipt
- POST /infra/graph/event
- POST /infra/graph/events/batch
- GET /infra/graph/events
- POST /infra/tasks/enqueue
- POST /infra/tasks/claim
- POST /infra/tasks/ack
- POST /infra/tasks/fail
- POST /infra/tasks/extend
- POST /infra/tasks/requeue
- GET /infra/tasks/stats
- POST /infra/fleet/register
- POST /infra/fleet/heartbeat
- GET /infra/fleet/workers
- GET /infra/fleet/utilization
- POST /infra/policies/budget
- POST /infra/policies/retry
- POST /infra/policies/allowlist
- POST /infra/policies/tool_downtime
Policy packs
Policies are stored centrally and applied at runtime. In simple terms: they are rules you set once and Kybernis enforces every time.
Example: “If this tool is down, wait and try again a limited number of times.”
{
"auto_resume": true,
"max_resume_per_call": 200
}KSP v1 schemas
KSP defines SPAN_START and SPAN_END events with policy, usage, and cost metadata. In simple terms: it is the shared event format for proxy and SDK.
Required fields: ksp_version, event_type, ts, tenant_id, run_id, span_id, span_type, name.
Required fields: ksp_version, event_type, ts, tenant_id, run_id, span_id, status, latency_ms.
policy.verdict, policy.reason, policy_id, retry settings, and budget limits.
prompt_tokens, completion_tokens, bytes_in/out, cpu_ms, mem_bytes_peak, cost_micros.
Console guide
The console is the operator view of runs, budgets, and execution health. It is protected by Organization authentication and scoped to a tenant.
Shows run status, cost, and stop state. Click a run to inspect its spans and receipts.
Live view of pending/claimed tasks. Useful for dispatchers and retry tuning.
Worker heartbeats and utilization (how busy dispatchers are).
Create tokens and copy your tenant ID in Settings → API keys.
Dispatchers (K8s / Temporal)
Kybernis tracks worker fleets via register + heartbeat. Whether you use Kubernetes jobs or Temporal workers, the dispatcher should report health and claim tasks through the infra API.
- Register workers with /infra/fleet/register.
- Send regular heartbeats via /infra/fleet/heartbeat.
- Use task endpoints to claim/ack/fail work items.
In practice this maps to the Kybernis worker services (or your own) running in K8s or Temporal, calling these endpoints to stay visible and healthy.
FAQ
Pricing
Kybernis is managed-only. Start on the free plan, then contact us to upgrade.
- Free plan: limited runs, spans, and decisions for evaluation.
- Managed plan: custom limits, SLAs, and support.
- No self-hosted plan at this time.
Limits & constraints
- Proxy cannot see non-HTTP work unless SDK spans are emitted. In simple terms: use the SDK for non-HTTP tools.
- Some providers ignore proxy settings; use SDK for full enforcement.
- Accurate cost requires provider usage data or approximate policies.
- Fail-closed egress may block dev workflows if HTTP(S)_PROXY is missing.