Kybernis Documentation

Kybernis is the deterministic execution layer for AI systems. In simple terms: it keeps a clean ledger, prevents duplicate side-effects, and makes sure every action follows policy.

Execution ledger

A tamper-proof log (ledger) of what happened and when.

Policy engine

Checks budgets, retries, and safe destinations.

Execution plane

The proxy + gates that stop risky actions unless approved.

Why Kybernis exists

AI systems don’t fail in reasoning — they fail at execution boundaries. Kybernis focuses on the hard production problems: duplicate charges, flaky upstreams, runaway loops, and missing auditability.

Before Kybernis

Per‑app retries and idempotency keys
Ad‑hoc logging and manual incident review
Budget checks sprinkled in code

With Kybernis

Centralized policy enforcement
Deterministic run ledger + receipts
Durable idempotency + safe retries

Architecture (three planes)

Control plane

Policies, budgets, retries, ledger, and /infra/decide.

Execution plane

Proxy + gates enforce policy at the side‑effect boundary.

Application plane

SDK emits spans for non‑HTTP work to keep the ledger complete.

App/Frameworks → Execution Plane (proxy + gates) → External tools

↑

                Control Plane (policies + ledger)

Optional: the protocol worker handles A2A streaming/execution when the gateway is enabled.

Core concepts (plain English)

Run

A single unit of work (one user request, workflow, or job).

Span

A metered step inside a run (HTTP call, tool, LLM, DB, side‑effect).

Ledger

Append‑only record of spans and events for audit + replay.

Gate (HITL)

Human‑in‑the‑loop boundary for irreversible actions.

Idempotency key

A deterministic key that prevents duplicate side‑effects.

Run receipt

Ordered spans + verdicts + hashes for replay‑safe audit.

Span types include: llm, tool, http, db, vector, code, file, side_effect, sleep, decision, checkpoint.

Access & authentication

Console access requires signing in with your organization. API access uses bearer tokens scoped to a tenant. In simple terms: dashboard uses login, APIs use a token + tenant ID from Settings → API keys.

Execution flow

Every irreversible action must be preceded by a policy verdict and recorded as a span. In plain English: ask permission, do the action, write it down.

SPAN_START → /infra/decide → enforce → SPAN_END

“Decide” is the rule check. “Enforce” is the real action. Both are logged.

Verdict values

allow
deny
delay
downgrade
require_approval

SPAN_END status values

ok
error
canceled
deduped
blocked

Environment variables

Set these so Kybernis knows where to talk, who you are, and which run to scope actions to. Use your tenant ID and token from Settings → API keys.

Dashboard

API_BASE — control plane base URL.
API_TOKEN — bearer token for server calls.
DASH_TENANT_ID — default tenant for UI actions.

SDK + proxy

KYB_SDK_ON — enable SDK (1/0).
KYB_API_BASE — control plane base URL.
KYB_API_TOKEN — bearer token.
KYB_TENANT_ID — tenant scope (from API keys).
KYB_RUN_ID — run scope (unique per request/workflow).
KYB_TOOL_REGISTER — auto‑register tools.
KYB_GATE_AUTO_APPROVE — auto‑approve gates for dev.
KYB_TOOL_CALL_LIMIT — per‑tool call limit.
KYB_ENFORCE_EGRESS — fail‑closed if proxy missing.
HTTP_PROXY/HTTPS_PROXY — route HTTP through proxy.

Proxy integration (no code changes)

Route HTTP(S) side-effects through the proxy to get idempotency, retries, and budgets automatically. In simple terms: the proxy blocks duplicates and applies your limits.

If no idempotency key is provided, the proxy derives one from method + URL + body so the same request can be deduped safely.

Control headers (what you send)

X-Kyb-On — turn rules on
X-Kyb-Idempotency-Key — stop duplicates
X-Kyb-Retry — allow safe retries
X-Kyb-Budget — cap spend
X-Kyb-Run-Id — tie to a run

Verdict headers (what you get back)

X-Kyb-Verdict — allowed or blocked
X-Kyb-Reason — plain reason
X-Kyb-Policy-Id — which rule
X-Kyb-Dedupe-Of — link to original
X-Kyb-Attempt — retry count

SDK quickstart (Python)

pip install kybernis

export KYB_SDK_ON=1

export KYB_API_BASE=https://<api>

export KYB_API_TOKEN=<token>

export KYB_TENANT_ID=<tenant>

export KYB_RUN_ID=<run>

from kybernis import KybernisClient, kyb_tool

client = KybernisClient()

@kyb_tool(client, run_id="{RUN_ID{'}'}", tool_name="charge")

def charge(amount: int):

    return {"ok": True, "cost_micros": amount * 1000}

The SDK exposes KybernisClient, kyb_tool decorators, span helpers, and side‑effect helpers.

SDK reference (Python)

KybernisClient

tool_register(name, kind, endpoint, labels) → tool_id
tool_heartbeat(tool_id) → keeps tool alive
tool_report(tool_id, status, circuit_state, failure_inc)
emit_event(run_id, event_type, payload) → ledger event
record_cost(run_id, cost_micros, currency)
gate_request(...) → {gate_id, status, verdict}
gate_request_span(...) → gate + span linkage
gate_status(gate_id) → {status}
gate_wait(gate_id, timeout_s) → final decision

Span helpers

span_start(client, run_id, span_type, name) → span_id
span_end(client, run_id, span_id, status, latency_ms)
span(client, run_id, span_type, name) → context manager

Span status values: ok, error, canceled, deduped, blocked.

kyb_tool decorator

Wrap a tool call with gate checks. If denied, it raises (or returns a blocked object).

Inputs: client, run_id, tool_name, action, policy, block, timeout_s. Output: tool result.

Side‑effect helpers

webhook(url, payload, idempotency_key)
email(url, payload, idempotency_key)
charge(url, payload, idempotency_key)

Quick integration: Google ADK Financial Advisor

This example shows a fast, safe Kybernis integration for the Google ADK multi‑agent Financial Advisor. Use it to prevent duplicate trades, enforce budgets, and keep a deterministic run ledger.

The upstream agent code lives athttps://github.com/Wingrammer/adk-samples/tree/main/python/agents/financial-advisor.

Agent overview (as‑is)

Data Analyst: market + SEC research via Google Search
Trading Analyst: proposes strategies by risk profile
Execution Agent: plans entry/exit execution
Risk Evaluation: analyzes risks and mitigations

Disclaimer: educational and informational only. Not financial advice.

Where Kybernis fits

Guardrails on side‑effects (orders, webhooks, notifications)
Idempotency across retries
Run ledger + receipts for auditability
Policy‑gated HITL for risky actions

This demo intentionally simulates costly and failure‑prone side‑effects to show Kybernis behavior.

Step 1: Create the Financial Advisor (ADK only)

Start with a data analyst and a broker-facing execution agent. Keep tools simple and HTTP‑only.

Clone + setup

git clone https://github.com/Wingrammer/adk-samples.git
cd adk-samples/python/agents/financial-advisor
uv sync

# Vertex/ADK env
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT=your-project
export GOOGLE_CLOUD_LOCATION=us-central1

Paste into: mock_broker_tools.py

File: mock_broker_tools.py

import requests

def broker_place_order(payload):
    return requests.post("http://localhost:8099/orders", json=payload, timeout=10).json()

Paste into: financial_advisor/sub_agents/data_analyst/agent.py

File: financial_advisor/sub_agents/data_analyst/agent.py

from google.adk import Agent
from google.adk.tools import google_search
from . import prompt

MODEL = "gemini-2.5-pro"
data_analyst_agent = Agent(
    model=MODEL,
    name="data_analyst_agent",
    instruction=prompt.DATA_ANALYST_PROMPT,
    output_key="market_data_analysis_output",
    tools=[google_search],
)

Paste into: financial_advisor/sub_agents/execution_analyst/agent.py

File: financial_advisor/sub_agents/execution_analyst/agent.py

from google.adk import Agent
from ..mock_broker_tools import broker_place_order
from . import prompt

MODEL = "gemini-2.5-pro"
execution_analyst_agent = Agent(
    model=MODEL,
    name="execution_analyst_agent",
    instruction=prompt.EXECUTION_ANALYST_PROMPT,
    output_key="execution_plan_output",
    tools=[broker_place_order],
)

Paste into: financial_advisor/agent.py

File: financial_advisor/agent.py

from google.adk.agents import LlmAgent
from google.adk.tools.agent_tool import AgentTool
from . import prompt
from .sub_agents.data_analyst import data_analyst_agent
from .sub_agents.execution_analyst import execution_analyst_agent

MODEL = "gemini-2.5-pro"
root_agent = LlmAgent(
    name="financial_coordinator",
    model=MODEL,
    instruction=prompt.FINANCIAL_COORDINATOR_PROMPT,
    output_key="financial_coordinator_output",
    tools=[
        AgentTool(agent=data_analyst_agent),
        AgentTool(agent=execution_analyst_agent),
    ],
)

Step 2: Run it and feel the pain

Start the mock broker.
Run the ADK agent.
Trigger the same order twice and watch duplicates happen.

Run without Kybernis

go run ./cmd/mock-effects
adk run financial_advisor

# Example prompts
# "Buy 5 shares of MOCK and show my updated balance"
# "Place the same order twice"

This is the real-world pain: retries and flaky HTTP create duplicate side‑effects.

Step 3: Add Kybernis smoothly

Install the Kybernis SDK from PyPI.
Set KYB_API_KEY, KYB_API_BASE, KYB_TENANT_ID in .env.
Uncomment the Kybernis wrapper and keep using root_agent.

Install the SDK

pip install kybernis

Paste into: .env

File: .env

KYB_API_KEY=your_key
KYB_API_BASE=https://api.kybernis.com
KYB_TENANT_ID=your_tenant

Paste into: financial_advisor/agent.py (replace the commented Kybernis block)

Enable Kybernis in agent.py

from kybernis import KybernisSDK
# Set KYB_API_KEY in .env.
kyb = KybernisSDK(root_agent)

Kybernis auto‑wraps ADK tools and emits spans. Duplicates stop, receipts appear.

Kybernis sequence overview (quick mental model)

System view and run lifecycle (full explanation)

Mock broker endpoints (used on purpose)

The mock broker is not Kybernis. It is a separate demo service hosted by Kybernis for education. It is used to simulate production failures (timeouts, retries, duplicate calls) so the guardrails are obvious.

Base URL: http://localhost:8099
POST /orders (buy/sell)
POST /orders/fill, POST /orders/cancel

ADK setup steps (as‑is, summarized)

Clone the ADK samples repo and open the financial‑advisor agent.
Install dependencies with `uv sync`.
Set Google Cloud env vars and run `adk run financial_advisor` or `adk web`.

Required envs (example names):

GOOGLE_GENAI_USE_VERTEXAI=true
GOOGLE_CLOUD_PROJECT
GOOGLE_CLOUD_LOCATION
GOOGLE_CLOUD_STORAGE_BUCKET (Agent Engine only)

Use your own Gemini/Vertex configuration or any model ADK supports.

Keep the official disclaimer in the UI. Kybernis only governs execution safety.

Why this demo matters

Retries multiply side‑effects. A single duplicate trade can become thousands at scale. Kybernis prevents duplicates and records every action in a deterministic ledger.

GitLab MCP demo (separate, working)

This is a standalone MCP demo that shows the difference between calling GitLab MCP directly versus routing the same tool through Kybernis.

Before Kybernis

Direct GitLab MCP call

Fast to test, but no centralized registry, tenant scoping, or gateway-level policy and rate limits. Each agent must know the GitLab MCP URL and handle auth on its own.

Call GitLab MCP directly

export GITLAB_MCP_URL=https://gitlab-mcp.example.com

curl -X POST $GITLAB_MCP_URL/mcp/invoke \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "gitlab.search",
    "input": {"query": "rate limit"}
  }'

With Kybernis

Gateway-routed MCP

Kybernis becomes the MCP gateway. Tools are registered once, tenant-scoped, and invoked through a single endpoint with consistent auth and rate limits.

Env (gateway + GitLab MCP URL)

export KYB_PROTOCOL_GATEWAY=http://localhost:8080
export GITLAB_MCP_URL=https://gitlab-mcp.example.com

curl -X POST http://localhost:8080/infra/mcp/tools/register \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KYB_API_KEY" \
  -d '{
    "tenant_id": "'$KYB_TENANT_ID'",
    "name": "gitlab.search",
    "description": "Search GitLab issues and repos",
    "endpoint": "'$GITLAB_MCP_URL'/mcp/invoke",
    "input_schema": {"type":"object","properties":{"query":{"type":"string"}},"required":["query"]},
    "output_schema": {"type":"object"}
  }'

Invoke via Kybernis MCP

curl -X POST http://localhost:8080/mcp/invoke \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $KYB_API_KEY" \
  -H "X-Kyb-Tenant-Id: $KYB_TENANT_ID" \
  -d '{
    "tool": "gitlab.search",
    "input": {"query": "rate limit"}
  }'

The response includes X-Kyb-Run-Id if you didn’t provide one.

Important note

Kybernis emits span events for MCP tool invocations and will auto-generate a run id when missing, so MCP calls appear in the ledger and can be included in receipts.

API reference (infra plane)

The API is the control plane. A typical production flow: create a run → call /infra/decide → execute → emit spans → fetch receipts.

Decision & runs

POST /infra/decide
GET /infra/runs
GET /infra/runs/receipt

Verdict values: allow, deny, delay, downgrade, require_approval.

Ledger & events

POST /infra/graph/event
POST /infra/graph/events/batch
GET /infra/graph/events

Tasks & workers

POST /infra/tasks/enqueue
POST /infra/tasks/claim
POST /infra/tasks/ack
POST /infra/tasks/fail
POST /infra/tasks/extend
POST /infra/tasks/requeue
GET /infra/tasks/stats

Fleet & policies

POST /infra/fleet/register
POST /infra/fleet/heartbeat
GET /infra/fleet/workers
GET /infra/fleet/utilization
POST /infra/policies/budget
POST /infra/policies/retry
POST /infra/policies/allowlist
POST /infra/policies/tool_downtime

Policy packs

Policies are stored centrally and applied at runtime. In simple terms: they are rules you set once and Kybernis enforces every time.

Example: “If this tool is down, wait and try again a limited number of times.”

{
  "auto_resume": true,
  "max_resume_per_call": 200
}

KSP v1 schemas

KSP defines SPAN_START and SPAN_END events with policy, usage, and cost metadata. In simple terms: it is the shared event format for proxy and SDK.

SPAN_START

Required fields: ksp_version, event_type, ts, tenant_id, run_id, span_id, span_type, name.

SPAN_END

Required fields: ksp_version, event_type, ts, tenant_id, run_id, span_id, status, latency_ms.

Policy fields

policy.verdict, policy.reason, policy_id, retry settings, and budget limits.

Usage + cost fields

prompt_tokens, completion_tokens, bytes_in/out, cpu_ms, mem_bytes_peak, cost_micros.

Span types: llm, tool, http, db, vector, code, file, side_effect, sleep, decision, checkpoint.

Console guide

The console is the operator view of runs, budgets, and execution health. It is protected by Organization authentication and scoped to a tenant.

Runs table

Shows run status, cost, and stop state. Click a run to inspect its spans and receipts.

Task queue stats

Live view of pending/claimed tasks. Useful for dispatchers and retry tuning.

Fleet utilization

Worker heartbeats and utilization (how busy dispatchers are).

API keys

Create tokens and copy your tenant ID in Settings → API keys.

Use Settings to configure retry and budget policies, pause/resume runs, and inspect gates.

Dispatchers (K8s / Temporal)

Kybernis tracks worker fleets via register + heartbeat. Whether you use Kubernetes jobs or Temporal workers, the dispatcher should report health and claim tasks through the infra API.

Register workers with /infra/fleet/register.
Send regular heartbeats via /infra/fleet/heartbeat.
Use task endpoints to claim/ack/fail work items.

In practice this maps to the Kybernis worker services (or your own) running in K8s or Temporal, calling these endpoints to stay visible and healthy.

FAQ

If Kybernis doesn’t run my workloads, how is it infrastructure?

Kybernis governs execution consequences (policy + ledger) across any runtime. That makes it infra‑grade, like a control plane/service mesh for side‑effects.

How does Kybernis control non‑HTTP tool failures?

The SDK emits spans for tool/code/db/LLM work so the ledger stays complete even when HTTP is not in the path.

Why are Kybernis retries needed if agents already retry?

Kybernis retries are centralized, policy‑aware, and auditable. Agent retries are local and bypassable.

How does Kybernis prevent duplicate side‑effects?

Durable idempotency keys are enforced at the proxy; duplicate requests are deduped server‑side.

What is the minimal state to inject into prompts?

Use the run receipt summary (counts, verdicts, dedupe, retries, budget status) instead of raw logs.

How does Kybernis mitigate context truncation, and can it be fully solved?

Current mitigation in Kybernis: Kybernis keeps a ledger of spans/events and a compact run receipt so the model doesn’t need full history to stay consistent. Kybernis encourages use of receipts + span summaries instead of raw logs for long-running flows. Kybernis supports externalized state (events, artifacts) so prompts stay small and references remain stable. How to fully solve it (in practice): You can’t fully solve context limits inside the model — you can only architect around them. The practical path is: 1) Persistent memory (ledger + artifacts) as the source of truth. 2) Retrieval + summarization to pull only relevant slices into the prompt. 3) Structured state (receipts, span summaries, tool outputs) instead of raw logs. 4) Chunked or tool-based context (model asks for details on demand). This is exactly the path Kybernis enables, but the model still has a finite context window.

How does Kybernis address A2A/MCP limitations when some methods rely on them?

A2A/MCP are coordination protocols, not safety/enforcement systems. They don’t prevent loops, runaway spend, or silent failures. Kybernis adds a governance layer at the execution boundary: Decisions before side-effects (decide), idempotency & dedupe on side-effects, budgets, retries, gate approvals, and a ledger + receipts for audit and replay safety. While some Kybernis methods use A2A/MCP for transport/coordination, the enforcement happens outside those protocols. That’s how the limitations are addressed: A2A/MCP remain the messaging substrate, Kybernis is the enforcement plane that keeps them safe and predictable.

Is A2A/Proxy more reliable, while A2A/MCP is more flexible but riskier? What default does Kybernis recommend in the Universal SDK?

A2A + Proxy is the most reliable for enforcing side-effects because the proxy sits in the HTTP path and can hard-stop duplicates, budget overruns, and unsafe retries. A2A + MCP is more flexible for reasoning and tool access, but it is riskier without guardrails because tools can bypass network-level enforcement. Kybernis supports both. In the Universal SDK Kybernis recommends Hybrid by default: keep the proxy guardrails for HTTP side-effects while using MCP for tool governance and context access. If you must choose one, Proxy-first is the safest for production.

Pricing

Kybernis is managed-only. Start on the free plan, then contact us to upgrade.

Free plan: limited runs, spans, and decisions for evaluation.
Managed plan: custom limits, SLAs, and support.
No self-hosted plan at this time.

Limits & constraints

Proxy cannot see non-HTTP work unless SDK spans are emitted. In simple terms: use the SDK for non-HTTP tools.
Some providers ignore proxy settings; use SDK for full enforcement.
Accurate cost requires provider usage data or approximate policies.
Fail-closed egress may block dev workflows if HTTP(S)_PROXY is missing.