# Sheaf Pod — Integration Guide (for your coding agent)

This document is written so an AI coding agent (Devin, Claude Code, Cursor, etc.) can wire **Sheaf Pod**
into your product with no further questions. Hand it the whole file.

## What Sheaf Pod is

Sheaf Pod is one HTTP endpoint that runs a **pod** of AI minds and returns **one reconciled answer plus a
coherence audit** (where the minds agree, and where they quietly contradict each other). You define the
minds (models + roles + context) and how they work together (the `podType`).

- **Base URL:** `https://api.discursa.ai`
- **Auth:** every request needs header `Authorization: Bearer <YOUR_KEY>` (a key that starts with `sk_pod_`).
- **Content type:** `application/json`. Responses are JSON.
- Keep your key server-side. Never embed it in client code.

## Quickstart (smallest working call)

```bash
curl -X POST https://api.discursa.ai/api/v1/pod/run \
  -H "Authorization: Bearer $SHEAF_POD_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "pod": {
      "podType": "council",
      "members": [
        {"id": "optimist", "model": "claude-opus-4-8", "role": "Optimist"},
        {"id": "skeptic",  "model": "gpt-5.5",         "role": "Skeptic"}
      ]
    },
    "input": "Should we ship the feature this week?"
  }'
```

## The endpoints

| Method & path | Purpose |
|---|---|
| `POST /api/v1/pod/run` | Run a pod (inline `pod`, or `podId` for a saved one). Sync, `stream`, or `async`. |
| `GET  /api/v1/pod/job/{jobId}` | Poll an async job's status + result. |
| `POST /api/v1/pod/save` | Save/update a pod definition → returns `{podId, version}`. |
| `GET  /api/v1/pod/list` | List your saved pods. |
| `GET  /api/v1/pod/get/{podId}` | Fetch one saved pod's config. |
| `GET  /api/v1/pod/usage` | Your run count, spend, and recent runs. |

All require the `Authorization: Bearer` header.

## ⏱ Latency — read this before integrating

A pod is **several AI calls in one request** (each mind + a synthesizer + the audit), so a run takes
**~10–60 seconds** depending on pod size — much longer than a single model call. This is expected. Pick a
call mode accordingly:

| Mode | How | Best for |
|---|---|---|
| **Sync** (default) | `POST /run`, wait for the JSON | Quick pods; scripts; **set your HTTP client timeout to ≥ 90s** |
| **Async** | `POST /run` with `"async": true` → `202 {jobId}`; then poll `GET /job/{jobId}`, or pass `"webhook"` to get POSTed the result | **Backend/pipeline integrations** (e.g. an offline content pipeline) — don't hold a connection open |
| **Stream** | `POST /run` with `"stream": true` → Server-Sent Events | Live UIs that show progress as each mind lands |

**For a production integration, prefer async + webhook** — fire the run, get notified when it's done. Don't
make a user wait on a synchronous call, and don't let a default 30s client timeout kill the request.

### Async example
```bash
# 1) submit
curl -X POST https://api.discursa.ai/api/v1/pod/run \
  -H "Authorization: Bearer $SHEAF_POD_KEY" -H "Content-Type: application/json" \
  -d '{ "pod": { "podType":"council", "members":[{"model":"claude-opus-4-8"},{"model":"gpt-5.5"}] },
        "input":"…", "async": true, "webhook": "https://your-app.com/sheaf-callback" }'
# → {"ok":true,"jobId":"…","status":"queued","poll":"/api/v1/pod/job/…"}

# 2) poll (or just receive the webhook POST)
curl https://api.discursa.ai/api/v1/pod/job/THE_JOB_ID -H "Authorization: Bearer $SHEAF_POD_KEY"
# → {"ok":true,"status":"done","result":{ …full run response… }}
```
The webhook receives `POST {jobId, status, result}` (https URLs only). Poll interval: every ~5s is fine.

### Streaming events (SSE)
With `"stream": true`, you receive events in this order:
`start` → `member` (one per mind, as it finishes) → `phase` → `answer` → `coherence` → `usage` → `done`.
Each `data:` line is JSON. The `done` event carries the full run response.

## Run request — full schema

```jsonc
{
  // Provide EITHER an inline "pod" OR a saved "podId".
  "pod": {
    "podType": "council",            // REQUIRED: "council" | "pipeline" | "debate" | "jury"
    "members": [                     // REQUIRED: 1–8 minds
      {
        "model": "claude-opus-4-8",  // REQUIRED. Supported: claude-opus-4-8, gpt-5.5, claude-haiku-4-5
        "id": "skeptic",             // optional, stable handle used in output + coherence conflicts
        "role": "Risk skeptic",      // optional, who this mind is
        "context": "Focus on …",     // optional, role-specific briefing
        "temperature": 0.7,          // optional (ignored for models that don't support it)
        "maxTokens": 800,            // optional per-member output cap
        "required": false,           // optional; if true a failure fails the run, else that mind is skipped
        "apiKey": "sk-…"             // optional BYO provider key; omit to use Sheaf's hosted keys
      }
    ],
    "sharedContext": "…",            // optional, context every mind sees
    "synthesizer": {                 // optional, how the minds are reconciled
      "model": "claude-opus-4-8",    //   default: claude-opus-4-8
      "instruction": "Reconcile …",  //   default: a Sheaf reconciler prompt
      "enabled": true                //   council default true; pipeline default false (last step = answer)
    },
    "coherenceAudit": true,          // optional, default true (needs ≥2 live minds to produce a report)
    "responseFormat": "text",        // optional: "text" (default) | "json" (answer is a JSON object string)
    "retention": "logs",             // optional: "none" stores only metadata+cost, not your input/answer
    "debate": { "rounds": 2 },       // optional (debate only): rounds of rebuttal — default 2, max 4
    "jury":   { "decisionRule": "majority" } // optional (jury only): majority|unanimous|supermajority|synthesis
  },
  "podId": "…",                      // alternative to "pod": run a saved pod by id
  "input": "The task/question.",     // REQUIRED
  "context": "Per-run context…",     // optional, merged with sharedContext
  "variables": { "brand": "Acme" },  // optional, fills {{brand}} placeholders in roles/context/input
  "stream": false,                   // optional, true → Server-Sent Events
  "async": false,                    // optional, true → 202 {jobId}; poll /job/{id} or use webhook
  "webhook": "https://…"             // optional (async only), https URL POSTed {jobId,status,result} on done
}
```

## Run response — full schema

```jsonc
{
  "ok": true,
  "podType": "council",
  "answer": "The reconciled answer (or a JSON string if responseFormat=json).",
  "members": [
    { "id": "skeptic", "role": "Risk skeptic", "model": "gpt-5.5", "output": "…", "skipped": false }
  ],
  "coherence": {                     // present when coherenceAudit ran
    "score": 78,                     // 0–100; higher = the minds align
    "summary": "Mostly aligned, two real tensions.",
    "agreements": ["…"],
    "conflicts": [ { "between": ["skeptic","optimist"], "issue": "…", "severity": "medium" } ]
  },
  "usage": { "inputTokens": 3241, "outputTokens": 3457, "costUsd": 0.26, "calls": 4, "latencyMs": 49586 },
  "error": null                      // a string when ok=false
}
```

## podType reference

- **`council`** — every mind answers **independently and blind**; a synthesizer reconciles them. Use for
  advice, analysis, "many expert opinions → one answer." Synthesizer on by default.
- **`pipeline`** — minds run **in order**, each transforming the previous one's output (stage 1 → 2 → 3).
  The **last stage is the answer** (synthesizer off by default). Use for workflows: draft → critique →
  revise, or brief → script → polish.
- **`debate`** — minds **see each other and argue over rounds**, then a synthesizer converges their final
  positions into one answer. Round 1 is opening statements; later rounds each mind rebuts/revises the
  others. Set `"debate": { "rounds": N }` (default 2, max 4). Use when you want positions stress-tested,
  not just collected. `members[]` returns each mind's final-round position.
- **`jury`** — each mind returns a **verdict + rationale**; a foreperson aggregates them under a decision
  rule and reports the outcome **with the vote tally**. Set `"jury": { "decisionRule": "..." }` —
  `majority` (default), `unanimous`, `supermajority` (≥⅔), or `synthesis` (decide on the merits, not by
  counting). Use for decisions with a clear call to make. `members[]` returns each juror's verdict.

**Latency note:** a `debate` is `members × rounds` model calls, so it's the slowest podType — prefer
`async` or `stream` for it. (Streaming for debate/jury emits the members + result when the run completes,
rather than each mind live as in council/pipeline.)

## Worked example — a marketing-video pipeline (Object Nirvana)

A `pipeline` pod that turns a brief into a ready-to-produce script + shot list. Your renderer/video tools
consume the JSON output; Sheaf Pod is the reasoning layer.

```bash
curl -X POST https://api.discursa.ai/api/v1/pod/run \
  -H "Authorization: Bearer $SHEAF_POD_KEY" -H "Content-Type: application/json" \
  -d '{
    "pod": {
      "podType": "pipeline",
      "responseFormat": "json",
      "members": [
        {"id":"strategist", "model":"claude-opus-4-8", "role":"Turn the brief into a 15s video angle"},
        {"id":"scriptwriter","model":"gpt-5.5",        "role":"Write the VO + on-screen script for that angle"},
        {"id":"hookdoctor", "model":"claude-opus-4-8", "role":"Rewrite the first 3 seconds for retention"},
        {"id":"shotlist",   "model":"gpt-5.5",         "role":"Output a scene-by-scene shot list with an image/video prompt per scene; return JSON {script, hook, shots:[{scene, vo, prompt}]}"}
      ]
    },
    "input": "Brief: promote a new AI note-taking app to busy founders.",
    "variables": { "brand": "YourBrand" }
  }'
```

The final stage's JSON is in `answer`; each stage's output is in `members[]`; cost is in `usage`.

## Saving and reusing a pod

```bash
# Save (returns {ok, podId, version}); pass "podId" to update an existing one.
curl -X POST https://api.discursa.ai/api/v1/pod/save \
  -H "Authorization: Bearer $SHEAF_POD_KEY" -H "Content-Type: application/json" \
  -d '{ "name": "marketing-video", "pod": { "podType":"pipeline", "members":[ … ] } }'

# Then run it by id (no need to resend the config):
curl -X POST https://api.discursa.ai/api/v1/pod/run \
  -H "Authorization: Bearer $SHEAF_POD_KEY" -H "Content-Type: application/json" \
  -d '{ "podId": "THE_RETURNED_ID", "input": "…" }'
```

## Notes & gotchas

- **Models:** `claude-opus-4-8` (strongest), `gpt-5.5`, `claude-haiku-4-5` (fast/cheap). Mix providers freely.
- **Cost** comes back on every run in `usage.costUsd` (computed from token usage). A council of N minds = N
  model calls + 1 synthesizer + 1 audit. Keep pods small for latency/cost; raise `maxTokens` only if needed.
- **Coherence audit** needs ≥2 non-skipped minds; with one mind it's omitted.
- **Errors:** non-2xx or `{"ok":false,"error":"…"}`. 401 = bad/missing key; 429 = rate limited (retry after a
  moment); 402 = monthly cost cap reached.
- **Privacy:** set `"retention":"none"` on the pod to store only metadata + cost (not your input/answer).
- **Recommended client pattern:** call from your backend, treat `answer` as the result, log `usage.costUsd`,
  and surface `coherence.conflicts` when you need to know the minds disagreed before acting.

## Console

A no-code console to build/run pods and watch usage: **https://sheaf.one/pod-console.html** (paste your key).

---
Questions / a higher rate limit / BYO-keys / more pod types: hello@sheaf.one
