# NightVeil API Reference

> You are writing code against the NightVeil API.
> Base URL, authentication, and the endpoint reference below are authoritative.
> Use `veil-*` model ids for NightVeil models.
> Open passthrough models (qwen/llama/gpt-style ids) pass through unchanged.
> Do not invent model ids; call `GET /v1/models` for the live catalog.

---

## Base URL

```
https://api.nightveil.ai
```

All endpoints are relative to this base URL.

---

## Authentication

Every request must include a bearer token in the `Authorization` header:

```
Authorization: Bearer nv_…
```

Keys are issued at nightveil.ai (console or `POST /v1/keys`).

### Scopes

Two scopes are resolved server-side from the key:

- **inference** — chat, image, video, audio, embeddings, models catalog, balance, deposit address.
- **admin** — key management (`/v1/keys`) and usage reporting (`/v1/usage`). Returns `403 FORBIDDEN` when called with an inference-only key.

---

## Endpoints

### Chat completions

OpenAI-compatible, non-streaming.

**`POST /v1/chat/completions`**

Request:

```json
{
  "model": "veil-uncensored",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is the capital of France?" }
  ],
  "max_tokens": 256,
  "temperature": 0.7
}
```

Fields:

| Field | Type | Required | Notes |
|---|---|---|---|
| `model` | string | yes | Model id, e.g. `veil-uncensored`. |
| `messages` | array | yes | `[{role, content}]`; role ∈ system, user, assistant. |
| `max_tokens` | integer | no | Maximum tokens to generate. |
| `temperature` | float | no | 0–2. Defaults to model setting. |

Response:

```json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1718000000,
  "model": "veil-uncensored",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Paris." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 4,
    "total_tokens": 32
  }
}
```

curl example:

```bash
curl https://api.nightveil.ai/v1/chat/completions \
  -H "Authorization: Bearer nv_…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veil-uncensored",
    "messages": [{"role": "user", "content": "Hello"}]
  }'
```

---

### Embeddings

**`POST /v1/embeddings`**

Request:

```json
{
  "model": "text-embed-3-small",
  "input": "The quick brown fox"
}
```

> `text-embed-3-small` is illustrative — get real embedding model ids from `GET /v1/models`.

`input` may be a string or an array of strings.

Response:

```json
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.002, -0.009, 0.014],
      "index": 0
    }
  ],
  "model": "text-embed-3-small",
  "usage": {
    "prompt_tokens": 6,
    "total_tokens": 6
  }
}
```

curl example:

```bash
curl https://api.nightveil.ai/v1/embeddings \
  -H "Authorization: Bearer nv_…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embed-3-small",
    "input": "The quick brown fox"
  }'
```

---

### Image generation

**`POST /v1/image/generate`** (primary)
**`POST /v1/images/generations`** (OpenAI alias — identical behaviour)

Request:

```json
{
  "model": "veil-diffuse",
  "prompt": "A moonlit forest clearing with a wolf howling at a full moon",
  "size": "1024x1024"
}
```

Fields:

| Field | Type | Required | Notes |
|---|---|---|---|
| `model` | string | yes | Image model id, e.g. `veil-diffuse`. |
| `prompt` | string | yes | What to generate. |
| `size` | string | no | e.g. `1024x1024`, `2048x2048`. |

Response includes the generated image(s) as base64 or URL, plus `id` and `model`:

```json
{
  "id": "img-xyz789",
  "model": "veil-diffuse",
  "images": ["<base64-encoded-png>"]
}
```

The response may alternatively use the OpenAI `data` shape:

```json
{
  "id": "img-xyz789",
  "data": [{ "b64_json": "<base64>" }]
}
```

curl example:

```bash
curl https://api.nightveil.ai/v1/image/generate \
  -H "Authorization: Bearer nv_…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veil-diffuse",
    "prompt": "A moonlit forest clearing",
    "size": "1024x1024"
  }'
```

---

### Video generation (async)

Video generation is asynchronous. The flow is:

1. **Queue** — submit the job, receive both a `queue_id` and a job `id`.
2. **Retrieve** — poll with `queue_id` until status is `COMPLETED`, then follow `download_url`.
3. **Download** — fetch the mp4 bytes using the job `id`.

#### Step 1 — Queue

**`POST /v1/video/queue`**

Request:

```json
{
  "model": "veil-video-1",
  "prompt": "A cat walking in slow motion through a field of flowers",
  "duration": 5,
  "resolution": "1280x720",
  "aspect_ratio": "16:9"
}
```

> `veil-video-1` is illustrative — get real video model ids from `GET /v1/models`.

Fields:

| Field | Type | Required | Notes |
|---|---|---|---|
| `model` | string | yes | Video model id. |
| `prompt` | string | yes | Description of the video. |
| `duration` | integer | no | Seconds (default 5). |
| `resolution` | string | no | e.g. `1280x720`. |
| `aspect_ratio` | string | no | e.g. `16:9`, `9:16`. |

Response:

```json
{
  "id": "vid-job-001",
  "queue_id": "q_abc123",
  "status": "QUEUED",
  "quote_usd": 0.12
}
```

The response returns two identifiers: `queue_id` — pass this to `/v1/video/retrieve` to poll status; and `id` — the job id used in the `/v1/video/download/{id}` path once complete.

curl example:

```bash
curl https://api.nightveil.ai/v1/video/queue \
  -H "Authorization: Bearer nv_…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veil-video-1",
    "prompt": "A cat walking in slow motion",
    "duration": 5
  }'
```

#### Step 2 — Retrieve

**`POST /v1/video/retrieve`**

Request:

```json
{
  "model": "veil-video-1",
  "queue_id": "q_abc123"
}
```

While rendering, the response is JSON:

```json
{ "status": "PROCESSING" }
```

When ready, the response is JSON with a download URL:

```json
{
  "status": "COMPLETED",
  "download_url": "/v1/video/download/vid-job-001"
}
```

Alternatively, the response may be raw `video/mp4` bytes (done-inline, no download step needed). Branch on `Content-Type: video/mp4` to detect this.

Poll every 5–10 seconds until status is `COMPLETED` or content-type is `video/mp4`. Recommended timeout: 4 minutes.

curl example:

```bash
curl https://api.nightveil.ai/v1/video/retrieve \
  -H "Authorization: Bearer nv_…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veil-video-1",
    "queue_id": "q_abc123"
  }'
```

#### Step 3 — Download

**`GET /v1/video/download/:id`**

Returns raw mp4 bytes (`Content-Type: video/mp4`).

curl example:

```bash
curl https://api.nightveil.ai/v1/video/download/vid-job-001 \
  -H "Authorization: Bearer nv_…" \
  -o output.mp4
```

---

### Audio — text-to-speech (sync)

**`POST /v1/audio/speech`**

Returns binary audio bytes directly (not JSON). Check `Content-Type` for the format.

Request:

```json
{
  "model": "veil-tts-1",
  "input": "Hello, world. This is NightVeil text-to-speech.",
  "voice": "nova",
  "response_format": "mp3"
}
```

> `veil-tts-1` is illustrative — get real audio model ids from `GET /v1/models`.

Fields:

| Field | Type | Required | Notes |
|---|---|---|---|
| `model` | string | yes | TTS model id. |
| `input` | string | yes | Text to speak. |
| `voice` | string | no | Voice id (model-dependent). |
| `response_format` | string | no | `mp3`, `wav`, `opus`, `aac`, `flac`. Default: `mp3`. |

Response: binary audio bytes with appropriate `Content-Type` (e.g. `audio/mpeg`).

curl example:

```bash
curl https://api.nightveil.ai/v1/audio/speech \
  -H "Authorization: Bearer nv_…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veil-tts-1",
    "input": "Hello, world.",
    "voice": "nova"
  }' \
  -o output.mp3
```

#### Audio async (queue/retrieve)

For long-form audio, an async flow mirrors the video pattern. These use the same queue → retrieve request/response shape as the video async flow above.

- **`POST /v1/audio/queue`** — same shape as `/v1/audio/speech` → `{ id, queue_id, status:"QUEUED" }`
- **`POST /v1/audio/retrieve`** — `{ model, queue_id }` → `{ status:"PROCESSING" }` or `{ status:"COMPLETED", download_url }` or raw audio bytes

Poll and download using the same logic as video.

---

### Models catalog

**`GET /v1/models`**

Returns the live model catalog. Use this to discover all available model ids and their types.

Response:

```json
{
  "object": "list",
  "data": [
    { "id": "veil-uncensored", "type": "chat" },
    { "id": "veil-uncensored-24b", "type": "chat" },
    { "id": "veil-roleplay", "type": "chat" },
    { "id": "veil-diffuse", "type": "image" },
    { "id": "qwen-2.5-7b", "type": "chat" }
  ]
}
```

`type` ∈ `chat` | `image` | `video` | `audio` | `embedding`

curl example:

```bash
curl https://api.nightveil.ai/v1/models \
  -H "Authorization: Bearer nv_…"
```

---

### Balance

**`GET /v1/balance`**

Returns the current USD balance for the authenticated tenant. Requires an inference-scope key.

Response:

```json
{ "balance_usd": 37.50 }
```

curl example:

```bash
curl https://api.nightveil.ai/v1/balance \
  -H "Authorization: Bearer nv_…"
```

---

### Deposit address

**`GET /v1/deposit-address`**

Returns a crypto deposit address object for topping up the balance.

Response:

```json
{
  "address": "0xabc123…",
  "network": "base",
  "currency": "USDC"
}
```

curl example:

```bash
curl https://api.nightveil.ai/v1/deposit-address \
  -H "Authorization: Bearer nv_…"
```

---

### Admin — key management

These endpoints require an **admin-scope** key. Calling them with an inference-only key returns `403 FORBIDDEN`.

#### Create a key

**`POST /v1/keys`**

Request:

```json
{
  "label": "customer-42",
  "content_tier": "sfw"
}
```

Both fields are optional. `content_tier` ∈ `sfw` | `nsfw`.

Response:

```json
{
  "id": "key_abc",
  "key": "nv_…",
  "label": "customer-42",
  "content_tier": "sfw"
}
```

The full key string is shown only once. Store it immediately.

curl example:

```bash
curl https://api.nightveil.ai/v1/keys \
  -X POST \
  -H "Authorization: Bearer nv_…" \
  -H "Content-Type: application/json" \
  -d '{"label": "customer-42", "content_tier": "sfw"}'
```

#### List keys

**`GET /v1/keys`**

Response:

```json
{
  "object": "list",
  "data": [
    {
      "id": "key_abc",
      "prefix": "nv_abc…",
      "label": "customer-42",
      "content_tier": "sfw",
      "status": "active",
      "created_at": "2026-01-15T10:30:00Z"
    }
  ]
}
```

curl example:

```bash
curl https://api.nightveil.ai/v1/keys \
  -H "Authorization: Bearer nv_…"
```

#### Revoke a key

**`DELETE /v1/keys/:id`**

Response:

```json
{ "id": "key_abc", "revoked": true }
```

curl example:

```bash
curl https://api.nightveil.ai/v1/keys/key_abc \
  -X DELETE \
  -H "Authorization: Bearer nv_…"
```

#### Usage report

**`GET /v1/usage`**

Returns wholesale spend broken down by key.

Response:

```json
{
  "total_billed_usd": 12.34,
  "by_key": [
    {
      "api_key_id": "key_abc",
      "label": "customer-42",
      "requests": 1042,
      "billed_usd": 8.50
    },
    {
      "api_key_id": "key_xyz",
      "label": "customer-99",
      "requests": 330,
      "billed_usd": 3.84
    }
  ]
}
```

curl example:

```bash
curl https://api.nightveil.ai/v1/usage \
  -H "Authorization: Bearer nv_…"
```

---

## Models

Call `GET /v1/models` for the live catalog. Headline models:

| Model id | Type | Notes |
|---|---|---|
| `veil-uncensored` | chat | Default uncensored chat model |
| `veil-uncensored-24b` | chat | Larger uncensored chat model |
| `veil-roleplay` | chat | Optimized for roleplay scenarios |
| `veil-diffuse` | image | Image generation |

Audio, video, and embedding model ids shown in the examples below (e.g. `veil-tts-1`, `veil-video-1`, `text-embed-3-small`) are illustrative — always call `GET /v1/models` for the live catalog of ids and their `type`.

Open passthrough models (qwen, llama, gpt-style ids, etc.) are passed through unchanged — they appear in the `GET /v1/models` catalog alongside `veil-*` models.

---

## Errors

All error responses use the same envelope:

```json
{
  "error": {
    "code": "INSUFFICIENT_BALANCE",
    "message": "Your balance is too low to complete this request.",
    "type": "billing",
    "request_id": "req_abc123"
  }
}
```

Fields:

| Field | Description |
|---|---|
| `code` | Machine-readable error code. |
| `message` | Human-readable description. |
| `type` | Error category (see below). |
| `request_id` | Unique id for support escalation. |

### Error types

| `type` | HTTP status | Notes |
|---|---|---|
| `auth` | 401 | Missing or invalid key. Code: `AUTHENTICATION_FAILED`. |
| `billing` | 402 | Insufficient balance. Code: `INSUFFICIENT_BALANCE`. |
| `policy` | 422 | Content blocked. Code: `CONTENT_BLOCKED`. |
| `validation` | 400 | Malformed request body. |
| `upstream` | 500–503 | Upstream model error. |
| `rate_limit` | 429 | Rate limited. Honor `Retry-After`. |
| `internal` | 500 | Internal server error. |
| `not_found` | 404 | Endpoint or resource not found. |
| `forbidden` | 403 | Admin endpoint called with non-admin key. Code: `FORBIDDEN`. |

### Handling errors

```javascript
const res = await fetch("https://api.nightveil.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer nv_…",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ model: "veil-uncensored", messages: [{ role: "user", content: "Hi" }] }),
});

if (!res.ok) {
  const { error } = await res.json();
  console.error(error.type, error.code, error.message);
  // Handle billing: prompt user to top up
  // Handle rate_limit: back off per Retry-After header
}
```

---

## Billing

NightVeil uses pay-as-you-go USD billing:

- Pre-fund your balance at nightveil.ai via **card** or **crypto** (USDC on Base).
- Each request deducts the per-token or per-second cost from your balance.
- Check your balance at any time with `GET /v1/balance`.
- Get your crypto deposit address with `GET /v1/deposit-address`.
- There are no subscriptions, no minimums, and no per-key billing — all spend runs against one wholesale balance.
- Resellers: mint a key per customer (`POST /v1/keys`) and track their spend via `GET /v1/usage`.

---

## Rate limits

- Requests are rate-limited **per IP address**.
- When rate-limited, the API returns `429` with a `Retry-After` header (seconds).
- Always honor `Retry-After` before retrying. A simple exponential back-off with jitter is recommended for production clients.
- If you need higher limits, contact nightveil.ai.

---

## MCP option

Prefer tools over raw HTTP? The official MCP server exposes every endpoint above as a tool usable from Claude Code, Cursor, and any other MCP-compatible client.

Add to your MCP client config:

```json
{
  "mcpServers": {
    "nightveil": {
      "command": "npx",
      "args": ["-y", "@nightveil/mcp"],
      "env": { "NIGHTVEIL_API_KEY": "nv_…" }
    }
  }
}
```

Package: [@nightveil/mcp on npm](https://www.npmjs.com/package/@nightveil/mcp)