Operations

Cost per token

Cost per token is the price paid for input and output tokens.

Quick definition

Cost per token is the price paid for input and output tokens.

  • Category: Operations
  • Focus: performance and reliability
  • Used in: Reducing time-to-first-token with streaming.

What it means

It determines the total cost of a request. In operations workflows, cost per token often shapes performance and reliability.

How it works

Operations covers latency, throughput, and cost. Systems often use caching, batching, and monitoring to scale reliably.

Why it matters

Operational choices impact cost, latency, and reliability.

Common use cases

  • Reducing time-to-first-token with streaming.
  • Managing costs with token budgets and caching.
  • Tracking usage and errors with logs and metrics.

Example

Estimate cost for 2k input tokens and 500 output tokens.

Pitfalls and tips

Ignoring limits can cause timeouts or rate limiting. Set budgets and monitor usage to avoid surprises.

In BoltAI

In BoltAI, this shows up in performance, logging, or usage views.