Billing

Understanding your usage

How to read the AI usage log and what each column actually represents.

The AI Usage tab in Billing & AI shows where every cent of your AI spend has gone. This page explains what each column means and how to read the summary numbers.

The summary panel

At the top of the Usage tab you'll see a few summary rows:

  • Subscription credits remaining — what's left of this period's plan allowance, against the total it started with.
  • Purchased credits remaining — only shown if you have at least one active top-up. Sums all unexpired top-ups.
  • Total credits available — subscription + purchased.
  • Spent in last 24 hours — rolling-window cost. Useful for noticing a runaway loop.
  • Spent this billing period — total cost since the period started.

The recent usage log

Below the summary, every AI call is logged as one row with these columns:

ColumnWhat it is
WhenTimestamp the call completed.
ModelWhich model handled the call (e.g. Sonnet 4, GPT-5, Gemini 2.5 Pro).
InInput tokens billed at the regular input rate.
OutOutput tokens generated by the model.
Cache RInput tokens served from cache (discounted, typically ~10× cheaper).
Cache WInput tokens written to cache for the first time. On Anthropic this carries a small premium; on OpenAI and Google it's billed as regular input.
CostTotal dollars deducted from your credit balance for this call.

For an end-to-end refresher on how those numbers turn into a charge, see AI pricing.

Common patterns

"Why is my Cache R column so big?"

That's the system instructions and tool definitions being reused on later turns of the same chat. It's a feature, not a bug — those tokens are already stored, so they cost you about a tenth of regular input. The bigger Cache R is, the more you're saving by reusing context.

"Why does the first message in a chat cost more?"

On Anthropic, the first turn pays a small Cache W premium (~25% above input) to store the prefix. From the second turn on, that prefix moves to Cache R and gets the discount. OpenAI and Google don't charge for cache writes — the first turn is just regular input.

"My balance is dropping faster than I expected."

  • Check the Model column — flagship models can be 5–10× more than smaller ones.
  • Long output is the biggest cost driver. Output tokens cost several times more than input.
  • Consider starting a fresh chat per indicator to keep input small. Long histories replay every turn.

Refreshing the data

The list paginates — use Show more to load older entries. The refresh icon next to the section heading reloads from the server in case a call is still in flight.

Something missing or wrong? Email support@strategytune.com.