Billing for LLM tokens
21 min
this guide shows how to bill customers for an ai chatbot product using api calls (per request) llm tokens (input + output, priced differently by model) this is a real world billing pattern used by most ai products by the end, you will have multiple meters (api calls + tokens) a pricing plan with multiple rate types dimensional pricing (model + token type) a customer receiving a fully broken down invoice the scenario you are building an ai chatbot saas your pricing model $0 01 per api request token usage billed based on model (model a vs model b) token type (input vs output) this reflects how real ai systems are priced what you’ll build you will implement api request → flat charge llm tokens → dimensional pricing (model + type) → combined into a single invoice step 1 create meters you need two meters 1a api calls meter (simple) navigate go to meters click create meter configure label api calls api name auto generated meter type count no dimensions required note you can add dimensions (endpoint, region, etc ) for analytics or cost tracking, but they are not needed for billing here click create 1b llm tokens meter (dimensional) navigate go to meters click create meter configure label llm tokens meter type count add dimensions add model type (input, output) these are required for realistic ai pricing click create step 2 create a pricing plan navigate go to pricing click create pricing plan configure name chatbot usage plan billing period monthly click continue step 3 add rates you will add two rates to the same plan 3a api calls rate (per unit) click add rate select usage based configure meter api calls rate model per unit price 0 01 this means every api request = $0 01 3b llm tokens rate (dimensional) click add rate again select usage based configure meter llm tokens rate model dimensions tier model per unit define pricing matrix you will define rates based on model type example model type price per token model a input 0 00001 model a output 0 00002 model b input 0 000005 model b output 0 00001 each row is a separate pricing rule important you are pricing each combination independently this is how real ai pricing works click save step 4 create a customer navigate go to customers click create customer configure customer name test customer customer id test customer click save step 5 assign pricing plan open the customer in pricing plans , click assign plan configure plan chatbot usage plan start date now click assign plan step 6 send usage now you simulate real usage navigate go to meters open llm tokens click event upload example event (tokens) \[ { "meterapiname" "llm tokens", "metervalue" 1000, "dimensions" { "model" "model a", "type" "input" } } ] send multiple events for different combinations model a + input model a + output model b + input api calls event send another event using the api calls meter \[ { "meterapiname" "api calls", "metervalue" 10 } ] step 7 view the invoice navigate go to customers open your customer what you’ll see your invoice will include api call charges token usage charges by default, token charges are broken down by model token type example model a input tokens → $x model a output tokens → $y model b input tokens → $z api calls → $w total → $t important note you do not have to expose this level of detail to your customers this breakdown is the default you can simplify how invoices are presented if you don’t see data refresh the page amberflo does not auto refresh if still empty confirm events were sent confirm dimensions match pricing confirm plan is assigned what you just built you implemented a real ai billing system multiple meters mixed pricing models dimensional pricing unified invoice why this matters this pattern lets you price different models differently charge input vs output tokens separately combine request based and usage based billing this is how modern ai products monetize next steps add tiered pricing for tokens introduce discounts or credits connect to cost tracking for margin visibility for internal cost attribution, see workloads docid\ xtorhvsoi2aok1bzpwfdu
