AI Governance (Unified LLM Int...
Calling LLMs
7 min
amberflo provides a built in, fully configurable ai gateway for accessing llms across providers using a single, openai compatible api interface follow the quick start guide to get setup in < 5 minutes and begin making requests through the ai gateway immediately all calls made through the gateway are authenticated using a virtual key, governed by the associated workload, and logged for usage, cost, and observability there are two primary ways to call the gateway using the model playground making direct api calls option 1 using the model playground the model playground is the fastest way to test that everything is working correctly navigate to model playground using the left hand navigation paste one of your virtual keys into the field labeled workload virtual key once the key is entered the workload field is automatically populated the model selector shows only the models allowed for that workload select a model and use the chat interface to send a message this is the easiest way to validate access confirm model routing start generating usage and cost data requests made through the playground are treated the same as api calls and will appear in the ai spend and ai observability dashboards option 2 calling the gateway via api the amberflo gateway exposes an openai compatible api this allows you to reuse existing openai client logic or make direct http calls with minimal changes gateway endpoint https //app amberflo io/ai gateway/proxy/v1/chat/completions all requests must include a virtual key in the authorization header a valid model (or model alias) allowed for the workload a standard openai compatible request body python example the example below uses the requests library and is designed to be simple and explicit import requests import json gateway url = "https //app amberflo io/ai gateway/proxy/v1/chat/completions" virtual key = "your virtual key here" model alias = "your model alias here" headers = { "accept" "application/json", "content type" "application/json", "authorization" f"bearer {virtual key}" } payload = { "model" model alias, "messages" \[ { "role" "user", "content" "how are tokens calculated?" } ] } response = requests post( gateway url, headers=headers, data=json dumps(payload) ) print(response status code) print(response json()) notes replace your virtual key here with a virtual key tied to a workload replace your model alias here with the model alias you configured in amberflo the request format matches the openai chat completions api curl example the following curl command is equivalent to the python example above curl https //app amberflo io/ai gateway/proxy/v1/chat/completions \\ x post \\ h "authorization bearer your virtual key here" \\ h "content type application/json" \\ d @ <\<eof { "model" "your model alias here", "messages" \[ { "role" "user", "content" "how are tokens calculated?" } ] } eof this is useful for quick testing from the command line debugging access or configuration issues verifying model routing what happens next once requests are sent through the gateway usage is automatically tracked cost is calculated using amberflo’s rating engine data is attributed to the correct workload metrics appear in the ai spend and ai observability dashboards no additional configuration is required summary use the model playground for quick validation and experimentation use api calls for application integration always authenticate using a virtual key reference models by alias, not provider specific names with this in place, your application is fully integrated with the amberflo ai gateway
