Usage Metering
Meter Corrections (undo events)
23 min
introduction occasionally, a source system may send incorrect or unintended meter events to amberflo this could be due to a bug that results in over reporting or under reporting usage an event that was mistakenly triggered and needs to be retracted how do you correct an invoice or bill based on inaccurate usage data? how do you undo (delete) incorrect meters? amberflo's solution undo ingested meters amberflo provides a built in capability to cancel (undo) one or more meter events after ingestion this ensures amberflo remains a reliable system of record by supporting idempotency data deduplication accurate correction of historical usage how it works amberflo exposes an easy to use api that allows you to cancel a single usage event cancel a batch of events when meter events are canceled, the system updates all relevant usage aggregations propagates changes to downstream systems , including invoices rewards payment calculations this guarantees your billing and reporting stay accurate, even when corrections are needed examples where meter events need to be undone (and corrected) system errors these errors are typically the result of bugs or data issues in the source system that sent incorrect or incomplete meter events to amberflo since the root cause lies within the originating system, we refer to them as system errors example let’s assume a company named smart ml uses amberflo for usage based pricing smart ml tracks a meter called api calls, which includes the following dimensions region, cluster and host type what went wrong on february 2, 2022 at 1 00 am, a bad deployment introduced a bug as a result, all api calls events sent to amberflo after that time were missing values for the cluster and host type dimensions discovery and correction on february 15, 2022 at 4 00 pm, george, a product manager at smart ml, noticed irregularities in the usage data after investigating, the smart ml on call engineering team concluded that all api calls meter events sent since 2/2/2022 were invalid and needed to be undone and corrected this is a typical case where meter correction via amberflo’s undo functionality is essential to maintain data accuracy and ensure downstream billing or analytics remain trustworthy unintentional or unsuccessful usage unlike systematic errors caused by bugs or incorrect data in the source system, unintentional or unsuccessful usage refers to events that occur naturally during regular use of a service in these cases, a customer attempts (or begins) to use a resource, but the intended action is ultimately canceled or aborted , resulting in the resource not actually being used example rabbit io is a company that enables users to build and run big data pipelines on its clusters—essentially a paas provider smart ml is a customer of rabbit io on 3/3/2022 at 3 21 am, smart ml submits a request to start a data pipeline job rabbit io assigns machines from cluster x for the job sends an event to amberflo indicating that smart ml has started using cluster x however, if the job was canceled shortly afterward (e g , due to a user action or a failure), the resource wasn’t actually used—making the original meter event invalid and a candidate for correction { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 500, // let's assume the job takes 500 cpus "metertimeinmillis" 1646320860, // 3/3/2022 at 3 21am "dimensions" { "cluster" "x", "tenant type" "tech" } } but then, 15 minutes later at 3 36 am, cluster x goes down and the job fails since the data pipeline job was never successfully executed, rabbit io decides not to charge smart ml for the failed usage as a result, they want to cancel the original meter event sent to amberflo—because the resource was ultimately not used coping with system errors to help cope with ‘system errors’ amberflo exposes a ‘filtering rules’ api this api allows you to define rules for filtering out and removing sets of previously ingested meter events example if we take the ‘smart ml’ example, then in order to filter out bad ‘api calls’, all ‘smart ml’ needs to do is https //app amberflo io/ingest snapshot/custom filtering rules https //app amberflo io/ingest snapshot/custom filtering rules { "type" "by property filter out", "id" "2 2 2022 api calls cancelation", "ingestiontimerange" { "starttimeinseconds" 1643763600, // 2/2/2022 1am "endtimeinseconds" 1644940800 // 2/15/2022 4pm }, "meterapiname" "api calls" } advanced filters in some cases, you may not need to cancel all meter events within a time range—only a subset that meets specific criteria example let’s assume smart ml experienced a bad deployment only in the region us west 1 in this case, there's no need to undo all api calls events between 2/2/2022 1 00 am and 2/15/2022 4 00 pm instead, you want to target only the events where the region dimension is equal to us west 1 how to apply the filter to define this type of filter, use the dimensionvaluesmap property and specify the dimension keys and values you want to match this ensures that only events with the specified dimension values will be selected for cancellation https //app amberflo io/ingest snapshot/custom filtering rules https //app amberflo io/ingest snapshot/custom filtering rules { "type" "by property filter out", "id" "2 2 2022 api calls cancelation", "ingestiontimerange" { "starttimeinseconds" 1643763600, // 2/2/2022 1am "endtimeinseconds" 1644940800 // 2/15/2022 4pm }, "meterapiname" "api calls", "dimensionvaluesmap" { "region" \[ "us west 1" ] } } for more information, please refer to our filtering rules api create or update a filtering rule 📘 notice when using the filtering and meter cancellation tools, please keep the following considerations in mind time limitations meter events can only be canceled within one year of their original ingestion date events older than one year are not eligible for cancellation impact on payments you can cancel usage events and replace them with corrected ones as needed however, cancellations cannot modify invoices that are already locked (i e , past their grace period) what does “locked” mean? a locked invoice is considered final and paid in these cases, any necessary adjustments should be handled via a refund a future discount or credit canceling unintentional or unsuccessful events you can use the filtering mechanism described earlier to target and cancel specific events based on their unique id method 1 using unique id if you have the unique id of the event to be canceled add it to the dimensionvaluesmap property this allows for a precise and reliable cancellation, assuming you have the correct unique id the same unique id hasn’t been reused for another event method 2 using resource related properties if the unique id is not available, but you have other identifying information—such as customer id meter name dimension values (e g , region, cluster) then amberflo can support an alternative filtering approach using those resource related properties this allows you to cancel specific events even without the exact unique id, provided that the filtering criteria are sufficient to isolate the correct events first, let’s define “ resource related properties ” these related properties depend on the meter type as described below as described above, the resource related properties help identify the specific resource instance that was used to serve the customer to cancel the events associated with that resource, simply send a new ingest event that includes the appropriate resource id and related dimensions and an additional dimension “aflo cancel previous resource event” with “true” as its value this instructs amberflo to locate and cancel any previously ingested events related to that resource for example, for the ‘rabbit io’ example mentioned above, ‘rabbit io’ needs to send the following ingest event in order to cancel the usage https //app amberflo io/ingest https //app amberflo io/ingest \[{ "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 0, // the value of the cancellation event doesn’t matter "metertimeinmillis" 1666278270000, // 10/20/2022 at 15 04 30 "dimensions" { "cluster" "x", "aflo cancel previous resource event" "true" } }] 📘 limitations when canceling events using resource related properties and the aflo cancel previous resource event flag, there are a few important constraints to keep in mind 1\ 9 hour time limit you have up to 9 hours from the time a resource starts being used to cancel the associated usage event 2\ timestamp of the cancellation event the timestamp of the cancellation event must be no later than 9 hours after the original usage event for best results, set the cancellation event’s timestamp to be as close as possible to the original event or even use the exact same timestamp as the event you are canceling why these limitations exist ? when you define a cancellation event using only resource related properties , you're not explicitly identifying the exact meter event to be canceled (e g , by unique id) because of this, amberflo operates under the assumption that you're targeting a recent event and will search for the most recent event related to the specified resource within the last 9 hours cancel that event if it exists important if the same resource was used more than once in that time window, amberflo will cancel only the most recent of those events also note that the value of the cancellation event does not matter, because its only purpose is to signal the system to drop the matching usage record—not to report usage canceling unintentional or unsuccessful usage (advance) the aflo cancel previous resource event flag , as described earlier, is used for dynamically canceling usage events this mechanism can target events such as start using a resource , stop using a resource , or an update to the usage rate for max or total duration meters , amberflo supports an additional flag to refine cancellation behavior "aflo ignore cancellation if no usage" this secondary flag is only meaningful when used together with aflo cancel previous resource event if aflo ignore cancellation if no usage is set to true, amberflo will ignore the cancellation of a "stop" event if there was no usage recorded between the original start and stop events this prevents cancellations from invalidating periods where no actual usage occurred —a useful safeguard for duration based meters example let’s assume you are using a total duration meter and the following three events were ingested // start usage { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 10, "metertimeinmillis" 1646321760, // 3/3/2022 at 3 36am "dimensions" { "cluster" "x", } }, // stop usage { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 0, "metertimeinmillis" 1646321820, // 3/3/2022 at 3 37am "dimensions" { "cluster" "x", } }, // cancelation { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 0, "metertimeinmillis" 1646321880, // 3/3/2022 at 3 38am "dimensions" { "cluster" "x", "aflo cancel previous resource event" "true" } } as mentioned, event cancelation cancels the latest event regardless of what it indicates so in the example above, after applying the dynamic cancellation, we will be left with only the start event // start usage { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 10, "metertimeinmillis" 1646321760, // 3/3/2022 at 3 36am "dimensions" { "cluster" "x", } } now, if we want the dynamic cancellation to cancel existing usage events (as opposed to cancelling any type of events), we will need to add the "aflo ignore cancellation if no usage" , with "true" as a value, to the 3rd event so the sequence of events sent to amberflo will look like // start usage { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 10, "metertimeinmillis" 1646321760, // 3/3/2022 at 3 36am "dimensions" { "cluster" "x", } }, // stop usage { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 0, "metertimeinmillis" 1646321820, // 3/3/2022 at 3 37am "dimensions" { "cluster" "x", } }, // cancelation { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 0, "metertimeinmillis" 1646321880, // 3/3/2022 at 3 38am "dimensions" { "cluster" "x", "aflo cancel previous resource event" "true", "aflo ignore cancellation if no usage" "true" // <== notice additional secondary flag } } in this case the result of the system run will be { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 10, "metertimeinmillis" 1646321760, // 3/3/2022 at 3 36am "dimensions" { "cluster" "x", } }, // stop usage { "customerid" "smart ml 123", // the actual customer id of smart ml "meterapiname" "cpu used", "metervalue" 0, "metertimeinmillis" 1646321820, // 3/3/2022 at 3 37am "dimensions" { "cluster" "x", } } this is because there was no usage to cancel for the 3rd event 📘 notice "aflo ignore cancellation if no usage" has a meaning only in the context of "max" or "total duration" meters