Meter Corrections (undo events)

Introduction

There are occasions where a meter event source may send in or report wrong meters.

For example, if a bug causes some system to send incorrect meter events to Amberflo or if it is simply a specific event you wish to cancel. See additional examples below.

How do you correct an invoice or bill based on inaccurate usage data?

How do you undo (delete) wrong meters?

Amberflo provides the ability to cancel (undo) one ore more meter events as needed.

A Metering System can only claim to be an accurate system of record of usage and consumption data if and only if it provides built-in artifacts (idempotency, data deduplication) and tooling (undo ingested meters) to guarantee accuracy.

Amberflo provides an easy to use API that you can invoke to cancel a single usage event or a large set of events. The system will also take care of propagating the changes to the usage aggregations as well as to the invoicing or rewards payment systems if those are affected by the cancellation.


Examples where meter events need to be undone (and corrected)

System Errors

These sorts of errors are basically data problems or bugs in the account's system that caused the account to send false or inaccurate data to Amberflo. These errors are called system errors as their source is related to the origin system that sent the events to Amberflo.

Example:

Let's assume that ‘Smart-ML’ uses Amberflo for usage-based pricing. ‘Smart-ML’ sends Amberflo a meter called ‘api-calls’ which has the following dimensions: ‘region’, ‘cluster’, and ‘host-type’.

Now let’s assume that on 2/2/2022 at 1:00am there was a “bad deployment”, and because of some bug, all of the ‘api-calls’ events do not have values for the “cluster” and “host-type” dimensions.

On 2/15/2022 at 4pm, George, a PM with ‘Smart-ML’ discovers something wrong with the meters sent to Amberflo, and after short investigation, ‘Smart-ML’ oncall reaches to the conclusion that all of the ‘api-calls’ meters sent to Amberflo since 2/2/2022 are invalid.


Unintentional or unsuccessful usage

Unlike systematic errors which are sourced from bad data or a bug in the source system, unintentional or unsuccessful usage is a more natural occurrence that happens in the course of normal use. In this scenario, a customer starts or intends to start using a resource, but then for some reason the action the customer wanted to take by using the resource is canceled, so ultimately, the resource ends up not being used.

Example:

‘Rabbit-IO’ is a company that allows users to develop and run big-data pipelines on its clusters (some sort of PAAS company). ‘Smart-ML’ is a customer of ‘Rabbit-IO’.

On 3/3/2022 at 3:21am ‘Smart-ML’ sends a request to ‘Rabbit-IO’ to start a certain data-pipeline job. ‘Rabbit-IO’ assigned cluster X machines for this job and even sent Amberflo an event stating the ‘Smart-ML’ started using machines on cluster X on 3/3/2022 at 3:21am.

{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 500, // let's assume the job takes 500 cpus
    "meterTimeInMillis": 1646320860, // 3/3/2022 at 3:21am
    "dimensions":{
        "cluster": "X",
        "tenant_type": "Tech"
    }
}

But then 15 minutes later, at 3:36am, cluster X goes down and the job fails. ‘Rabbit-IO’ doesn’t want to charge ‘Smart-ML’ for failed resource usage, so they want to cancel that usage event since it never successfully took place.


Coping with System Errors

To help cope with ‘System Errors’ Amberflo exposes a ‘filtering-rules’ API. This API allows you to define rules for filtering out and removing sets of previously ingested meter events.


Example:

If we take the ‘Smart-ML’ example, then in order to filter out bad ‘api-calls’, all ‘Smart-ML’ needs to do is:

Step 1: Send the following filtering event to Amberflo:

https://app.amberflo.io/ingest-snapshot/custom-filtering-rules :

{
    "type": "by_property_filter_out",
    "id": "2-2-2022-api-calls-cancelation",
    "ingestionTimeRange": {
        "startTimeInSeconds": 1643763600, // 2/2/2022 1am
        "endTimeInSeconds": 1644940800  // 2/15/2022 4pm
    },
    "meterApiName": "api-calls"
}

Step 2: Send the correct events to Amberflo with the original timestamps.


Advanced filters:

Let’s assume ‘Smart-ML’ only had a bad deployment for region ‘us-west-1’, and so there is no need to filter out all of the ‘api-calls’ events that happened between 2/2/2022 1am to 2/15/2022 4pm, but instead, only those those with where the ‘region’ dimension is ‘us-west-1’. To define such a filter just populate the ‘dimensionValuesMap’ property with the dimensions keys and values which you wish to filter out. For example:

https://app.amberflo.io/ingest-snapshot/custom-filtering-rules :

{
    "type": "by_property_filter_out",
    "id": "2-2-2022-api-calls-cancelation",
    "ingestionTimeRange": {
        "startTimeInSeconds": 1643763600, // 2/2/2022 1am
        "endTimeInSeconds": 1644940800  // 2/15/2022 4pm
    },
    "meterApiName": "api-calls",
    "dimensionValuesMap": {
        "region": [
            "us-west-1"
        ]
    }
}

For more information, please refer to our filtering rules API: https://docs.amberflo.io/reference/post_payments-custom-filtering-rules.

📘

Notice

A few things to keep in mind regarding the filtering tool:

Time limitations

Events can only be canceled within one year of ingestion.

Impact on Payments

While canceling usage and replacing it with a “corrected” usage is possible and easy, canceling events cannot affect invoices which are already locked (past their grace period time).

These locked invoices should be considered to have already been paid. An account can issue a refund or give a future discount in such cases instead.


Canceling unintentional or unsuccessful events

You can use the filtering mechanism mentioned above to filter out specific events according to their unique-id (just add the unique-id of the event you want to the ‘dimensionValuesMap’ property mentioned above). This provides an accurate way of canceling events, assuming that you have the unique-id of the event you wish to cancel and that you did not reuse the same unique-id. If you do not have the exact unique-id of the event you wish to cancel, yet at the point where you want to cancel an event you do have the “resource related properties” then we can allow you an alternative approach.


First, let’s define “resource related properties”. These related properties depend on the meter type as described below:

Meter Type

Resource related values

Sum

customer-id

Max

customer-id

Event Duration

customer-id, unique-id-dimension

Running Total

customer-id, unique-id-dimension

Unique Count

customer-id, unique-count-dimension

As you can see, the “resource related properties” allows you to identify the resource which was utilized to serve the customer.


Now, to cancel the events for a given resource, just send an ingest event that includes the relevant resource-id related properties and the following dimension: “aflo.cancel_previous_resource_event” with “true” as its value. For example, for the ‘Rabbit-IO’ example mentioned above, ‘Rabbit-IO’ needs to send the following ingest event in order to cancel the usage:

https://app.amberflo.io/ingest :

{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 0, // the value of the cancellation event doesn’t matter
    "meterTimeInMillis": 1646321760, // 3/3/2022 at 3:36am
    "dimensions": {
        "cluster": "X",
        "aflo.cancel_previous_resource_event": "true"
    }
}

📘

Limitations

A few things to keep in mind regarding the this way of canceling events:

9-hour time limit

From the moment a resource starts being used, you have 9 hours to cancel the usage event.

Cancelation event timestamp

Also make sure the timestamp you assign to the cancelation event is no later than 9 hours after the original event. In fact, try and set the timestamp of the cancelation event to be as close to the original event as possible, or even use the exact same time of the event you want to cancel.

Why these limitations ?

Notice that when defining a cancelation event we send the “resource related properties” and nothing else that identifies the event. Amberflo will assume that the event we want to cancel is some very recent event (in the scope of the last 9 hours) and will go ahead and cancel the most recent event of the specific resource if such an event exists. So for example if we used the same resource twice in the last 9 hours. Then Amberflo will always cancel the later event of the two. This is also why the value of the cancellation event doesn’t matter (because the cancellation event instructions is to just drop the most recent event regardless of its value).


Canceling unintentional or unsuccessful Usage (advance)

The "aflo.cancel_previous_resource_event" dynamic cancellation described above cancels events. These events indicate: start using a resource, stop using a resource, or an update to the usage rate.

For 'Max' or 'Total Duration' meters, Amberflo allows to send an secondary flag "aflo.ignore_cancellation_if_no_usage" (this addition flag has a meaning only if it comes together with "aflo.cancel_previous_resource_event"). This flag, if set to 'true', will tell Amberflo to ignore the dynamic cancelation of "stop events".

To see what's the meaning of this secondary flag let's look at an example:
Let's assume that we have a total-duration meter and that Amberflo system ingested the 3 following events:

// Start usage
{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 10, 
    "meterTimeInMillis": 1646321760, // 3/3/2022 at 3:36am
    "dimensions": {
        "cluster": "X",
    }
},
// Stop usage
{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 0, 
    "meterTimeInMillis": 1646321820, // 3/3/2022 at 3:37am
    "dimensions": {
        "cluster": "X",
    }
},
// Cancelation
{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 0,
    "meterTimeInMillis": 1646321880, // 3/3/2022 at 3:38am
    "dimensions": {
        "cluster": "X",
        "aflo.cancel_previous_resource_event": "true"
    }
}

As mentioned cancelation cancels the latest event regardless of what it indicates. So in the example above, after applying the dynamic cancellation, we will left only with the start event:

// Start usage
{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 10, 
    "meterTimeInMillis": 1646321760, // 3/3/2022 at 3:36am
    "dimensions": {
        "cluster": "X",
    }
}

Now, if we want the dynamic cancellation to cancel existing usage events (as opposed to cancelling any type of events), we will need to add the "aflo.ignore_cancellation_if_no_usage", with "true" as a value, to the 3rd event. So the sequence of events sent to Amberflo will look like:

// Start usage
{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 10, 
    "meterTimeInMillis": 1646321760, // 3/3/2022 at 3:36am
    "dimensions": {
        "cluster": "X",
    }
},
// Stop usage
{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 0, 
    "meterTimeInMillis": 1646321820, // 3/3/2022 at 3:37am
    "dimensions": {
        "cluster": "X",
    }
},
// Cancelation
{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 0,
    "meterTimeInMillis": 1646321880, // 3/3/2022 at 3:38am
    "dimensions": {
        "cluster": "X",
        "aflo.cancel_previous_resource_event": "true",
        "aflo.ignore_cancellation_if_no_usage": "true"  // <== NOTICE: Additional Secondary Flag
    }
}

In this case the result of the system run will be:

{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 10, 
    "meterTimeInMillis": 1646321760, // 3/3/2022 at 3:36am
    "dimensions": {
        "cluster": "X",
    }
},
// Stop usage
{
    "customerId": "Smart-ML-123", // the actual customer id of Smart-ML
    "meterApiName": "cpu-used",
    "meterValue": 0, 
    "meterTimeInMillis": 1646321820, // 3/3/2022 at 3:37am
    "dimensions": {
        "cluster": "X",
    }
}

This is because there was no usage to cancel for the 3rd event.

📘

NOTICE

"aflo.ignore_cancellation_if_no_usage" has a meaning only in the context of "Max" or "Total Duration" meters.