Probes (API)
Generic Probe REST API
The information below describes as much of the API as necessary to understand how to use IBA for someone already familiar with Apstra API conventions. Formal API documenation is reserved for the API documentation itself.
We will walk through the API as it's used for the example workflow described in the introduction, demonstrating its general capability by specific example.
Create Probe
To create a probe, the operator POSTs to
/api/blueprints/<blueprint_id>/probes
with the following
form:
{ "label": "server_tx_bytes", "description": "Server traffic imbalance", "tags": ["server", "imbalance"], "disabled": false, "processors": [ { "name": "server_tx_bytes", "outputs": { "out": "server_tx_bytes_output" }, "properties": { "counter_type": "tx_bytes", "graph_query": "node('system', name='sys').out('hosted_interfaces').node('interface', name='intf').out('link').node('link', link_type='ethernet', speed=not_none()).in_('link').node('interface', name='dst_intf').in_('hosted_interfaces').node('system', name='dst_node', role='server').ensure_different('intf', 'dst_intf')", "interface": "intf.if_name", "system_id": "sys.system_id" }, "type": "if_counter" }, { "inputs": { "in": "server_tx_bytes_output" }, "name": "std", "outputs": { "out": "std_dev_output" }, "properties": { "ddof": 0, "group_by": [] }, "type": "std_dev" }, { "inputs": { "in": "std_dev_output" }, "name": "server_imbalance", "outputs": { "out": "std_dev_output_in_range" }, "properties": { "range": { "max": 100 } }, "type": "range_check" }, { "inputs": { "in": "std_dev_output_in_range" }, "name": "server_imbalance_anomaly", "outputs": { "out": "server_traffic_imbalanced" }, "type": "anomaly" } ], "stages": [ { "name": "server_tx_bytes_output", "description": "Collect server tx_bytes", "tags": ["traffic counter"], "units": "Bps" } ] }
As seen above, the endpoint is given an input of probe metadata, a processor instance list, and output stage list.
Probe metadata is composed of the following fields:
label |
human-readable probe label; required, |
description |
optional description of the probe, |
tags |
list of strings with the probe tags; optional, |
disabled |
optional boolean that tells whether probe should be disabled. Disabled probes don't provide any data and don't consume any resources. The probe is not disabled by default. |
Each processor instance contains an instance name (defined by user), processor type
(a selection from a catalog defined by the platform and the reference design), and
inputs
and/or outputs
. All additional fields
in each processor are specific to that type of processor, are specified in the
properties
sub-field, and can be learned by introspection via
our introspection API at
/api/blueprints/<blueprint_id>/telemetry/processors
; we
will go over this API later.
Matching our working example, we will go through each entry we have in the processor list in the above example.
In the first entry, we have a processor instance of type if_counter
that we name server_tx_bytes
. It takes as input a query called
graph_query
which is a graph query. It then has two other
fields named interface
and system_id
. These three
fields together indicate that we want to collect a (first time-derivative of)
counter for every server-facing port in the system. For every match of the query
specified by graph_query
, we extract a system_id by taking the
system_id
field of the sys
node in the
resulting path (as specified in the system_id
processor field) and
an interface name by taking the if_name
field of the
intf
node in the resulting path (as specified in the
interface
processor field). The combination of system ID and
interface is used to identify an interface in the network, and its tx_bytes counter
(as specified by counter_type
) is put into the output of this
processor. The output of this processor is of type "Number Set" (NS); stage types
are discussed exhaustively later. This processor has no inputs, so we do not supply
an input
field. It has one output, labeled out
(as
defined by the if_counter processor type); we map that output to a stage labeled
server_tx_bytes_output
.
The second processor is of type std_dev
and takes as input the stage
we created before called server_tx_bytes_output
; see the
processor-specific documentation for the meaning of the ddof
field.
Also, see the processor-specific documentation for the full meaning of the
group_by
field. It will suffice to say for now that in this
case group_by
tells us to construct a single output "Number" (N)
from the input NS; that is, this processor outputs a single number-the standard
deviation taken across each of the many input numbers. This output is named
"std_dev_output".
The third processor is of type range_check
and takes as input
std_dev_output
. It checks that the input is out of the expected
range specified by range
- in this case if the input is ever
greater-than 100 (we have chosen this arbitrary value to indicate when the
server-directed traffic is unbalanced). This processor has a single output we choose
to label std_dev_output_in_range
. This output (as defined by the
range_check processor type) is of type DS (Discrete State) and can take values
either true
or false
, indicating whether or not a
value is out of the range.
Our final processor is of type anomaly
and takes as input
std_dev_output_in_range
. It raises an Apstra anomaly when the
input is in the true
state. This processor has a single output we
choose to label server_traffic_imbalanced
. This output (as defined
by the anomaly processor type) is of type DS (Discrete State) and can take values
either true
or false
, indicating whether or not an
anomaly is raised. We do not do any further processing with this anomalous state
data in this example, but that does not preclude its general possibility.
Finally, we have a stages
field. This is a list of a subset of
output stages, with each stage indicated by the name
field which
refers to the stage label. This list is meant to add metadata to each output stage
that cannot be inferred from the DAG itself. Currently, supported fields are:
description |
string with a stage description, |
tags |
list of strings that make a set of tags for stage, |
units |
string that is meant to describe the units of the stage data. |
All these fields are optional.
This stage metadata is returned when fetching data from that stage via the REST API and used by the GUI in visualization.
HTTP POST can be sent to
/api/blueprints/<blueprint_id>/probes
. Here, we POST probe
configuration, as exemplified in the "POST for Probe Creation" figure to create a
new probe. POSTing to this endpoint will return a UUID, as most of the other
creation endpoints in Apstra, which can be used for further operations.
Changed in version 2.3: To get a predictable probe id instead of a UUID described above, one could specify it by adding an "id" property to the request body.
{ "id": "my_tx_bytes_probe", "label": "server_tx_bytes", "processors": [], "rest_of_the": "request_body" }
Changed in version 2.3: Previously, stage definitions were inlined into processor definitions like this:
{ "label": "test probe", "processors": [ { "name": "testproc", "outputs": {"out": "test_stage"}, "stages": [{"name": "out", "units": "pps"}] } ] }
This no longer works, and stage name should refer to the stage label instead of the internal stage name. So the example above should look this way:
{ "stages": [{"name": "test_stage", "units": "pps"}] }
Additional note: it's recommended not to inline stage definitions into processor definitions, and place that as a stand-alone element like in POST example above.
HTTP DELETE can be sent to
/api/blueprints/<blueprint_id>/probes/<probe_id>
where to delete the probe specified by its probe_id
.
HTTP GET can be sent to
/api/blueprints/<blueprint_id>/probes/<probe_id>
to
retrieve the configuration of the probe as it was POSTed. It will contain more
fields than it was specified at probe creation:
id |
with id of the probe (or UUID if it was not specified at creation time), |
state |
with actual state of the probe; possible values are "created" for a probe being configured, "operational" for a successfully configured probe, and "error" if probe configuration has failed. |
last_error |
contains detailed error description for the most-recent error for probes in the "error" state. It has the following sub-fields:
|
The complete list of probe messages could be obtained by issuing HTTP GET request to
/api/blueprints/<blueprint_id>/probes/<probe_id>/messages
.
Messages are sorted by the 'timestamp' field, oldest come first.
Additionally, HTTP GET can be sent to
/api/blueprints/<blueprint_id>/probes
to retrieve all the
probes for blueprint <blueprint_id>
.
2.3
HTTP PATCH and PUT methods for probes are available since Apstra version 2.3.
HTTP PATCH can be sent to
/api/blueprints/<blueprint_id>/probes/<probe_id>
to
update the probe metadata or disable or enable the probe.
{ "label": "new server_tx_bytes", "description": "some better probe description", "tags": ["production"], "stages": [ { "name": "server_tx_bytes", "description": "updated stage description", "tags": ["server traffic"], "units": "bps" } ] }
This example updates probe metadata for the probe that was created with the POST request listed above. All fields here are optional, values that were not specified remain unchanged.
Every stage instance is also optional, that is, only specified stages will be updated, and not specified stages remain unchanged.
Tags collection is updated entirely, i.e. if it was tags: ["a", "b"]
and the PATCH payload specified tags: ["c"]
, then the resulting
collection will look like tags: ["c"]
(NOT tags: ["a", "b",
"c"]
).
With PATCH it's not possible to change probe's set of processor and stages. Please read further for PUT description which allows to do that.
HTTP PUT can be sent to
/api/blueprints/<blueprint_id>/probes/<probe_id>
to
replace a probe.
This is very similar to POST, with the difference being that it replaces the old
configuration for probe <probe_id>
with the new one specified
in the payload. Payload format for this request is the same as for POST, but
id
is not allowed.
Inspect Probe
Stages are implicitly created by being named in the input and output of various
processors. You can inspect the various stages of a probe. The API for reading a
particular stage is
/api/blueprints/<blueprint_id>/probes/<probe_id>/stages/<stage_name>
Each stage has a type. This is a function of the generating processor and the input stage(s) to that processor. The types are: Number (N); Number Time Series (NTS), Number Set (NS); Number Set Time Series (NSTS); Text (T); Text Time Series (TTS); Text Set (TS); Text Set Time Series (TSTS); Discrete State (DS); Discrete State Time Series (DSTS); Discrete State Set (DSS); Discrete Set Time Series (DSSTS)
A NS is exactly that: a set of numbers.
Similarly, a DSS is a set of discrete-state variables. Part of the specification of a DSS (and DSSTS) stage is the possible values the discrete-state variable can take.
A text set is a set of strings.
A NSTS is a set of time-series with numbers as values. For example, a member of this set would be: (time=0 seconds, value=3), (time=3 seconds, value=5), (time=6 seconds, value=23), and so-on.
An DSTS is the same as an NSTS except values are discrete-state.
An TSTS is the same as an NSTS except values are strings.
Number (N), Discrete-State (DS), and Text (T) are simply Number Sets, Discrete State Sets, and Text Sets guaranteed to be of length one.
NTS, DSTS, and TS are the same as above, but are time-series instead of single values.
Let's consider the first stage - "server_tx_bytes". This stage contains the tx_bytes
counter for every server-facing port in the system. We can get it from the url
/api/blueprints/<blueprint_id>/probes/<probe_id>/stages/server_tx_bytes_output
The response we get would be of the same form as the following:
{ "properties": [ "interface", "system_id" ], "type": "ns", "units": "bytes_per_second", "values": [ { "properties": { "interface": "intf1", "system_id": "spine1" }, "value": 22 }, { "properties": { "interface": "intf2", "system_id": "spine1" }, "value": 23 }, { "properties": { "interface": "intf1", "system_id": "spine3" }, "value": 24 } ] }
As we know from our running example, the "server_tx_bytes" stage contains the tx_bytes value for every server-facing interface in the network. Looking at the above example, we can see that this stage is of type "ns", indicating NS or Number-Set. As mentioned before, data in stages is associated with context. This means that every element in the set of a stage is associated with a group of key-value pairs. Per every stage, the keys are the same for every piece of data (or, equivalently, item in the set). These keys are listed in the "properties" field of a given stage, and are generally a function of the generating processor. Each of the items in "values" assigns a value to each of the properties of the stage and provides a value (the "Number" in the "Number Set"). The meaning of this data in this stage is that tx_bytes on intf1 of spine1 is 22, on intf2 of spine1 is 23, and on intf1 of spine3 is 24 bytes per second.
Notice that "units" is set for this stage as specified in the running example.
To query the second stage in our probe, send an HTTP GET to the std endpoint
/api/blueprints/<blueprint_id>/probes/<probe_id>/stages/std_dev_output
.
{ "type": "n", "units": "", "value": 1 }
This stage is a number. It has no context, only a single value. In our example, this is the standard deviation across all spines.
The penultimate stage in our probe can be queried at the endpoint
/api/blueprints/<blueprint_id>/probes/<probe_id>/stages/server_traffic_imbalanced
.
{ "possible_values": [ "true", "false" ], "type": "ds", "units": "", "value": false }
As shown, this stage indicates whether server traffic is imbalanced ("true") or not ("false") by indicating if the standard deviation across of tx_bytes across all server-facing ports is greater-than 100. Note the "possible_values" field describes all values that the discrete-state "value" can take.
All processors of a probe can also be queried via
/api/blueprints/<blueprint_id>/probes/<probe_id>/processors/<processor_name>
.
By doing such a query, you can discover the configuration used for creation of said
processor.
Query Probe Anomalies
The final stage of our example processor raises an Apstra Anomaly (and sets its output to "true"), when the standard deviation of tx_bytes across server-facing interfaces is greater-than 100.
You can query probe anomalies via the standard anomaly API at
/api/blueprints/<bluprint_id>/anomalies?type=probe
.
Following is the JSON form of an anomaly that would be raised by our example probe (with ellipses for data we don't care about for this example):
{ "actual": { "value_int": 101 }, "anomaly_type": "probe", "expected": { "value_int": 100 }, "id": "...", "identity": { "anomaly_type": "probe", "probe_id": "efb2bf7f-d8cc-4a55-8e9b-9381e4dba61f", "properties": {}, "stage_id": "server_traffic_imbalanced" }, "last_modified_at": "...", "severity": "critical" }
As seen in the above example, the identity contains the probe_id and the name of the stage on which the anomaly was raised and which requires further inspection by the operator. Within a given stage, if the type of the stage were a set-based type, the "properties" field of the anomaly would be filled with the properties of the specific item in the set that caused the anomaly. This brings up the important point that multiple anomalies can be raised on a single stage, as long as each is on a different item in the set. In our example, since the stage in question is of type NS, the "properties" field is not set.
Introspect Processors
The set of processors available to the operator is a function of the platform and the reference design. Apstra provides an API for the operator to list all available processors, learn what parameters they take, and learn what inputs they require and outputs they yield.
The API in question is found at
/api/blueprints/<blueprint_id>/telemetry/processors
.
It yields a list of processor descriptions. In the following example, we show the description for the std_dev processor.
{ "description": "Standard Deviation Processor.\n\n Groups as described by group_by, then calculates std deviation and\n outputs one standard deviation for each group. Output is NS.\n Input is an NS or NSTS.\n ", "inputs": { "in": { "required": true, "types": [ { "keys": [], "possible_values": null, "type": "ns" }, { "keys": [], "possible_values": null, "type": "nsts" } ] } }, "outputs": { "out": { "required": true, "types": [ { "keys": [], "possible_values": null, "type": "ns" } ] } }, "label": "Standard Deviation", "name": "std_dev", "schema": { "additionalProperties": false, "properties": { "ddof": { "default": 0, "description": "Standard deviation correction value, is used to correct divisor (N - ddof) in calculations, e.g. ddof=0 - uncorrected sample standard deviation, ddof=1 - corrected sample standard deviation.", "title": "ddof", "type": "integer" }, "enable_streaming": { "default": false, "type": "boolean" }, "group_by": { "default": [ "system_id" ], "items": { "type": "string" }, "type": "array" } }, "type": "object" } }
As seen above, there is a string-based description, the name of type processor type (as supplied to the REST API in probe configuration). The set of parameters specific to a given probe is described in the "schema".
Special notice must be paid to "inputs" and "outputs". Even though these are in the "schema" section, they are present on every type of processor. Each processor can take zero-or-more more input stages and must output one-or-more stages. Optional stages have "required" set to false. The names of the stages (relative to a particular instance of a processor) they take are described in these variables. We can see that the "std_dev" processor takes a single input named "in" and a single output named "out". This is reflected in our usage of it in the previous example.
There's one special input name: *
. For example:
"inputs": { "*": { "required": true, "types": [ { "keys": [], "possible_values": null, "type": "ns" }, { "keys": [], "possible_values": [], "type": "dss" }, { "keys": [], "possible_values": null, "type": "ts" } ] } }
It means the processor accepts one or more inputs of the specified types with arbitrary names.
Changed in 3.0: Previously, inputs and outputs section didn't specify whether specific inputs or outputs were required, so the format was changed from the following:
This syntax is deprecated and invalid.
"inputs": { "in": [ { "data_type": "ns", "keys": [ "system_id" ], "value_map": null, "value_type": "int64" } ... ] }
Stream Data
Any processor instance in any probe can be configured to have its output stages streamed in the "perfmon" channel of Apstra streaming output. If the property "enable_streaming" is set to "true" in the configuration for any processor, its output stages will have all their data streamed.
For Non-Time-Series-based stages, each will generate a message whenever their value changes. For Time-Series based stages, each will generate a message whenever a new entry is made into the time-series. For Set-based stages, each item in the set will generate a message according to the two prior rules.
Each message that is generated has a value, a timestamp, and a set of key-value pairs. The value is self-explanatory. The timestamp is the time at which the value changed for Non Time-series-based stages and the timestamp of the new entry for Time-series based stages. The key-value pairs correspond to the "properties" field we observed earlier in the "values" section of stages, thus providing context.
Below we have the format for messages from IBA which is encapsulated in a PerfMon message (and that in-turn in an AosMessage). The key-value pairs of context are put into the "property" repeated field (with "name" as the key and "value" as the value) while the value is put into the "value" field. "probe_id" and "stage_name" are as they appear. The blueprint_id is put into the "origin_name" of the encapsulated AosMessage. Similarly the timestamp is put into the generic "timestamp" field.
message ProbeProperty { required string name = 5; required string value = 6; } message ProbeMessage { repeated ProbeProperty property = 1; oneof value { int64 int64_value = 2; float float_value = 3; string string_value = 4; } required string probe_id = 5; required string stage_name = 6; }