Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Paragon Insights Time Series Database (TSDB)

Paragon Insights (formerly HealthBot) collects a lot of data through its various ingest methods. All of that data is time sensitive in some context. This is why Paragon Insights uses a time series database (TSDB) to store and manage all of the information received from the various network devices. This topic provides an overview of the TSDB as well some guidance on managing it.

Historical Context

In releases earlier than HealthBot Release 3.0.0, there was one TSDB instance regardless of whether you ran Paragon Insights as a single node or as a multi-node (Docker Ccompose) installation. Figure 1 shows a high-level view of what this looked like.

Figure 1: Single TSDB Instance - Releases earlier than HealthBot Release 3.0.0Single TSDB Instance - Releases earlier than HealthBot Release 3.0.0

This arrangement left no room for scaling or redundancy of the TSDB. Without redundancy, there is no high availability (HA); A failure left you with no way to continue operation or restore missing data. Adding more Docker Compose nodes to this topology would only provide more Paragon Insights processing capability at the expense of TSDB performance.

TSDB Improvements

To address these issues and provide TSDB high availability (HA), three new TSDB elements are introduced in HealthBot Release 3.0.0, along with clusters of Paragon Insights nodes* for the other Paragon Insights microservices:

  • – How many servers, or nodes, are available to store TSDB data and scale Paragon Insights?

  • – How many copies of your data do you want to keep?

  • – How is data written and read back from the TSDB? What happens when something goes wrong?

Note:

*Paragon Insights uses Kubernetes for clustering its Docker-based microservices across multiple physical or virtual servers (nodes). Kubernetes clusters consist of a primary node and multiple worker nodes. During the healthbot setup portion of Paragon Insights multinode installations, the installer asks for the IP addresses (or hostnames) of the Kubernetes primary node and worker nodes. You can add as many worker nodes to your setup as you need, based on the required replication factor for the TSDB databases. The number of nodes you deploy should be at least the same as the replication factor. (See the following sections for details).

For the purposes of this discussion, we refer to the Kubernetes worker nodes as Paragon Insights nodes. The primary node is not considered in this discussion.

Database Sharding

Database sharding refers to selectively storing data on certain nodes. This method of writing data to selected nodes distributes the data among available TSDB nodes and permits greater scaling since each TSDB instance then handles only a portion of the time series data from the devices.

To achieve sharding, Paragon Insights creates one database per device group/device pair and writes the resulting database to a specific (system determined) instance of TSDB hosted on one (or more) of the Paragon Insights nodes.

For example, say we have two devices, D1 and D2 and two device groups, G1 and G2. If D1 resides in groups G1 and G2, and D2 resides only in group G2, then we end up with 3 databases: G1:D1, G2:D1, and G2:D2. Each database is stored on its own TSDB instance on a separate Paragon Insights node as shown in Figure 2 below. When a new device is on-boarded and placed within a device group, Paragon Insights chooses a TSDB database instance on which to store that device/device-group data.

Figure 2: Distributed TSDBDistributed TSDB

Figure 2, above, shows 3 Paragon Insights nodes, each with a TSDB instance and other Paragon Insights services running.

Note:
  • A maximum of 1 TSDB instance is allowed on any given Paragon Insights node. Therefore, a Paragon Insights node can have 0 or 1 TSDB instances at any time.

  • A Paragon Insights node can be dedicated to running only TSDB functions. When this is done, no other Paragon Insights functions run on that node. This prevents other Paragon Insights functions from starving the TSDB instance of resources.

  • We recommend that you dedicate nodes to TSDB to provide the best performance.

  • Paragon Insights and TSDB nodes can be added to a running system using the Paragon Insights CLI.

Database Replication

As with any other database system, replication refers to storing the data in multiple instances on multiple nodes. In Paragon Insights, we establish a replication factor to determine how many copies of the database are needed.

A replication factor of 1 only creates one copy of data, and therefore, provides no HA. When multiple Paragon Insights nodes are available and replication factor is set to 1, then only sharding is achieved.

The replication factor determines the minimum number of Paragon Insights nodes needed. A replication factor of 3 creates three copies of data, requires at least 3 Paragon Insights nodes, and provides HA. The higher the replication factor, the stronger the HA and the higher the resource requirements in terms of Paragon Insights nodes. If you want to scale your system further, you should add Paragon Insights nodes in exact multiples of the replication factor, or 3, 6, 9, etc.

Consider an example where, based on device/device-group pairing mentioned earlier, Paragon Insights has created 20 databases. The Paragon Insights system in question has a replication factor of 2 and has 4 nodes running TSDB. Based on this, two TSDB replication groups are created; in our example they are TSDB Group 1 and TSDB Group 2. In Figure 3 below, the data from databases 1-10 is being written to TSDB instances 1 and 2 in TSDB group 1. Data from databases 11-20 is written to TSDB instances 3 and 4 in TSDB group 2. The outline around the TSDB instances represents a TSDB replication group. The size of the replication group is determined by the replication factor.

Figure 3: TSDB DatabasesTSDB Databases

Database Reads and Writes

As shown in Figure 2, Paragon Insights can make use of a distributed messaging queue. In cases of performance problems or errors within a given TSDB instance, this allows for writes to the database to be performed in a sequential manner ensuring that all data is written in proper time sequence.

All Paragon Insights microservices use standardized database query (read) and write functions that can be used even if the underlying database system is changed at some point in the future. This allows for flexibility in growth and future changes. Other read and write features of the database system include:

  • In normal operation, database writes are sent to all TSDB instances within a TSDB group.

  • Database writes can be buffered up to 1GB per TSDB instance so that failed writes can be retried until successful.

  • If problems persist and the buffer fills up, the oldest data is dropped in favor of new data.

  • When buffering is active, database writes are performed sequentially so that new data cannot be written until the previous write attempts are successful.

  • Database queries (reads) are sent to the TSDB instance which has reported the fewest write errors in the last 5 minutes. If all instances are performing equally, then the query is sent to a random TSDB instance in the required group.

Manage TSDB Settings in the Paragon Insights GUI

You can use the Paragon Insights GUI to configure the time series database (TSDB) settings.

To configure TSDB settings:

Warning:

Selecting, deleting, or dedicating TSDB nodes must be done during a maintenance window because some services will be restarted and the Paragon Insights GUI will likely be unresponsive.

  1. Select Settings > System.

    The System Settings page appears.

  2. Click Time Series Database.

    The TSDB Settings page appears.

  3. From the TSDB Settings page, you can:

    1. Select one or more nodes (from the TSDB Nodes list) to be used as TSDB nodes.

      (The TSDB Nodes list displays the available nodes in the Paragon Insights installation that you can select as TSDB nodes. By default, Paragon Insights automatically selects one node as a TSDB node.)

    2. Set the replication factor by typing a value (or by using the arrows to specify a value) in the Replication Factor text box.

      (The replication factor determines how many copies of the database are needed. The replication factor is set to 1 by default.)

    3. Dedicate nodes as TSDB nodes by clicking the Dedicate toggle to turn it on.

      A TSDB node might have more than one microservice running. However, when you dedicate a node as TSDB node, it runs only the TSDB microservice, and stops running all other microservices.

      Note:
      • If the node is associated to a persistent volume (storage in a cluster), then you cannot use that node as a dedicated TSDB node.

      • A fail-safe mechanism ensures that you cannot dedicate all Paragon Insights nodes as TSDB nodes.

    4. Ignore system errors (when you remove or replace a failed TSDB node from Paragon Insights) by clicking the Force toggle to turn it on.

      For example, when a TSDB node fails and the replication factor for that node is set to one, the TSDB data for that node is lost. In this scenario, the failed TSDB node must be removed from Paragon Insights. However, when you try to replace the failed node with a new node, the backup of the node fails with a system error because the replication factor was set to one. If you want to proceed with replacing the node, you must turn the Force toggle on.

    5. Delete a node that was previously assigned as a TSDB node by clicking X next to the name of the TSDB node. The node is removed as a TSDB node when you deploy the new configuration changes.

  4. Do one of the following:

    • Click Save to only save the configuration changes to the database without applying the changes to the TSDB nodes.

      You must commit (or rollback) the configuration changes later. For more information, see Commit or Roll Back Configuration Changes in Paragon Insights.

    • Click Save & Deploy to save configuration changes to the database and to apply the changes to the TSDB nodes.

  5. In the pop-up that appears, click OK to confirm.

    You are returned to the TSDB Settings page.

Paragon Insights CLI Configuration Options

The Paragon Insights CLI provides a means to add and delete TSDB nodes from from the system and to change the replication factor as a result.

Add a TSDB Node to Paragon Insights

or

Manage the Replication Factor

Set the replication factor to a multiple of the number of TSDB nodes present in the system. If you have two TSDB nodes, set the replication factor at 2, 4, 6, etc.

Usage Notes

  • Paragon Insights performs a ping to determine if the new node(s) is reachable. A warning is shown if the ping fails.

  • The dedicate option specifies whether or not the TSDB nodes perform only TSDB functions.