Time Series Database Replication Scenarios

Paragon Insights collects a lot of time-sensitive data through its various ingest methods. Paragon Insights uses a time series database (TSDB) to store and to manage this information received from the various network devices. For more information about TSDB and on managing TSDB settings, see Manage Time Series Database Settings.

These topics explain the various scenarios that you might come across after you have configured TSDB settings from the Paragon Automation GUI.

Points to Remember

TSDB Nodes
- The TSDB Nodes list in the TSDB Settings page of the GUI displays the available nodes in the Paragon Automation installation that you can select as TSDB nodes. By default, Paragon Automation automatically selects one node as a TSDB node.
- A TSDB node might have more than one microservice running. However, when you dedicate a node as TSDB node, it only runs the TSDB microservice, and stops running all other microservices.
- If the node is associate with a persistent volume (storage in a cluster), then you cannot use that node as a dedicated TSBD node.
- A fail-safe mechanism ensures that you cannot dedicate all Paragon Automation nodes as TSDB nodes.
- When a TSDB node fails, you can rebuild the damaged server or component. However, if the replication factor is set to one, the TSDB data for the node is lost.
- You can ignore system errors when you remove or replace a failed TSDB node from Paragon Automation.
- Selecting, deleting, or dedicating TSDB nodes must be done during a maintenance window because some services will be restarted and the Paragon Automation GUI will likely be unresponsive.
Replication Factor
- Replication refers to storing data on multiple instances on multiple nodes. In Paragon Automation, we configure a replication factor to determine how many copies of the database are needed. The replication factor determines the minimum number of Paragon Automation nodes needed.
- The replication factor is set to 1 by default. A replication factor of 1 creates only one copy of data, and therefore, provides no high availability (HA). A replication factor of 3 creates three copies of data, requires at least 3 Paragon Automation nodes, and provides HA.
- The higher the replication factor, the stronger the HA and higher the resource requirements in terms of Paragon Automation nodes.
- If you want to scale your system further, you must add Paragon Automation nodes in multiples of the replication factor.
  
  For example, when you set the replication factor to three, you can add Paragon Automation nodes in multiples of three, such as three, six, nine, etc.
Database Sharding
- Database sharding refers to selectively storing data on certain nodes. This method distributes the data among available TSDB nodes and improves scaling.
Database Reads
- Database queries (reads) are sent to the TSDB instance which has reported the fewest write errors in the last 5 minutes. If all instances are performing equally, then the query is sent to a random TSDB instance in the required group.
  
  For more information, see Time Series Database (TSDB) Overview.

Scenario One

Consider the following TSDB configuration:

Table 1: TSDB Replication Scenario One
Number of databases	`20`
Replication factor	`2`
TSDB nodes	`4 (TSDB-1, TSDB-2, TSDB-3, TSDB-4)` You can specify TSDB nodes from the TSDB Settings page. The replication factor is set to one by default. When the replication factor is set to one, , Paragon Automation will select one TSDB node to store a copy of the data of a database. However, in this scenario, when the replication factor is set to two, Paragon Automation will store a copy of the data on two different TSDB nodes.
TSDB groups (created automatically by Paragon Automation)	`2 TSDB groups with 2 TSDB nodes each` Two TSDB groups are created automatically by Paragon Automation after you set the replication factor and add TSDB nodes.

In this scenario, when a new database is created, a TSDB group that serves the least number of databases is automatically assigned to that database. Data from the database is stored (TSDB writes) in this assigned TSDB group. A copy of the data is maintained in both TSDB nodes that form the TSDB group.

Scenario Two

Consider the following TSDB configuration

Table 2: TSDB Replication Scenario Two
Number of databases	`20`
Replication factor	`2`
TSDB nodes	`8 (TSDB-1, TSDB-2... TSDB-8)` You can specify TSDB instances from the TSDB Settings page. The replication factor is set to 1 by default. When the replication factor is set to one, Paragon Automation will select one TSDB node to store a copy of the data of a database. However, in this scenario, when the replication factor is set to two, Paragon Automation will store a copy of the data on two different TSDB nodes.
TSDB groups (created automatically by Paragon Automation)	`4 TSDB groups with 2 TSDB nodes each` Four TSDB groups are created automatically by Paragon Automation after you set the replication factor and add TSDB nodes.

In this scenario where the total number of databases are 20, the TSDB group that serves the least number of databases is selected. However, if all TSDB groups serve the same number of databases, the TSDB group is picked at random. A copy of the data is maintained in all TSDB nodes that form the TSDB group.

Scenario Three