Data Nodes and Data Storage

JSA processor appliances and All-in-One appliances can store data but many companies require the stand-alone storage and processing capabilities of the Data Node to handle specific storage requirements and to help with implementing data retention policies. Many companies are impacted by regulations and laws that mandate keeping data records for specific periods.

Data Node Information

The following list describes information about Data Nodes:

Data Nodes add storage and processing capacity.
Data Nodes are plug-n-play and can be added to a deployment at any time.
Data Nodes integrate seamlessly with existing deployments.
Use Data Nodes to reduce the processing load on processor appliances by removing the data storage processing load from the processor.
Users can scale storage and processing power independently of data collection.
As of JSA 2014.7, a new data format with native data compression is used. Data is compressed in memory and is written out to disk in a proprietary binary compressed format. The new data format enables a better search performance and a more efficient use of system resources than the previous data format. The previous data format did not have a native built-in compression in older versions of JSA.

The following diagram shows an example of some uses for Data Nodes in a deployment.

Figure 1: Using Data Node Appliances to Manage Your Data Storage Data management architecture showing JSA Console in New York for data searches, storage of Events and Flows, and multiple data centers in the UK, France, and US. Data is categorized for optimized search and retained based on policies.

Data management architecture showing JSA Console in New York for data searches, storage of Events and Flows, and multiple data centers in the UK, France, and US. Data is categorized for optimized search and retained based on policies.

The following list describes the different elements that you need to consider when you deploy Data Nodes.

Data clustering-- Data Nodes add storage capacity to a deployment, and also improve performance by distributing data that is collected across multiple storage volumes. When the data is searched, multiple hosts, or a cluster does the search. The cluster can improve search performance, but doesn't require you to add multiple event processors. Data Nodes multiply the storage for each processor.

Note:
You can connect a Data Node to only one processor at a time, but a processor can support multiple Data Nodes.
Deployment considerations-- Keep the following information in mind as you set up Data Nodes in a deployment.
- Data Nodes are available with JSA 2014.5 and later.
- Data Nodes perform similar search and analytic functions as event and flow processors in a JSA deployment.
  
  The operational speed on a cluster is affected by the slowest member of a cluster. Data Node system performance improves if Data Nodes are sized similarly to the Event Processors and Flow Processors in a deployment. To facilitate similar sizing between Data Nodes and event and flow processors, Data Nodes are available on JSA core appliances.
- Data Nodes are available in three formats: software (on your own hardware), physical, and appliances. You can mix the formats in a single cluster.
Bandwidth and latency-- Ensure that you have a 1 Gbps link and less than 10 ms latency between hosts in the cluster. Searches that yield many results require more bandwidth.
Appliance compatibility-- Data Nodes are compatible with all existing JSA appliances that have an Event Processor or Flow Processor component, including All-In-One appliances.

Data Nodes support high availability (HA).
Installation of Data Nodes-- Data Nodes use standard TCP/IP networking, and do not require proprietary or specialized interconnect hardware.

Install each Data Node that you want to add to your deployment the same as you would install any other JSA appliance. Associate Data Nodes with either an event or flow processor. For more information, see Configuring a managed host in the Juniper Secure Analytics Administration Guide.

You can attach multiple Data Nodes to a single Event Processor or Flow Processor in a many-to-one configuration.

When you deploy high availability (HA) pairs with Data Node appliances, install, deploy, and rebalance data with the HA appliances before you synchronize the HA pair. The combined effect of the data rebalancing and the replication process that is utilized for HA results in significant performance degradation. If HA is set up on appliances to which Data Nodes are being introduced, then disconnect HA on the appliances and then reconnect it when the rebalance of the cluster is complete.
Decommissioning Data Nodes-- Use the System and License Management window to remove Data Nodes from your deployment, as with any other JSA appliance. Decommissioning does not erase data on the host, nor does it move the data to your other appliances. If you need to retain access to the data that was on the Data Nodes, you must identify a location to move that data to.
Data Rebalancing-- Adding a Data Node to a cluster distributes data to each Data Node. If it is possible, data rebalancing tries to maintain the same percentage of available space on each Data Node. New Data Nodes added to a cluster initiate more rebalancing from cluster event and flow processors to achieve efficient disk usage on the newly added Data Node appliances.

Starting with JSA 2014.5, data rebalancing is automatic and concurrent with other cluster activity, such as queries and data collection. No downtime is experienced during data rebalancing.

Data Nodes offer no performance improvement in the cluster until data rebalancing is complete. Rebalancing can cause minor performance degradation during search operations, but data collection and processing continue unaffected.

Note:
Encrypted data transmission between Data Nodes and Event Processors is not supported.
Management and Operations-- Data Nodes are self-managed and require no regular user intervention to maintain normal operation. JSA manages activities, such as data backups, high availability, and retention policies, for all hosts, including Data Node appliances.
Data Node failure-- If a Data Node fails, the remaining members of the cluster continue to process data.

When the failed Data Node returns to service, data rebalancing can occur to maintain proper data distribution in the cluster, and then normal processing resumes. During the downtime, data on the failed Data Node is unavailable, and I/O errors that occur appear in search results from the log and network activity viewers in the JSA user interface.

For catastrophic failures that require appliance replacement or the reinstallation of JSA, decommission Data Nodes from the deployment and replace them using standard installation steps. Copy any data that is not lost in the failure to the new Data Node before you deploy. The rebalancing algorithm accounts for data that exists on a Data Node, and shuffles only data that was collected during the failure.

For Data Nodes deployed with an HA pair, a hardware failure causes a failover, and operations continue to function normally.

SAN Overview

To increase the amount of storage space on your appliance, you can move a portion of your data to an offboard storage device. You can move your /store, /store/ariel, or /store/backup file systems.

Multiple methods are available for adding external storage, including iSCSI, and NFS (Network File System). You must use iSCSI to store data that is accessible and searchable in the UI, such as the /store/ariel directory, and reserve the use of NFS for data backups only.

Moving the /store file system to an external device might affect JSA performance.

After migration, all data I/O to the /store file system is no longer done on the local disk. Before you move your JSA data to an external storage device, you must consider the following information:

Searches that are marked as saved are also in the /transient directory. If you experience a local disk failure, these searches are not saved.
A transient partition that exists before you move your data is likely to remain in existence after the move, and it can be mounted on an iSCSI storage mount.

For more information about offboard storage, see the Juniper Secure Analytics Configuring Offboard Storage Guide.

ON THIS PAGE

Data Nodes and Data Storage

Data Node Information

SAN Overview