Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation
 

Related Documentation

 

Juniper OpenStack High Availability

Introduction

The Juniper Networks software-defined network (SDN) controller has two major components: OpenStack and Contrail. High availability (HA) of the controller requires that both OpenStack and Contrail are resistant to failures. Failures can range from a service instance failure, node failure, link failure, to all nodes down due to a power outage. The basic expectation from a highly available SDN controller is that when failures occur, already provisioned workloads continue to work as expected without any traffic drop, and the controller is available to perform operations on the cluster. Juniper Networks OpenStack is a distribution from Juniper Networks that combines OpenStack and Contrail into one product.

Contrail High Availability

Contrail has high availability already built into various components, including support for the Active-Active model of high availability, which works by deploying the Contrail node component with an appropriate required level of redundancy.

The Contrail control node runs BGP and maintains adjacency with the vRouter module in the compute nodes. Additionally, every vRouter maintains a connection with all available control nodes.

Contrail uses Cassandra as the database. Cassandra inherently supports fault tolerance and replicates data across the nodes participating in the cluster. A highly available deployment of Contrail requires at least two control nodes, three config nodes (including analytics and webui) and three database nodes.

OpenStack High Availability

High availability of OpenStack is supported by deploying the OpenStack controller nodes in a redundant manner on multiple nodes. Previous releases of Contrail supported only a single instance of the OpenStack controller, and multiple instances of OpenStack posed new problems that needed to solved, including:

  • State synchronization of stateful services (e.g. MySQL) across multiple instances.
  • Load-balancing of requests across the multiple instances of services.

Supported Platforms

Juniper OpenStack Controller has tested high availability on the following platforms:

  • Linux - Ubuntu 12.04 with kernel version 3.13.0-34
  • OpenStack Havana

Juniper OpenStack High Availability Architecture

A typical cloud infrastructure deployment consists of a pool of resources of compute, storage, and networking infrastructure, all managed by a cluster of controller nodes.

The following figure illustrates a high-level reference architecture of a high availability deployment using Juniper OpenStack deployed as a cluster of controller nodes.

Juniper OpenStack Objectives

The main objectives and requirements for Juniper OpenStack high availability are:

  • 99.999% availability for tenant traffic.
  • Anytime availability for cloud operations.
  • Provide VIP-based access to the API and UI services.
  • Load balance network operations across the cluster.
  • Management and orchestration elasticity.
  • Failure detection and recovery.

Limitations

The following are limitations of Juniper OpenStack high availability:

  • Only one failure is supported.
  • During failover, a REST API call may fail. The application or user must reattempt the call.
  • Although zero packet drop is the objective, in a distributed system such as Contrail, a few packets may drop during ungraceful failures.
  • Juniper OpenStack high availability is not tested with any third party load balancing solution other than HAProxy.

Solution Components

Juniper Openstack's high availability active-active model provides scale out of the infrastructure and orchestration services. The model makes it very easy to introduce new services in the controller and in the orchestration layer.

Virtual IP with Load Balancing

HAProxy is run on all nodes to load balance the connections across multiple instances of the services. To provide a Virtual IP (VIP), Keepalived (open source health check framework and hot standby protocol) runs and elects a master based on VRRP protocol. The VRRP master owns the VIP. If the master node fails, the VIP moves to a new master elected by VRRP.

The following figure shows OpenStack services provisioned to work with HAProxy and Keepalived, with HAProxy at the front of OpenStack services in a multiple operating system node deployment. The OpenStack database is deployed in clustered mode and uses Galera for replicating data across the cluster. RabbitMQ has clustering enabled as part of a multinode Contrail deployment. The RabbitMQ configuration is further tuned to support high availability.

Failure Handling

This section describes how various types of failures are handled, including:

  • Service failures
  • Node failures
  • Networking failures

Service Failures

When an instance of a service fails, HAProxy detects the failure and load-balances any subsquent requests across other active instances of the service. The supervisord process monitors for service failures and brings up the failed instances. As long as there is one instance of a service operational, the Juniper OpenStack controller continues to operate. This is true for both stateful and stateless services across Contrail and OpenStack.

Node Failures

The Juniper OpenStack controller supports single node failures involving both graceful shutdown or reboots and ungraceful power failures. When a node that is the VIP master fails, the VIP moves to the next active node as it is elected to be the VRRP master. HAProxy on the new VIP master sprays the connections over to the active service instances as before, while the failed down node is brought back online. Stateful services (MySQL/Galera, Zookeeper, and so on) require a quorum to be maintained when a node fails. As long as a quorum is maintained, the controller cluster continues to work without problems. Data integrity is also inherently preserved by Galera, Rabbit, and other stateful components in use.

Network Failures

A connectivity break esp. in the control/data network causes the Controller cluster to partition into two. As long as the caveat around minimum number of nodes is maintained for one of the partitions, Controller cluster continues to work fine. Stateful services including MySQL Galera and RabbitMQ detect the partitioning and reorganize their cluster around the reachable nodes. Existing workloads continue to function and pass traffic and new workloads can be provisioned. When the connectivity is restored, the joining node becomes part of the working cluster and system gets restored to its original state.

Deployment

Minimum Hardware Requirement

A minimum of 3 servers (physical or virtual machines) are required to deploy a highly available Juniper OpenStack Controller. In Active-Active mode, the Controller cluster uses Quorum-based consistency management for guaranteeing transaction integrity across its distributed nodes. This translates to the requirement of deploying 2n+1 nodes to tolerate n failures.

Juniper OpenStack Controller offers variety of deployment choices. Depending on the use case, the roles can be deployed either independently or in some combined manner. The type of deployment determines the sizing of the infrastructure. The numbers below present minimum requirement across compute, storage, and network.

Compute

  • Quad core Intel(R) Xeon 2.5 Gz or higher
  • 32 GB or higher RAM for the controller hosts (increases with number of hypervisors being supported)
  • Minimum 1 TB disk, SSD, HDD

Network

A typical deployment separates control/data traffic from the management traffic.

  • Dual 10 GE that is bonded (using LAG 802.3ad) for redundant control/data connection
  • Dual 1 GE bonded (using LAG 802.3 ad) for redundant management connection.
  • Single 10G and 1G will also work if link redundancy is not desired.

Need Virtual IP(VIP) addresses carved from the networks the above NICs participate in. External VIP on the management network and Internal VIP on control/data network. External facing services get load-balanced using external VIP and internal VIP is used for communication between other services.

Packaging

High availability support brought in new components to the Contrail OpenStack deployment, which are packaged in a new package called contrail-openstack-ha. It primarily contains HAProxy, Keepalived, Galera, and their requisite dependencies.

Installation

Installation is supported through fabric (fab) scripts. External facing, there is very little change mostly to incorporate multiple OpenStack roles and VIP configuration. Testbed.py has new sections to incorporate external & internal VIPs. If OpenStack and Contrail roles are co-located on the nodes, only one set of external and internal VIP is enough.

Install also supports separating OpenStack and Contrail roles on physically different servers. In this case, external and internal VIPs specified are used for OpenStack controller and a separate set of VIPs, contrail_external_vip and contrail_internal_vip are used for the Contrail controller nodes. It is also possible specify separate RabbitMQ for OpenStack and Contrail controllers.

When multiple OpenStack roles are specified along with VIPs, the install-contrail target treats the installation as the High Availability install and additionally adds the contrail-openstack-ha package.

Similarly, setup_all treats the setup as Contrail High Availability setup and provisions the following services using the listed fab tasks.

  • Keepalived — fab setup_keepalived
  • high availability proxy — fab fixup_restart_haproxy_in_openstack
  • Galera — fab setup_galera_cluster, fab fix_wsrep_cluster_address
  • Glance — fab setup_glance_images_loc
  • Keystone — fab sync_keystone_ssl_certs

Also all the provisioning scripts are changed to use VIPs instead of the physical IP of the node in all OpenStack and Contrail related configuration files. The following figure shows a typical three node deployment where Openstack and Contrail roles are co-located on three servers.

Testbed File for Fab

A sample file is available at:

https://github.com/Juniper/contrail-fabric-utils/blob/R1.10/fabfile/testbeds/testbed_multibox_example.py

You can use the sample file by uncommenting and changing the high availability section to match your deployment.

The contents of the sample testbed.py file for the minimum high availability configuration is the following.

from fabric.api import env

#Management ip addresses of hosts in the cluster
host1 = '<user@<ip address>'
host2 = '<user@<ip address>'
host3 = '<user@<ip address>'
host4 = '<user@<ip address>'
host5 = '<user@<ip address>'

#External routers if any
#for eg. 
#ext_routers = [('mx1', '<ip address>')]
ext_routers = [('mx1','<ip address>'')]

public_vn_rtgt = 20000 
public_vn_subnet = "<ip address>'"

#Autonomous system number
router_asn = <asn number>

#Host from which the fab commands are triggered to install and provision
host_build = '<user@<ip address>'

#Role definition of the hosts.
env.roledefs = {
    'all': [host1, host2, host3, host4, host5],
    'cfgm': [host1, host2, host3],
    'openstack': [host1, host2, host3],
    'control': [host2, host3 ],
    'compute': [host4, host5 ],
    'collector': [host1, host2, host3],
    'webui': [host1,host2,host3],
    'database': [host1,host2,host3],
    'build': [host_build],
}

env.hostnames = {
    'all': ['vse2100-2', 'vse2100-3', 'vse2100-4','vse2100-5','vse2100-6']
}

#Openstack admin password
env.openstack_admin_password = <password>

env.password = '<password>'
#Passwords of each host
env.passwords = {
    host1: '<password>',
    host2: '<password>',
    host3: '<password>',
    host4: '<password>',
    host5: '<password>',
    host_build: '<password>',
}

#For reimage purpose
env.ostypes = {
    host1: 'ubuntu',
    host2: 'ubuntu',
    host3: 'ubuntu',
    host4: 'ubuntu',
    host5: 'ubuntu',
}

#OPTIONAL BONDING CONFIGURATION
#==============================
#Inferface Bonding
#OPTIONAL BONDING CONFIGURATION
#==============================
#Inferface Bonding
bond= {
    host1 : { 'name': 'bond0', 'member': ['eth1','eth2'], 'mode':'802.3ad' },
    host2 : { 'name': 'bond0', 'member': ['eth1','eth2'], 'mode':'802.3ad' },
    host3 : { 'name': 'bond0', 'member': ['eth1','eth2'], 'mode':'802.3ad' },
    host4 : { 'name': 'bond0', 'member': ['eth1','eth2'], 'mode':'802.3ad' },
    host5 : { 'name': 'bond0', 'member': ['eth1','eth2'], 'mode':'802.3ad' },
}

#OPTIONAL SEPARATION OF MANAGEMENT AND CONTROL + DATA
#====================================================
#Control Interface
control_data = {
    host1 : { 'ip': '<ip address>', 'gw' : '<ip address>', 'device':'bond0' },
    host2 : { 'ip': '<ip address>', 'gw' : '<ip address>', 'device':'bond0' },
    host3 : { 'ip': '<ip address>', 'gw' : '<ip address>', 'device':'bond0' },
    host4 : { 'ip': <ip address>', 'gw' : '<ip address>', 'device':'bond0' },
    host5 : { 'ip': '<ip address>', 'gw' : '<ip address>', 'device':'bond0' },
}

# VIP
env.ha = {
    'internal_vip' : '<ip address>',
    'external_vip' : '<ip address>'
}

#To disable installing contrail interface rename package
env.interface_rename = False

#To enable multi-tenancy feature
#multi_tenancy = True

#To Enable parallel execution of task in multiple nodes
do_parallel = True

# To configure the encapsulation priority. Default: MPLSoGRE 
#env.encap_priority =  "'MPLSoUDP','MPLSoGRE','VXLAN'"

Note: The management interface configuration happens outside of fab, so if the user needs a bond interface, the user needs to create a bond and assign the management NICs to it.

The management network must be a routable network.

 

Related Documentation

 

Modified: 2016-06-09

 

Related Documentation

 

Modified: 2016-06-09