Scale-out Deployment Guide

Introduction

This document outlines the requirements and instructions for setting up a cluster of StorReduce servers using Docker running either on a either a StorReduce supplied VM image or a server with RedHat Linux 7+ (or similar).

A StorReduce cluster is comprised of an odd number of StorReduce servers (VMs or bare metal). Each server in the cluster is configured in exactly the same way, with one excpetion, and once the cluster is running all servers are equivalent.

Please contact us if you have questions

If you have any problems or questions in your deployment, please feel free to contact us!

Email support@storreduce.com

Live on-line chat: use the chat button (bottom right)

Prerequisites

This guide assumes that each member of the cluster will run on its own isolated Linux server, each with internet connectivity capable of communicating with an object store, like Amazon S3, on HTTP port 80 and/or port 443, and access to port 8080 for the StorReduce dashboard admin interface.

If the system is not running on the StorReduce supplied VM, please ensure that it running a RedHat 7+ compatible distribution (e.g, RHEL 7, CentOS 7+, Amazon Linux).

Decide on the Cluster Configuration

A StorReduce cluster is comprised of an odd number of StorReduce servers, N_SERVERS, which must be greater than or equal to 3. Inside each server there are a number of shards, N_SHARDS. Each shard represents a logical partition of the namespace - if a server contains a shard then that server is responsible for that part of the namespace.

When the number of servers in a cluster is increased the cluster rebalances the shards across the cluster so that each server is responsible for approximately same number of shards. The number of shards N_SHARDS is fixed when the cluster is created; it cannot be changed after the cluster is created.

Before you create a cluster you must decide on the number shards that the cluster will have. StorReduce reccommends that in most cases the number of shards should be 12 times the number of servers in the cluster. E.g.,

N_SHARDS = 12 * N_SERVERS

For example, if N_SERVERS = 3 we recommend N_SHARDS = 36; if N_SERVERS = 9 we reccommend N_SHARDS = 108.

Each shard is contained in a primary server. Additionally, a shard can also be replicated to one or more secondary servers so that if the primary server fails, one of the secondary servers can take over that portion of the namespace and the cluster can continue to operate. The number of shard replicas is N_REPLICAS.

A StorReduce cluster will continue to function if up to N_REPLICAS servers fail and the number of operational servers is greater than or equal to N_SERVERS divided by 2 and rounded up to the nearest integer.

The constraint on the size of N_REPLICAS is the amount of local SSD that can be allocated to each server. Every replica increases the amount of SSD required for the cluster. E.g., a cluster with N_REPLICAS = 1 requires twice as much SSD as a cluster with N_REPLICAS = 0.

Increasing N_REPLICAS also increases the amount CPU, network bandwidth and IOPS consumed by the cluster.

Therefore, if N_SERVERS is 3 then N_REPLICAS can be 1 or 0. If N_SERVERS greater than 3 then N_REPLICAS should in most cases be 0, 1 or 2.

Disk Configuration

  • If you are using the StorReduce VM. Please skip to this section.

As discussed in the On-Premises Installation Guide, StorReduce requires two disks to store its deduplication index. For a StorReduce cluster, both disks should be SSDs.

NOTE: For functional testing with less than 100TB of data, it is not necessary that these disks are SSDs. If you have more than 1TB of free space in the root partition, you can skip to the next section.

These disks should be formatted with the ext2 filesystem1 and mounted at /mnt/srdb and /mnt/srkeydb.

Form

sudo mkfs -t ext2 /dev/sdX
sudo mkfs -t ext2 /dev/sdY

then make mount points like this:

sudo mkdir -p /mnt/srdb
sudo mkdir -p /mnt/srkeydb

To ensure the disks are mounted when the server restarts, add entries like this to /etc/fstab:2

/dev/sdX    /mnt/srdb       ext2    defaults,noatime  0  2
/dev/sdY    /mnt/srkeydb    ext2    defaults,noatime  0  2

Finally, mount the disks for the first time like this:

sudo mount /mnt/srdb
sudo mount /mnt/srkeydb

Firewall

  • If you are using the StorReduce VM. Please skip to this section.

By default StorReduce server uses the following TCP ports. Please configure the firewall on all servers to accept connections on these ports:

Port(s) Reason Who/What will connect
22 SSH Administrators
80 S3 API over HTTP Administrators & users
443 S3 API over HTTPS Administrators & users
8080 StorReduce adminstration dashboard over HTTPS Administrators & users
8095-8098 Intra-cluster communications Servers in the cluster
2379-2380 Intra-cluster communications Servers in the cluster

Port ranges are inclusive.

The default ports can be changed when the server is installed, if you do make changes to the ports please adjust the firewall rules to match.

Install ntpd if it is not already installed

(If you are using the StorReduce VM, please skip to this section).

Determine if ntpd is installed.

sudo systemctl status ntpd

if using a systemd based distribution or:

sudo server ntpd status

on an older distribution.

If NTP is not installed, please install it.

Note you may need to configure ntpd to use your corporate NTP servers. You may also need to modify your firewall settings to allow NTP packets through.

Install the StorReduce YUM Repository

  • If you are using the StorReduce VM. Please skip to this section.

On all servers install the StorReduce YUM repository:

sudo curl --output /etc/yum.repos.d/storreduce.repo \
  https://storreduce-yum.s3-us-west-2.amazonaws.com/storreduce.repo

Deploy the StorReduce OVA Virtual Machines

If you are not using the StorReduce VM. Please skip to this section.

Virtual Machines (VMs) need to first be provisioned using the StorReduce OVA file:

  1. Download the StorReduce OVA file here
  2. Import the OVA file into VMware ESXi, Workstation or Fusion.

    • We recommend that you select ‘thin provisioning’ for disks since the OVA file will set up several virtual disks. For evaluation purposes the default disk sizes can be used.

    • Ensure that you have network connectivity to the server on all ports listed in the above table.

Network Configuration

The StorReduce cluster must have a domain name (DNS name) associated with it.

The domain name can refer to each of the servers in the cluster using DNS round robin or similar, or it can refer to a load balancer which sits infront of the cluster.

You must have a DNS name or hosts file entry to refer to the StorReduce server VM. Although you can use the IP address for the VM to browse to the Web console, this will not work for clients connecting via the S3 API. (S3 API digital signatures do not work when the endpoint is specified as an IP address.)

Pre-configure the StorReduce Servers

NOTE: The following instructions apply to each of the servers in the cluster.

Install storreducectl. On the StorReduce VM, run:

sudo yum clean all && sudo yum -y update storreducectl

If you are not using the StorReduce VM, run:

sudo yum clean all && sudo yum -y install storreducectl

Install Docker

  • If you are using the StorReduce VM. Please skip to this section.

On some systems (e.g., CentOS 7, AMI Linux) you can run:

sudo yum install -y docker

For other systems please follow the instructions on the Docker website.

Please ensure that docker will restart when the server is rebooted.

Deploy the First StorReduce Server

NOTE: The method for deploying the first StorReduce docker image varies from subsequent servers belonging to the same cluster.

  1. On the first server in the cluster, deploy the StorReduce docker image by running:

    sudo storreducectl server init
    
  2. Follow the wizard to allocate ports and other configuration settings for StorReduce operation. If you change the ports from the defaults, please update the firewall accordingly. The output of the wizard should look like this:

    Using default config
    Running with args: []string{}
    
    Enter the IP address that other StorReduce servers will use to connect to this server (--cluster_listen_interface): 172.31.0.197
    Enter the beginning of the port range that other servers will connect to this server on. The port range will cover the specifed port plus the next 3 ports (--cluster_listen_port) [8095]:
    Enter the ETCD client port that other servers will connect to this server on (--config_server_client_port) [2379]:
    Enter the ETCD peer port that other servers will connect to this server on (--config_server_peer_port) [2380]:
    Enter the S3 HTTP port that S3 clients will connect to this server on (--http_port) [80]:
    Enter the S3 HTTPS port that S3 clients will connect to this server on (--https_port) [443]:
    Enter the Admin API HTTPS port that StorReduce admins will connect to this server on (--admin_port) [8080]:
    Please enter the number of shards. This CANNOT be changed later (--dev_n_shards) [36]:
    Please enter the number of shard replicas. This CAN be changed later (--n_shard_replicas) [2]: 1
    
    StorReduce will be configured with the following settings:
    
    ### Cluster ###
    
    Shard Count (--dev_n_shards): 36
    Shard Replica Count (--n_shard_replicas): 1
    
    ### Network ###
    
    IP Address (--cluster_listen_interface): 172.31.0.197
    Cluster Port #1 (--cluster_listen_port): 8095
    Cluster Port #2: 8096
    Cluster Port #3: 8097
    Cluster Port #4: 8098
    ETCD Client Port (--config_server_client_port): 2379
    ETCD Peer Port (--config_server_peer_port): 2380
    HTTP Port (--http_port): 80
    HTTPS Port (--https_port): 443
    Admin API HTTPS Port (--admin_port): 8080
    
    ### Host Directories ###
    
    SRDB: /mnt/srdb
    SRKEYDB: /mnt/srkeydb
    Lib: /var/lib/storreduce
    Run: /var/run
    Etc: /etc/storreduce
    Log: /var/log/storreduce
    
    WARNING Please confirm you wish to initialize StorReduce
    Please type "init storreduce" and push enter to proceed: init storreduce
    
  3. Browse to the server’s admin dashboard e.g. https://<SERVER-IP-ADDRESS>:8080 and complete the full server setup and save settings.

  4. Wait for server to restart, then log back in and navigate to the Cluster tab.

  5. Copy the Cluster Discovery Token that appears (referred in subsequent steps as <CLUSTERTOKEN>)

Cluster Discovery Token Screenshot

Expand the Cluster by Adding the Other Servers

  1. On all the other servers that have not yet been configured, run:

    sudo storreducectl server init <clustertoken>
    
  2. Follow the wizard to allocate ports and other configuration settings for StorReduce operation. If you change the ports from the defaults, please update the firewall accordingly.

Rebalance the cluster

After all servers have been added to the cluster, click the Rebalance Cluster button on the Cluster tabe of the StorReduce Dashbaord, or SSH to any server and run:

    storreducectl cluster rebalance -d

Wait a few moments and the cluster should be fully functional.


  1. We use the ext2 filesystem because it offers the best performance with memory mapped databases.

    [return]
  2. The noatime option reduces the number of writes into filesystem by disabling update to the last accessed time attribute on the files in the filesystem.

    [return]