Cloud Deduplication, On-Demand: StorReduce, an APN Technology Partner

StorReduce Case Study

This case study originally appeared on the Amazon Partner blog.

Develop. Disrupt. Repeat.

At the end of the day, our main goal is to provide our APN Partners with the services, support, and resources they need to provide end customers with valuable value-added services and solutions on the AWS platform. We love hearing stories about the unique products our APN Technology Partners have developed that integrate with the AWS platform, and today we’re going to tell you about one such product from APN Technology Partner StorReduce.

Our Partner SA team has worked closely with StorReduce, and wanted to share the story of their success with AWS Customer and fellow APN Partner SpectrumData with you.

Who is StorReduce?

StorReduce, an APN Technology Partner, enables enterprises storing unstructured data to Amazon Simple Storage Service (Amazon S3) or Amazon Glacier on Amazon Web Services (AWS) to reduce their storage by typically 50-97%. It also offers enterprises a new and more efficient way to migrate backup appliance data and large tape archives to AWS.

StorReduce’s deduplication software runs as an instance in the cloud or as a virtual machine in a datacenter and scales to petabytes of data. The deduplication removes any redundant blocks of data before it is stored and ensures that only one copy of each block is stored, thereby reducing the amount and cost of cloud storage by up to 95 percent. StorReduce provides throughput of up to 600 MB/s for both reads and writes, and on retrieval adds an additional latency of around 10ms. StorReduce is suitable to deduplicate most data workloads such as: backup, archive, data from mobiles and wearable devices where there is copying of the data, and general unstructured file data.

StorReduce has an Amazon S3 interface, so that any data it deduplicates can seamlessly be used by AWS services such as Amazon Elastic MapReduce (Amazon EMR) for data mining, and Amazon CloudSearch.

See below to get an idea for how StorReduce works:

How StorReduce Works

StorReduce and AWS

StorReduce chose to work with AWS because of AWS’ extensive range of enterprise cloud services; for instance, storage services like Amazon S3 and Amazon Glacier and the ecosystem of tools and services that integrate with them are important for the enterprise workloads with which StorReduce works. The global AWS footprint was another important factor for StorReduce in working with AWS, along with AWS’s commitment to reduce the cost of cloud for our customers.

For the StorReduce team, AWS is a natural choice for enterprises migrating to a public or hybrid cloud environment and for high growth companies born on the cloud. StorReduce chose the Amazon S3 compatible interface because it offers a simple integration point for its customers. The Amazon S3 compatible interface allows any application that communicates with Amazon S3 to take advantage of StorReduce for deduplication without modification. This includes third party products that copy data to and from Amazon S3, as well as AWS Services like Amazon EMR and Amazon CloudSearch.

Who is SpectrumData?

SpectrumData, an APN Technology Partner, operate globally. The company is one of the world’s largest independent data management companies and the largest in the Southern Hemisphere. They are highly experienced in all aspects of data management, in particular the restoration, migration and preservation of digital assets from legacy media and redundant, out-dated tape and recording technologies.

Deduplication to the Cloud - The Challenge

SpectrumData needed to migrate its clients’ petabyte scale tape archives (tens of thousands of tapes) to Amazon S3 and Amazon Glacier storage solutions. To reduce the cost of storage and the bandwidth required to transfer the tape data to AWS, SpectrumData chose to deduplicate the data. Tape archives generally contain multiple copies of the same data sets, which can be reduced down to a single copy with deduplication. This reduces the amount of data stored down to between 12 to 1/20th.

According to Guy Holmes, Director of SpectrumData, “It is virtually impossible to migrate large tape archives to cloud using existing on-premise deduplication offerings because they do not scale. We can only put four tapes at a time through their hardware before we start to see a bottleneck forming. In order to upload large tape archives to cloud in weeks not years, we need to put hundreds of tapes at a time through the hardware 24 hours per day.”

Why StorReduce

For tape migration, StorReduce’s software can be installed on premise for a CAPEX-free, very fast migration of an enterprise’s large tape archives and backup appliance data onto the AWS Cloud. Installing StorReduce on-premise minimizes bandwidth during the transfer. See below:

Transferring data with StorReduce

After the transfer is completed, the on-premise StorReduce software can be removed and re-instated in the cloud:

Re-instate StorReduce in the cloud

The Benefits of Working with StorReduce and AWS

For SpectrumData, the global footprint of AWS made working with AWS a natural choice. The AWS footprint allows SpectrumData to store data in close proximity to its customers no matter where they are in the world. This improves performance by reducing latency and allows SpectrumData and its customers to comply with data sovereignty laws. Another reason the company decided to work with AWS is the pay-as-you-go pricing model embraced by AWS. SpectrumData pays for exactly the resources they use, and there’s no need to estimate capacity or to make an upfront investment.

After SpectrumData was introduced to StorReduce by AWS, Holmes believed that it could overcome his current challenges with SpectrumData’s on-premise deduplication hardware.

SpectrumData conducted a proof of concept with StorReduce which performed the same tests on the same data that they had previously performed with a leading global deduplication hardware vendor. Holmes confirmed, “We’re delighted with StorReduce’s performance. The software deduplicates 24 / 7 and is more scalable than the hardware appliances we tested. These factors help us to achieve the necessary throughput for our clients. It also showed deduplication ratios trending to over 95 percent, which is equal to the leading global deduplication offerings we have tested.”

StorReduce enables SpectrumData to migrate large tape archives to AWS far more efficiently than the hardware appliances that were tested, reducing years of work to weeks. The deduplication also reduces cloud storage costs by up to 95 percent, decreasing a potential client’s monthly storage cost on Amazon Glacier from $250,000 per month to less than $20,000 per month.

Additional benefits are:

  • StorReduce removes the tens to hundreds of thousands of dollars in CAPEX that would otherwise need to be spent on deduplication hardware.

  • With StorReduce, once the tape data has been migrated to the cloud it is seamlessly accessible by Amazon S3 API. Therefore any existing AWS cloud services like Amazon CloudSearch and Amazon EMR can be easily used on that data. This is extremely challenging with on-premise deduplication offerings.

  • As the client’s data grows, StorReduce can quickly scale to meet their needs with no need to buy additional hardware.

Holmes concludes, “Working with StorReduce and AWS makes my business work.”

To learn more about how AWS can help with your storage and backup needs, visit our Storage and Backup details page:

Try StorReduce on AWS Marketplace now with one click to see how much you could save. To learn more about how StorReduce can migrate your tape archive or backup appliance data to the AWS Cloud, see, call +1 408 769 6118 or email