Cloud Deduplication, On-Demand: StorReduce, an APN Technology Partner

StorReduce Case Study

Who is StorReduce?

StorReduce, an APN Technology Partner, enables enterprises storing unstructured data to Amazon Simple Storage Service (Amazon S3) or Amazon Glacier on Amazon Web Services (AWS) to reduce their storage by typically 50-97%. It also offers enterprises a new and more efficient way to migrate backup appliance data and large tape archives to AWS.

StorReduce’s deduplication software runs as an instance in the cloud or as a virtual machine in a datacenter and scales to petabytes of data. The deduplication removes any redundant blocks of data before it is stored and ensures that only one copy of each block is stored, thereby reducing the amount and cost of cloud storage by up to 95 percent. StorReduce provides throughput of 10 GB/s (and beyond when used with more servers in a cluster), and ensures scalability & durability for your data. StorReduce is suitable to deduplicate most data workloads such as: backup, archive, data from mobiles and wearable devices where there is copying of the data, and general unstructured file data.

StorReduce has an Amazon S3 interface, so that any data it deduplicates can seamlessly be used by AWS services such as Amazon Elastic MapReduce (Amazon EMR) for data mining, and Amazon CloudSearch.

See below to get an idea for how StorReduce works:

How StorReduce Works

StorReduce and AWS

StorReduce chose to work with AWS because of AWS’ extensive range of enterprise cloud services; for instance, storage services like Amazon S3 and Amazon Glacier and the ecosystem of tools and services that integrate with them are important for the enterprise workloads with which StorReduce works. The global AWS footprint was another important factor for StorReduce in working with AWS, along with AWS’s commitment to reduce the cost of cloud for our customers.

For the StorReduce team, AWS is a natural choice for enterprises migrating to a public or hybrid cloud environment and for high growth companies born on the cloud. StorReduce chose the Amazon S3 compatible interface because it offers a simple integration point for its customers. The Amazon S3 compatible interface allows any application that communicates with Amazon S3 to take advantage of StorReduce for deduplication without modification. This includes third party products that copy data to and from Amazon S3, as well as AWS Services like Amazon EMR and Amazon CloudSearch.

Who is Tape Ark?

Tape Ark operate globally and is one of the world’s largest independent data management companies and the largest in the Southern Hemisphere that specialise in tape migration to the cloud. They are highly experienced in all aspects of data management, in particular the restoration, migration and preservation of digital assets from legacy media and redundant, out-dated tape and recording technologies. Tape Ark offers a free service to migrate large scale tape archives from physical storage to the cloud which is set to disrupt the physical tape storage industry.

Deduplication to the Cloud - The Challenge

Tape Ark needed to migrate its clients’ petabyte scale tape archives (tens of thousands of tapes) to Amazon S3 and Amazon Glacier storage solutions. To reduce the cost of storage and the bandwidth required to transfer the tape data to AWS, Tape Ark chose to deduplicate the data. Tape archives generally contain multiple copies of the same data sets, which can be reduced down to a single copy with deduplication. This reduces the amount of data stored down to between 1⁄2 to 1/20th.

According to Guy Holmes, Founder and CEO of Tape Ark, “It is virtually impossible to migrate large tape archives to the cloud using existing on-premise deduplication offerings because they do not scale. We can only put four tapes at a time through their hardware before we start to see a bottleneck forming. In order to upload large tape archives to cloud in weeks not years, we need to put hundreds of tapes at a time through the hardware 24 hours per day.”

Why StorReduce

For tape migration, StorReduce’s software can be installed on premise for a CAPEX-free, very fast migration of an enterprise’s large tape archives and backup appliance data onto the AWS Cloud. Installing StorReduce on-premise minimizes bandwidth during the transfer. See below:

Transferring data with StorReduce

After the transfer is completed, the on-premise StorReduce software can be removed and re-instated in the cloud:

Re-instate StorReduce in the cloud

The Benefits of Working with StorReduce and AWS

For Tape Ark, the global footprint of AWS made working with AWS a natural choice. The AWS footprint allows Tape Ark to store data in close proximity to its customers no matter where they are in the world. This improves performance by reducing latency and allows Tape Ark and its customers to comply with data sovereignty laws. Another reason the company decided to work with AWS is the pay-as-you-go pricing model embraced by AWS. Tape Ark pays for exactly the resources they use, and there’s no need to estimate capacity or to make an upfront investment.

After Tape Ark was introduced to StorReduce by AWS, Holmes believed that it could overcome his current challenges with Tape Ark’s on-premise deduplication hardware.

Tape Ark conducted a proof of concept with StorReduce which performed the same tests on the same data that they had previously performed with a leading global deduplication hardware vendor. Holmes confirmed, “We’re delighted with StorReduce’s performance. The software deduplicates 24 / 7 and is more scalable than the hardware appliances we tested. These factors help us to achieve the necessary throughput for our clients. It also showed deduplication ratios trending to over 95 percent, which is equal to the leading global deduplication offerings we have tested.”

StorReduce enables Tape Ark to migrate large tape archives to AWS far more efficiently than the hardware appliances that were tested, reducing years of work to weeks. The deduplication also reduces cloud storage costs by up to 95 percent, decreasing a potential client’s monthly storage cost on Amazon Glacier for example from $120,000 per month to less than $10,000 per month.

Additional benefits are:

  • StorReduce removes the tens to hundreds of thousands of dollars in CAPEX that would otherwise need to be spent on deduplication hardware.

  • With StorReduce, once the tape data has been migrated to the cloud it is seamlessly accessible by Amazon S3 API. Therefore any existing AWS cloud services like Amazon CloudSearch and Amazon EMR can be easily used on that data. This is extremely challenging with on-premise deduplication offerings.

  • As the client’s data grows, StorReduce can quickly scale to meet their needs with no need to buy additional hardware.

Holmes concludes, “Working with StorReduce and AWS makes my business work.”

To learn more about how AWS can help with your storage and backup needs, visit our Storage and Backup details page:

To learn more about how Tape Ark can help with your tape migration needs, see, call +61 1300 660 982 or Email

Try StorReduce on AWS Marketplace now with one click to see how much you could save. To learn more about how StorReduce can migrate your tape archive or backup appliance data to the AWS Cloud, see, call +1 408 769 6118 or email