Duplicate Elimination service

Duplicate elimination is the process of merging the data associated with two or more identical objects. For objects to be identical, their data content must match exactly. By eliminating duplicates, HCP increases the amount of space available for storing additional objects.

For example, if the same document is added to several different directories, duplicate elimination ensures that each copy of the document content that HCP must maintain in the repository is stored in only one location. This saves the space that would have been used by the additional copies of the document.

For the purpose of duplicate elimination, HCP treats these as individual objects:

  • Parts of multipart objects
  • Chunks for erasure-coded objects
  • Chunks for erasure-coded parts of multipart objects
  • Full copies of the data for objects and parts that are subject to erasure coding before those copies are reduced to chunks

The Duplicate Elimination service does not merge parts of in-progress multipart uploads, parts of a multipart upload that have been replaced, parts of an aborted multipart upload, or unused parts of completed multipart uploads.

The Duplicate Elimination service runs according to the active service schedule.

Note

The Duplicate Elimination service does not eliminate duplicate objects stored in namespaces that use service plans that have S Series storage devices set as the ingest tier.