Content Verification service processing

The Content Verification service has two main functions: detecting corrupted data and discrepancies in metadata, and repairing that data and metadata.

Detecting content verification violations

To detect corrupted data, the Content Verification service regenerates the cryptographic hash values for each object. After regenerating the hash values, the Content Verification service checks that these regenerated values match the corresponding values in the primary metadata.

The Content Verification service detects metadata discrepancies by checking that certain secondary metadata for each object matches the primary metadata for the object.

A violation occurs when either of the conditions described above is not true. Violations of the second type are not reported in the system log.

Note

When an object is stored through the CIFS or NFS protocol, its primary metadata does not initially include cryptographic hash values that are based on the object data. HCP waits several minutes to ensure that the object content is complete before calculating these values. Large objects stored through these protocols may take longer to get hash values than smaller objects do.

If the Content Verification service encounters primary metadata without hash values, it adds the regenerated values to it.

Repairing content verification violations

If the Content Verification service finds a discrepancy between the cryptographic hash values it regenerates for the object and the corresponding hash value in the primary metadata, it creates a new copy of the object from an existing good copy and marks the corrupted copy for deletion.

If replication is in effect and the Content Verification service cannot find a good copy of the object in the current repository, it can repair the object by using a copy from another HCP system in the replication topology.

To repair a chunk for an erasure-coded object, the Content Verification service recalculates the chunk either by using a full copy of the object data, if one exists on another system in the replication topology, or by using the chunks for the object on all the other systems in the replication topology.

If the Content Verification service finds a discrepancy between other secondary metadata for the object and the corresponding primary metadata, it uses the primary metadata to replace the secondary metadata.

Unavailable and irreparable objects

When the Content Verification service cannot repair a violation, it marks the object as either unavailable or irreparable:

  • An object is unavailable if all of these are true:
    • At least one copy of the object is unavailable because of a node, logical volume, or extended storage device being unavailable.
    • None of the available copies of the object are good.
    • Either the namespace that contains the object is not being replicated, or all copies of the object data on other systems in the replication topology are either inaccessible or not good.
  • An object is irreparable if all of these are true:
    • All of the primary storage volumes, NFS volumes, and extended storage devices on which copies of the object data are stored are available.
    • None of the copies of the object data are good.
    • Either the namespace that contains the object is not being replicated, or all copies of the object data on other systems in the replication topology are either inaccessible or not good.