Metadata Annotation
In addition to updating object metadata directly (via COPY), append additional metadata to existing objects without altering the original. This provides a method to extend the metadata of immutable objects, including historical versions, because each object's create date, original metadata, and version sequence remain undisturbed. Annotations provide an additional method for finding and managing objects, such as storing S3 object-level ACLs for the Gateway to enforce.
Important
Swarm cannot be downgraded to an earlier version once this feature is used.
Benefits - Keeping metadata annotation separate from the object itself provides several advantages:
Add helpful metadata without changing the object’s create date, original metadata, and version sequence.
Retrieve objects as originally written, so applications can distinguish between what was original and what was added later.
Annotate immutable objects.
Annotate historical versions of objects, independent of the current version of the object. This is keenly important when the metadata is derived from analysis performed on the data, which changes from version to version, or when capturing information about specific versions.
Note
This lightweight implementation of annotation does not rely on the annotator and the target object interacting, and the objects do not operate as a pair. For example, there is no single request that returns both objects' headers, and there is no method to merge and resolve conflicts between them.
It is recommended to use SCSP COPY operation to update the metadata on an object so the metadata is directly available in listings. This COPY operation is efficient even for large objects, assuming they are erasure-coded, because the manifest stream is copied. For S3, use the PUT with copy operation, specifying the same source and destination.
Annotation Cleanup
There are two key features of this annotation method: (1) validation that target objects exist before annotations are written, and (2) the Health Processor's automated tracking and cleanup of annotation objects after the target object is removed. A target object annotated may be removed from Swarm in one of several ways:
SCSP Delete
SCSP Write (invalidating the old version)
Lifepoint Delete
Recursive delete of a parent context (domain or bucket)
Note
The Health Processor it logs a “DECORATION DELETE” AUDIT-level message when purging an annotation during garbage collection. Annotation objects "decorate" a targeted content object.
Regardless of the type of Swarm object annotated (named, alias, immutable, historical version) and the protection type (replicated or erasure-coded), metadata annotation operate largely the same way:
Swarm deletes the orphaned annotation during garbage collection if an annotation is created and the target object is later deleted.
The target object is completely unaffected if an annotation is created and later deleted.
For named objects only, Swarm replaces the annotation object with a delete marker.
Swarm deletes both recursively if deleting a domain or a bucket containing both the original object and the annotation.
Create separate annotations for any historical versions if updating a versioned object; Swarm deletes the orphaned annotation during garbage collection when deleting a version.
Two outcomes occur based on the position in the version chain if creating and later deleting an annotation on a versioned object:
Historical versions: Swarm removes the annotation.
Current versions: Swarm replaces the annotation object with a delete marker.
Creating Annotations
Metadata annotation makes use of a persisted header, Castor-System-Decorates
, which is the ETag of the target object the annotation object is extending (decorating). This is an annotation object, subject to special Health Processor management, if this header is present. The header is valid for all Swarm object types (immutable, alias, and named), but not for context objects (domains and buckets). Both the annotator (decorator) and annotated target object may be versioned.
Create a new annotation object create an object pointing to the ETag of the target and includes the custom metadata to be added, such as GPS coordinates extracted from an existing, uploaded photo:
Extending Metadata with Post-Processed Data
Content-Length: 0
Castor-System-Decorates: 9282727ffcca3a09e0843281aafc13af
X-GPS-Meta-Longitude: 36; 16; 48.36000000000589
X-GPS-Meta-Latitude: 115; 10; 20.79299999981990
Searching for Annotations
In the annotation (decorator) object’s Elasticsearch record, the Castor-System-Decorates
header value is indexed under the key decorates, and the Elasticsearch configuration templates include the decorates field. Most Swarm queries return this value, if present, as part of the results.
Query argument - Use a “decorates=<uuid>
” query argument in Swarm listing queries to find annotation objects for a given ETag (or earlier query result “hash”).
See Listing Operations.
Sample Scenario for Annotations
Suppose a company needs to store surveillance videos as immutable objects (as protection from tampering) in the domain "swarm.example.com
". To add a video, use the normal POST, adding the Content-Type of the video and custom metadata for the video's duration, camera location, and camera model:
curl -i --location-trusted -X POST --post301 \
--data-binary @20170311-972-9928817883.mp4 \
-H "Expect: 100-continue" \
-H "x-example-meta-Start-Time: 2017-03-11T12:00:01.678Z" \
-H "x-example-meta-End-Time: 2017-03-11T13:00:00.421Z" \
-H "x-example-meta-Building: Annex 2" \
-H "x-example-meta-Location: 972" \
-H "x-example-meta-CameraModel: SWDSK-850004A-US" \
-H "Content-Type: video/mp4" \
-H "Content-Disposition: inline" \
"http://swarm.example.com/"
HTTP/1.1 201 Created
Location: http://192.168.1.11:80/e970b3280d5501571c8c6fe9d6838557?domain=swarm.example.com
Location: http://192.168.1.12:80/e970b3280d5501571c8c6fe9d6838557?domain=swarm.example.com
Volume: b3381183a1cfc620d960db3eae1d086d
Volume: 604a44d1a351045553b5481391af0810
Manifest: ec
Content-UUID: e970b3280d5501571c8c6fe9d6838557
Last-Modified: Tue, 28 Mar 2017 19:19:48 GMT
Castor-System-Encoding: zfec 1.4(2, 1, 524288, 200000000)
Castor-System-Version: 1490728788.934
Etag: "681b2470307b9260fb83542903e51828"
Replica-Count: 2
Date: Tue, 28 Mar 2017 19:22:19 GMT
Server: CAStor Cluster/9.2.0
Content-Length: 46
Content-Type: text/html
Keep-Alive: timeout=14400
<html><body>New stream created</body></html>
To verify the video is successfully stored, use a HEAD command:
curl --head --location-trusted "http://swarm.example.com/e970b3280d5501571c8c6fe9d6838557"
HTTP/1.1 200 OK
Castor-System-CID: 7e7fd5d747d244726af93c726672408b
Castor-System-Cluster: swarm.example.com
Castor-System-Created: Tue, 28 Mar 2017 19:19:48 GMT
Content-Disposition: inline
Content-Type: video/mp4
Last-Modified: Tue, 28 Mar 2017 19:19:48 GMT
x-example-meta-Building: Annex 2
x-example-meta-CameraModel: SWDSK-850004A-US
x-example-meta-End-Time: 2017-03-11T13:00:00.421Z
x-example-meta-Location: 972
x-example-meta-Start-Time: 2017-03-11T12:00:01.678Z
Manifest: ec
Content-Length: 1500964975
Etag: "681b2470307b9260fb83542903e51828"
Castor-System-Domain: swarm.example.com
Volume: b3381183a1cfc620d960db3eae1d086d
Date: Tue, 28 Mar 2017 19:24:25 GMT
Server: CAStor Cluster/9.2.0
Keep-Alive: timeout=14400
The custom metadata is what makes it possible and practical to identify video of interest. Suppose an incident occurs in the Annex 2 building. Search for immutable video taken at Annex 2 during the time span to find surveillance video relevant to the investigation:
The search correctly finds a video of interest: e970b3280d5501571c8c6fe9d6838557
Adding Metadata Annotation
With the video stored securely, suppose the organization also needs to run an application to perform facial recognition on the video. An application generates data when it is run, including both information on the algorithm/settings and the detailed results. The original video object must remain read-only to serve as evidence, so the derived data and metadata must be stored with a method associating it with the original object without altering it.
The solution is to annotate the video with a decoration object (which can be named or unnamed) to associate the results with the original video.
To find any annotations producing facial recognition on the original object, search for objects that decorate the video and also qualify the search to look for facial recognition results:
The search correctly finds an annotation: 0cb2d9e90a3341b10bc9dba2
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.