Metadata

The following guide explains how to add and edit metadata for various Pachyderm objects including clusters, projects, repositories, branches, and commits. This is useful for adding context to your data, which makes it easier to discover, manage, track and use.

Before you start

You must have the following permissions on a given object to manage its metadata:

  • Cluster: CLUSTER_EDIT_CLUSTER_METADATA (Included in the clusterAdmin role)
  • Repo: REPO_WRITE (Included in the repoOwner and repoWriter role)
  • Branch: REPO_CREATE_BRANCH (Included in the repoOwner and repoWriter role)
  • Project & Commit: No special permissions required.

Derived metadata

Projects, pipelines, branches, and commits automatically generate the following metadata:

  • created_at: The timestamp when the object was created.
  • created_by: The user who created the object.
  • updated_at: The timestamp when the object was last updated.
  • updated_by: The user who last updated the object.

How to add metadata to objects

To manage metadata for various Pachyderm objects, use the pachctl edit metadata command. This command supports adding, editing, deleting, and replacing metadata entries.

  1. Open a terminal window.
  2. Validate you are in the correct context and set to the desired project:
    pachctl config get context $(pachctl config get active-context)
    {
      "pachd_address":  "grpc://127.0.0.1:80",
      "cluster_deployment_id":  "mUf3ryBk95WZKpcp3MBwOfOJWyduFajJ",
      "project":  "video-to-frame-traces"
    }
  3. Switch the context or active project if necessary:
    pachctl config set context <contextName>
    pachctl config update context --project foo
  4. Run the following command to manage metadata for the desired object:
    pachctl edit metadata <object type> <object picker> <operation> <data>

About pachctl edit metadata

See the CLI Command Reference for more information on the pachctl edit metadata command.

Parameters

  • object type: Type of the object (e.g., cluster, project, repo, branch, commit).
  • object picker: A selector used to pick the object. The format varies depending on the object type.
  • operation: Type of operation (add, edit, delete, replace) to perform on the metadata.
  • data: The metadata to be manipulated. This could be a key-value pair or a JSON object, depending on the operation.
Unspecified Projects
If a project is not specified for the repo or branch object picker, the current active project is assumed.

Operations

  • Add: Inserts a new key-value pair. Fails if the key already exists.
  • Edit: Modifies the value of an existing key. If the key does not exist, it will be created.
  • Delete: Removes a key-value pair from the metadata.
  • Replace: Overwrites the entire metadata with new content.

Examples

Cluster

pachctl edit metadata cluster . add environment=production
pachctl edit metadata cluster . edit environment=development
pachctl edit metadata cluster . delete environment

Project

pachctl edit metadata project myproject add support_contact=you@example.com
pachctl edit metadata project myproject replace contact=you@example.com

Repo

pachctl edit metadata repo myproject/myrepo add source="some-external-datasource"
pachctl edit metadata repo myproject/myrepo delete source

Branch

TARGET=myrepo/myproject@dev
METADATA='"spaced key"="two\nlines"'
pachctl edit metadata branch $TARGET add $METADATA

Multiple commits

pachctl edit metadata commit images@master add verified_by=you@example.com \
commit edges@master add verified_by=you@example.com

Demo

The following demo shows how to add metadata to the raw_videos_and_images repo from the beginner tutorial project.

How to read metadata

You can view the metadata for various Pachyderm objects using the pachctl inspect <object type> command and passing in the --raw or --raw --output=yaml flag.

pachctl inspect project myproject --raw