Run Commands
Read the PPS series >

Transform PPS

Set the name of the Docker image that your jobs use.

Spec #

This is a top-level attribute of the pipeline spec.

{
    "pipeline": {...},
    "transform": {
        "image": string,
        "cmd": [ string ],
        "datum_batching": bool,
        "err_cmd": [ string ],
        "env": {
            string: string
        },

        "secrets": [ {
            "name": string,
            "mount_path": string
        },
        {
            "name": string,
            "env_var": string,
            "key": string
        } ],
        "image_pull_secrets": [ string ],
        "stdin": [ string ],
        "err_stdin": [ string ],
        "accept_return_code": [ int ],
        "debug": bool,
        "user": string,
        "working_dir": string,
        "dockerfile": string,
        "memory_volume": bool,
    },
    ...
}

Attributes #

AttributeDescription
cmdPasses a command to the Docker run invocation.
datum_batchingEnables you to call your user code once for a batch of datums versus calling it per each datum.
stdinPasses an array of lines to your command on stdin.
err_cmdPasses a command executed on failed datums.
err_stdinPasses an array of lines to your error command on stdin.
envEnables a key-value map of environment variables that Pachyderm injects into the container.
secretsPasses an array of secrets to embed sensitive data.
image_pull_secretsPasses an array of secrets that are mounted before the containers are created.
accept_return_codePasses an array of return codes that are considered acceptable when your Docker command exits.
debugEnables debug logging for the pipeline
userSets the user that your code runs as.
working_dirSets the directory that your command runs from.
memory_volumeSets pachyderm-worker’s emptyDir.Medium to Memory, allowing Kubernetes to mount a memory-backed volume (tmpfs).

Behavior #

When to Use #

You must always use the transform attribute when making a pipeline.