Egress PPS

Push the results of a Pipeline to an external data store or an SQL Database.

March 24, 2023


For a single-page view of all PPS options, go to the PPS series page.

Spec #

"egress": {
    // Egress to an object store
    "URL": "s3://bucket/dir"
    // Egress to a database
    "sql_database": {
        "url": string,
        "file_format": {
            "type": string,
            "columns": [string]
        "secret": {
            "name": string,
            "key": "PACHYDERM_SQL_PASSWORD"

Attributes #

Attribute Description
URL The URL of the object store where the pipeline’s output data should be written.
sql_database An optional field that is used to specify how the pipeline should write output data to a SQL database.
url The URL of the SQL database, in the format postgresql://user:password@host:port/database.
file_format The file format of the output data, which can be specified as csv or tsv. This field also includes the column names that should be included in the output.
secret The name and key of the Kubernetes secret that contains the password for the SQL database.

Behavior #

The egress field in a Pachyderm Pipeline Spec is used to specify how the pipeline should write the output data. The egress field supports two types of outputs: writing to an object store and writing to a SQL database.

Data is pushed after the user code finishes running but before the job is marked as successful. For more information, see Egress Data to an object store or Egress Data to a database.

This is required if the pipeline needs to write output data to an external storage system.

When to Use #

You should use the egress field in a Pachyderm Pipeline Spec when you need to write the output data from your pipeline to an external storage system, such as an object store or a SQL database.

Example scenarios: