Run Commands
Read the GLOSSARY series >

Pipeline Inputs

Learn about the concept of a pipeline input, which is the source of the data that the pipeline reads and processes.

About #

In Pachyderm, pipeline inputs are defined as the source of the data that the pipeline reads and processes. The input for a pipeline can be a Pachyderm repository (input repo) or an external data source, such as a file in a cloud storage service.

To define a pipeline input, you need to specify the source of the data and how the data is organized. This is done in the input section of the pipeline specification file, which is a YAML or JSON file that defines the configuration of the pipeline.

Input Types #

The input section can contain one or more input sources, each specified as a separate block.