Pachyderm Worker


Pachyderm workers are kubernetes pods that run the docker image (your user code) specified in the pipeline specification. When you create a pipeline, Pachyderm spins up workers that continuously run in the cluster, waiting for new data to process.

Each datum goes through the following processing phases inside a Pachyderm worker pod:

Phase Description
Downloading The Pachyderm worker pod downloads the datum contents
into Pachyderm.
Processing The Pachyderm worker pod runs the contents of the datum
against your code.
Uploading The Pachyderm worker pod uploads the results of processing
into an output repository.

Distributed processing internals