Reference
PachCTL

Datum Timeout PPS

Set the maximum execution time allowed for each datum.

March 24, 2023

ℹī¸

For a single-page view of all PPS options, go to the PPS series page.

Spec #


"datum_timeout": string,

Behavior #

The datum_timeout attribute in a Pachyderm pipeline is used to specify the maximum amount of time that a worker is allowed to process a single datum in the pipeline.

When a worker begins processing a datum, Pachyderm starts a timer that tracks the elapsed time since the datum was first assigned to the worker. If the worker has not finished processing the datum before the datum_timeout period has elapsed, Pachyderm will automatically mark the datum as failed and reassign it to another worker to retry. This helps to ensure that slow or problematic datums do not hold up the processing of the entire pipeline.

Other considerations:

When to Use #

You should consider using the datum_timeout attribute in your Pachyderm pipeline when you are processing large or complex datums that may take a long time to process, and you want to avoid having individual datums hold up the processing of the entire pipeline.

For example, if you are processing images or videos that are particularly large, or if your pipeline is doing complex machine learning or deep learning operations that can take a long time to run on individual datums, setting a reasonable datum_timeout can help ensure that your pipeline continues to make progress even if some datums are slow or problematic.