A Ancestry Syntax Learn about the concept of Ancestry Syntax, which is used to reference the history of commits and branches in a repository. B Branch A pointer to a commit that moves along with new commits as they are submitted. C Commit An atomic operation that snapshots and preserves the state of files/directories within a repository. Commit Set Learn about the concept of a commit set, which is an immutable set of all the commits that resulted from a modification to the system. Cron Learn about the concept of a cron D DAG Learn about DAGs, the Directed Acyclic Graphs that define the order in which pipelines are executed and how data flows between them. Data Parallelism Learn about the concept of data parallelism. Datum Learn about datums, the smallest indivisible unit of computation within a job. Deferred Processing Learn about the concept of deferred processing, which allows you to commit data more frequently than you process it. Distributed Computing Learn about the concept of distributed computing, which allows you to split your jobs across multiple workers. EF File A Unix filesystem object (directory or file) that stores data. G Glob Pattern Learn about the concept of a glob pattern, which is a string of characters that specifies a set of filenames or paths in a file system. Global Identifier Learn about the concept of a global identifier, which is a unique identifier for a DAG. H History The collective record of version-controlled commits for pipelines and jobs. I Input Repository Learn about the concept of an input repository, which is a location where data resides that is used as input for a pipeline. J Job Learn about the concept of a Job, which is a unit of work that is created by a pipeline. KLMN NLP Learn about the concept of NLP, which is a subfield of machine learning that focuses on teaching machines to understand and generate human language. O Output Repository Learn about the concept of an output repo, which is a repository where the results of a pipeline's processing are stored after being transformed by the provided user code. P Pachyderm Worker Learn about the concept of a Pachyderm worker. Pipeline Learn about the concept of a pipeline, which is a primitive responsible for reading data from a specified source, transforming it according to the pipeline specification, and writing the result to an output repo. Pipeline Inputs Learn about the concept of a pipeline input, which is the source of the data that the pipeline reads and processes. Pipeline Specification Learn about the concept of a pipeline specification, which is a declarative configuration file used to define the behavior of a pipeline. Project Learn about the concept of a project, which is a workspace collection of repositories and pipelines. Provenance The recorded data lineage that tracks the dependencies and relationships between datasets. QRST Task Parallelism Learn about the concept of task parallelism. U User Code Learn about the concept of User Code, which is custom code that users write to process their data in pipelines. VWXYZ