Projects are logical collections of related work (such as repos and pipelines). Each Pachyderm cluster ships with an initial project named default
. PachCTL supports all Project operations, such as adding/removing team members, resources, etc. Pachyderm Console can be used to view and access Projects. Pachyderm’s integrations with JupyterLab, Seldon, S3 Gateway, and DeterminedAI also support
projects.
Benefits of Projects #
-
Logical Organization of DAGs: Similar to a file system, you can organize your work within a Pachyderm instance.
-
Standardizable: Resources like repos can have the same name if they belong to different projects, making it easier to create and adhere to project templates in a collaborative environment. For example,
ProjectA.Repo1
andProjectB.Repo1
. -
Multi-team Enablement: With Enterprise Pachyderm, You can grant access to projects based on roles; projects are hidden from users without access by default.
Example #
In the following example there are two projects: DOGS
and CATS
. They have similarly named repositories and pipelines. With Enterprise Pachyderm, you could scope access to each project by user or user group.
graph TD A[(Project DOGS)] --> B[Picture Repo] A[(Project DOGS)] --> C[Text Repo] A[(Project DOGS)] --> D[Audio Repo] B --> X(Cleanup Pipeline A) C --> Y(Cleanup Pipeline B) D --> Z(Cleanup Pipeline C) X -- from output repo --> 1(Grouping Pipeline) Y -- from output repo --> 1(Grouping Pipeline) Z -- from output repo --> 1(Grouping Pipeline) J[(Project CATS)] --> M[Picture Repo] J[(Project CATS)] --> N[Text Repo] J[(Project CATS)] --> O[Audio Repo] M --> P(Cleanup Pipeline A) N --> Q(Cleanup Pipeline B) O --> R(Cleanup Pipeline C) P -- from output repo --> 2(Grouping Pipeline) Q -- from output repo --> 2(Grouping Pipeline) R -- from output repo --> 2(Grouping Pipeline)
Limitations #
- You currently cannot move existing repos or pipelines between different projects.