On-Prem Deploy

Before you start

Before you can deploy Pachyderm, you will need to perform the following actions:

  1. Install kubectl
  2. Install Helm
  3. Deploy Kubernetes on-premises.
  4. Deploy two Kubernetes persistent volumes for Pachyderm metadata storage.
  5. Deploy an on-premises object store using a storage provider like MinIO, EMC’s ECS, or SwiftStack to provide S3-compatible access to your data storage.
  6. Install PachCTL and PachCTL Auto-completion.
Kubernetes & Openshift Version Support
  • Kubernetes: Pachyderm supports the three most recent minor release versions of Kubernetes. If your Kubernetes version is not among these, it is End of Life (EOL) and unsupported. This ensures Pachyderm users access to the latest Kubernetes features and bug fixes.
  • Openshift: Pachyderm is compatible with OpenShift versions within the “Full Support” window.
Hardened Security and Dependency Considerations

If you are deploying in a hardened security environment, such as within the DoD community or other regulated sectors, consider downloading and installing Pachyderm from Iron Bank, a hardened container registry.

MLDM images may be pulled from Iron Bank by updating the global registry setting in the MLDM Helm chart values.yaml to use registry1.dso.mil/, e.g.

global:
  ...
  image:
    registry: registry1.dso.mil/

Additionally, note that the MLDM Helm chart relies on the Bitnami image and its associated sub-chart. If the Bitnami image is unavailable or if your available PostgreSQL image cannot be managed through the Bitnami sub-chart, you will need to install PostgreSQL separately. Refer to Global Helm Chart Values for details on specifying your separate PostgreSQL instance. Also, refer to Non-Bundled Database Setup for more detail on using your own PostgreSQL instance with MLDM.

If you have questions, please reach out to your Customer Support Engineer for assistance before proceeding.

How to Deploy On-Premises

1. Install via Helm

helm repo add pachyderm https://helm.pachyderm.com
helm repo update

2. Configure Helm Values

View and copy a full helm chart from GitHub or ArtifactHub for reference when configuring your Helm values file.

Add Storage classes to Helm Values

Update your Helm values file to include the storage classes you are going to use:

etcd:
  storageClass: MyStorageClass
  size: 10Gi

postgresql:
  persistence:
    storageClass: MyStorageClass
    size: 10Gi

Size & Configure Object Store

  1. Determine the endpoint of your object store, for example minio-server:9000.

  2. Choose a unique name for the bucket you will dedicate to Pachyderm.

  3. Create a new access key ID and secret key for Pachyderm to use when accessing the object store.

  4. Update the Pachyderm Helm values file with the endpoint, bucket name, access key ID, and secret key.

    pachd:
      storage:
        backend: "AMAZON"
        storageURL: "s3://pachyderm-test?endpoint=minio.default.svc.cluster.local:9000&disableSSL=true&region=dummy-region"

Configure Authentication & Authorization

To set up Authentication, you must use the Enterprise version of Pachyderm and provide a valid license key.

We recommend that you create a secret and provide it on the Helm chart as the value to the attribute pachd.enterpriseLicenseKeySecretName. Once deployed, Pachyderm stores your provided Enterprise license as the platform secret pachyderm-license in the key enterprise-license-key.

Note

After deploying Pachyderm, you can log in as the root user and begin to add users to certain resource types such as Projects and Repos.

pachctl auth set project <project-name> <role-name> user:<username@email.com>

For more information on user permissions, see the Authorization section.

3. Deploy

helm install pachyderm -f values.yaml pachyderm/pachyderm --version <your_chart_version>
Tip

You can update your Helm values file using the following command:

helm upgrade pachyderm pachyderm/pachyderm -f values.yml