AWS + Pachyderm

Learn how to deploy to Pachyderm to the cloud with AWS.

March 24, 2023

Before You Start #

This guide assumes that you have already tried Pachyderm locally and have all of the following installed:

1. Create an EKS Cluster #

  1. Use the eksctl tool to deploy an EKS Cluster:
eksctl create cluster --name pachyderm-cluster --region <region> -profile <your named profile>
  1. Verify deployment:
kubectl get all

2. Create an S3 Bucket #

  1. Run the following command:
aws s3api create-bucket --bucket ${BUCKET_NAME} --region ${AWS_REGION}
  1. Verify.
aws s3 ls

3. Enable Persistent Volumes Creation #

  1. Create an IAM OIDC provider for your cluster.
  2. Install the Amazon EBS Container Storage Interface (CSI) driver on your cluster.
  3. Create a gp3 storage class manifest file (e.g., gp3-storageclass.yaml)
    kind: StorageClass
      name: gp3
      annotations: "true"
      type: gp3
      fsType: ext4
  4. Set gp3 to your default storage class.
    kubectl apply -f gp3-storageclass.yaml
  5. Verify that it has been set as your default.
    kubectl get storageclass

4. Create a Values.yaml #


5. Configure Helm #

Run the following to add the Pachyderm repo to Helm:

helm repo add pach
helm repo update
helm install pachd pach/pachyderm -f my_pachyderm_values.yaml 

6. Verify Installation #

  1. In a new terminal, run the following command to check the status of your pods:
kubectl get pods
NAME                                           READY   STATUS      RESTARTS   AGE
pod/console-5b67678df6-s4d8c                   1/1     Running     0          2m8s
pod/etcd-0                                     1/1     Running     0          2m8s
pod/pachd-c5848b5c7-zwb8p                      1/1     Running     0          2m8s
pod/pg-bouncer-7b855cb797-jqqpx                1/1     Running     0          2m8s
pod/postgres-0                                 1/1     Running     0          2m8s
  1. Re-run this command after a few minutes if pachd is not ready.

7. Connect to Cluster #

pachctl connect grpc://localhost:80 

If the connection commands did not work together, run each separately.

Optionally open your browser and navigate to the Console UI.


You can check your Pachyderm version and connection to pachd at any time with the following command:

pachctl version

pachctl             2.5.2  
pachd               2.5.2