AWS + Pachyderm
Learn how to deploy to Pachyderm to the cloud with AWS.
March 24, 2023
Before You Start #
This guide assumes that you have already tried Pachyderm locally and have all of the following installed:
1. Create an EKS Cluster #
- Use the eksctl tool to deploy an EKS Cluster:
eksctl create cluster --name pachyderm-cluster --region <region> -profile <your named profile>
- Verify deployment:
kubectl get all
2. Create an S3 Bucket #
- Run the following command:
aws s3api create-bucket --bucket ${BUCKET_NAME} --region ${AWS_REGION}
- Verify.
aws s3 ls
3. Enable Persistent Volumes Creation #
- Create an IAM OIDC provider for your cluster.
- Install the Amazon EBS Container Storage Interface (CSI) driver on your cluster.
- Create a gp3 storage class manifest file (e.g.,
gp3-storageclass.yaml
)kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: gp3 annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: kubernetes.io/aws-ebs parameters: type: gp3 fsType: ext4
- Set gp3 to your default storage class.
kubectl apply -f gp3-storageclass.yaml
- Verify that it has been set as your default.
kubectl get storageclass
4. Create a Values.yaml #
Version:
deployTarget: "AMAZON"
proxy:
enabled: true
service:
type: LoadBalancer
pachd:
storage:
amazon:
bucket: "bucket_name"
# this is an example access key ID taken from https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html (AWS Credentials)
id: "AKIAIOSFODNN7EXAMPLE"
# this is an example secret access key taken from https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html (AWS Credentials)
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-2"
externalService:
enabled: true
console:
enabled: true
deployTarget: "AMAZON"
proxy:
enabled: true
service:
type: LoadBalancer
pachd:
storage:
amazon:
bucket: "bucket_name"
# this is an example access key ID taken from https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html (AWS Credentials)
id: "AKIAIOSFODNN7EXAMPLE"
# this is an example secret access key taken from https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html (AWS Credentials)
secret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
region: "us-east-2"
# Enterprise key
enterpriseLicenseKey: "YOUR_ENTERPRISE_TOKEN"
console:
enabled: true
5. Configure Helm #
Run the following to add the Pachyderm repo to Helm:
helm repo add pach https://helm.pachyderm.com
helm repo update
helm install pachd pach/pachyderm -f my_pachyderm_values.yaml
6. Verify Installation #
- In a new terminal, run the following command to check the status of your pods:
kubectl get pods
NAME READY STATUS RESTARTS AGE
pod/console-5b67678df6-s4d8c 1/1 Running 0 2m8s
pod/etcd-0 1/1 Running 0 2m8s
pod/pachd-c5848b5c7-zwb8p 1/1 Running 0 2m8s
pod/pg-bouncer-7b855cb797-jqqpx 1/1 Running 0 2m8s
pod/postgres-0 1/1 Running 0 2m8s
- Re-run this command after a few minutes if
pachd
is not ready.
7. Connect to Cluster #
pachctl connect grpc://localhost:80
âšī¸
If the connection commands did not work together, run each separately.
Optionally open your browser and navigate to the Console UI.
đĄ
You can check your Pachyderm version and connection to pachd
at any time with the following command:
pachctl version
COMPONENT VERSION
pachctl 2.5.2
pachd 2.5.2