OpenShift

OpenShift is a popular enterprise Kubernetes distribution. Pachyderm can run on OpenShift with some additional steps:

Deploy Pachyderm

  1. Make sure that privilege containers are allowed (they are not allowed by default). You can add priviledged scc (SecurityContextConstraints) to pachyderm service account:
oadm policy add-scc-to-user privileged system:serviceaccount:<PROJECT_NAME>:pachyderm

or manually edit oc edit scc privileged:

users:
- system:serviceaccount:<PROJECT_NAME>:pachyderm
  1. Replace hostPath with emptyDir in your cluster manifest (Your manifest is generated by the pachctl deploy ... command or can be generated manually. To only generate the manifest, run pachctl deploy ... with the --dry-run flag).
      "spec": {
        "volumes": [
          {
            "name": "pach-disk",
            "emptyDir": {}
          }
        ],

 ... <snip>  ...

      "spec": {
        "volumes": [
          {
            "name": "etcd-storage",
            "emptyDir": {}
          }
        ],

Please note that emptyDir does not persit your data. You need to configure persistent volume or hostPath to persist your data.

  1. Deploy Pachyderm manifest you modified.
$ oc create -f pachyderm.json

You can see the cluster status by using oc get all like kubernetes:

$ oc get all
NAME             DESIRED          CURRENT       AGE
rc/etcd          1                1             5m
rc/pachd         1                1             5m
NAME             CLUSTER-IP       EXTERNAL-IP   PORT(S)           AGE
svc/etcd         172.30.170.24    <nodes>       2379/TCP          5m
svc/pachd        172.30.194.202   <nodes>       650/TCP,651/TCP   5m
NAME             READY            STATUS        RESTARTS          AGE
po/etcd-7m5r1    1/1              Running       0                 5m
po/pachd-foq68   1/1              Running       0                 5m

Configure for a Pipeline

  1. Add cluster-reader and edit role to pachyderm service account.
oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:<PROJECT_NAME>:pachyderm
oadm policy add-cluster-role-to-user edit system:serviceaccount:<PROJECT_NAME>:pachyderm
  1. Add the pachyderm service account to the pipeline Pod (ReplicationController).
oc patch rc pipeline-edges-v1 -p 'spec:
  template:
    spec:
      serviceAccount: pachyderm
      serviceAccountName: pachyderm'

or manually edit rc oc edit rc <RC_PIPELINE> -o json:

                   ...
                "dnsPolicy": "ClusterFirst",
                "serviceAccountName": "pachyderm",
                "serviceAccount": "pachyderm",
                "securityContext": {}
                   ...
  1. Replace hostPath with emptyDir.

Again, please note that emptyDir does not persit your data. You need to configure persistent volume or hostPath to persist.

  1. Redeploy the updated Pods.
$ oc scale rc pipeline-edges-v1 --replicas=0
$ oc scale rc pipeline-edges-v1 --replicas=4

You can see the pipeline pods are running and successful job.

$ oc get pod
NAME                      READY     STATUS    RESTARTS   AGE
etcd-kbi4n                1/1       Running   0          1h
pachd-z3b7y               1/1       Running   0          1h
pipeline-edges-v1-28vdj   1/1       Running   0          12s
pipeline-edges-v1-fpa8v   1/1       Running   0          12s
pipeline-edges-v1-mshi0   1/1       Running   0          12s
pipeline-edges-v1-yx2wa   1/1       Running   0          12s

$ pachctl list-job
ID                                   OUTPUT COMMIT                          STARTED        DURATION   RESTART PROGRESS STATE
1b2c1b49-f536-484f-b0e3-07b3906572be edges/006f0aecb2b048d5b5edee0cdb766879 55 minutes ago 51 minutes 0       1 / 1    success

Problems related to OpenShift deployment are tracked in this issue: https://github.com/pachyderm/pachyderm/issues/336. If you have additional related questions, please ask them on Pachyderm’s slack channel or via email support@pachyderm.io.