Non-Bundled Database Setup

This guide outlines how to transition from the bundled PostgreSQL included with Pachyderm for demo environments, to a non-bundled PostgreSQL service that may be desired for development and production environments. In the cloud, it’s common for users to opt for a Postgres service from the same provider as their Kubernetes cluster, such as AWS RDS, Google Cloud SQL, or Azure Database for PostgreSQL.

Before You Start

Before making the switch to a non-bundled PostgreSQL service, ensure you have:

  • An operational Pachyderm instance using the bundled PostgreSQL
  • Access to a Postgres service (such as AWS RDS, Google Cloud SQL, or Azure Database for PostgreSQL, or an existing on-prem instance)
  • A basic understanding of database operations and SQL

How to Transition to a Non-Bundled PostgreSQL Service

Set Up a PostgreSQL Service

Establish a PostgreSQL service through your chosen provider or deploy one on-premises (for example, Crunchy Postgres). You’ll need to gather the following connection information:

  • Username
  • Password
  • Database name
  • Host address
  • Port number
  • SSL mode (options include require, prefer, disable)
  • SSL CA certificate (if SSL is in use)
  • SSL secret (if SSL is in use)

Update Helm Configuration

Update your values.yaml file, specifically the global.postgresql section, to connect to your PostgreSQL service. You have two options for this:

  1. Create a Kubernetes secret in the same namespace as your Pachyderm deployment, containing the credentials and connection details for your PostgreSQL service.

  2. Directly input the credentials into the values.yaml file.

Choose the method that best aligns with your security practices and deployment workflow.

Deploy the Updated Helm Chart

Deploy the updated Helm chart with the new configuration:

helm upgrade pachyderm pachyderm/pachyderm -f my_pachyderm_values.yaml

Verify the Deployment

Check the status of your deployment to ensure that it is running correctly:

kubectl get pods -n <your-namespace>

If the deployment is successful, you should see your PostgreSQL pod in the output.