Non-Bundled Database Setup
This guide outlines how to transition from the bundled PostgreSQL included with Pachyderm for demo environments, to a non-bundled PostgreSQL service that may be desired for development and production environments. In the cloud, it’s common for users to opt for a Postgres service from the same provider as their Kubernetes cluster, such as AWS RDS, Google Cloud SQL, or Azure Database for PostgreSQL.
Before You Start #
Before making the switch to a non-bundled PostgreSQL service, ensure you have:
- An operational Pachyderm instance using the bundled PostgreSQL
- Access to a Postgres service (such as AWS RDS, Google Cloud SQL, or Azure Database for PostgreSQL, or an existing on-prem instance)
- A basic understanding of database operations and SQL
How to Transition to a Non-Bundled PostgreSQL Service #
Set Up a PostgreSQL Service #
Establish a PostgreSQL service through your chosen provider or deploy one on-premises (for example, Crunchy Postgres). You’ll need to gather the following connection information:
- Username
- Password
- Database name
- Host address
- Port number
- SSL mode (options include
require
,prefer
,disable
) - SSL CA certificate (if SSL is in use)
- SSL secret (if SSL is in use)
Update Helm Configuration #
Update your values.yaml
file, specifically the global.postgresql
section, to connect to your PostgreSQL service. You have two options for this:
-
Create a Kubernetes secret in the same namespace as your Pachyderm deployment, containing the credentials and connection details for your PostgreSQL service.
-
Directly input the credentials into the
values.yaml
file.
Choose the method that best aligns with your security practices and deployment workflow.
Deploy the Updated Helm Chart #
Deploy the updated Helm chart with the new configuration:
helm upgrade pachyderm pachyderm/pachyderm -f my_pachyderm_values.yaml
Verify the Deployment #
Check the status of your deployment to ensure that it is running correctly:
kubectl get pods -n <your-namespace>
If the deployment is successful, you should see your PostgreSQL pod in the output.