Pachyderm Enterprise includes s3gateway, which allows you to interact with PFS storage through an HTTP API that imitates a subset of AWS S3’s API. With this, you can reuse a number of tools and libraries built to work with object stores (e.g. minio, Boto) to interact with pachyderm.
You can only interact with the HEAD commit of non-authorization-gated PFS branches through the gateway. If you need richer access, you’ll need to work with PFS through its gRPC interface instead.
Connecting to the s3gateway¶
You can start the s3gateway locally by running
pachctl s3gateway. Then point
your browser or favorite s3 tool at
http://localhost:30600. The s3gateway
also runs in the background on your pachyderm cluster, which you can reach via
Note: A third way to access the s3gateway is using the explicit port
pachctl port-forward). However, we don’t recommend it, as
kubernetes’ port forwarder incurs overhead, and does not recover well from
These operations are supported by the gateway:
- Creating buckets: creates a repo and/or branch.
- Deleting buckets: Deletings a branch and/or repo.
- Listing buckets: Lists all branches on all repos as s3 buckets.
- Writing objects: Atomically overwrites a file on the HEAD of a branch.
- Removing objects: Atomically removes a file on the HEAD of a branch.
- Listing objects: Lists the files in the HEAD of a branch.
- Getting objects: Gets file contents on the HEAD of a branch.
For details on what’s going on under the hood and current peculiarities, see the s3gateway API.
These S3 operations are not supported by the gateway:
- Object copying. PFS does support this through gRPC though, so if you need this, you can use the gRPC API directly.
- CORS configuration
- HTML form uploads
- Legal holds
- Multipart uploads. See writing object documentation above for a workaround.
- Object locks
- Payment requests
- Public access blocks
- Retention policies
- Website configuration
Attempting any of these operations will return a standard
Additionally, there are a few general differences:
- There’s no support for authentication or ACLs.
- As per PFS rules, you can’t write to an output repo. At the moment, a 500 error will be returned if you try to do so.
If you have the option of what S3 client library to use for interfacing with the s3gateway, we recommend minio, as integration with it’s go client SDK is thoroughly tested. These minio operations are supported: