Connect your Superb.ai project to Pachyderm to automatically version and save data you’ve labeled in Superb AI to use in downstream machine learning workflows.
This integration ingests the data into Pachyderm on a cron schedule. Once your data is ingested into Pachyderm, you can perform data tests, train a model, or any other type of data automation you may want to do, all while having full end-to-end reproducibility.
Before You Start #
- You must have a Superb.AI account
- You must have a Pachyderm cluster
-
- Download the example code and unzip it. (or download this repo.
gh repo clone pachyderm/docs-content
and navigate todocs-content/docs/latest/integrate/superb-ai
)
- Download the example code and unzip it. (or download this repo.
How to Use the Superb AI Connector #
- Generate an Access API Key in SuperbAI.
- Put the key and your user name in the
secrets.json
file. - Create the Pachyderm secret
pachctl create secret -f secrets.json
- Create the cron pipeline to synchronize your
Sample project
from SuperbAI to Pachyderm. This pipeline will run every minute to check for new data (you can configure it to run more or less often in the cron spec insample_project.yml
).pachctl create pipeline -f sample_project.yml
- Pachyderm will automatically kick off the pipeline and import the data from your sample project.