Ingest From a croissant URL Workflow
This workflow allows you to ingest datasets described through a croissant URL into ApertureDB.
This lets you use your own existing data, or a related dataset that might be available on sites such as HuggingFace, kaggle, and Google datasets. This provides an easy way to get started with ApertureDB, and to see how it can be used with real data.
Creating the workflow
Creating and deleting workflows
For general information on creating workflows in ApertureDB Cloud see Creating and Deleting Workflows.
![[object Object]](/assets/images/configure_ingest_from_croissant-7086dcdaee06a33b8535e0f132d6c1ed.png)
1
2
- Enter the public URL of an ML Croissant dataset, e.g. MNIT-CoT-Dataset on huggingface, text-2-video-human-preferences-veo3 on huggingface.
- Click the blue button at the bottom.
See the GitHub repository for more information
For more detailed information about what this workflow is doing, additional information about the parameters, and how to run the workflow outside of the ApertureDB Cloud, see the dataset-ingestion documentation in GitHub.
Getting croissant links for datasets
See the results
If you go to the "My Instances" page and click on "Connect" for the instance you used, you will see an option to go to the Web UI for your instance. You will see the number of objects in the database increase as the workflow runs. Click on the refresh button to update the count.

