During this series, we will use ĺ to build a Data-Centric pipeline to debug and fix a model trained with the NVIDIA TAO Toolkit.
. We demystify the NVIDIA ecosystem and define a Data-Centric pipeline based on a model trained with the NVIDIA TAO framework.
Part 2 (current). Using the ĺ API, we show you how to upload a dataset and a model to the ĺ platform.
Part 3. We identify failures in our model due to data issues, fix these failures and improve our model’s performance with the fixed dataset.
Table of Contents
- Recap from Part 1
- Getting an access token
- Uploading our dataset
- Uploading our model
- Manually uploading a dataset/model
- What’s next
1. Recap from Part 1
In the of this series we focused on three main things:
- We introduced a blueprint for building a Data-Centric pipeline.
- We broke down the NVIDIA TAO Toolkit and several of the moving parts of the NVIDIA ecosystem.
- We briefly introduced the .
Before continuing, make sure that you have the following pre-requisites:
- A trained model on the we introduced in Part 1. You can either adventure yourself with the NVIDIA TAO Toolkit using this or download a trained model .
- Both the annotations of your dataset and the predictions of your model are expected to be in — If you opt for (already in COCO format) instead of training one by yourself, you can disregard this step.
- One provider to store your data with the appropriate configuration (click on the links to configure your data storage provider): , or .
- A ĺ account.
⚠️ ĺ’ advanced features, including the API, are available for premium users only. However, to make the most out of this series, Section 5 describes how you can upload a dataset/model on the freemium version of ĺ. Please, sign up for a sandbox account .
2. Getting an access token
To obtain an access token, first generate your API keys (see Figure 1).

The API keys will serve you to retrieve an , as shown on the code below.
💡 Hint: contains examples in other programming languages different from Python (e.g., Node, Go, PHP, Ruby, etc).
A successful response is shown below:
Save the value of access_token
, we’ll use it during the rest of the article.
⚠️ Be aware that the access token has an expiration time of 3,600 seconds or 60 minutes — follow the same procedure to obtain a new one, if required.
💡Hint: Our following code examples might appear verbose to the trained eye, and in fact they are: for this series, we aim to be as explicit as possible!
3. Uploading our dataset to ĺ
⚠️️ For the code examples in Section 3 and 4:
- We use as data storage provider during this post. For more information, follow this .
- Replace
credentials.value
andazure_uri
with your own values. - Use your
access_token
in the headers.
3.1 Images & Annotations Ingestion
We assume you have configured your data storage provider as mentioned on Section 1. If you haven’t, detailed instructions of how to do the setup Azure, can be found .
Let’s push our dataset to ĺ! 🚀
- Uploading images:
- Uploading annotations:
- Ingesting the images and the annotations into ĺ:
Voila! You will now see your data on the platform! (see Figure 2).

4. Uploading our model to ĺ
4.1 Model Predictions Ingestion
Next, let’s upload our model to ĺ.
- Creating a model:
- Uploading model predictions: