How to iterate over a large set of data with Ikomia STUDIO

Published by Ludovic Barusseau on

Let us introduce you to the Batch Processing mode of Ikomia platform. This tutorial is the logical continuation of how to use Ikomia STUDIO. Therefore, we assume that you are aware of the workflow key features before reading these lines.

Batch processing mode

– 5 min read –

An extension to iterate over a large set of data

With Ikomia STUDIO, you can build custom image processing pipelines easily, in a no-code manner. We often use few images or videos while developing new workflows to address a specific application. However, the validation phase needs a lot more data to check reliability and accuracy. This is when the Batch Processing mode comes into play. To clarify, it extends the STUDIO and provides the ability to iterate over a large set of data.

Set inputs

The first step is to set global inputs for your workflow. As you may know, data should be already loaded as a project in Ikomia. The software lets you load multiple images, videos or entire filesystem folder. Then data are structured into a dataset and added to a project. For instance, this is what a project looks like when opening a filesystem folder containing images:

Project view

Once data are loaded, we create a new workflow and add algorithms into it. After that, we add a new global input to set the list of images (or videos) we want to process. Within the project structure, different choices are available. You just need to follow the wizard for data type selection:

  • images or videos
  • datasets (contain images or videos)
  • folders (contain datasets)

Input selection

Finally, connect the global input to your root node. Global input data can be modified at any time by clicking the corresponding input button.

Note: a special icon appears in the workflow input area to notify user that input is a batch.

Manage outputs

At this point, our workflow is ready to iterate over all selected data. But what happens to each node output? Are they lost each time the workflow is executed on a new image?

As soon as we connect a batch input to the root node, the Workflow Creator switches to Batch Processing mode. Consequently, a new button appears on each node, disabled by default.

This button gives you the possibility to enable automatic save for all outputs of a specific node. Once enabled, the STUDIO saves all output data to a predefined folder and load them into the current project when the batch job is finished.

Outputs automatic save

Understand auto-save feature

When dealing with batch processing, automatic save of output data is mandatory. The engine behind the STUDIO implements a specific architecture to provide such feature. Each output type comes with its own save mechanism. In other words, each output type is responsible of defining:

  • what data has to be saved.
  • in which file format.

This behavior is extensible to new data types which is very interesting for Ikomia plugins (out of scope here). Here is the list of compatible output types and their formats:

Output save formats

In addition, the STUDIO provides convenient file organization and naming convention to ease further data analysis. By default, the root save folder is located at $HOME/Ikomia/Workflows/. You can change this default location in the global preferences of Ikomia.

In the previous section, we describe how to enable auto-save for a node. In that case, all outputs of this node will be saved at runtime. For those who want to customize this behavior, it is possible! Just follow this few steps:

  • select the node (mouse click)
  • open the I/O tab in the information area of the Workflow Creator
  • check or uncheck auto-save property for the desired output
  • that’s it!

Auto-save property

Run the workflow

You are now able to select batch input for your custom workflow and configure auto-save for your outputs. It’s time to run it. The STUDIO will iterate over all elements (image, video or volume) contained in the batch input, apply the chain of algorithms and save selected outputs automatically.

Note: the STUDIO does not launch Batch Processing through parallel jobs for optimization. However, each algorithm in the workflow may use multi-processing techniques and GPU optimizations.


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published.