Demo Training Data Project: Difference between revisions

Latest revision as of 19:24, 31 January 2025

The Demo Training Data project is available for all users and can be found in the main menu under Edit projects. This project does not count towards your license.

This project is intended for people who are working in fields such as AI, remote sensing, data analysis, and urban planning.

This project showcases a method in the Tygron Platform to export a dataset for training a Mask R-CNN model..

The demo is a working project in which a number of areas are drawn, using a satellite overlay as underlay. These areas are marked with specific attributes, making it easy to export them as a training or test set using an export option in the Tygron Platform.

Project's content

Demo Train Dataset is a project that contains shapes of foliage, drawn as areas on top of a Satellite Overlay. These areas, in combination with satellite images, can be exported as a dataset for training a Mask R-CNN model. Such a model can then be applied to a different project location to detect foliage. A specific use-case is detecting foliage on private property, such as gardens and private yards.

Individual foliage areas
Satellite Overlay serving as input for images

AI Training Data

In order to train a Mask R-CNN model, we need to obtain two datasets, one for training and one for testing. A dataset consists of a list of numbered files, with the number representing the data entry number. Each data entry consists of the following files:

An input image, as a PNG
A mask image, as a PNG, containing gray values, where each gray value represents a unique identifiable feature.
A label text file containing the classes of the identifiable features, where each class is identified with an integer number (larger than 0).
Optional: A text file containing the bounding-box pixel numbers; min x, min y, max x, max y, for each identifiable feature in the input image.

Input image, in this case an image exported using the Satellite Overlay
Mask image with each unique feature identified by a gray value.
Class labels
Bounding box coordinates

The Tygron Platform allows users to export datasets. The Demo Train Dataset Project is set up to contain two separate sets of Areas to export; One set of Areas with the attribute TRAIN_FOLIAGE and one with the attribute TEST_FOLIAGE. Each set can be exported with an Overlay. Since both sets are drawn based on a Satellite Overlay named Original Satellite, we will use that Overlay.

Export a dataset

This how-to explains how to generate an AI Training Data using the Tygron Platform. This data can be used to train your own AI model.

How to export AI Training Data:

Select current situation in the ribbon bar
Hover over Areas, and select Export Geo Data
Under Format, select the option AI Training Data
Select the filter option, and select the following attribute TRAIN_FOLIAGE
For the Overlay option, select the Satellite Overlay named Original Satellite.
Set the image size to 200 by 200. This ensures that the amount of areas per exported image does not exceed the maximum of 250.
The stride can be kept at 50%
Click on the Export Files button
Select a suitable folder for the generated dataset. For example: user/documents/foliage/train/demo_training_data
Wait until the dataset is fully generated.

Do the same for the attribute TEST_FOLIAGE and export it to a different directory, for example user/documents/foliage/test/demo_training_data

Train your own model

A Tygron AI Suite is available at github. This repository contains the necessary files to configure a Conda environment to train a new Mask R-CNN AI model.

Anaconda Navigator will be used to manage a Conda environment and run Jupyter Notebooks with python.

Example notebooks provided in the repository can be used as a basis to train your own model.

How to train your own AI model for an Inference Overlay:

If you are not familiar with github, you can download the zip containing all the files that you need. Otherwise, clone the git repository
Optionally unzip and open the folder containing the downloaded files; the local repository.
Download and install Anaconda Navigator.
Open the Anaconda Navigator application
Select the tab Environments and click on the Import button. This will download all necessary dependencies and may take a while.
Select the file "tygronai.yml" from the local repository of the tygron-ai-suite. This file will automatically setup the environment needed to open, edit and execute the Jupyter Notebooks.
Once configured, select the Home tab
Open either the JupyterLab or Jupyter Notebook application. You might need to install it first by click on the install button.
A browser will open with the selected application.
Browse to the folder of the tygron-ai-suite repository and select the "example_config.ipynb".
Adjust the following parameters:
```
trainDirectory = "PATH TO TRAIN FILES"
```
```
testDirectory = "PATH TO TEST FILES"
```
to the folders containing the exported datasets. See how to export AI datasets for more information.
Press the double arrow button named Restart kernel and execute all cells to run the Jupyter Notebook. See the images below of what to expect.
Eventually an ONNX file will be created.

Applying your own model

Once you have trained your own AI model and exported it to ONNX, you can import this model into your project and apply it using an Inference Overlay.

However, you will first need to create a new project on a different geographical location than this demo project is situated. Please follow the steps of the New Project Wizard.

Once you have created a new (non-empty) project opened in the editor, you can continue with the configuration of an Inference Overlay.

To do so, drag the ONNX file onto the Tygron Client Application
With the popup menu shown, select the option: Import as an Inference Overlay. This Inference Overlay should be configured almost automatically.
Next, add a new Satellite Overlay.
Next, select the Inference Overlay and for Input A, select the Satellite Overlay

ONNX file popup in the Tygron Client
Input Satellite Overlay
Inference Overlay using the imported ONNX file

Considerations

When exporting data and training your own AI model, there are a few considerations that require attention.

Export Considerations

There are several considerations when exporting AI Training Data:

The exported features (Buildings, Areas, etc) should match the overlay that these are exported with.
A Larger image size leads to less inference calls by the Inference Model and therefore a faster calculation. However, the maximum amount of identifiable features within images of the train dataset may not exceed 250.
The stride fraction can be configured to create more images and ensures features are fully inside certain images and not only on image borders.

Training Considerations

There are some things to consider:

The train sets can obtain data files from sub-folders. For example, if data was exported to user/documents/foliage/train/demo_training_data, the PATH_TO_TRAINING_DATA can be user/documents/foliage/train/. This makes it easier to train on data exported from multiple projects and combining it with your own custom data.
To train a better model, increase the number of trained epochs, for example to 10:
```
config.setEpochs(10)
```
Please note that training with too many epochs may not improve the model; it will become too strict.
If any dataset contains more label classes than you would like to recognize, you can adjust the autoLimitLabel configuration to:
```
config.setAutoLimitLabel(True)
```

Inference Overlay considerations

The input prequel overlay should match the data it was trained on. For example, if it was trained on a satellite overlay, you should not use a WMS-Overlay of an Infra Red service as input.
The grid cell size should match the precision it was trained on.

@@ Line 1: / Line 1: @@
 {{demo project summary
-| title=Demo Train Dataset
+| title=Demo Training Data
 | image=demo_train_dataset.jpg
 | demographic=people who are working in fields such as AI, remote sensing, data analysis, and urban planning
@@ Line 32: / Line 32: @@
 ===Export a dataset===
 This [[How to export AI Training Data|how-to]] explains how to generate an AI Training Data using the {{software}}. This data can be used to train your [[How to train your own AI model for an Inference Overlay|own AI model]].
-{{:How to export AI Training Data}}
-===Export Considerations===
+{{editor steps|title=export AI Training Data
-There are several considerations when exporting AI Training Data:
+| Select current situation in the ribbon bar
-* The exported features ([[Building]]s, [[Area]]s, etc) should match the overlay that these are exported with.
+| Hover over [[Area]]s, and select ''Export Geo Data''
-* A Larger image size leads to less inference calls by the Inference Model and therefore a faster calculation. However, the maximum amount of identifiable features within images of the train dataset may not exceed 250.
+| Under ''Format'', select the option '''AI Training Data'''
-* The stride fraction can be configured to create more images and ensures features are fully inside certain images and not only on image borders.
+| Select the filter option, and select the following attribute '''TRAIN_FOLIAGE'''
+| For the Overlay option, select the [[Satellite Overlay]] named '''Original Satellite'''.
+| Set the image size to 200 by 200. This ensures that the amount of areas per exported image does not exceed the maximum of 250.
+| The stride can be kept at 50%
+| Click on the ''Export Files'' button
+| Select a suitable folder for the generated dataset. For example: ''user/documents/foliage/train/demo_training_data''
+| Wait until the dataset is fully generated.
+}}
+Do the same for the attribute '''''TEST_FOLIAGE''''' and export it to a different directory, for example ''user/documents/foliage/test/demo_training_data''
 == Train your own model ==
 {{:How to train your own AI model for an Inference Overlay}}
-=== Training Considerations ===
-There are some things to consider:
-* To train a better model, increase the number of trained epochs, for example to 10:<pre>config.setEpochs(10)</pre> Please note that training with too many epochs may not improve the model; it will become too strict.
 == Applying your own model ==
@@ Line 54: / Line 59: @@
 Once you have created a new (non-empty) project opened in the [[editor]], you can continue with the configuration of an [[Inference Overlay]].
-* To do so, drag the ONNX file onto the Tygron Client Application
+* To do so, drag the ONNX file onto the [[Tygron Client]] Application
 * With the popup menu shown,  select the option: ''Import as an Inference Overlay''. This Inference Overlay should be configured almost automatically.
 * Next, add a new Satellite Overlay.
 * Next, select the Inference Overlay and for Input A, select the Satellite Overlay
+<gallery>
+demo_inference_import_neural_network.png|[[ONNX]] file popup in the [[Tygron Client]]
+demo_inference_satellite_overlay.png|Input [[Satellite Overlay]]
+demo_inference_result_overlay.png|[[Inference Overlay]] using the imported ONNX file
+</gallery>
+== Considerations ==
+When exporting data and training your own AI model, there are a few considerations that require attention.
+=== Export Considerations ===
+There are several considerations when exporting AI Training Data:
+* The exported features ([[Buildings]], [[Areas]], etc) should match the overlay that these are exported with.
+* A Larger image size leads to less inference calls by the Inference Model and therefore a faster calculation. However, the maximum amount of identifiable features within images of the train dataset may not exceed 250.
+* The stride fraction can be configured to create more images and ensures features are fully inside certain images and not only on image borders.
+=== Training Considerations ===
+There are some things to consider:
+* The train sets can obtain data files from sub-folders. For example, if data was exported to ''user/documents/foliage/train/demo_training_data'', the '''PATH_TO_TRAINING_DATA''' can be ''user/documents/foliage/train/''. This makes it easier to train on data exported from multiple projects and combining it with your own custom data.
+* To train a better model, increase the number of trained epochs, for example to 10:<pre>config.setEpochs(10)</pre> Please note that training with too many epochs may not improve the model; it will become too strict.
+* If any dataset contains more label classes than you would like to recognize, you can adjust the autoLimitLabel configuration to: <pre>config.setAutoLimitLabel(True)</pre>
+===Inference Overlay considerations===
+* The [[A prequel (Inference Overlay)|input prequel overlay]] should match the data it was trained on. For example, if it was trained on a satellite overlay, you should not use a WMS-Overlay of an Infra Red service as input.
+* The [[grid cell size]] should match the precision it was trained on.

Demo Training Data Project: Difference between revisions

Latest revision as of 19:24, 31 January 2025

Contents

Project's content