How to evaluate an AI model
Jump to navigation
Jump to search
AI can be a strong tool to enhance, reinforce, and analyze a variety of datasets for a variety of cases. However, the training-based approach for creating an AI model means that evaluating the performance and results of such a model requires explicit validation. For this purpose, after training an AI model with the appropriate training- and testing datasets, it should be set to analyse a third, separate, "evaluation dataset" to validate its accuracy.
The best method for this evalulation is to compare the geographical results of the Inference Overlay running the model with an Average Overlay or Combo Overlay displaying the known dataset.
How to evaluate an AI model:
- Create a validation project with a set of Areas that can be used for validation. The structure of this dataset should be the same as the training and testing datasets, but should be a fully separate dataset.
For example areas that define where trees are located with the attribute FOLIAGE_TYPE. - Create a Combo Overlay, with the following configuration:
Name: Evaluation Data
Attribute Input A: The Attribute of the evaluation dataset. (Continuing the example: FOLIAGE_TYPE)
Formula: @A - Create an Inference Overlay, with the following configuration:
Name: AI Model Overlay
ONNX File: Your ONNX file to test
Result Type: Labels - Create an Inference Overlay and run your model contained in the ONNX file.
- Create a second Combo Overlay, with the following configuration:
Name: Evaluation Result
Input A: Evaluation Data
Input B: AI Model Overlay
Formula: IF( AND( EQ(A,0), EQ(B,0) ) , NO_DATA , IF( EQ(A,B) , 1 , 0 ) ) - Configure the legend of the Evaluation Result Overlay to have distinct values for 1 and 0.
- The result will be an Overlay showing differences between the evaluation dataset and the AI model results.
Each place with a value "1" is a successfully detected feature.
Each place with a 0 is either a (part of a) feature that was not detected, or a (partially) false positive.
The greater the amounts of "1"s compared to the overall amount of 1's and 0's, the better the model has performed.
- Create a validation project with a set of Areas that can be used for validation. The structure of this dataset should be the same as the training and testing datasets, but should be a fully separate dataset.
Notes
- It is also possible to aggregate the results in an Indicator. Use the following calculation:
- Total points: SELECT_GRIDAREA_WHERE_GRID_IS_(Evaluation Result)_AND_MINGRIDVALUE_IS_0
- Successful points: SELECT_GRIDAREA_WHERE_GRID_IS_(Evaluation Result)_AND_MINGRIDVALUE_IS_1
- Accuracy of model: successful points / total points