How to evaluate an AI model

From Tygron Preview Support Wiki
Jump to navigation Jump to search

AI can be a strong tool to enhance, reinforce, and analyze a variety of datasets for a variety of cases. However, the training-based approach for creating an AI model means that evaluating the performance and results of such a model requires explicit validation. For this purpose, after training an AI model with the appropriate training- and testing datasets, it should be set to analyse a third, separate, "evaluation dataset" to validate its accuracy.

The best method for this evalulation is to compare the geographical results of the Inference Overlay running the model with an Average Overlay or Combo Overlay displaying the known dataset.

How to evaluate an AI model:
  1. Create a validation project with a set of Areas that can be used for validation. The structure of this dataset should be the same as the training and testing datasets, but should be a fully separate dataset.
    For example areas that define where trees are located with the attribute FOLIAGE_TYPE.
  2. Create a Combo Overlay, with the following configuration:
    Name: Evaluation Data
    Attribute Input A: The Attribute of the evaluation dataset. (Continuing the example: FOLIAGE_TYPE)
    Formula: @A
  3. Create an Inference Overlay, with the following configuration:
    Name: AI Model Overlay
    ONNX File: Your ONNX file to test
    Result Type: Labels
  4. Create an Inference Overlay and run your model contained in the ONNX file.
  5. Create a second Combo Overlay, with the following configuration:
    Name: Evaluation Result
    Input A: Evaluation Data
    Input B: AI Model Overlay
    Formula: IF( AND( EQ(A,0), EQ(B,0) ) , NO_DATA , IF( EQ(A,B) , 1 , 0 ) )
  6. Configure the legend of the Evaluation Result Overlay to have distinct values for 1 and 0.
  7. The result will be an Overlay showing differences between the evaluation dataset and the AI model results.
    Each place with a value "1" is a successfully detected feature.
    Each place with a 0 is either a (part of a) feature that was not detected, or a (partially) false positive.
    The greater the amounts of "1"s compared to the overall amount of 1's and 0's, the better the model has performed.

Notes

  • It is also possible to aggregate the results in an Indicator. Use the following calculation:
    • Total points: SELECT_GRIDAREA_WHERE_GRID_IS_(Evaluation Result)_AND_MINGRIDVALUE_IS_0
    • Successful points: SELECT_GRIDAREA_WHERE_GRID_IS_(Evaluation Result)_AND_MINGRIDVALUE_IS_1
    • Accuracy of model: successful points / total points