Tutorials#

This section describes the different tools available in the plugin. The video examples showcase a case study using aerial images from the NAIP programm in the USA. You can download the same raster using the following command:

wget https://naipeuwest.blob.core.windows.net/naip/v002/de/2018/de_060cm_2018/38075/m_3807511_ne_18_060_20181104.tif

Encoder#

This tool enables the encoding of an image with a deep learning backbone. Projecting an image through a deep learning backbone can indeed help bypass color, shadows or texture artefacts that make an object hard to detect otherwise. We have coded this plugin with Vision Transformers (ViTs) in mind as backbones because they have become the state of the art in computer vision since 2021. Modern deep learning backbones are pretrained in a Self-Supervised maner and can provide meaningfull descriptors of an image without further training. Here, we rely on the timm library to download and use pre-trained models. This library is widely used in the deep learning community and regularly updated. After being fed to the model, we save the outputed features produced by the model. Hopefully, these features can represent the image in a new feature space that is more informative and discriminant to help with further mapping. Generally with ViTs, the resulting spatial resolution is coarser than the input resolution (e.g. for a ViT Base, 16x16 pixels become a single pixel).

Encoding process#

GIS raster images are usually too big to be fed in a deep learning backbone. Therefore, we tile the input image and each tile is fed to the backbone. There is two major parameters you can set : size and stride. The original image will be sampled in tiles of size x size pixels with the given stride. Then, if the stride is smaller than the size, there will be an overlap between tiles. If stride is equal to the size, it will be a perfect grid. If the stride is bigger than the size, if will probably not work properly !

Do keep in mind that before passsing through the model, the sampled tile will be resized according to the expected input size from the model.

During encoding, each tile is saved on disk and the tiles are merged regularly to save space. We chose to save on disk to be easier on ram an leave the option to stop the process and restart later. By default, features are saved in a temporary directory but you can change the target location.

Encoding sessions are identified with a md5 hash so a set of input parameters is unique and can be recognised. This allows to easily start again where you left of. The parameters corresponding to a sha are saved in the parameters.csv files in the target directory.

The backend for handling datasets and dataloader is a fork from torchgeo.

You can find a screen recording showing the process here.

The main steps are the following:

Choose a raster you want to feed to an encoder.
Select the bands you want to be used.
Select the extent you want the encoding to be applied on. If you don’t want to use all the input raster, this can save a lot of compute time ! You can draw the extent on canvas or use an other layer as reference.
Choose sampling and stride size. This will define the grid allong which your raster is sampled. By default, stride will be equal to the sampling size but you can add an overlap for smoother results.
Choose an encoder. We have pre-selected some classic and reliable foundation models but you can use other models available on huggingface by typing their name below (e.g. vit_base_patch16_dinov3.lvd1689m). Note that this library features a lot of models and all of them are not tested !
Select the batch size you want to apply. This will depend on your hardware, do not hesitate to test out on a small area to be sure everything works before scalling up ! Do check advanced parameters as well, several optimizations for limited hardware are available as well.
Define an output directory for the produced files. If you don’t specify it, a temporary directory will be created and output rasters will be deleted on shutdown !!
Hit Run and let the encoder do it’s job. You can change log verbosity in the advanced parameters if you want a more intricate viwew of what’s going on. Temporary output files are stored on disk, you can cancel a run between batches and start again where you left off with the same parameters afterwards.

Parameters#

Input raster layer or image file path: The raster you want to feed to a deep learning encoder. This can either be a layer loaded in QGIS or the path to a file.
Selected Bands: Bands that will be fed to the encoder. Selected none feeds all the bands as is into the encoder. If you select a number of bands different that the one of the backbone you are using, the pretrained model will be changed accordingly.
Processing extent: Defaults to the entire image. Otherwise, you can set a smaller processing extent, either by calculating from a layer, from the current map canvas or by drawing the extent on the map.
Sampling size: The input raster will be sampled in squares of this sampling size (in pixels). This size can differ from the input size of the chosen deep learning encoder, sampled tiles will be resized before entering the encoder.
Stride: Step size between two sampled tiles. If the stride is equall to the sampling size, the raster will be sampled allong a grid. If the stride is smaller than the sampling size, there will be an overlap between neighboring tiles. If the stride is larger, this will likely cause an error.
Use GPU if CUDA is available: If the plugin recognises a GPU, it will be used for computing.
Pre-selected backbones: A selection of backbones.
Enter an architecture name if you want to test another backbone: You can use other backbones available on huggingface. These however may not work properly depending on their architecture. Most ViT like backbones should work.
Batch size: How many tiles are fed into the network at once. This only takes effect if a GPU is available.
Output directory: Where resulting rasters will be saved. A subdirectory identified by a md5 hash will correspond to a given encoding session.

Advanced parameters#

Pretrained checkpoint: If you have a pretrained model available on disk, you can use this one rather than pre-trained weights available on the web.
CUDA Device ID: Enter CUDA device ID to choose on which GPU computations are done if you have several.
Remove temporary files after encoding: If selected, all temporary tiles will be removed at the end of encoding.
Merge method at the end of inference: Choose how tiles will be merged to reconstruct a full raster. For more informations, see rasterio documentation. Only the average method is custom and will average several overlapping tiles to obtain the values of the final pixels.
Frequency at which temporary files should be cleaned up: Every n batch, temporary tiles will be merged together and deleted.
Number of workers for dataloader: How many threads will be used by the dataloader to feed the tiles into the encoder. This defaults to all available workers. You can chose less to ease the workload on your CPU.
Schedule pauses between batches: If a number is inputed, their will be a pause between each batch. This allows to pass the inference in background if other computations have to be made at the same time.
Target CRS: CRS into which the resulting raster should be projected.
Target resolution: target resolution in meters.
Compress final result to uint16 and JP2 to save space: If selected, the final features raster will be converted to uint16 rather than float32 (i.e. two times lighter) and compressed to JP2 rather than geotiff to save space.

Pass parameters as JSON file: In the output directory, you can find a JSON file summarizing the input parameters used during encoding. You can pass this JSON file here to overide all previous parameters. This can be usefull if you want to resume an encoding session.

Dimension reduction#

The features produced by a deep learning encoder are often of high dimensionality (e.g. 768 dimensions for a ViT base). However, it can be cumbersome to deal with all these features and this high dimensionality feature space, especially when a majority are not really informative. Therefore, it is possible to reduce the dimensions of a raster using a variety of algorithms. We chose to rely on scikit-learn to provide the algorithms. All algorithms available in the decomposition, manifold and the cluster module that share a common API can be used.

Different algorithms have different arguments that can be passed. You can provide these as a json string in the corresponding field.

Not all of the algorithms have been tested and some may be heavy on computing or need particular input types.

You can find a screen recording showing the process here.

The main steps are the following:

Choose a raster you want to feed to a dimension reduction algorithm.
Select the bands you want to be used.
Select the extent you want the encoding to be applied on. If you don’t want to use all the input raster, this can save a lot of compute time ! You can draw the extent on canvas or use an other layer as reference.
Choose the target number of dimensions (components). If the algorithm has this parameter, this will be used.
Select an algorithm to be used, by default, PCA is selected as this is a common lightweight operation but a lot more algorithms are available, you can find a list of them in the sidebar. Note that scikit-learn provides a lot of algorithms, not all of them are usable for any type of data.
You can pass overriding arguments in the field below, the arguments are expected to be on the same format as featured in the sidebar description (e.g. {'n_components': 5, 'whiten': False, 'copy': True, 'batch_size': 10} for IncrementalPCA)
Do check the advanced parameters, you can set the seed for reproductibility and save the produced model to a file so you can reuse it afterwards.
Define an output directory for the produced files. If you don’t specify it, a temporary directory will be created and output rasters will be deleted on shutdown !!
Hit Run.

Although the majority of the algorithms are lightweight, some may take some time to fit. QGIS’s plugin structure makes it impossible for the plugin to kill an algorithm while running. If you’re stuck, you may have to force kill QGIS !

Clustering#

Features or reduced features can be clustered (i.e. unsupervised classification) using algorithms form the scikit-learn cluster module that share a common API.

Different algorithms have different arguments that can be passed. You can provide these as a json string in the corresponding field.

Not all of the algorithms have been tested and some may be heavy on computing or need particular input types.

You can find a screen recording showing the process here.

The main steps are the following:

Choose a raster you want to feed to a clustering algorithm.
Select the bands you want to be used.
Select the extent you want the encoding to be applied on. If you don’t want to use all the input raster, this can save a lot of compute time ! You can draw the extent on canvas or use an other layer as reference.
Choose the target number of clusters. If the algorithm has this parameter, this will be used.
Select an algorithm to be used, by default, KMeans is selected as this is a common lightweight operation but a lot more algorithms are available, you can find a list of them in the sidebar. Note that scikit-learn provides a lot of algorithms, not all of them are usable for any type of data.
You can pass overriding arguments in the field below, the arguments are expected to be on the same format as featured in the sidebar description (e.g. {'eps': 0.5, 'min_samples': 5, 'metric': 'euclidean', 'metric_params': None, 'algorithm': 'auto', 'leaf_size': 30, 'p': None, 'n_jobs': None} for DBSCAN)
Do check the advanced parameters, you can set the seed for reproductibility and save the produced model to a file so you can reuse it afterwards.
Define an output directory for the produced files. If you don’t specify it, a temporary directory will be created and output rasters will be deleted on shutdown !!
Hit Run.

Although the majority of the algorithms are lightweight, some may take some time to fit. QGIS’s plugin structure makes it impossible for the plugin to kill an algorithm while running. If you’re stuck, you may have to force kill QGIS !

Similarity#

A good way to compare two points in a high-dimension setting is through cosine similarity. This measure will be equall to one for vectors having the same coordinates and 0 for orthogonal vectors. Thus, the closer to one the cosine similarity is, the more similar two points should be.

Here, additionnaly to an input raster, you have to provide a shapefile (or any format that will be read by geopandas) that will serve to prodive reference point(s). You can find a template shapefile over NAIP dataset here.

You can find a screen recording showing the process here.

The main steps are the following:

Choose a raster you want to feed to a clustering algorithm.
Select the bands you want to be used.
Select the extent you want the encoding to be applied on. If you don’t want to use all the input raster, this can save a lot of compute time ! You can draw the extent on canvas or use an other layer as reference.
Select a shapefile that will serve as a prompt for the similarity. If the geometry of your shapefile is not points, it will automatically be sampled as points. You can check the sampling rate in the options. If there is several points, the prompt used is the arithmetic mean of the points coordinates.
Define an output directory for the produced files. If you don’t specify it, a temporary directory will be created and output rasters will be deleted on shutdown !!
Hit Run. Cosine similarity should be relatively quick to compute.

Machine Learning Algorithms#

If the features you have seem informative, you can fit a Machine Learning model on it by providing ground truth points. Thus, you have to provide an input shapfile (or any format that will be read by geopandas) and the column corresponding to the ground truth values. Based on the algorithm you choose, these values will be interpreted as integers (classification) or floats (regression). All models provided by scikit-learn ensemble (e.g. Random Forests, Gradient Boosting) and neighbors(e.g. KNN) module that share a common API are available.

You can find a screen recording showing the process here.

You can find a template shapefile over NAIP dataset here.

The main steps are the following:

Choose a raster you want to feed to a clustering algorithm.
Select the bands you want to be used.
Select the extent you want the encoding to be applied on. If you don’t want to use all the input raster, this can save a lot of compute time ! You can draw the extent on canvas or use an other layer as reference.
Select a shapefile that will provide labels to the algorithm.
Select an algorithm to be used, by default, RandomForestClassifier is selected as this is a reliable and versatile algorithm but a lot more algorithms are available, you can find a list of them in the sidebar. Note that scikit-learn provides a lot of algorithms, not all of them are usable for any type of data (e.g. classification vs. regression).
You can pass overriding arguments in the field below, the arguments are expected to be on the same format as featured in the sidebar description (e.g. {'estimator': None, 'n_estimators': 50, 'learning_rate': 1.0, 'loss': 'linear', 'random_state': None} for AdaBoostRegressor)
You can provide a shapefile for a test dataset as well.
Define the column (attribute) to be used as a ground truth value.
By default, we encourage users to perform a cross-validation. Default scheme will be a random 5-fold but if you have an atribute defining folds, you can pass it as well in the field bellow.
Do check the advanced parameters, you can set the seed for reproductibility and save the produced model to a file so you can reuse it afterwards.
Define an output directory for the produced files. If you don’t specify it, a temporary directory will be created and output rasters will be deleted on shutdown !!
Hit Run.

Although the majority of the algorithms are lightweight, some may take some time to fit. QGIS’s plugin structure makes it impossible for the plugin to kill an algorithm while running. If you’re stuck, you may have to force kill QGIS !

Training and testing#

It is good practice to train and test a ML model on separate datasets. If you do not provide a test dataset, you have the option to perform a (cross-validation)[https://en.wikipedia.org/wiki/Cross-validation_(statistics)]. Then, you can either define the column that defined the separation between the different folds of your dataset or go for an automatic split. If you choose the automatic split, ye perform a random sampling and each points are randomly attributed to a fold. Be carefull that this may not be the best way to validate a model in your case !

Requesting model or algorithm support#

As both timm and scikit-learn provide a lot of different models and algorithm, not all of them have been tested. Nevertheless, we chose these libraries because their implementation of different methods is very consistent and reliable. If you find a model or an algorithm that does not work and that you would like to see properly supported, please do fill an issue on GitHub and we will try to make it work !