This is the first time I touched the subject of machine learning in GIS applications and I have never thought of this as a tool for cartographic tasks. However, in this post, you will see that it can be helpful in recognizing known features from satellite imagery, to be specific, I’m going to detect buildings.
Detect buildings in ArcGIS Pro – TensorFlow and PyTorch
During this exercise, I base on a tutorial where teachers are using palm trees detection. The tutorial is quite clear to follow and has many useful hints, so I encourage you to go through it! For doing this task you need:
- Publisher or Administrator role in an ArcGIS organization (get a free trial)
- ArcGIS Pro
- ArcGIS Image Analyst
- Optional: A machine with a GPU at GTX 1080ti or better and at least 6 GB of memory
First, I had to create training samples which is nothing other than simple drawing polygons around buildings. I have provided for 100 samples. For me, really valuable is pointing out that you have to frame not only the desired object but also its nearest background.
After exporting the sample files, I had to prepare an environment and train the model – this was the most tricky part. Mostly because there were many errors coming from python PyTorch library (eg. it didn’t like commas in created sample text files). After hours of struggle, I got the Train Deep Learning Model geoprocessing tool to generate my model.
The next step was to Detect Objects by Deep Learning with the help of the model trained before. There are many parameters you can set, and it is really important, I set padding to 1, the threshold to 0.05, nms_overlap to 1.6, batch_size to 1, and exclude_path_detections to True. The result is not so precise, but what I liked is that every shot/rectangle had an attribute named confidence with its score. Of course, it would be perfect if every building had high confidence but, unfortunately, it isn’t so beautiful with only 100 trained samples.
The final results from ArcGIS Pro: 90 positives, 90 false positives
Detect buildings in FME – OpenCV
First of all, FME is using the OpenCV library and the training process is not only inside FME but also in the OpenCV environment. OpenCV is a “library of functions used during image processing, based on open code and initiated by Intel”. What’s notable, I’ve found the whole process more flexible and less error-prone than in ArcGIS Pro.
I’ve used 148 samples of buildings in this case. In the OpenCV there is also a need to provide negative samples, where objects are not present – I’ve made 100 pieces of it. This is the most time-consuming part and is performed outside FME.
When we have all the necessary files, FME transformers are stepping in: Raster Object Detector Sample Preparer, Raster Object Detection Model Trainer, and Raster Object Detector. To prepare the workspace I have used a tutorial where Dimitri Bagh is recognizing stop signs. The article is very informative, nevertheless, you have to test yourself on what parameters of the model trainer and object detector suit your case. For example, in the training model, the width and height of the detected buildings on my satellite image close at 11 px. What is more, I set 26 stages of training and LBP (Local Binary Patterns) model type, not the slower Haar cascade classifiers.
The final results from OpenCV and FME: 182 positives, 39 false positives.
Summary
FME and OpenCV building detection looks better in the above comparison, but remember that I provided only 100 positive samples in ArcGIS Pro and 148 in the FME process. This may not be a fair comparison, but my intent was to compare processes over the effectivity. Nevertheless, FME choice seems to be a less error-prone procedure. Thus, to start your recognition quicker I recommend using Safe company product.