Heliguy Blog: Drones for Climate

UK drone company Heliguy recently ran a blog article about my work with drones in the Arctic including on my Microsoft/National Geographic AI for Earth grant.

Drones have been increasingly important in my work on Arctic climate change, especially in mapping melting over glacier surfaces and as a way to link ground measurements with satellite remote sensing. I have recently passed the UK CAA Permissions for Commercial Operations assessments, so please reach out with projects and collaboration ideas related to drone photography or remote sensing.

Image taken from a quadcopter while mapping the ice surface, near point 660, Greenland Ice Sheet.






AI Adventures in Azure: Ice Surface Classifiers

For this post I will introduce what I am actually trying to achieve with the AI for Earth grant and how it will help us to understand glacier and ice sheet dynamics in a warming world.

The Earth is heating up – that’s a problem for the parts of it made of ice. Over a billion people rely directly upon glacier fed water for drinking, washing, farming or hydropower. The sea level rise resulting from the melting of glaciers and ice sheets is one of the primary species level existential risks we face as humans in the 21st century, threatening lives, homes, infrastructures, economies, jobs, cultures and traditions. It has bee projected that $14 trillion could be wiped off the global economy annually by 2100 due to sea level rise. The major contributing factors are thermal expansion of the oceans and melting of glaciers and ice sheets, which in turn is primarily controlled by the ice albedo, or reflectivity. However, our understanding of albedo for glaciers and ice sheets is still fairly basic. Our models make drastic assumptions about how the albedo of glaciers behaves, some assign a constant value to it, some assume it varies as a simple function of exposure time in the summer, and the more sophisticated models use radiative transfer but on the assumption that the ice behaves in the same way as snow (i.e. it can be adequately represented as a collection of tiny spheres). Our remote sensing products also struggle to resolve the complexity of the ice surface and fail to detect the albedo reducing processes operating there, for example the accumulation of particles and growth of algae on the ice surface, and the changing structure of the ice itself. This limits our ability to observe the ice surface changing over time and to attribute melting to specific processes that would enable us to make better predictions of melting – and therefore sea level rise – into the future.

Aerial view of a field camp on the Greenland Ice Sheet in July 2016. The incredible complexity of this environment is clear – there are areas of bright ice, standing water, melt streams, biological aggregates known as cryoconites and areas of intense contamination with biological growth, mineral dust and soots – none of which is resolved by our current models or remote sensing but all of which affect the rate of glacier melting.

I hope to contribute to tackling this problem with AI for Earth. My idea is to use a form of machine learning known as supervised classification to map ice surfaces from drone images and then at the scale of entire glaciers and ice sheets using multispectral data from the European Space Agency’s Sentinel-2 satellite. The training data will come from spectral measurements made on the ice surface that match the wavelengths of the UAV and Sentinel sensors. I’ll be writing the necessary code in Python and processing the imagery in the cloud using Microsoft Azure, with the aim of gaining new insights into glacier and ice sheet melting and developing an accessible API to host on the AI for Earth API hub. I have been working on this problem for a while and the code (in active development) is being regularly updated on my Github repository. A publication is currently under review.

I have already posted about my Azure setup and some ways to start programming in Python on Azure virtual machines, and from here on in the posts will be more about coding specifically for this project.

National Geographic Explorers Festival London

A few weeks ago I had the pleasure of presenting at the National Geographic Explorer’s Festival in London. This was an amazing opportunity to meet the inspirational Explorers and listen to them talk about AI solutions to conservation problems around the world. In the afternoon I spoke about my work on machine learning and remote sensing for monitoring glacier and ice sheet melting, and then participate in a panel discussion about the challenges of applying AI to environmental problems. The event was livestreamed and is now archived here (my part starts at 1:48).

The work I presented is supported by Microsoft and National Geographic through their AI for Earth scheme.


AI Adventures in Azure: Ways to Program in Python on the DSVM

Having introduced the set up and configuration of a new virtual machine and the ways to interact with it, I will now show some ways to use it to start programming in Python. This post will assume that the VM is allocated and that the user is accessing the VM using a remote desktop client.

1. Using the terminal

I am running an Ubuntu virtual machine, so the command line interface is referred to as the terminal. The language used to make commands is (usually) “bash”. Since the package manager Anaconda is already installed on the data science VM, it is very easy to start building environments and running Python code in the terminal. Here is an example where I’m creating a new environment called “AzurePythonEnv” that includes some popular packages:

>> conda create -n AzurePythonEnv python=3.6 numpy matplotlib scikit-learn pandas

Now this environment can be activated any time via the terminal:

>> source activate AzurePythonEnv

Now, with the environment activated, python code can be typed directly into the terminal, or scripts can be written as text files (e.g. using the pre-installed text editors Atom or Vim) and called from the terminal:

>> python /data/home/tothepoles/Desktop/script.txt


2. Using an IDE

The data science VM includes several IDEs that can be used for developing Python Code. My preferred option at the moment in PyCharm, but Visual Studio Code is also excellent and I can envisage using this as my primary IDE later on. IDEs are available under Applications > Development in the desktop toolbar or accessible via the command line. IDEs for other languages are also pre-installed on the Linux DSVM including R-Studio. Simply open the preferred IDE and start programming. In PyCharm the bottom frame in the default view can be toggled between the terminal and the python console. This means new packages can be installed into your environment and new environments created and removed from within the IDE, along with all the other functions associated with the command line. The basic workflow for programming in the IDE is to start a new project, link it to your chosen development environment, write scripts in the editor window then run them (optionally running them in the console so that variables and datasets remain accessible after the script has finished running).

Screenshot from 2019-03-15 09-46-52
Development in the PyCharm IDE

3. Using Jupyter Notebooks

Jupyter notebooks are applications that allow active code to be run in a web browser, and the outputs displayed interactively within the same window. They are a great way to make code accessible to other users. The code is written nearly indentically to a normal python script except that it is divided into individual executable cells. Jupyter notebooks can be run in the cloud using Azure notebooks, making it easy to access Azure data storage, configure custom environments, deploy scripts and present it as an accessible resource hosted in the cloud. I will be writing more about this later as I develop my own APIs on Azure. For now, the Azure Notebook documentation is here. On the DSVM JupyterLab and Jupyter Notebooks are preinstalled and accessed simply by typing the command

>> jupyter notebook
Screenshot from 2019-03-15 09-49-48
A Jupyter notebook running in a web browser

AI Adventures in Azure: Uploading data to the VM

There are many ways to transfer data from local storage to the virtual machine. Azure provides Blob storage for unstructured data managed through the user’s storage account as well as specific storage options for files and tables. There is also the option to use Data Lakes. These are all useful for storing large datasets and integrating into processing pipelines within Azure.

However, in this post I will talk about some simpler options for transferring smaller files, for example scripts or smaller images and datasets onto the VM itself, just to make the essential datasets available for code development on the VM. There are two main options – one is to upload to third party cloud storage, and the other is sharing folders through the remote desktop connection.

1) Upload data to a third party cloud storage account:

This could be an Azure store, Gdrive, OneDrive, Dropbox or similar, or an ftp site. Upload from the local computer, then start up and log into the VM and download directly to the VM hard drive. This is quite clunky and time consuming compared to a direct transfer.

2) share files using the remote desktop connection:

In XTerm there is an option to set preferences. Clicking this brings up a menu with a tab named “shared folders”. Select these folders and check the boxes for “mount automatically”. These folders are then available to the VM, and files can be copied and pasted between the local and remote machines.

Other, Azure-optimised data transfer and storage options will be covered in a later post!

AI Adventures in Azure: Accessing the VM via terminal or remote desktop

Accessing the Data Science Virtual Machine

Once the virtual machine is set up and started (by clicking “start” on the appropriate VM in the Azure portal) there are several ways to interface with it. The first is via the terminal (I am running Ubuntu 16.04 on both my local machine and the virtual machine). To connect to the virtual machine from the terminal, we can use secure shell, or SSH. This requires a set of keys which are used for encryption and decryption and keep the connection between the local and virtual machine secure. These keys are unique to your system, and they need to be generated. This can be done using the command line.

Generating ssh keys:

Option 1 is to use the terminal on your local machine. In Ubuntu, the following command will generate an RSA key pair (RSA is a method of encryption named after Rivest, Shamir and Adleman who first proposed it) with a length of 2048 bits:

ssh-keygen -t rsa -b 2048

Alternatively, the Azure command line interface (azure CLI) can be used. The Azure CLI is a command line service that can be installed to run from the existing terminal or it can also run in a web browser and is used to send commands directly to the virtual machine in an Azure-friendly syntax. To create the ssh key pair in Azure CLI:

az vm create –name VMname –resource-group RGname –generate-ssh-keys

Regardless of the method used to generate them, ssh key pairs are stored by default into


and to view the key the following bash command can be used

cat ~/.ssh/id_rsa.pub

The key values displayed by this command should be stored somewhere secure for later use. The ssh keys enable access to the VM through the command line (local terminal or Azure CLI). Alternatively, the virtual machine can be configured with a desktop that can be accessed using a remote desktop client. This requires some further VM configuration:


To set up remote desktop

The ssh keys created earlier can be used to access the VM through the terminal. Then, the terminal can be used to install a desktop GUI to the VM. I chose the lightweight GUI LXDE to run on my Ubuntu VM. To install LXDE use the command:

sudo apt-get install lxde -y

To install the remote desktop support for LXDE:

sudo apt-get install xrdp -y

Then start XRDP running on the VM:

/etc/init.d/xrdp start

Then the VM needs to be configured to enable remote desktop. This cna be done via the Azure portal (portal.azure.com). Login using Azure username and password, start the VM by clicking “start” on the dashboard. Then navigate to the inbound security rules:

resource group > network security > inbound security rules > add >

A list of configuration options is then available, they should be updated to the following settings:

source: any

source port ranges: *

Destination: any

destination port ranges: 3389

protocol: TCP

Action: Allow


Finally,  a remote desktop client is required on the local machine. I chose to use X2Go client available from the Ubuntu software centre or can be installed in the terminal using apt-get. After The RDC is installed, the system is ready for remote access to the VM using a desktop GUI.

Remote Access to VM using Desktop GUI:

  1. The VM must first be started – this can be done via the Azure portal after logging in with the usual Azure credentials (username and password) and clicking “start” on the dashboard. Copy the VM IP addres to the clipboard.
  2. Open X2Go Client and comfigure a new session:
    1. Host = VM ip address
    2. Login = Azure login name
    3. SSH port: 22
    4. Session Type = XFCE
  3. These credentials can be saved under a named session so logging in subsequently just requires clicking on the session icon in X2Go (although the ip address for the VM is dynamic by default so will need updating each time).
  4. A LXDE GUI will open!


Remember that closing the remote desktop does not stop the Azure VM – the VM must be stopped by clicking “stop” on the dashboard on the Azure portal.