AI Adventures in Azure

A lot of my work at the moment requires quite computationally heavy geospatial analysis that stretches the processing capabilities of my laptop. I invested in a pretty powerful machine – i7-7700GHz processor, 32GB RAM – and sped things up by spreading the load across cores and threads, but it can still be locked up for hours when processing very large datasets. For this reason, I have started exploring cloud computing. My platform of choice is Microsoft Azure. Being new to Azure and cloud computing in general, I thought it would be helpful for me to keep notes of my learning as I climb onboard, and also thought it could be useful to make the notes public for others who might be following the same path.

I’ll be blogging these notes as “Adventures in Azure”. I’m predominantly a Linux user and the notes will focus on Linux virtual machines on Azure. My programming will almost all be in Python. The end-goal is to be proficient with machine learning applied to remote sensing image analysis in the cloud.


I’m certain I will find fugly ways to do things and I will be grateful for any suggestions for refinements!


1. Setting Up Linux Data Science Virtual Machine

I’m not going to write up notes for this as it was so easy! I created an Azure account with a Microsoft email address, then I chose to use a virtual machine image preloaded with the essentials – Ubuntu, Anaconda (2.7 and 3.5), JupyterHub, Pycharm, Tensorflow and NVIDIA drivers – amongst a range of other useful software designed specifically for data science. Microsoft call it the “Data Science Virtual Machine and the link is here and the instructions are simple to follow. I opted for a standard NC6 (which has 6 vCPUs and 56GB memory) as this is a significant step up in terms of processing power from my local machine, but comes at an affordable hourly rate.

Once the virtual machine is established, there is still a fair amount of configuring to do before using it for geospatial projects. The next post will contain info about ways to work with Python on the virtual machine.