Project Description
As population growth affects worldwide social and economic development, the importance of understanding a country’s population density is unquestionable. Population size and population density are fundamental in sustainable development and are taken into consideration in the formation of various aspects of infrastructure progression.
However, these figures remain enigmatic as they are very expensive and inaccessible for some countries to produce. Nonetheless, with the availability of satellite imaging, computer vision and deep learning provide a technical solution. Consequently, rich photographic data has made it increasingly possible to evaluate the global growth of humanity.
Thus, by using multiple ResNets to monitor continental population, based solely on satellite images, our research has been able to achieve population estimations with 74% accuracy through designing our own architecture, and validating our results on many European countries.
Project Links:
Paper Link: https://drive.google.com/file/d/1YUFfgFbCMhbepTR6KrSnaXnBX91o5A9K/view?usp=sharing
PowerPoint Link: https://docs.google.com/presentation/d/1bcmVPL5f3ka0w4bLGaC_hzqQjnk9LG5elXf1402Z7Lg/edit?usp=sharing
Problem:
There are not enough manual resources for countries to monitor their population growth.
But due to vast improvements in satellite technology, there are now daily photos of the entire Earth’s surface. We want to use this wealth of spatial and temporal data
Therefore, we have used open-source satellite imagery from Landsat 8 and census data from Eurostat to create population predictions for the European continent.
Our project: Using satellite images and CNNs to predict population density
Past Research:
Past research has been done using American and Ugandan population predictions
- Regional Population Estimation Using Satellite Imagery
- A Deep Learning Approach for Population Estimation from Satellite Imagery
Our modifications:
- Increased number of past data generated from our satellite to population model
- Comparisons between improved neural network models, including regression
- Large-scale predictions - more accessible than close-up high-res images
Models:
1) Classification - Trained on 1384 values
- ResNet 50 Base
- 10 classes for log population density 0-10
2) Regression - Trained on 2048 values
- ResNet 18 Base
- Output log population density as a scalar value
- Removed Average Pooling layer for Flatten layer
3) Regression w/ Augmentation
- Same as Model #2, but with Data augmentation
- Rotations, Shifts, added gaussian noise
Workflow:
First, we converted the satellite images into a resized RGB image for easier use.
- Also tried using high stride with higher quality images to downsample
- Wrote python scripts to download thousands of .tif images (dozens of gigabytes total)
We then used Keras to build both models with roughly 25 million parameters.
Then, using CNNs with input (?, 224, 224, 3), we made 3 models
- Trained with 2048 images, verified on 683 images
Applications: We used a sliding box to generate heatmaps of the populations in squares of different regions
Results:
- Regression: With data augmentation: 1.4702 MSE
- Regression: Without data augmentation: 2.0414 MSE
- Classification with 10 classes: 99.94% training accuracy, 74.14% test accuracy
Challenges Overcame:
This was my first experience building something that involved deep learning and my first large-scale project using neural networks. I had also never worked with satellite imagery before and thus had to learn about various technologies and data sources for this task.
The biggest challenge was tuning and building out our ResNet model. Initially, this research was planned to include two differing approaches toward forecasting future population growth. The first approach planned to use ConvLSTMs to predict the population in a certain year, given just five years of images, which would have included a way to directly take images as input, and calculate a population number from that.
Additionally, the second planned approach would have been to first predict the population of each of the 5 years of images inputted in, and would have used a normal time-series or LSTM model to predict the next year’s population. Additional features were planned, such as the classification the region type of certain areas, such as classifying cities as urban, farmlands as rural, and factories as industrial areas, which can be used in future predictions. Ultimately, this was not possible with our time constraints.
Accomplishments
Our three models had a high levels of accuracy. In training, our classification model correctly classified images into the correct one of its ten classes 99.94% of the time. While this is a very high number that potentially indicates overfitting, our testing accuracy is also very high. For our classification model with 10 classes, we acheived an accuracy of over 74.14%, which is quite on par for these types of models. We also achieved a low mean squared error for our regression models. Our regression model with our data augmentation performed admirably as the model finished with an mean squared error of 1.4702. While our regression model without the data augmentation was not as successful, our model still maintained a MSE of 2.0414. These results are in line with those achieved in other research projects.