Case Overview

A PE firm used NASA nigh-time illumination data to estimate economic activity changes using time series analysis.

The PE firm makes investment decisions based on growth prospects of an economy. For this, they use economic performance indicators published by the countries. However, these reports are subject to manipulation e.g. China and data quality issues in developing countries where agencies do not have a lot of pressure on collecting accurate data.

They wanted to look at a measurement methodology that is available faster than official releases and cannot be manipulated easily.

Identify activity clusters

Data gaps in developing countries are constraining and impacts the accuracy of the economic performance forecasts.

  • Economic activity estimates predicted within a week’s time
  • Accounted for the impact of cloud cover and pollution levels in estimates
  • Accounted for moon phase in illumination observations

The PE firm used the estimates from the model as an important metric to make investment decisions and have a more thorough analyses of investment decisions.

Data and measurement gaps

The economic activity data in developing economies suffer from two challenges - the first being availability of reliable data. This has more to do with enough investments in improving the data collection process and training the staff.

The second problem lies with the pace at which the data can be collected. Developing economies, with generally higher density of population (esp. China, India, Indonesia)make it difficult to collect data at a rapid pace. The staff take their own time collating data from far flung locations and coming up with the estimates, which need to be vetted by the respective government agencies. We proposed an inexpensive, open data source – NASA nigh-time illumination data that is available from their website on request and test if it can be used to estimate economic activity on ground.

A proxy for changes in the economy

In order to develop an effective economic activity estimation solution, the first thing we need is the ability to handle the large volume of data being generated by the satellites on a daily basis.  

Once the above has been addressed, the external impact on the illumination readings should be taken care of through data cleaning process to:

a. eliminate the impact of cloud cover

b. eliminate the impact of cloud cover

c. eliminate the impact of pollution and smog

Time series decomposition of the albedo data yielded seasonality – for cloud cover, cyclicity – for moon phase, external air quality data for pollution effects. The final component is the proxy for economic activity.