The Covid crisis has demonstrated the need for alternative data, in real-time and with global coverage. This paper exploits daily infrared images from satellites to track economic activity in advanced and emerging countries. We first develop a framework to read, clean and exploit satellite images. We construct an algorithm based on the laws of physics and machine learning to detect the heat produced by cement plants in activity. This allows to monitor in real-time if a cement plant is functioning. Using this information on more than 500 plants, we construct a satellite-based index tracking activity. Using this satellite index outperforms benchmark models and alternative indicators for nowcasting the activity in the cement industry and in the construction sector. Exploring the granularity of daily and plant-level data, using neural networks yields significantly more accurate predictions. Overall, combining satellite images and machine learning allows to track industrial activity accurately.
The assessment of economic activity from space would be of great interest as satellite data are released in near-real-time, have a global coverage with uniform quality, and are free-to-use. Combining these advantages contrasts with usual data sources most often released with a significant lag, whose quality and reliability change much across countries, and which be costly. In addition, the increasing number of satellites, their sophistication and the release of their data in the public domain has made satellite data an increasingly promising source of real-time information.
However, tracking the economy with satellites requires a signal that can be seen from outer space: to that end, we exploit the heat produced by cement plants. Manufacturing cement indeed includes a step where raw materials are heated at about 1,450°C in large ovens called rotary kilns. Such heat can be detected, when using satellite images in the infrared spectrum. There are other interests of focusing on the cement industry since: cement is a widely used commodity, necessary in both advanced and developing economies, and cement is generally consumed locally, as its low cost makes it un-profitable to ship across long distances. Using these satellite images is a first contribution of this paper, while the literature trying to exploit satellite data has so far focused on night lights (Donaldson and Storeygard, 2016) and more recently on air pollution (Bricongne et al., 2021).
We lay out a method to exploit infrared satellite images and detect automatically heat, using the law of physics and machine learning. The idea of heat detection comes from Planck’s law which describes the reflectance (electromagnetic radiation) emitted by an object. By looking at infrared satellite images over the locations of the rotary kilns (ovens) of cement plants, we apply a suite of algorithms based on Planck’s law to see whether the kiln is working or not. The left-hand side of Figure N1 shows an example of a working cement plant with different “hot” kilns (in red). The same cement plant is shown during the Covid-19 lockdown in the right-hand side of Figure N1, where no heat is detected as the plant had been completely shut down. We apply this procedure on around 500 cement plants globally to assess their activity. The satellite date are also corrected for cloudiness – using an AI algorithm for image recognition – and interpolated – using extreme gradient boosting, a machine learning algorithm. In the end, this provides a real-time satellite-based index of activity in cement plants, daily and for each plant we track. A second contribution is to set such a procedure to read, exploit, and clean satellite images, combining algorithms based on physics with machine learning.
We then test the predictive power of our satellite-based activity index to nowcast production of cement and broader activity in the construction sector. We find that it outperforms benchmarks, including models based on alternative indicators. We start with a linear model using the satellite index and an AR term to nowcast the production of cement. We find that it outperforms usual benchmarks (random walk and autoregressive model) as well as similar linear models based on construction indicators (building permits, PMI Indices, and Google Trends). But while this 1st model uses the satellite index aggregated at monthly frequency and at country level, in a 2nd step, we explore the granularity of our daily and plant-level satellite indices. We use a MIDAS to exploit the daily frequency, a LASSO to exploit the plant dimension, and a LASSO-MIDAS to explore both these spatial and temporal dimensions. We find however that the accuracy is on average similar when using such models on disaggregated data – compared to the OLS model using data aggregated at monthly frequency and country-level.
We finally use neural networks to predict the production of cement and find that it can significantly outperform the OLS model. Neural networks are highly flexible non-linear methods which have the double advantage of: high flexibility, and being found in the recent machine learning literature to outperform other approaches. In line with literature, we employ a multi-layer perceptron with few hidden layers given the small sample size. Overall, neural networks strongly outperform the linear model, and thus also benchmark models. This is another contribution: neural networks can be relevant for nowcasting in macroeconomics – complementing recent applications to nowcast GDP (Woloszko, 2020) and trade (Hopp, 2021).
Updated on: 06/27/2023 18:21