The paper applies Natural Language Processing techniques (NLP) to the quasi-universe of newspaper articles for France, concentrating on the period 2004-2022, in order to measure inflation attention as well as perceptions by households and firms for that country. The indicator, constructed along the lines of a balance of opinions, is well correlated with actual HICP inflation. It also exhibits good forecasting properties for the European Commission survey on households’ inflation expectations, as well as overall HICP inflation. The method used is a supervised approach that we describe step-by-step. It performs better on our data than the Latent-Dirichlet-Allocation (LDA)-based approach of Angelico et al. (2022). The indicator can be used as an early real-time indicator of future inflation developments and expectations. It also provides a new set of indicators at a time when central banks monitor inflation through new types of surveys of households and firms.
Newspapers disseminate a wealth of data that describe economic developments and may therefore shape agents’ expectations. Available every day, newspaper articles can provide a quasi-instantaneous assessment of inflationary pressures that can be useful to economic agents and Central Banks. They complement the increasing reliance by Central Banks on indicators of inflation expectations from business or households’ surveys, beyond the traditional indicators derived from financial markets.
There is a growing literature that uses Natural Language Processing (NLP) techniques and Artificial Intelligence (AI) in order to build macro-economic indicators. Starting from the analysis of GDP developments, researchers investigate now new dimensions, with an extension to inflation developments.
The paper analyses more than one million articles from the written press or press/news agencies since 2004 in France, using commercial data available from the Factiva API. Based on a set of 30 sources corresponding to major daily newspapers or weekly magazines of the national and regional press, paper and/or online, we construct indicators of perceived inflation in France in the spirit of the work of Angelico et al. (2022) for Italy.
The method is based on a selection of articles using keywords (related to the semantic field of "inflation" or "prices") as well as filtering and classification algorithms. They are used to select only the articles that actually deal with goods and services inflation and not other topics (e.g. literary 'prizes' or awards which are written in the same way in French), or other types of inflation. A distinction for the direction of price changes is then also made (rising, falling or stable) again based on classification algorithms.
Several techniques are available to construct text-based indicators and one objective of the paper is to compare their performance, in particular between “supervised” and “unsupervised” methods. The first ones require human labelling to train the model, as in our case, while the second types of methods do not rely on human intervention for training but still need it a posteriori, such as for the selection of relevant topics as in Angelico et al. (2022), that we reproduce in the paper for comparison.
We first provide a measure of intensity of inflation (i.e. the frequency of articles associated with inflation) since it may be a useful measure of the “attention” to inflation by media and economic actors in the sense of Korenok et al. (2022).
However, the direction of inflation also matters and we also create an indicator which signals the direction of prices and is constructed as a “balance of opinions” inspired by surveys among households and firms. We find that such an indicator presents interesting statistical properties. The indicator is well correlated with the EU Commission’s household survey of one year ahead inflation expectations (Figure A). In comparison to other indicators of inflation expectations from Consensus Forecasts or financial markets, the indicator has better forecasting properties: it is always retained by an automatic selection algorithm, also including a variety of control variables (oil prices in euro, short term business cycle). It is also well correlated with overall inflation (Figure B) and has good forecasting power for one quarter ahead inflation in a Phillips curve framework.
The signal provided by the indicator is only slightly different when we exclude articles expressing the views of experts. For that purpose, we make a distinction between experts and non-experts, and within the experts’ category we also try to discriminate between policymakers and other private-sector experts since the former may have specific information about inflation developments that we may not want to capture.
We also show that the Machine-Learning based approach that we implement performs significantly better for France than the unsupervised approach proposed by Angelico et al (2022), based on LDA (Latent Dirichlet Allocations) or bi-grams only.
Updated on: 08/25/2023 10:50