Courrier des statistiques N7 - 2022

The seventh issue and third birthday for the review since its relaunch. The ambition is always to address a wide range of the issues affecting Official Statistics. On an educational level, it addresses statisticians, whether beginners or experts, students and teachers, as well as citizens whom the “manufacture” of statistics concerns.

The first two articles cover the integration of mixed-mode data collection into the surveys, addressing the issues of which methods and tools to use to take advantage of this new approach to data collection. One major statistical operation is modernising: the agricultural census is now collected on a mixed-mode basis. Comprehensive administrative sources are more accessible, but are they easy to use? One example is the granular analysis of household property holdings.

Data may set the tone of this issue but the latter still extensively covers the instruments that allow that data to be used and heard. A good command of cloud computing and IT development techniques are proposed to ensure the quality of statistical output. Statisticians must also be able to work in conjunction with other academic disciplines, such as psychometrics in the assessment of students’ abilities. Finally, the development of a classification of crimes demonstrates how useful it is to adopt a common framework to store, classify and analyse data.

Courrier des statistiques
Paru le :Paru le19/02/2024
Odile Rascol, Editor-in-Chief, INSEE
Courrier des statistiques- February 2024
Consulter

Presentation of the issue

Odile Rascol, Editor-in-Chief, INSEE

Music please Maestro!

With this seventh issue, the Courrier des statistiques celebrates its third birthday. The aim of the review is always to address a wide range of the major issues facing Official Statistics. It is educational in tone and is open to a variety of subjects, authors and points of view. It addresses both statisticians, whether beginners or experts, and citizens, though the latter may sometimes find it somewhat dry. It allows a demonstration of our collective ability to evolve and innovate, not only with regard to methods and tools, but also with regard to institutional and legal issues. The review ensures that it remains attentive to external practices, both in France and in other countries, in order to position itself with regard to our community, to feed into our discussions and to testify to our work.

As is only natural, the review takes a particular interest in data and in the various ways of collecting or producing data. In this issue, it covers a major development that has taken place in recent years: the integration of mixed-mode data collection into surveys, i.e. the arrangement of additional data collection methods within the surveys. Just as in the previous issue, it also aims to understand how statisticians are increasingly working to take advantage of data sources that already exist but remain under-utilised. Finally, we will see the ways in which a major statistical operation, the agricultural census, is modernising.

Data is therefore central in this issue, as it constitutes the core of a statistician’s profession. However, now more than ever, statisticians must use a wide range of instruments. As part of their repertoire, a good command of cloud computing technologies or of the latest IT developments improves their autonomy and responsibility and the range of possibilities with which they can play and orchestrate their data processing operations. The ability to work in conjunction with other academic disciplines is highlighted in this issue, using the example of psychometrics in the assessment of students’ abilities. Finally, statisticians from various organisations need to disseminate compelling and consistent data. To that end, they develop and adopt a common framework to store, classify and analyse data: this framework is the classification. Issue N7 illustrates this with the new statistical classification of crimes, which is now used by all those involved in crime statistics.

In the prelude to this issue, two articles deal with the introduction of the Internet and particularly with mixed-mode protocols used in the collection of surveys. François Beck, Laura Castell, Stéphane Legleye and Amandine Schreiber perform a wide-ranging review of this development: mixed-mode data collection and the combination of data collection methods (in person, by telephone, in paper format or online) are a response to the increasing difficulty in contacting households. However, they also complicate the entire process, in relation to both the definition of the data collection protocol and the statistical processing operations. Éric Sigaud and Benoît Werquin then detail the necessary harmonisation of the various data collection stages for the teams responsible for designing and implementing them. This means it is necessary to conceptualise each of the phases of the data collection, from the design and automatic generation of questionnaires through to their processing and data consolidation, on an ex-ante basis. In doing so, they provide us with a new interpretation of the active metadata management approach.

The 2020 agricultural census take the spotlight in this issue, with an article by Hervé Le Grand. This major operation in the field of agricultural statistics is at the heart of the agricultural statistics information system: it acts as a kind of metronome that sets the rhythm for other surveys and ensures that consistent data are produced at both French national level and European level. The latest iteration includes five major innovations, which have an impact on respondents, interviewers and statisticians. For the first time, the data were collected primarily online or by telephone. At the time of writing, a sixth innovation, data visualisation, is bringing a finishing touch to this census, which is replete with developments.

A trio of pirates, Frédéric Comte, Arnaud Degorre and Romain Lesur, take us on a journey into the SSPCloud. A computer environment to assist in experimentation with new data science methods, SSPCloud is composed of a set of computer resources for creating prototypes, testing statistical processing operations and taking ownership of new work practices. With SSPCloud, statisticians become part of a FabLab-type school of thought, enabling them to take advantage of new data sources. Here we form several of them and the adoption of open source solutions ensures that reuse is possible. SSPCloud is ultimately a fruitful blend of two professional worlds: that of statistics and that of IT.

Following straight on from the article on SSPCloud, Emmanuel L’Hour, Ronan Le Saout and Benoît Rouppert tell us about the concept of self-sufficient statisticians. The profession of statistician requires a good command of computer tools. The days of statisticians being able to perform their role without the use of tools and instruments have now passed. They must code in accordance with the applicable rules and customs because, while the computer programs written must allow results to be obtained, they are, beyond the deliverables, evidence of the quality of the data processing operations and it must be possible to reuse them for other work or for other self-sufficient statisticians to reinterpret them.

The sixth article in this review takes us into the universe of the use and data matching of comprehensive administrative sources. Mathias André and Olivier Meslin describe the project they are leading jointly: to create a new statistical database to allow the study of the property wealth of households and the redistributive profile of property tax. Committed to making use of the administrative sources available, they have experienced the obstacles encountered before accessing the data, as well as the pitfalls for the data matching and statistical processing of files from different spheres, designed for other uses. The article precisely describes the stages of the saga: it details the good practices that have enabled them to achieve a production database, which now supplements the panorama of statistical information concerning household wealth. It also highlights lessons learned from this project for statisticians wishing to perform work on administrative databases.

The work of statisticians sometimes takes a singular path, due to the fact that the thing that they wish to measure does not exist prior to the measuring operation itself. Thierry Rocher describes the solutions used by the Education Statistical Service to approach a measurement of students’ abilities. He shines a light on concepts from the field of psychometrics. He describes to us the choices made (specific procedures and modelling) to reach the point at which it is possible to produce standardised assessments of students’ abilities. In so doing, he reminds us of the extent of national and international systems that seek to define comparable and useful statistics at all levels, from the level of a teacher and the head of a school to the level of a minister.

The final measures in this issue take us in the direction of a fairly rare challenge in the life of a statistician: that of creating a statistical classification. Until then, the Ministry of the Interior and the Ministry of Justice used different dissemination classifications, which prevented the availability of consistent granular statistics throughout the criminal chain. Benjamin Camus describes how the UN developed an international classification in 2015, distancing itself from differences in criminal legislation by choosing an approach based on the offender’s behaviour. This set the tempo and provided an opportunity to launch the project in France: an inter-ministerial working group defined a French version anchored in a detailed codification of criminal law. In December 2021, the French classification of crimes came into being: linked with the international classification for the major categories, but including a level of detail that is more relevant in the French context, it constitutes a starting point for the concept of reconciled statistics.

Paru le :19/02/2024