Labour cost and structure of earnings annual survey 2022 

Ecmoss 2022

17/07/2024

Traitement statistique

Frequency of data collection


Data collection

There are three ways of collecting data. Historically, questionnaires have been sent to establishments by post. Large companies for which several establishments are surveyed can provide a computerised response, via a spreadsheet file. Finally, since Ecmo 2016, the Internet collection method, via the INSEE business survey response portal, has been offered to a sub-sample of establishments for which the "employee" section of the questionnaire sent to them covers few employees.

Data collection period

The ESS 2022 collection took place from May to December 2023.

Collection mode

  • By post mail
  • By Internet

Survey unit

Local unit (of an enterprise)

Sampling method

The sample is selected using a two-stage design, stratified at each stage. First, establishments are drawn and then employees within these establishments. The sampling rate is approximately 3.5% for the establishment level and 0.9% for employees. The stratification used is designed to optimise the accuracy of the main indicator (hourly wages) according to the main breakdowns required by law (by sector of activity, company size, region in particular).

The establishments are selected from the 'all employees' database constituted from social declarations (DSN), crossed with the Sirus register. The sample for year N is drawn from the data at 31/12/N-1.

The establishments surveyed are asked to answer a questionnaire on their establishment and questionnaires on identified employees (from 1 to 24 depending on the case). The sample of employees is differentiated by status (managerial/non-managerial).

The Ecmoss establishment samples are part of the negative co-ordination process between the surveys undertaken at INSEE.

Sample size

Around 18,000 establishments are surveyed, representing 165,000 employees

Data collection documents

The ESS 2022 questionnaires are provided in the french section

Data validation

The results are analysed and compared with other indicators disseminated by INSEE, particularly in the context of the quality report sent to Eurostat.

Data compilation

The tables sent to Eurostat always use two successive annual surveys. They are therefore based on observations surveyed in the survey year (N) as well as observations surveyed in the year before (N-1). The earnings variables observed in year N-1 are updated ("aged") to be representative of year N.


Enrichment with administrative data

At the end of the collection, the file of respondents is enriched with information from the 'all employees' database (BTS), mainly from administrative sources (DSN). In this way, the activity of the establishment, the employee's occupation, administrative data on remuneration, paid working hours, etc. are recovered. This enrichment of information from administrative sources is central to the survey process, both to complete the survey data and to check the consistency between administrative information and information from the survey and decide to adjust if necessary.


The enrichment phase also serves to identify 'out-of-scope' cases, in order to differentiate them from non-respondents. Out-of-scope establishments (mainly those that have ceased since the establishment of the sampling frame) and out-of-scope employees (either belonging to out-of-scope establishments or having left the establishment since the establishment of the sampling frame) are identified.

Clearance and correction of non-response

For the questionnaires 'employees' :

Dares is responsible for the adjustment of employee questionnaires. The central variables of the survey (gross salary and number of hours paid) are checked mainly using individual data from the "all employees" database (BTS). The main principles of the adjustment operations are as follows:

  • The value collected by the questionnaire is kept even in case of inconsistency with the BTS value, as long as the answers given to the different questions of the questionnaire are consistent with each other;
  • When outliers or missing values are detected or inconsistencies are found internally or with the BTS data, some variables are adjusted by deterministic imputations with the BTS variables, others are adjusted by modelling (statistical imputation). Whatever the source (questionnaire or enrichment) the earnings data are considered more reliable than the data on durations; it is therefore the durations that are modified in case of inconsistency.

After these adjustments, Dares calculates a first set of 'employee' weights corrected for total non-response by reallocating the weights of the non-responding units to the respondents belonging to the same draw stratum.

INSEE then carries out a second set of adjustments on the employee data, to meet the constraints imposed by Eurostat. In particular, a special treatment is carried out for employees on fixed-term contracts, for whom Eurostat wants a number of paid hours to be provided.

This expert work makes it possible to obtain for each year of the survey the adjusted, non-calibrated base which, after the calibration carried out at INSEE, becomes the annual base for national distribution.

For questionnaires 'establishments'

The adjustment of the 'establishments' questionnaires particularly concerns the Ecmo format where the establishment part is essential for responding to Eurostat. First of all, a distinction is made between respondents according to their level of response: they may in fact respond sometimes for the establishment or for the enterprise when the information at establishment level is not known. The establishments table is then cleared by eliminating establishments for which there is or is considered to be total non-response. Establishments not responding to a whole block of variables or to certain so-called "key" variables, such as those relating to costs or the wage bill, are considered to be in total non-response. As with the employee questionnaires, initial weights are calculated by reallocating the weights of the non-responding units in a homogeneous manner to the responding units belonging to the same sampling stratum. Then, the missing or incorrectly filled-in responses to questions other than the "key" questions, which lead to the establishment being classified as a total non-response, are adjusted by imputation, in particular by hot-deck.

Treatment of influential units

A treatment of the influential units is carried out, which makes it possible to control the "influence" of individuals who, because of their response and their high weight - and without their response being erroneous - lead to measures that are certainly still unbiased but potentially much less precise of the statistics of interest on the domains to which they belong. This is achieved by applying a winsorisation technique (Kokic and Bell's method) which reduces the weight of the influential individual without losing the information of their response. This makes it possible to improve accuracy.

Calibration on margins

  • For each annual survey, the variables taken from the survey are calibrated to the margins of the total population in paid employment taken from the BTS, according to a number of criteria (social category, gender, geographical location, etc.).
  • After concatenation of the annual files for the Eurostat rendering, the whole set is calibrated a second time on the margins relative to the year of validity of the survey.

Each of the adjustments to the margins is carried out using the Calmar procedure.

Eurostat rendering

The tables sent to Eurostat always use two successive annual surveys. On the concatenated base, final adjustments are made to satisfy the constraints imposed by Eurostat. These constraints are first of all strict limits for several variables (working time, valuation of overtime for example), or the absence of partial non-response (deletion of individuals with certain variables missing).