ant Jul 18, 2019 04:46 PM
Dear all,

I have some questions regarding the consistency of the data published in (Monitoring of CO2 emissions from passenger cars,[…]/co2-cars-emission-16).
I know the data is marked as "provisional", but nonetheless I'd like to do some analysis with it. I apologize if there is any better place to ask these questions and I’d be thankful for any relating hints.

1) Several entries of the field Mh (Manufacturer name EU standard denomination) contain the word "Duplicate". Does this mean that the corresponding data row should be omitted in the analysis (e.g. when calculating average CO2 emissions)? If so, why are they published?
For example, see data row 6747 (ID 14555502). All entries are identical to the preceding row except for the fields Mh, Mp, VFN, Ewltp and Vf.

2) As in the previous example, emission data originating from apparently identical vehicle types may differ significantly. For example, the data in row 1749 and 1750 (ID 14560585/6) only differ in the NEDC and WLTP emission figures, i.e. 149 vs 160g/km and 173 vs 184g/km, respectively. What may cause this discrepancy?
EEA Jul 19, 2019 10:44 AM
Dear M. 'Ant',

Thank you for contacting the European Environment Agency (EEA).

Duplicates are records for which the VIN appeared at least twice in the database. Duplicates are not used in the calculations of the manufacturer performances. These rows are published, because OEMs will review them and correct data in the error notifications. As it is a provisional dataset, there might very well be some errors, which will be corrected by OEMs.

With kind regards,
EEA Enquiry Service
ant Jul 19, 2019 03:09 PM
Thank you for clarifying!