Home

COD-AB Data Quality Report:

Generated on: 25 Apr 2025

Metadata

Scores

Overall Score: Takes the average value of all the below.
NaN%
Valid Geometry: Layers which have valid geometry. Valid geometry is defined by having no empty geometries, only containing polygons (no points or lines), not containing self-intersecting rings, using WGS84 CRS (EPSG:4326), and has a valid bounding box.
NaN of 0
Valid Topology: Layers which have valid topology. Valid topology is defined as having no triangle polygons, sliver gaps or overlaps within a layer, with each polygon being fully contained within their parent.
NaN of 0
Equal Area: Layers which all share the same area. Layers not sharing the same area may have empty areas representing water bodies whereas other layers have them filled out.
NaN of 0
Sq. km: Layers which have an area attribute in square kilometers and value matches area calculated above using NASA EASE-Grid 2.0.
NaN of 0
P-Codes: Layers which have all required P-Code columns (ADM2_PCODE), with no empty cells, only alphanumeric characters, starting with a valid ISO-2 code, no duplicate codes, all codes within a column having the same length, and hierarchical nesting codes.
NaN of 0
Names: Layers which have all required name columns (ADM2_EN), with no empty cells, no duplicate rows, no double / leading / trailing spaces, no columns all uppercase / lowercase, no cells lacking alphabetic characters, and all characters matching the language code.
NaN of 0
Languages: Layers which have at least 1 language column detected, all language codes used are valid, a romanized language is featured first, and layers don't have more languages than their parents.
NaN of 0
Date: Layers which have a valid date value for their source.
NaN of 0
Other: Layers which have no fields other than expected values.
NaN of 0

Checks

Valid Geometry
How many geometries are in the layer?
How many geometries are empty?
How many geometries are not polygons?
How many geometries have 3D coordinates?
How many geometries are invalid?
If any geometries are invalid, how come?
What EPSG projection is used in the layer?
Does the layer have an invalid bounding box?
Valid Topology
Are there any sliver gaps within the layer?
How many polygons overlap each other within the same layer?
How many polygons aren't fully contained within their parent?
How many polygons list a P-Code different than the one their parent has? (match by location)
How many polygons list a P-Code different than the one their parent has? (match by attributes)
How many polygons list a name different than the one their parent has? (match by location)
How many polygons list a name different than the one their parent has? (match by attributes)
Equal Area
Are all layer bounding boxes the same?
How large is the layer rounded to square kilometers?
Sq. km
What is the total area according to the layer's attribute value?
P-Codes
How many admin levels (eg. ADM2) are represented with P-Code columns? (eg. ADM2_PCODE)
How many P-Codes are empty?
How many P-Codes are duplicated?
How many different P-Code lengths are there?
How many P-Codes don't start with a valid ISO-2 code?
How many P-Codes aren't alphanumeric?
How many P-Codes can't nest with their parent value?
Names
How many Admin 0 names don't match their UNTERM or UN M49 short names?
How many admin levels (eg. ADM2) are represented with name columns? (eg. ADM2_EN)
How many name columns are present? (should be number of languages multiplied by number of admin levels)
How many names are empty?
How many rows have duplicated names? (considers all admin levels)
How many names have double spaces?
How many names have leading or trailing spaces?
How many names are all uppercase?
How many names are all lowercase?
How many names contain numbers?
How many names don't contain any alphabetic characters? (eg. all numbers or punctuation)
How many names use characters outside of their defined language?
How many invalid characters are detected?
What invalid characters are detected?
Languages
What languages are used in the dataset, are they all valid codes?
Does the first listed language use a Roman (Latin) script?
Does the parent contain all the current layer's languages?
Date
What is the date of the dataset's source?
When was the dataset last validated on?
Does the layer have a validTo field?
Is the layer's validTo field empty?
Other
How many reference name fields are present?
How many alternative name fields are present?
How many other fields are present?