Data style guide
An introduction to charts and tables, including when to use the different types of chart, and guidance on presenting data and using the chart and table builder in the Publisher.
1. Preparing your data for the Publisher
Formatting your data
When you are preparing data for charts and tables:
- do not use abbreviations – for example, write ‘number’ rather than ‘no.’
- for data in tables (not charts), follow the guidance for writing dates and time periods – for example, write ‘April 2018 to March 2019’, not ‘2018/19’
- do not use dashes and slashes – for example, write ‘White Gypsy and Roma’, not ‘White Gypsy/Roma’
- always capitalise ethnic groups – for example, write ‘Black Other’ not ‘Black other’
- use commas to separate thousands and millions – for example, write ‘3,000’ not ‘3000’
Ethnic categories
Ethnic categories should follow those in the list of ethnic groups where possible.
If departments have submitted different categories, amend them to match the standardised list if you can. For example, if the original dataset uses ‘White’ and ‘BME’, rename ‘BME’ to ‘Other than White’. Do this in both the download file and in the data you are using for charts and tables.
If the dataset uses categories that do not fit with any of the existing groupings used on Ethnicity facts and figures, the chart builder will automatically select the ‘Custom’ grouping.
Use the same ethnic categories throughout the charts, tables and download file for each measure page. Data for aggregated groups can be shown in charts, and individual groups in tables in the same dimension.
Use the label ‘Unknown’ for categories where the ethnicity isn’t known or stated.
Aggregated and individual ethnic groups
If the data provided by departments includes individual ethnic categories (such as ONS 18+1), they will be grouped into their aggregated categories by the Publisher.
Do not include statistics for aggregated groups if they were not part of the department’s data file.
People with unknown ethnicity
Write in ‘The ethnic categories used in this data’ (in ‘Things you need to know’):
- the percentage of people whose ethnicity was not known
- whether people with unknown ethnicity are used in the calculations
Do not include an analysis of people with unknown ethnicity in the commentary unless there is an important reason for doing so (for example, an unusually high number of people of unknown ethnicity).
Missing data
There should not be any empty cells in charts, tables and data downloads.
Use the following symbols if data is missing:
- ! data withheld for confidentiality
- ? data not shown because small sample size makes it unreliable
- - data not collected [or otherwise unavailable]
- N/A data not applicable (eg no meaningful numerator)
- ~0 value rounded to 0 but is above 0 before rounding
The Publisher automatically adds footnotes explaining what the symbols mean.
Rounding
Depending on the quality of the data, round numbers in charts and tables to either:
- whole numbers
- 1 decimal place
As a rule of thumb, round good quality data to 1 decimal place, and poor quality data to whole numbers.
Rounding must be consistent across tables, charts and commentary in the same measure. If the download file contains unrounded numbers, or has different rounding to charts and tables, say so in ‘Things you need to know’.
2. Building a table
All dimensions must include a table. They give the user an alternative way to view the data, and can be used to include extra information (like sample sizes).
Which data to include
In addition to percentages or rates, tables can also include other data, such as sample sizes, to help the user understand the data presented.
As a general rule:
- in survey data, include the sample size or total number of respondents (denominator) in tables – make sure it is clearly labelled in the header row of the table (for example, ‘All respondents’)
- in administrative data, include the value (for example, the rate or percentage) and the numerator (the number of people or occurrences being counted)
For standardised data (like age-standardised data for health measures), do not include the sample size (denominator) or numerator. Instead only show results in terms of a percentage of the group being measured.
The data you include in a table will depend on:
- the type of data the measure is based on (survey or administrative data)
- the size of the table
There is not a minimum or maximum number of columns and rows a table should have. However, try to avoid overwhelming the user with figures in tables for dimensions shown by ethnicity and other factors.
If a time series has a lot of data points, you can show only the first and last time periods. Include all time periods in the data download file.
Preview your tables after you have built them. If they are too wide and the user has to scroll horizontally in order to view the full table, find other ways to present your data. For example, put ethnic groups in columns instead of rows, or exclude the numerator or denominator if they are not essential to understanding the data.
Columns and rows
Table rows should usually be labelled by ethnic group, and columns should be labelled by other factors (years, age groups, etc).
Only label columns by ethnic group if:
- the table is part of a ‘by area’ dimension, showing regions, local authorities or other geographic breakdowns
- the number of ethnic groups is smaller than the number of categories in the other factor, and putting ethnic groups in columns would mean less or no scrolling
Group columns in tables if they present different statistics for the same variable.
Header rows
Use the following labels in table header rows:
- ‘Ethnicity’ for ethnic groups – this should be used when you are presenting ethnic groups
- % (a percentage sign) where relevant – do not use percentage signs alongside values within the table
- ‘Region’ or ‘Local authority’ for areas and regions – or the type of geography that corresponds to the data presented
3. Building a chart
Types of chart and when to use them
There are 5 types of chart:
- bar chart
- component bar chart
- multiple bar charts
- line chart
- multiple line charts
Bar charts, grouped bar charts, component charts and multiple bar charts
Use bar charts, grouped bar charts, component charts and multiple bar charts (sometimes called ‘panel bar charts’) for cross-sectional data, or for a very short time series (2 or 3 years).
Use a bar chart if:
- the data is broken down by ethnicity only – show the values for both aggregated and individual ethnic groups if they’re available
- the data is broken down by ethnicity and a second factor, and the second factor has 3 groups or fewer (for example, men and women, or a time series with 3 years)
Use a component chart if::
- the percentages for all ethnic groups add up to 100%
- the data is broken down by ethnicity and a second factor (for example, gender) and the percentages for the second factor add up to 100%
Use a component chart or a grouped bar chart to display data divided into 2 categories, such as ‘male’ and ‘female’. This allows users to easily compare values for the 2 groups.
Use a multiple bar chart or grouped bar chart if:
- the data is broken down by ethnicity and an second factor, and the second factor has 4 groups or more
Multiple bar charts in the same dimension should have the same scale. If your multiple bar charts are showing different scales, use a grouped bar chart instead.
Line charts and multiple line charts
Use line charts and multiple line charts for time series data.
Use a multiple line chart if a single line chart would be difficult to read because:
- there are more than 6 groups to compare
- the lines are very close together or overlapping
Multiple line charts must all have the same horizontal and vertical scales, and they should each show a single line.
The vertical axis should start from 0 if the line chart is showing a percentage.
If there are few lines to compare but they are very close together, they can be presented in a line chart (rather than multiple line charts) but changing the minimum value in the y axis. However, user research has shown this approach is less easily read.
Chart values and labels
When creating charts in the chart builder, select the ‘Standard ethnicity’ group in all fields that refer to ethnicity categories (‘Primary grouping’, ‘Secondary grouping’, ‘Panels’, etc). If the field has a text box, enter ‘Ethnicity’.
Percentages
If the values in the data are percentages:
- Import the data as numbers rather than percentages in all data files (chart, table and download files).
- In the ‘Number format’ section in the Publisher, choose ‘Percentage’ if you want the vertical axis to start at 0% and end at 100%.
- Choose ‘Other’ if you want the vertical axis to have a range different than 0% to 100% (for example, if all values in the data are between 0% and 20%). Enter the values for the vertical axis in the ‘Minimum’ and ‘Maximum’ fields. Enter ‘%’ in the ‘Suffix’ field.
4. Adding the download file
Download files contain all the data used in the charts and tables, as well as some additional data.
Use the following names for the variables within download files to make sure they are consistent across different measures:
- measure
- ethnicity
- time
- gender or sex – use the same term provided by the department in the source data
- age
- geography
- value
- numerator
- denominator
- sample
- unweighted sample
- upper bound
- lower bound
The names of the variables listed in the ‘Download description’ for each measure should be broadly the same as those in the download file – but say ‘confidence intervals’ rather than ‘upper bound’ and ‘lower bound’. Explain any other differences in the download description.
The variables listed depend on the content of the download file.