You are here

Primary tabs

Contextual Note: Detailed Analysis of Quality, Representation and Bias

We regard the source petition data used for this analysis to be of high quality. The data is collated and published by the House of Commons Petitions Committee on their website. From the information published there, steps are taken to ensure that British and UK citizens can only sign a petition once .

More Metrics use publicly available data published by ONS at Output Area (OA) and LSOA for their disaggregation process. This is mainly data collected at the decennial census (2011 and 2021) and this source data is also regarded as being of the highest quality.

The small area estimates (at LSOA level) derived by More Metrics using disaggregation are obtained using established proprietary techniques developed over a number of years. Researchers using this data should find that it provides useful insight when used alongside other data sources (e.g. market research data, bio bank data etc). It is however important to note that the nature of the disaggregation process and the fact that More Metrics does not have access to any data on the individuals who have signed petitions means that we are not able to provide statistical analysis on the accuracy of our small area estimates. Therefore, the suitability and usefulness of our data is a question for the researcher to consider when assessed alongside other data sources available to them.

The provided data covers the whole of the UK at LSOA level. The LSOA21 coding is used for England and Wales and LSOA11 coding is used for Scotland and Northern Ireland. The availability of census data used for the disaggregation was limited in Scotland and Northern Ireland to census 2011. In England and Wales census data for 2011 and 2021 was used.

The nature of petition data means that it is potentially subject to bias. Steps are taken by More Metrics to identify and, where possible, to mitigate bias.
There are three sources of bias to consider:

  • Local skew. Some petitions can have a very local skew (e.g. “save our hospital”) making them unsuitable for inclusion in our UK wide analysis. These petitions need to be identified and dropped from the analysis at the outset.
  • National / Regional level bias. This is particularly important in the case of Covid 19 because the progression of the pandemic and its response varied considerably across the UK. This is most obviously the case at a national level, but also applies regionally with local tiering introduced, and with some elected mayors having a high profile (e.g. Andy Burnham in Manchester). All our chosen petitions are subject to this type of bias and mitigation is taken at parliamentary constituency level to account for this.
  • Geo-demographic bias. Parliamentary petitions are signed “on-line” and therefore the level of access to the internet and a suitable electronic device will impact on signature rates. This is likely to introduce systematic bias into the data based on factors such as age, household income etc. This bias cannot be mitigated at parliamentary constituency level and needs to be identified and accounted for by the researcher (if required). Options for doing this include profiling at LSOA level using other CDRC datasets to re-weight the results appropriately.

The actions taken by More Metrics at parliamentary constituency level to mitigate the effect of bias are as follows:

  • Petition counts are standardised by Region in England and by Country outside of England. Standardisation involves dividing petition counts at parliamentary constituency level by the national / regional average and multiplying by the UK average.
  • Petitions with local bias are identified (after standardisation) by only selecting a subset of petitions at Parliamentary Constituency level that have an “unambiguous” set of (Spearman rank) correlations to every other chosen petition. The process used for this involves positioning each petition in five-dimensional “petition space” with the co-ordinates chosen so that the Euclidian distances between pairs of petitions is as close as possible to 1-r (where r is the corelation coefficient of the pairing). Petitions with total positioning errors above a threshold are dropped one-by-one until a set of petitions remain that have well-defined positions in petition space relative to every other chosen petition. The position co-ordinates also provide the necessary data needed to cluster petition topics into themes based on a combination of nearest neighbour distances and petition narratives.

You need to login to download the file.