You are here

Synthetic Household/Accommodation Population for Energy (SHAPE)

The Synthetic Household/Accommodation Population for Energy (SHAPE) provides an enrichment to the SPENSER Household Population. The SPENSER (Synthetic Population Estimation and Scenario Projection Model) uses Iterative Proportional Fitting (IPF) to create two synthetic populations: one of individuals (at Middle Layer Super Output Area (MSOA) level) and other of households (at Output Area (OA) level). Both use the 2011 Census Data. The population of individuals/households are evolved in time using a dynamic microsimulation model, and finally, individuals are linked to households. Each SPENSER Household has information related to accommodation type, tenure, household type/size, number of rooms/bedrooms, central heating availability, NS-SEC and Ethnic group of Household Reference Person. The original SPENSER Individual Population at MSOA level is not reproduced in the SHAPE data, but each household record contains a household ID which links back to the SPENSER individual’s data.

In the SHAPE population each household in the original synthetic population has been enriched with accommodation related information (floor area, accommodation age, and whether or not the property is connected to the gas network) available in the Domestic Energy Performance Certificates (EPC) data. As the geographic zones (local authorities) in the SPENSER and the EPC datasets were often recorded with different names, to ensure the quality of the enriched population all the local authorities were recategorized using the same Lookup Table provided by the Office for National Statistics (see related datasets for more details). Then, to assign EPC information to each synthetic household, the Propensity Score Matching (PSM) technique with a kernel density algorithm was used.

The data files are all synthetic, so they do not reveal information about real people; they are not sensitive data.

All details and code used to generate the Synthetic Household/Accommodation Population for Energy can be found in the Energy Flexibility Project GitHub repository (

Please see the data profile document for further details about the dataset.

University of Leeds

Data and Resources

Release Date
Spatial / Geographical Coverage Location
United Kingdom
Patricia Ternes, André P Neto-Bradley, Ruchi Choudhary, Nick Malleson
Contact Name
Consumer Data Research Centre
Contact Email
Public Access Level
POLYGON ((-5.7972085475922 49.623941359093, -5.7972085475922 55.943122888915, 2.2528141736984 55.943122888915, 2.2528141736984 49.623941359093))