Branscombe censuses

1 Introduction

The censuses included in this collection are the 1841, 1851, 1861, 1871, 1881, 1891 and 1901 censuses. These are the ones for which the original data are available (rather than just summaries).

The electronic versions of the censuses were assembled in the following way. First, printouts of the original census returns1 were transcribed onto computer files.2 Next, the raw data was annotated using an XML markup.3 The result is a set of XML files - the data files - one per census.

For readability, viewable HTML4 files have been generated from the data in two formats.

2 Version

The present version of these Branscombe censuses is version 2. Version 2 differs from version 1 in the inclusion of the 1901 census.

3 Structure of the data

The broad structure of each census is:

<census>
    year
    <household>+
        rooms-occupied
        in-occupation
        <abode>
        <member>+
where + means 'one or more'. In other words, a census consists of one or more households, each of which is made up of an abode (=house name) and one or more members. The year attribute is the year the census was taken. rooms-occupied and in-occupation are explained below.

Each <member> can contain the following fields:

<member>
    <name>
        <forename>+
        <surname>
    <relation>
    <condition>
    <age>
    <sex>
    <defects>
    <trade>
        standard
        work-status
        standardised-work-status
        at-home
        annotation
    <birthplace>
        standard
        county
        standard-county

<name>, <relation>, <age>, <sex> and <birthplace> are all obligatorily present (except where illegible in the original). The other fields may be absent.

Generally the censuses get more detailed over time. Earlier censuses lack some of the fields listed above. The 1841 census, for instance does not have relation or condition, and only includes minimal information on birthplace (see §3.13).

Unoccupied houses are recorded in the original censuses and also in the electronic version. These have <abode> but no <member>s. (See also §3.4.)

3.1 General approach to the data

The original presentation of the censuses, with all its peculiarities of spelling and phraseology, is interesting in its own right. Also interesting, though, is the possibility of processing the data automatically (by program), for instance by sorting it in ways that differ from the original order, or by searching for particular values (a name, say). In order to be able to process two items (such as two names) that are equivalent, they need to have the same form. If they are different in the original census, they need to be standardised. These two requirements - fidelity to the original and standardisation - are in conflict.

One way of standardising the data is to edit it directly. But this loses the original form. Another way is to add standard forms as annotation, and then to process only the standard forms. In the present collection, names of houses and of people have been kept in their original form, except for spelling out some abbreviations. Other fields have been standardised (on the assumption, for instance, that it is not interesting that 'widower' is sometimes spelt out in full, sometimes written as 'wid', and sometimes written as 'widr'.

Some editing has been done on the <trade> field to standardise the spelling, spacing, and capitalisation of names of trades. For instance, 'Washer woman' and 'Washerwoman' are standardised as 'Washerwoman'. However, no attempt has been made to standardise trade information consisting of several trades (e.g. 'Lacemaker and servant') or where specific details of the trade are given (e.g. 'Farmer of 250 acres employing 5 labourers'). Given that there is only partial standardisation, the <trade> field is only partially searchable in its present form. For the 1901 census, trade information has been entered in both original and standard form (see §3.12).

Information on birth place has been separated into (a) the name of the village, town or city and (b) the name of the county. This allows these parts to be searched separately.

3.2 Rooms occupied

The rooms-occupied attribute of <household> indicates the number of rooms occupied where the number is less than five. In houses with a lot of people in them, this number indicates how crowded the rooms are.

3.3 Not in occupation

The 1901 census has a column marked 'Not in occupation'. Properties that are marked as not in occupation tend to be those that are named but have noone listed as living there. In these cases, 'Not in occupation' is redundant. It is omitted.

3.4 In occupation

In a small number of cases, the 1901 census records a property as being 'In occupation' even though noone is listed as living there. In these cases, an in-occupation="true" attribute is attached to <household>.

3.5 Abode

Abode is a house (or farm, pub, etc.) name. In some cases - perhaps where there is no conventional name to distinguish one house for another - it is the enumerator's5 invention.

The enumerator on the 1871 census often wrote 'Cot' for abode, presumably meaning 'cottage'. We have put this in lower case to distinguish it from the proper name Cotte (and its variant spellings).

3.6 Name

Forenames and surnames are tagged as separate items. Some forenames are initials. Forenames which the enumerator wrote in abbreviated form (but not initials), such as 'Thos.' for 'Thomas' have been expanded to their full form.

3.7 Relation

Censuses are structured in terms of households. Households are seen as having a head and some number of dependents (possibly zero). The head is male apart from cases where the household is headed by a woman whose husband is absent or by a widow or a spinster. The relation field classifies members of the household in terms of their relation to the head. The head himself (sometimes herself) is labelled 'Head'.

3.8 Condition

'Condition' means marital status, including whether the person is a widow(er).

3.9 Age

Most ages are in years. ages of children are sometimes given to the nearest month, or occasionally the nearest week or day. Examples are given below:

    3             = 3 years
    0.3           = 3 months
    1.3           = 1 year 3 months
    0.0.3         = 3 weeks
    0.0.0.3       = 3 days

In the original censuses, age appears in one of two columns: age for males, and age for females. This implied sex information is made explicit in the sex field. See §3.10.

In the 1841 census, ages that are multiples of five occur much more often than expected. It appears that the enumerator has rounded to the nearest multiple of five (e.g. 60, 65, 60) where there was a doubt.

3.10 Sex

This field is not in the original. It is added to allow sorting by sex.

3.11 Defects

All of the censuses have information on what might be called 'defects' (the labelling in the originals varies). The categories used include 'blind', 'cripple', 'deaf', 'idiot', 'imbecile'.

3.12 Trade

In the current version, trade (occupation) information is only partially searchable, because terminology and spelling has not been fully standardised. Nor have the different elements within the <trade> field been marked up separately.

For the 1901 census, the original form of the trade or occupation is recorded, as well as a standardised form (allowing searching). The standard form is recorded using the attribute standard.

In addition, annotations concerning occupation that were added later (apparently by the enumerator) are recorded using the attribute annotation.

3.12.1 Employment status

The 1891 census has columns for recording people's employment status. The categories are 'employer', 'employed', 'neither' and blank. There is no indication of how many people employers employ.

In the electronic version of the 1890 census this field has been omitted. However, for the 1901 census this information is retained using the attribute work-status. Here the most common categories are 'worker', 'own account' and 'employer'. The work-status attribute keeps to the original spelling. To allow for searching on work status, a standard form is used: standardised-work-status.

3.12.2 Working at home

The 1901 census recordes whether a person works at home. Where the enumerator puts 'at home', an attribute at-home="true" is included in the <trade> element.

3.12.3 Example <trade> element

An example of a <trade> element, with all the above attributes is:

    <trade standard="Shoemaker"
           work-status="own acc"
           standardised-work-status="own account"
           at-home="true"
           annotation="Boot M.">Shoe maker</trade>

3.13 Birthplace

Birthplace has two parts: (a) village, town, or city, (b) county.

In the data files, county is put as attribute of <birthplace>. Where the census returns omitted county information it has been added in the present electronic version.

Original spellings of placenames are used. Also, original counties are kept to (e.g. St Pancras, Middlesex).

The 1841 census has yes-no categories for recording people born outside the county in question (Devon in the present case) and born in Scotland, Ireland or 'foreign parts'. These have been recorded in the data files as

    <birthplace county="outside Devon"/>
and
    <birthplace>outside England and Wales</birthplace>
respectively.

For the 1901 census, standard and standard-county have been used to record standard forms of the settlement and county respectively, where the enumerator has used non-standard forms.

4 Views

Two views of the data are provided. One view is organised by household, following the pattern of the original census returns. The order in which households appear reflects the enumerator's walk around the village as he collected the data. This view is useful where the occupancy of a particular house is of interest.

The second view is organised alphabetically by individual. This view is useful when a particular individual or family is of interest.

Although the amount of information recorded in increases over the decades, the same format is used for each census. Where information is not available, the relevant columns are blank.

4.1 Information not displayed

A few pieces of information in the data files (.xml) are not displayed. There are two reasons for not displaying them. First, some information was not present in the original. This includes <sex> and the standardised forms of trade (occupation) and employment status. This is information that was added to enable automatic sorting and searching of the data.

The other reason for not displaying information is to save screen space. In particular, <defects> is not displayed because it is usually empty.

However, if one wants to see these pieces of information, one can open up the data files (see §5) with a text editor and search for, e.g., '<defects>'.

5 Package

The package is structured as follows:

censuses
  households
    census1841.htm
    census1851.htm
    census1861.htm
    census1871.htm
    census1881.htm
    census1891.htm
    census1901.htm
  individuals
    census1841.htm
    census1851.htm
    census1861.htm
    census1871.htm
    census1881.htm
    census1891.htm
    census1901.htm
data
  census1841.xml
  census1851.xml
  census1861.xml
  census1871.xml
  census1881.xml
  census1891.xml
  census1901.xml
description
  source
    description.xml
  description.htm
style
  css
    census.css
  xsl
    common.xsl
    households.xsl
    individuals.xsl

For viewing the censuses the files of interest are the ones under censuses/households and censuses/individuals. These show the census information arranged by household (as in the original) and by individual respectively.

The data directory contains the actual census data. Some information is contained in these files that is not presented for viewing (because of limits on what can be fitted onto the computer screen).

The description directory contains this file.

Style information, controlling the appearance of the censuses and of this file, is contained in style/css and style/xsl.


Notes

  1. Census returns are held at the Devon Record Office in Exeter.
  2. Data entry was done by John and Dan Ponsford. Other processing and editing was done by Dan Ponsford.
  3. XML = Extensible Markup Language (see www.w3.org).
  4. HTML = Hypertext Markup Language (see www.w3.org).
  5. The enumerator is the person who collected the census data on the ground.

May 2005Last updated December 2008