Avon NHS  Avon NHS Search  NHS uk  NHS Direct
NHS logo
Public Health
Network
 


Main Menu
Avonsafe
About the Network
Data & Literature
PCT Profiles
Publications
Education, Training & CPD
Public Health Forums
Links
Search


Understanding Public Health Data

This page gives a basic guide to interpreting and using the most common types of public health analysis. This section also contains links to more detailed information contained in the Analytical Methods section.

More comprehensive guides on interpreting data can be found on

Epidemiology for the Uninitiated - BMJ Publishing Group 1997
Statistics at Square One - BMJ Publishing Group 1997
Electronic Textbook StatStat 

This section covers the following topics:

Measuring Disease

Rates

Deprivation Scores

Population Sources


Measuring Disease

The most common methods for assessing the amount of disease in a population are look at morbidity (illness) or mortality (death).  There are 2 common methods used to measure morbidity; incidence and prevalence. 

Incidence

The incidence of a disease is the rate at which new cases occur in a population at risk during a specific period. 

If the population is stable, the formula used is:

Incidence = Number of cases / Population at risk x time during which cases were ascertained

If the population changes during the period where new cases are measured, the incidence is calculated using "total person years at risk". This is the sum of the new cases during the period which each person in the population was at risk during the measurement period. The formula is therefore

Incidence = Number of new cases / Total person years at risk

Example: If there were 17 cases of flu in a school of 1000 in a 2 years, this is an incidence of 8.5/1000 per year.  

Up

Prevalence

The proportion of a population who have a disease at a point in time is the prevalence of disease. It is often expressed as a percentage. The 'point in time' can be a single examination [point prevalence], but is often a longer time scale in order to give a better estimate of the numbers with the disease [period prevalence]. 

Prevalence = (cases/population) * 100

Prevalence is used for diseases that are chronic conditions, such as asthma, diabetes, cystic fibrosis.

Example: In a nursing home with 100 residents, 10 people had diabetes in a 12 month period. The prevalence of diabetes in the nursing home was therefore 10%. 

Up


Rates

It is often to look at the incidence and prevalence of diseases in terms of rates. This enables comparisons between different populations. 

Crude Rate

A crude rate is a rate that applies to the population as a whole, and that hasn’t been adjusted to account for differences in population structures such as age and sex. It is calculated using this formula:

Crude Rate = Number of Events / Total Number of People in a Population

Example: If a town has a population of 1000, and 25 people die from cancer in one year, the mortality rate is 25/1000/year.

Pros:

  • Crude rates are useful for getting an overall picture of the amount of disease in an area

Cons:

  • Crude rate often mask the differences in particular age groups or sexes. If there are differences in the crude rate between areas, it could be because one area has a higher proportion of vulnerable people. 

    For example, if area A has a high proportion of elderly people compared with area B, its crude death overall rate will be higher. This may however mask the fact that the premature death rate is actually higher in area B, which had the lower crude death rate.  

Up

Standardised Rates

In order to overcome the problems of a crude rate masking differences in particular age groups, rates can be standardised. There are two types of standardization; direct and indirect.

Direct Standardisation

Directly standardized rates give an indication of the number of events that would occur in a standard population, if the population had the same age-specific rates of the local area. The standard population that is most commonly used is the European Standard population, however other populations such as the Avon standard population can also be used.  The rates are calculated per 100,000 and because rates are applied to the same population, rates across areas can be compared.

Example: 

Between 1997-1999, Bristol Unitary Authority has a directly standardized mortality rate for accidents of 16.7/100,000. This means that if the European Standard population experienced the same age-specific patterns in mortality, the death rate would be 16.7 per 100,000 

Pros:

  • Unlike indirectly standardised rates, which compare the observed number of events to the expected number of events, (see below) they can be used to compare disease rates across areas and time.  
  • Can be used to asses the relative burden of disease in a population e.g. if there is more heart disease than cancer
  • There is a wide variety of comparable figures including data in the Compendium of Clinical Indicators available that have been standardised using the European Standard population.  

Cons:

  • Requires age specific rates that are not often available at a local level.
  • Rates may not be stable for small number of events (approximately <100).

More information: See Analytical Methods section for a worked example. 

Up

Indirect Standardisation

  Indirect standardization compares actual numbers of deaths to expected numbers, adjusting for age and sex.  This produces a ratio which is commonly called a standardised mortality ratio, or an SMR. The expected number of deaths is taken from the number of deaths in a larger reference population. For example, if the analysis is looking at death rates in wards, the reference population could be Avon or England and Wales.

The SMR of the reference population is always 100, a value of lower than 100 means that fewer deaths than expected occurred in the local population after adjusting for differences in age and sex; more than 100 means that there have been more deaths than expected.

Example:  

In 1998, Bristol’s SMR from all causes, which was standardised using the England and Wales average, was 89. This means that there were 89 deaths for every 100 that occurred in the country as a whole.

Pros:

  • This method does not require local rates, only absolute number of events
  • Easier to interpret indirectly standardised rates than standardised rates as 100 is the average.

Cons:

  • Areas cannot directly compared.  If South Gloucestershire UA had an SMR for all causes of 79 and Bristol UA an SMR of 89, it can be said that both UAs have a lower than expected number of deaths compared to the average for England and Wales. However, no assumptions can be made about the actual difference in mortality of South Gloucestershire and Bristol.      
  • Unlike directly standardised rates, SMRs give no idea of the actual burden of disease.

More information: See Analytical Methods section for a worked example. 

Up

Confidence Intervals

Confidence intervals or limits are used to give a range of values within which there is a degree of certainly that the values are correct, and to assess if values are significantly different from that of the reference population. This range is required as there is likely to be some variation that occurs by chance.

For 95% confidence intervals we can be 95% certain that the real value (e.g. SMR) will fall somewhere between the values of the two confidence limits 95 times out of 100.

In SMRs if the confidence intervals are both one side of 100 then the result can be termed 'statistically significant', high if both values are above, and low if both values are below 100.  

A small numbers of events or a small sample population will tend to produce wide confidence intervals, i.e. a large difference between the upper and the lower confidence limit. A large number of events or a large sample population will tend to produce narrow confidence limits.

Confidence intervals should be used for direct and indirectly standardised rates, and proportions. 

Standardised rates:

Upper Confidence interval  = rate + 1.96 * ((rate / sqrt. (number of events))

Lower Confidence interval  = rate - 1.96 * ((rate / sqrt. (number of events))

Proportions:

Upper Confidence interval  = proportion + 1.96 * sqrt ((Pu * (1-Pu)) / sample population)

Lower Confidence interval  = proportion - 1.96 * sqrt ((Pu * (1-Pu)) / sample population)

Where Pis the proportion of cases. 

Example:  

In 1997-1999, North Somerset had a directly standardized rate for death from accidents of 12.3/100,000. The upper confidence limit is 14.8 and the lower confidence limit is 9.8, meaning that there is a 95% certainty that the true mortality rate for North Somerset lies between 9.8 and 14.8/100,000.

The England and Wales rate for the same period is 16.7/100,000 with an upper confidence limit of 16.9/100,000 and a lower confidence limit of 16.5/100,000. As the values within the 95% confidence limit of North Somerset UA (9.8-14.8) do not overlap with the England and Wales rate (16.5-16.9), we can say with 95% certainty that North Somerset UA has a significantly lower mortality rate for accidents than England and Wales.

Caution: Confidence Intervals for age-standardised rates are unreliable where there are less than 50 cases in an area. This is because the formula commonly used is a simplified version and assumes normal distribution.

Up

 

Fertility Rates

Fertility rates can be calculated in 2 ways; either taking a 'snap-shot' of current levels of fertility (period) across women of all ages at a particular point in time or by tracking a group or cohort of women as they pass through their reproductive period (usually taken as 15-44). In practice, most fertility rates are calculated using the 'snap-shot' method because they use data which are currently available. However cohort methods are useful in historical analyses examining issues such as completed family size. The following definitions are used to describe fertility. 

General Fertility Rate: Number of live births / Number of women aged 15-44  

Total Fertility Rate: Sum of age specific fertility rates

Gross Reproduction Rate: Average number of daughters produced by a woman during her reproductive life time (sum of age specific fertility rates for female births). 

Net Reproduction Rates:  Average number of daughters, allowing for mortality, who survive to the age of the mother during her reproductive life time (sum of age specific fertility rates for female births of mothers minus age specific mortality rates of daughters).

 

Up

 

Birth & Infant Death Rates

Crude Birth Rate: Number of live births / mid-year population

Infant Mortality Rate: Number of infants deaths under 1 year  / total live births

Still Birth Rate: Number of interuterine deaths after 24 weeks / total births

Perinatal Mortality: Number of still births + deaths under 7 days / total births

Neonatal Mortality: Number of infant deaths under 28 days / total live births

Postneonatal Mortality: Number of infant deaths ages 28 days to 1 year / live births

Up


Deprivation Scores

There are three common methods used to estimate deprivation; the Jarman Score, the Townsend Score, and the DETR indicators. Most measures are "ecological", in other words they try to measure geographical populations, not individuals or social groups. 

Jarman Score

The Jarman score was developed as a measure of General Practice workload in the mid-eighties that is sometimes used as a proxy for deprivation.  It has been used by the Department of health to determine additional “deprivation” payments to GPs.  The scores were re-calculated for the 1991 census, using the same census variables as 1981.

The calculation of the Jarman score consists of the three stages, data identification, weighting and aggregation.  Eight census variables are used in the calculation.  Each has a weight attached to it.

  • Percentage of people in households who are 65 or over and living alone (weighted at 6.62 

  • Percentage of the people living in households who are under 5 (weighted at 4.64)        

  • Persons in households of one person over 16 with one or more children under 16 as a percentage of all persons in households (weighted at 3.01)

  • Persons in households headed by a person in socio-economic group 11 (unskilled workers) as a percentage of all residents in households (weighted at 3.74)

  •  Economically active persons over 16 unemployed and seeking work (weighted at 3.34)

  • Persons in households with more than 1 person per room as a percentage of all residents in households (weighted at 2.88)

  • Persons aged 1 or over with a usual address one year before the census different form the present usual address as a percentage of total residents (weighted at 2.68)

  • People in households headed by a person born in the new Commonwealth or Pakistan as a percentage of all residents in households (weighted at 2.50)

The mean for England and Wales is 0.  An area with a high score has a greater demand for primary care based on the characteristics of the resident population, than an area with a low score.  Extreme scores are those above 32 (rounded to 30 by the Department of Health).

Example:  

South Gloucestershire has a Jarman score of -21.38 meaning that it has a lower demand for primary care than England and Wales mean, but Bristol UA has a score of 17.85, higher than the England and Wales mean.

Pros:

  • Can be used for small areas
  • A diverse range of measurement make up the score

Cons:

  • Differences within wards are often masked as there can be great variation of deprivation within a particular ward.
  • The data is 10 years out of date
  • Does not indicate the proportion of people in an area that are deprived  
  • It is biased toward the urban population

More Information: See Analytical Methods section for a worked example. 

Up

DETR 2000

The Department of Transport, Local Government and the Regions have produced a several ward and local authority level deprivation scores looking at different aspects of deprivation. The 6 domains they have used to look at deprivation are: 

  • Income,
  • Employment,
  • Health deprivation and disability,
  • Education,
  • Skills and training,
  • Housing and geographical access to services


In addition, there is an overall score and rank for every ward in England called the Index of Multiple Deprivation 2000 and a supplementary score and rank on Child Poverty. The combined score is a combination of all 6 domains.

Each of the 6 domains uses a variety of data, mainly taken from social services, which are then weighted according to importance and combined to create a score. The score is then used to generate ranks the wards.  The wards are ranked so that 1 is the most deprived, and 8414 is the least deprived in England. It is the rank, and not the score that should be used.

Example: 

Lawrence Hill ward in Bristol City Council is ranked 133 using the overall score, which means it is the 133rd most deprived ward in England out of a total of 8414 wards. However, it ranks differently when looking at the 6 domains; 108th most deprived for income, 294th most deprived in terms of health.

  Pros:

  • Uses recent data which is updateable
  • Can distinguish between different aspects of deprivation
  • Gives the proportion of people that are deprived for the income and employment domains.

  Cons: 

  • Some of the domains, such as housing, are derived from only a few different sources.
  • The variables are weighted differently, so one source can be more important than another. The justification for the different weightings is not clear.
  • The overall score, the Index of Multiple Deprivation 2000, contains the access domain which does not mirror the other deprivation measures. Access score is more reflective of rural need than urban needs
  • Data not available below ward level.
  • District averages are applied to ward level in certain instances e.g. SMR

Up

Townsend Material Deprivation Score

The Townsend score is made up by looking at four variables census; 

unemployment - % of economically active residents aged 16-59/64 who are unemployed;
car ownership - % of private households who do not possess a car;
owner occupation - % of households not owner occupied
overcrowding (over 1 person per room) - % of private households with more than 1 person per room. 

The data are taken from the 1991 census.  The variables combine to form an overall score ranking a particular area relative to others.  The higher the score, the more deprived the area. The average is 0.

Example:  

Based on Avon standardisation, the most affluent area in Avon has a score of –9.0, the most deprived has a score of 9. 

The scores can only be used to rank areas. This is because the actual score has no value i.e. an area with a score of 4 is not twice as deprived as an area with a score of 2.

Pros: 

  • It can be used to look at small areas
  • Highly correlated with measures of ill health, e.g. SMRs or limiting long-term illness
  • As it is the sum of standardised scores, it is easy to calculate

Cons:

  • The data is 10 years out of date. In particular, housing tenure has changed significantly since 1991 due to sales of council housing
  • Does not indicate the proportion of people in an area that are deprived
  • It is a better indicator of deprivation of urban areas than rural areas  

More Information: See Analytical Methods section for a worked example.

Up

Deprivation Quintiles 

Deprivation quintiles divide areas in fifths according to some measure of deprivation, and can be used to analyse variations in health between deprived and affluent sections of the population regardless of where they live. The can be of varying size, e.g. Health Authority or enumeration districts.

The national targets use quintiles for life expectancy at Health Authority level, and the teenage conceptions national target uses quintiles at ward level.  

Up


Population Sources

There are various sources of population data. The main sources that are used by Avon Health Authority are the GP register, ONS estimates, local estimates and census data.

GP register 
This includes all people registered with a GP and can be used estimate population for small areas. 

Pros:

  • It is the most up to date source of population data, and is believed to be reasonably accurate in Avon.

Cons

  • Particular groups are likely to be under or over-represented. 

  • The most common reasons are a delay between changing address and registering, and not changing address at all. Approximately 6% of addresses of the GP registered population are wrong. 

    Delays in registering are particularly a problem with newborn babies, and not registering at all is a particular problem with young adults.  

Note: Local Population Estimates are available that attempt to adjust for this. 

Up

ONS estimates 
These are produced annually for the previous year and are available at district level and above. They are broken down by 5-year age-sex bands. These take into account changes in births, deaths and migration. 

Pros:

  • Widely used and recognised

  • Produced nationally using the same methodology

Cons:

  • The data is between 12 and 18 months old by the time it is released

  • It doesn't always take into account local developments such as new housing estates

  • It is not available for small areas such as wards.

More information: National Statistics

Up

ONS projections

ONS projections look at trends in fertility, mortality and population trends to produce 25 year projections. ONS produces 3 different projections that reflect differing assumptions about the underlying trends.

Pros:

  • Nationally consistent

Cons:

  • The projections do not take local variations 

  • The data is based on adjusted 1991 Census data therefore is likely to become more inaccurate as time passes.

  • It is not available for small areas such as wards.

More information: National Statistics  

Up

Local Estimates
Avon Health Authority produce ward level estimates that adjust the GP registered population to match the ONS district level estimates. This is done by calculating and adjusting for differences in 3 population sources:  the GP registered population, the ONS mid-year estimates, and the DETR mid-1998 ward population estimates. 

Pros:

  • This method adjusts for likely errors in the GP registered population

  • Can be used for small areas 

  • Continually updated

Cons:

  • The method relies on the stability of the GP registered population 

  • It is based on the DETR mid-1998 ward population estimates, therefore as time elapses it is likely to become more inaccurate.


More Information:
See Analytical Methods section for a worked example. 

Up

Census Data
This is collected every 10 years and provides a comprehensive population profile for small areas (enumeration districts). It is also used to allocate funding from central government. The 2001 census results should be available in 2002.

Pros:

  • This is the most accurate and comprehensive collection of population information 

  • It is the only comprehensive source of ethnicity.

Cons:

  • Only collected every 10 years

  • 1991 census was thought to undercount the population, particularly the 16-24 year olds, due to problems with the poll tax issue.

More information about Census 2001

Up

 

  Total Hits Hit Counter

Public Health Network, NHS Bristol, South Plaza, Bristol BS1 3NX
carl.muldoon@bristol.nhs.uk