Skip to main content

Publication, Part of

Rare Condition Registration Statistics updated to 2022

Current Chapter

Rare condition registration data, quality and methodology


Rare condition registration data, quality and methodology

Update frequency

These statistics will be updated annually.

Note that not all conditions will be updated in every release, nor will all conditions be updated to the same point in time. This is due to the disparate nature of the data sources; for further information, see the ‘Data Sources’ section below.


Data sources

Rare cancers

The cohort of persons with rare cancers is extracted from the National Cancer Registration Dataset.

Cancer registration is the systematic collection of data about cancer and tumour diagnoses. In England, this data collection is managed by the National Disease Registration Service (NDRS) in NHS England. Every year, the NDRS collects information on over 300,000 new cases of cancer, including patient details, as well as detailed data about the type of cancer, how advanced it is and the treatment the patient receives. The registrations are made using the ICD-O-3 coding system. Over 99% of these registrations can be coded to the ICD-10 coding system for values between C00.0 and D48.9.

Data are submitted to the NDRS from a range of healthcare providers and other services (for example, histopathology and haematology services, radiotherapy departments, screening services and general practitioners). The National Cancer Registration and Analysis Service (NCRAS), which is a part of NDRS, then uses these multiple sources to build a comprehensive picture of cancer incidence in England, as well as other detailed analysis and interpretations covering the entire cancer pathway on all patients in England. For more information, please see the Data Resource Profile for NCRAS.

The quality and accuracy of the data are validated and processed to ensure that they are consistent and to a high standard. Once all the expected records for any one incidence year have been received and validated, NCRAS takes a snapshot of the dataset, which provides a single, consistent source of cancer registrations. The snapshot will vary year on year due to the dynamic nature of registration data:

  • new cancer cases will be registered which can include new “late” registrations after cancer incidence has been published for that year
  • cancer records can be amended, for example the site code of a record can be modified when more accurate information becomes available
  • cancer records can be cancelled (although this is uncommon)
  • A less common reason for changes to historical data is a patient exercising their right to opt-out of the cancer registration datasets.

Rare congenital anomalies

The cohort of persons with congenital anomalies is extracted from the National Congenital Anomalies and Rare Disease Registration Service Dataset as described in the cohort profile, selected from conditions that may be detected as part of the Fetal Anomaly Screening Programme available from the gov.uk website.

Congenital anomalies are defined as being present at delivery, likely originating before birth, and include structural, chromosomal and genetic conditions. Data are collected broadly in accordance with definitions and guidelines of the European Surveillance of Congenital Anomalies (EUROCAT). Congenital anomalies are coded using the International Classification of Disease version 10 (ICD-10) with British Paediatric Association (BPA) extension, which gives supplementary one-digit extensions to ICD-10 codes to allow greater specificity of coding. For more information about data collection, definitions and coding see the Technical details document which accompanies the NCARDRS Congenital Anomalies Statistics report. 

While births of any outcome are considered for birth prevalence statistics presented in the Congenital Anomaly Official Statistics, for prevalence calculations only those that were born alive and are alive at the index date are included.

Other rare conditions

For the (non-cancer, non-congenital anomaly) rare diseases presented there are multiple discrete methods of data collection:

Rare Autoimmune Rheumatic Diseases

A group of diseases is based on the Rare Autoimmune Rheumatic Disease (RAIRD) cohort. This was collected as part of the RECORDER project using Hospital Episode Statistics data with the first mention of a disease of interest as the date of entry to the cohort. Patients receiving care outside of secondary care (or inpatient care where their rare disease is not coded) may be excluded from the prevalent cohort, meaning that the reported prevalence estimates for these conditions is likely to be an underestimate. These patients may have a disease of a different nature to those being treated in secondary care (possibly less advanced) or who are being treated in secondary care but have not had an episode of admitted patient care leading to coding of their rare disease.

Inherited Metabolic Conditions

NDRS collects data from Highly Specialised Services and Inherited Metabolic Disease (IMD) treatment centres, about patients who have or are suspected to have IMD conditions.  IMDs are genetic, inherited problems of the metabolism.  Patients receiving care for a disease outside the HSS may therefore be excluded from the prevalent cohort.

There are more than 600 individual IMD conditions.  Although individually metabolic conditions are rare, collectively they are a considerable cause of morbidity and mortality.  The rarity and complex nature of IMD requires an integrated specialised clinical and laboratory service to provide satisfactory diagnosis and management.  The treatment services aim to identify and diagnose patients who are suspected of having an IMD, to improve life expectancy and quality of life.

The period covered by data we have received from each Highly Specialised Service (HSS) varies. The table below lists each of the HSS we have received data from and the period the submissions cover.

Organisation name

Notes

Earliest data submission

Latest data submission

Alder Hey Children's NHS Foundation Trust

Paediatrics

26 March 2024

26 March 2024

Birmingham Women's and Children's NHS Foundation Trust

Paediatrics

30 July 2020

26 August 2022

Bradford Teaching Hospitals NHS Foundation Trust

Paediatrics

14 July 2020

22 February 2024

Cambridge University Hospitals NHS Foundation Trust

Adults

27 April 2020

08 August 2023

Cambridge University Hospitals NHS Foundation Trust 

Paediatrics

20 December 2023

20 December 2023

Great Ormond Street Hospital for Children NHS Foundation Trust

Paediatrics

01 February 2022

12 May 2023

Guy's and St Thomas' NHS Foundation Trust

Paediatrics

28 February 2020

15 November 2023

Guy's and St Thomas' NHS Foundation Trust

Adults

01 September 2021

15 January 2024

King's College Hospital NHS Foundation Trust

National Acute Porphyria Centre

28 September 2021

09 May 2023

Liverpool University Hospitals NHS Foundation Trust

National Alkaptonuria Centre

06 May 2020

15 November 2023

Manchester University NHS Foundation Trust

Paediatrics

01 November 2016

07 October 2021

Norfolk and Norwich University Hospitals NHS Foundation Trust

Paediatrics

17 February 2020

22 November 2023

North Bristol NHS Trust

Adults

04 May 2020

13 October 2022

Northern Care Alliance NHS Foundation Trust

Adults

17 August 2021

31 May 2024

Nottingham University Hospitals NHS Trust

Paediatrics

04 January 2024

04 January 2024

Oxford University Hospitals NHS Foundation Trust

Paediatrics

02 April 2020

19 October 2022

Oxford University Hospitals NHS Foundation Trust

Adults

05 October 2021

22 March 2023

Royal Free London NHS Foundation Trust

Lysosomal Storage Disorder Unit

17 March 2020

09 November 2023

Sheffield Children's NHS Foundation Trust

 

16 July 2021

23 February 2024

Sheffield Teaching Hospitals NHS Foundation Trust

 

08 May 2024

08 May 2024

The Newcastle upon Tyne hospitals NHS Foundation Trust

Adult

15 November 2022

15 November 2022

The Newcastle upon Tyne hospitals NHS Foundation Trust

Paediatrics

26 July 2024

26 July 2024

University College London hospitals NHS Foundation Trust

 

18 December 2019

21 February 2024

University Hospital of Wales

National Acute Porphyria Centre

10 November 2021

20 November 2023

University Hospital Southampton NHS Foundation Trust

Paediatrics

13 March 2020

08 August 2023

University Hospitals Birmingham NHS Foundation Trust

 

28 February 2020

22 December 2023

University Hospitals Bristol and Weston NHS Foundation Trust

 

11 June 2021

11 June 2021

Note that the submission from Cambridge University Hospitals NHS Foundation Trust was only coded to ICD10 level, and we have not yet been able to apply Orphacodes to the diagnoses in this submission. Therefore, this data has not been included in the final data.

The diagnoses are received as free text and are assigned a relevant Orphacode by the rare disease team. It is preferred that specific disorder or disorder subtype Orphacodes are selected. However, in some cases a group of disorders Orphacode must be used where a specific diagnosis cannot be determined from the free text received. The Orphanet Rare Disease Ontology (ORDO) is a structured vocabulary for rare diseases derived from the Orphanet database, capturing relationships between diseases, genes and other relevant features’. https://www.orpha.net/

The data has been though a cleaning process and the patient details verified by linking to each patient’s details on the NHS Personal Demographics Service (PDS). This linkage occurred in June 2024.

Further information about IMD service commissioning can be found here:  NHS commissioning » E06. Metabolic disorders  

Neurofibromatosis Type 2

Data are from the four highly specialised commissioned services - Cambridge University Hospital NHS Foundation Trust, Guys and St Thomas’ NHS Foundation Trust, Manchester University Hospitals NHS Foundation Trust and Oxford University Hospitals NHS Trust. Patients receiving care for a disease outside the HSS may therefore be excluded from the prevalent cohort. Further information about the services can be found here:  NHS commissioning » Highly specialised services

 

Defining the conditions

Rare cancers

Rare cancers have been defined in line with that used by the US National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) Program.

Due to the implementation of ICDO3 classification from 2013 some tumour site definitions are different for tumours diagnosed from 1995-2012 and 2013-2020 to reflect the use of the new coding system. For the affected groups, ICD10/O2 is used from 1995-2012 and then ICDO3 from 2013-2020. By using the ICDO3 definitions from 2013 we are ensuring we are defining the tumour groupings as per the most up to date classifications for relevant groupings.

Please download the table below to view the full tumour grouping details:

Rare congenital anomalies

Congenital anomalies are selected from conditions that may be detected as part of the Fetal Anomaly Screening Programme and defined as being present at delivery, likely originating before birth, and include structural, chromosomal and genetic conditions. Data are collected broadly in accordance with definitions and guidelines of the European Surveillance of Congenital Anomalies (EUROCAT). Congenital anomalies are coded using the International Classification of Disease version 10 (ICD-10) with British Paediatric Association (BPA) extension, which gives supplementary one-digit extensions to ICD-10 codes to allow greater specificity of coding.

The congenital anomaly prevalence estimate rates use only those patients whose age is less than the observation period length (for example, where the index date is 31 December 2021, this would be children up to the age of 4 years). This is because national congenital anomaly registration has only been in place for babies born since 1 January 2018. The point prevalence estimate data presented represents all individuals born between 1 January 2018 and the index date, that were still alive on the index date.


Analytical code repository

The calculation of prevalence estimates from the cohorts of each condition is performed in R using a standard function. This function is publicly available as part of the “NDRSanalysis” R package, available on the GitHub website

Other rare conditions

Inherited Metabolic Disorders

The diagnoses are received as free text and are assigned a relevant Orphacode by the rare disease team. It is preferred that specific disorder or disorder subtype Orphacodes are selected. However, in some cases a group of disorders Orphacode must be used where a specific diagnosis cannot be determined from the free text received.

The data has been though a cleaning process and the patient details verified by linking to each patient’s details on the NHS Personal Demographics Service (PDS). This linkage occurred in June 2024.

Date of diagnosis is not always provided. Where this is the case, the date the first time the patient was seen by the HSS is taken as the diagnosis date (‘clinic date’). If they also do not have a ‘clinic date’, then the date the patient was reported to NCARDRS is used.

Patients who appear in the dataset more than once have had their most recent codable diagnosis selected as their final diagnosis and the earliest diagnosis date (or clinic/report date if diagnosis date is not available) is selected as their diagnosis date (this follows the process used by the RECORDER team for establishing the RECORDER cohort).

Patients removed from the IMD dataset before running through the prevalence function are:

  • patients without a diagnosis that can be assigned an Orphacode.
  • patients with a ‘Suspected’ diagnosis.
  • patients who could not be verified by linking to the PDS.
  • patients who are not resident in England – determined using the postcode returned by linking the patient to their details on the PDS. If a postcode is not identified on the PDS, then the patient’s most recent valid postcode in the original data is used. In cases where a patient does not have a PDS postcode or a valid postcode in the original data, they will be removed as it cannot be determined whether the patient is a resident in England.

Due to the small numbers of patients with certain diseases and the extensive number of diseases in the IMD data, certain diseases have been grouped using the ORDO ontology structure.

Using input from clinical leads, certain disease groups or specific diseases have been selected to be included in this year’s prevalence estimates publication.

The counts for each selected disease grouping will contain the counts of all diseases and disease groupings which are ‘children’ of the selected disease grouping. These counts may also include diseases or disease groups which are not listed in the prevalence tool.

The ORDO ontology allows for diseases to have multiple ‘parent’ codes, this has resulted in some patients being counted in more than one disease grouping.

For example:

Patient ID

Diagnosis

Patient 1

Hyperinsulinism-hyperammonemia syndrome (35878)

The Ordo classification of this disease is:

Rare inborn errors of metabolism ORPHA:68367

  • Disorder of amino acid and other organic acid metabolism ORPHA:79062
  • Disorder of urea cycle metabolism and ammonia detoxification ORPHA:79167
  • Hyperinsulinism-hyperammonemia syndrome ORPHA:35878

 Rare inborn errors of metabolism ORPHA:68367

  • Other metabolic disease ORPHA:91088
  • Congenital isolated hyperinsulinism ORPHA:657
  • Diazoxide-sensitive diffuse hyperinsulinism ORPHA:165985
  • Hyperinsulinism-hyperammonemia syndrome ORPHA:35878

From these the selected diseases and disease groupings are:

  • Disorder of amino acid and other organic acid metabolism
  • Disorder of urea cycle metabolism and ammonia detoxification
  • Other metabolic disease

The disease is the ‘child’ of both Disorder of amino acid and other organic acid metabolism and Other Metabolic Disease, so will be counted under both these groups.

Disorder of urea cycle metabolism and ammonia detoxification is the ‘child’ of Disorder of amino acid and other organic acid metabolism, and so Hyperinsulinism-hyperammonemia syndrome is counted under both these disease groupings.

A patient with this diagnosis would appear in the data as:

Patient ID

Diagnosis

Patient 1

Disorder of amino acid and other organic acid metabolism

Patient 1

Disorder of urea cycle metabolism and ammonia detoxification

Patient 1

Other metabolic disease

And the counts for this one patient would be:

Diagnosis

Count

Diagnosis subtype

Count

Disorder of amino acid and other organic acid metabolism

1

   
   

Disorder of urea cycle metabolism and ammonia detoxification

1

Other metabolic disease

1

   

The Orphanet Rare Disease Ontology (ORDO) is a structured vocabulary for rare diseases derived from the Orphanet database, capturing relationships between diseases, genes and other relevant features’. 

Ordo ontology version 4.5 downloaded from the Orphanet website.



Last edited: 16 June 2025 11:56 am