Publication, Part of Rare Condition Registration Statistics
Rare Condition Registration Statistics updated to 2022
Rare condition registration data, quality and methodology
Update frequency
These statistics will be updated annually.
Note that not all conditions will be updated in every release, nor will all conditions be updated to the same point in time. This is due to the disparate nature of the data sources; for further information, see the ‘Data Sources’ section below.
Data sources
Rare cancers
The cohort of persons with rare cancers is extracted from the National Cancer Registration Dataset.
Cancer registration is the systematic collection of data about cancer and tumour diagnoses. In England, this data collection is managed by the National Disease Registration Service (NDRS) in NHS England. Every year, the NDRS collects information on over 300,000 new cases of cancer, including patient details, as well as detailed data about the type of cancer, how advanced it is and the treatment the patient receives. The registrations are made using the ICD-O-3 coding system. Over 99% of these registrations can be coded to the ICD-10 coding system for values between C00.0 and D48.9.
Data are submitted to the NDRS from a range of healthcare providers and other services (for example, histopathology and haematology services, radiotherapy departments, screening services and general practitioners). The National Cancer Registration and Analysis Service (NCRAS), which is a part of NDRS, then uses these multiple sources to build a comprehensive picture of cancer incidence in England, as well as other detailed analysis and interpretations covering the entire cancer pathway on all patients in England. For more information, please see the Data Resource Profile for NCRAS.
The quality and accuracy of the data are validated and processed to ensure that they are consistent and to a high standard. Once all the expected records for any one incidence year have been received and validated, NCRAS takes a snapshot of the dataset, which provides a single, consistent source of cancer registrations. The snapshot will vary year on year due to the dynamic nature of registration data:
- new cancer cases will be registered which can include new “late” registrations after cancer incidence has been published for that year
- cancer records can be amended, for example the site code of a record can be modified when more accurate information becomes available
- cancer records can be cancelled (although this is uncommon)
- A less common reason for changes to historical data is a patient exercising their right to opt-out of the cancer registration datasets.
Rare congenital anomalies
The cohort of persons with congenital anomalies is extracted from the National Congenital Anomalies and Rare Disease Registration Service Dataset as described in the cohort profile, selected from conditions that may be detected as part of the Fetal Anomaly Screening Programme available from the gov.uk website.
Congenital anomalies are defined as being present at delivery, likely originating before birth, and include structural, chromosomal and genetic conditions. Data are collected broadly in accordance with definitions and guidelines of the European Surveillance of Congenital Anomalies (EUROCAT). Congenital anomalies are coded using the International Classification of Disease version 10 (ICD-10) with British Paediatric Association (BPA) extension, which gives supplementary one-digit extensions to ICD-10 codes to allow greater specificity of coding. For more information about data collection, definitions and coding see the Technical details document which accompanies the NCARDRS Congenital Anomalies Statistics report.
While births of any outcome are considered for birth prevalence statistics presented in the Congenital Anomaly Official Statistics, for prevalence calculations only those that were born alive and are alive at the index date are included.
Other rare conditions
For the (non-cancer, non-congenital anomaly) rare diseases presented there are multiple discrete methods of data collection:
Rare Autoimmune Rheumatic Diseases
A group of diseases is based on the Rare Autoimmune Rheumatic Disease (RAIRD) cohort. This was collected as part of the RECORDER project using Hospital Episode Statistics data with the first mention of a disease of interest as the date of entry to the cohort. Patients receiving care outside of secondary care (or inpatient care where their rare disease is not coded) may be excluded from the prevalent cohort, meaning that the reported prevalence estimates for these conditions is likely to be an underestimate. These patients may have a disease of a different nature to those being treated in secondary care (possibly less advanced) or who are being treated in secondary care but have not had an episode of admitted patient care leading to coding of their rare disease.
Inherited Metabolic Conditions
NDRS collects data from Highly Specialised Services and Inherited Metabolic Disease (IMD) treatment centres, about patients who have or are suspected to have IMD conditions. IMDs are genetic, inherited problems of the metabolism. Patients receiving care for a disease outside the HSS may therefore be excluded from the prevalent cohort.
There are more than 600 individual IMD conditions. Although individually metabolic conditions are rare, collectively they are a considerable cause of morbidity and mortality. The rarity and complex nature of IMD requires an integrated specialised clinical and laboratory service to provide satisfactory diagnosis and management. The treatment services aim to identify and diagnose patients who are suspected of having an IMD, to improve life expectancy and quality of life.
The period covered by data we have received from each Highly Specialised Service (HSS) varies. The table below lists each of the HSS we have received data from and the period the submissions cover.
|
Organisation name |
Notes |
Earliest data submission |
Latest data submission |
|---|---|---|---|
|
Alder Hey Children's NHS Foundation Trust |
Paediatrics |
26 March 2024 |
26 March 2024 |
|
Birmingham Women's and Children's NHS Foundation Trust |
Paediatrics |
30 July 2020 |
26 August 2022 |
|
Bradford Teaching Hospitals NHS Foundation Trust |
Paediatrics |
14 July 2020 |
22 February 2024 |
|
Cambridge University Hospitals NHS Foundation Trust |
Adults |
27 April 2020 |
08 August 2023 |
|
Cambridge University Hospitals NHS Foundation Trust |
Paediatrics |
20 December 2023 |
20 December 2023 |
|
Great Ormond Street Hospital for Children NHS Foundation Trust |
Paediatrics |
01 February 2022 |
12 May 2023 |
|
Guy's and St Thomas' NHS Foundation Trust |
Paediatrics |
28 February 2020 |
15 November 2023 |
|
Guy's and St Thomas' NHS Foundation Trust |
Adults |
01 September 2021 |
15 January 2024 |
|
King's College Hospital NHS Foundation Trust |
National Acute Porphyria Centre |
28 September 2021 |
09 May 2023 |
|
Liverpool University Hospitals NHS Foundation Trust |
National Alkaptonuria Centre |
06 May 2020 |
15 November 2023 |
|
Manchester University NHS Foundation Trust |
Paediatrics |
01 November 2016 |
07 October 2021 |
|
Norfolk and Norwich University Hospitals NHS Foundation Trust |
Paediatrics |
17 February 2020 |
22 November 2023 |
|
North Bristol NHS Trust |
Adults |
04 May 2020 |
13 October 2022 |
|
Northern Care Alliance NHS Foundation Trust |
Adults |
17 August 2021 |
31 May 2024 |
|
Nottingham University Hospitals NHS Trust |
Paediatrics |
04 January 2024 |
04 January 2024 |
|
Oxford University Hospitals NHS Foundation Trust |
Paediatrics |
02 April 2020 |
19 October 2022 |
|
Oxford University Hospitals NHS Foundation Trust |
Adults |
05 October 2021 |
22 March 2023 |
|
Royal Free London NHS Foundation Trust |
Lysosomal Storage Disorder Unit |
17 March 2020 |
09 November 2023 |
|
Sheffield Children's NHS Foundation Trust |
|
16 July 2021 |
23 February 2024 |
|
Sheffield Teaching Hospitals NHS Foundation Trust |
|
08 May 2024 |
08 May 2024 |
|
The Newcastle upon Tyne hospitals NHS Foundation Trust |
Adult |
15 November 2022 |
15 November 2022 |
|
The Newcastle upon Tyne hospitals NHS Foundation Trust |
Paediatrics |
26 July 2024 |
26 July 2024 |
|
University College London hospitals NHS Foundation Trust |
|
18 December 2019 |
21 February 2024 |
|
University Hospital of Wales |
National Acute Porphyria Centre |
10 November 2021 |
20 November 2023 |
|
University Hospital Southampton NHS Foundation Trust |
Paediatrics |
13 March 2020 |
08 August 2023 |
|
University Hospitals Birmingham NHS Foundation Trust |
|
28 February 2020 |
22 December 2023 |
|
University Hospitals Bristol and Weston NHS Foundation Trust |
|
11 June 2021 |
11 June 2021 |
Note that the submission from Cambridge University Hospitals NHS Foundation Trust was only coded to ICD10 level, and we have not yet been able to apply Orphacodes to the diagnoses in this submission. Therefore, this data has not been included in the final data.
The diagnoses are received as free text and are assigned a relevant Orphacode by the rare disease team. It is preferred that specific disorder or disorder subtype Orphacodes are selected. However, in some cases a group of disorders Orphacode must be used where a specific diagnosis cannot be determined from the free text received. The Orphanet Rare Disease Ontology (ORDO) is a structured vocabulary for rare diseases derived from the Orphanet database, capturing relationships between diseases, genes and other relevant features’. https://www.orpha.net/
The data has been though a cleaning process and the patient details verified by linking to each patient’s details on the NHS Personal Demographics Service (PDS). This linkage occurred in June 2024.
Further information about IMD service commissioning can be found here: NHS commissioning » E06. Metabolic disorders
Neurofibromatosis Type 2
Data are from the four highly specialised commissioned services - Cambridge University Hospital NHS Foundation Trust, Guys and St Thomas’ NHS Foundation Trust, Manchester University Hospitals NHS Foundation Trust and Oxford University Hospitals NHS Trust. Patients receiving care for a disease outside the HSS may therefore be excluded from the prevalent cohort. Further information about the services can be found here: NHS commissioning » Highly specialised services
Defining the conditions
Rare cancers
Rare cancers have been defined in line with that used by the US National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) Program.
Due to the implementation of ICDO3 classification from 2013 some tumour site definitions are different for tumours diagnosed from 1995-2012 and 2013-2020 to reflect the use of the new coding system. For the affected groups, ICD10/O2 is used from 1995-2012 and then ICDO3 from 2013-2020. By using the ICDO3 definitions from 2013 we are ensuring we are defining the tumour groupings as per the most up to date classifications for relevant groupings.
Please download the table below to view the full tumour grouping details:
Rare congenital anomalies
Congenital anomalies are selected from conditions that may be detected as part of the Fetal Anomaly Screening Programme and defined as being present at delivery, likely originating before birth, and include structural, chromosomal and genetic conditions. Data are collected broadly in accordance with definitions and guidelines of the European Surveillance of Congenital Anomalies (EUROCAT). Congenital anomalies are coded using the International Classification of Disease version 10 (ICD-10) with British Paediatric Association (BPA) extension, which gives supplementary one-digit extensions to ICD-10 codes to allow greater specificity of coding.
The congenital anomaly prevalence estimate rates use only those patients whose age is less than the observation period length (for example, where the index date is 31 December 2021, this would be children up to the age of 4 years). This is because national congenital anomaly registration has only been in place for babies born since 1 January 2018. The point prevalence estimate data presented represents all individuals born between 1 January 2018 and the index date, that were still alive on the index date.
Analytical code repository
The calculation of prevalence estimates from the cohorts of each condition is performed in R using a standard function. This function is publicly available as part of the “NDRSanalysis” R package, available on the GitHub website.
Other rare conditions
Inherited Metabolic Disorders
The diagnoses are received as free text and are assigned a relevant Orphacode by the rare disease team. It is preferred that specific disorder or disorder subtype Orphacodes are selected. However, in some cases a group of disorders Orphacode must be used where a specific diagnosis cannot be determined from the free text received.
The data has been though a cleaning process and the patient details verified by linking to each patient’s details on the NHS Personal Demographics Service (PDS). This linkage occurred in June 2024.
Date of diagnosis is not always provided. Where this is the case, the date the first time the patient was seen by the HSS is taken as the diagnosis date (‘clinic date’). If they also do not have a ‘clinic date’, then the date the patient was reported to NCARDRS is used.
Patients who appear in the dataset more than once have had their most recent codable diagnosis selected as their final diagnosis and the earliest diagnosis date (or clinic/report date if diagnosis date is not available) is selected as their diagnosis date (this follows the process used by the RECORDER team for establishing the RECORDER cohort).
Patients removed from the IMD dataset before running through the prevalence function are:
- patients without a diagnosis that can be assigned an Orphacode.
- patients with a ‘Suspected’ diagnosis.
- patients who could not be verified by linking to the PDS.
- patients who are not resident in England – determined using the postcode returned by linking the patient to their details on the PDS. If a postcode is not identified on the PDS, then the patient’s most recent valid postcode in the original data is used. In cases where a patient does not have a PDS postcode or a valid postcode in the original data, they will be removed as it cannot be determined whether the patient is a resident in England.
Due to the small numbers of patients with certain diseases and the extensive number of diseases in the IMD data, certain diseases have been grouped using the ORDO ontology structure.
Using input from clinical leads, certain disease groups or specific diseases have been selected to be included in this year’s prevalence estimates publication.
The counts for each selected disease grouping will contain the counts of all diseases and disease groupings which are ‘children’ of the selected disease grouping. These counts may also include diseases or disease groups which are not listed in the prevalence tool.
The ORDO ontology allows for diseases to have multiple ‘parent’ codes, this has resulted in some patients being counted in more than one disease grouping.
For example:
|
Patient ID |
Diagnosis |
|---|---|
|
Patient 1 |
Hyperinsulinism-hyperammonemia syndrome (35878) |
The Ordo classification of this disease is:
Rare inborn errors of metabolism ORPHA:68367
- Disorder of amino acid and other organic acid metabolism ORPHA:79062
- Disorder of urea cycle metabolism and ammonia detoxification ORPHA:79167
- Hyperinsulinism-hyperammonemia syndrome ORPHA:35878
Rare inborn errors of metabolism ORPHA:68367
- Other metabolic disease ORPHA:91088
- Congenital isolated hyperinsulinism ORPHA:657
- Diazoxide-sensitive diffuse hyperinsulinism ORPHA:165985
- Hyperinsulinism-hyperammonemia syndrome ORPHA:35878
From these the selected diseases and disease groupings are:
- Disorder of amino acid and other organic acid metabolism
- Disorder of urea cycle metabolism and ammonia detoxification
- Other metabolic disease
The disease is the ‘child’ of both Disorder of amino acid and other organic acid metabolism and Other Metabolic Disease, so will be counted under both these groups.
Disorder of urea cycle metabolism and ammonia detoxification is the ‘child’ of Disorder of amino acid and other organic acid metabolism, and so Hyperinsulinism-hyperammonemia syndrome is counted under both these disease groupings.
A patient with this diagnosis would appear in the data as:
|
Patient ID |
Diagnosis |
|---|---|
|
Patient 1 |
Disorder of amino acid and other organic acid metabolism |
|
Patient 1 |
Disorder of urea cycle metabolism and ammonia detoxification |
|
Patient 1 |
Other metabolic disease |
And the counts for this one patient would be:
|
Diagnosis |
Count |
Diagnosis subtype |
Count |
|---|---|---|---|
|
Disorder of amino acid and other organic acid metabolism |
1 |
||
|
Disorder of urea cycle metabolism and ammonia detoxification |
1 |
||
|
Other metabolic disease |
1 |
The Orphanet Rare Disease Ontology (ORDO) is a structured vocabulary for rare diseases derived from the Orphanet database, capturing relationships between diseases, genes and other relevant features’.
Ordo ontology version 4.5 downloaded from the Orphanet website.
Last edited: 16 June 2025 11:56 am