The National Cancer Registration and Analysis Service (NCRAS) has developed an algorithmically generated Rapid Cancer Registration Dataset (RCRD) using the standard administrative datasets which flow rapidly into NHS England (NHSE) and are incorporated into the Cancer Analysis System (CAS) of NCRAS. The data takes the form of a series of significant events that occur to each patient as they proceed through the diagnostic and then therapeutic parts of the cancer pathway, and is available at approximately 4-5 months behind real time. The RCRD is shallower and narrower than the full NCRAS cancer registration dataset; it should be used and interpreted with reference to the caveats outlined within this document.


Main findings

This document outlines the main features of the data to be aware of when interpreting the Rapid Cancer Registration Dataset:

  • across all cancer types included approximately 12.1% of cases are missing and 6.0% of cases are included erroneously or with incorrect cancer type or diagnosis date (when compared to ‘Gold Standard’ registration data for 2018 data)

  • these figures vary strongly with cancer site. Broadly, more common cancers (particularly breast and prostate cancer) perform best and less common cancers (particularly bone and soft tissue and cancers of unknown primary) perform worst

  • non-melanoma skin cancer (ICD-10 C44) tumours are excluded from the majority of data shown (Figure 3 onwards). Carcinoma of the cervix in-situ (ICD-10 D06) is excluded from all data presented

  • there are more missing tumours in those aged over 70 compared to younger age groups

  • other factors that reduce data completeness include the patient’s route to diagnosis, mortality within 30 days of diagnosis, and the presence of multiple cancers

  • usable data is available approximately 4 to 5 months after diagnosis or other clinical activity occurs

  • data on cancer stage group at diagnosis is available for a number of common tumour types, although completeness is lower than that for the Gold Standard registration data. Where data is available it generally agrees with the Gold Standard stage group in 80 to 90% of tumours

The dataset includes Rapid Cancer Registrations from January 2018 to the most recently available data (at the date specified in the title to this document), plus additional event data for the same period.

Summary

A need to make rapidly available ‘proxy cancer registrations’ (and associated clinical activity) for the COVID-19 period has been identified to support the public health response by NHS England (NHSE) and other agencies, and service reorganisation by the NHS. These proxy registrations are called Rapid Registrations in contrast to the more formal detailed registration process that are used in non-clinical cancer research and the National Statistics.

The National Cancer Registration and Analysis Service (NCRAS) has developed a Rapid Cancer Registration Dataset (RCRD) using all standard administrative datasets which flow rapidly into NHSE and are incorporated into the Cancer Analysis System (CAS) of NCRAS.

This document describes the dataset structure, creation methodology, and data quality caveats (due to the rapid automated creation process without additional data curation) behind this dataset.

These data structures and methodologies are expected to evolve over the course of the public health response to COVID-19. The data is updated monthly and is referred to by the monthly CAS snapshot upon which it is based, e.g. CAS2009 refers to the CAS snapshot from September 2020. This document is considered a ‘living document’ and strictly applies only to the snapshot of CAS identified in the title.

Methodology

Proxy registration events (Rapid Registrations)

Datasets available to NHSE were surveyed for how many months in arrears that they arrive within NCRAS and are loaded in a usable format for analysis. From these datasets a selection of event types were defined similarly to those typically used for cancer pathway analysis pursued by NCRAS.

The data takes the form of a series of significant events that occur to each patient as they proceed through the diagnostic and then therapeutic parts of the cancer pathway. These events include chemotherapy cycles, radiotherapy episodes and major cancer surgery as well as events based on the Cancer Waiting Times (CWT) and Cancer Outcomes and Services Dataset (COSD) datasets. These event types are numbered in the range 1-23 in the dataset.

Some events hypothesised to be indicative of a cancer diagnosis were defined including ‘Diagnosis reported in COSD’ (event 51) and ‘CWT estimated diagnosis date’ (event 52). These are numbered in the range 50-57 in the dataset - see Appendix 1 for a full list.

The indicative events for diagnosis were explored as candidate Rapid Registration events. These candidate rapid registration events were judged as matching against a Gold Standard Registration event if it met the following two conditions:

  • diagnosis dates for each event was 90 days or less
  • both registrations fell into the same broad tumour group (as defined in Appendix 3)

Using these matching criteria False Positive errors and False Negative errors are defined as:

  • False Positive Error (FPE): A rapid registration event has been created which does not match against a Gold Standard Registration in the comparison period
  • False Negative Error (FNE): There exists a Gold Standard Registration event for which no rapid registration event can be matched

Additional filtering was applied to the candidate events and eventually event 101 was defined to minimise both false positive and false negative errors and is recommended for use by researchers as the best candidate for a rapid cancer registration. Appendix 4 briefly examines some of the alternatives examined in the development of this event definition.

Data structures

The rapid registration dataset consists of two tables:

AT_RAPID_PATHWAY: This is an event-based dataset with a number of types of event of interest defined based on the rapidly available datasets, see Appendix 1 for event definitions and properties. These are numbered in the range 1-23 for general purpose events, 50-57 for events that are candidates for combining into a rapid registration, and 101 for the final rapid registration event.

AT_RAPID_TUMOUR: This is a tumour level dataset that holds tumour and patient level data for each of the tumours defined by a rapid registration. The structure and contents of this table are presented in Appendix 3.

The rapid registration pathway and tumour table can be linked together as shown in Figure 1, and also to other datasets that are timely enough via NHSnumber.

Figure 1: Linkage diagram for the Rapid Cancer Registration Dataset

Diagram showing how the rapid tumour and rapid pathway tables link together via avpid or tumour avpid, individual ID, patient ID and NHS number, and how patient ID and NHS number can link into the Cancer Analysis System and other CAS reference tables

Data Quality

How do the number of Rapid Registrations compare with Gold Standard Registrations?

To illustrate the strengths and weaknesses of the Rapid Registrations compared to the gold standard process, registrations for tumours diagnosed during 2018 are compared in Figure 2.

For most tumour groups the counts of Rapid Registrations are significantly lower than those of standard registrations. The COSD system does not attempt to record basal cell carcinoma non-melanoma skin cancers (but they are recorded by hospital pathology systems, and thereby registered), explaining the discrepancy there. There is only one group where this situation is reversed - bone and soft tissue - for which a precise morphology is required to properly record the diagnosis. These cancers are being preferentially coded to bone and soft tissue in COSD (as the COSD standard necessitates simpler site-based coding, and this is the best choice under the circumstances) and re-coded during the gold standard registration process where more sophisticated combination of site and morphological coding is possible.

Figure 2: The number of cancer registrations by registration and tumour type, England, 2018

Bar chart of tumour diagnoses in 2018 by registration source where for each tumour type, gold standard registration count is higher than rapid registration count, especially for non-melanoma skin cancer.

Figure 3 shows the age dependence of the ratio between Gold Standard and Rapid Registrations, Non-Melanoma Skin Cancer is excluded. The proportion of diagnoses is consistently high for both males and females until the age of 70 is reached, where it declines. This is explored further in Figure 5 below.

Figure 3: The proportion of cancer registrations by gender, age and registration type, England, 2018 (all tumour types combined)

Line chart of the proportion of rapid compared to gold standard registrations by GENDER, where the proportion of diagnoses for both GENDERs is consistently high until age 70, where it declines.

Comparing the matching quality of Rapid Registrations

The quality of the Rapid Registrations was judged by comparing them against the gold-standard cancer registrations in the period April 2018 to September 2018. This period was chosen as available gold standard registration data was only finalised to December 2018 and a matching period of 90 days was allowed (restricting comparison to the middle six months of the twelve-month period).

Figure 4 shows the proportions of false positive and false negative events, by broad cancer type (excluding non-melanoma skin cancer), measured in the cas2603 snapshot (the tumour groups are defined in Appendix 3). A more detailed tabulation is available by tumour group and tumour site in Appendix 5.

In most tumour groups, there are more tumours missed by the rapid registrations process (false negatives) than there are falsely identified as tumours (false positives).

For breast and prostate, very few incorrect proxy registrations are made. Breast, colorectal, lung, oesophagogastric (O-G) and prostate cancers are also least likely to be missing from the proxy dataset, whereas for cancers of unknown primary, and bone and soft tissue tumours more than 25% of cancers are missed. Bone and soft tissue tumours are not frequently diagnosed. These tumours often require multiple pathology reports to correctly diagnose a patient and the Rapid Registrations dataset has not attempted to reconcile differences in the reported diagnoses.

Figure 4: Types of error by tumour group

Bar chart of the proportion of false negative and false positive errors by tumour group, where bone and soft tissue tumours and cancers of unknown primary have a higher proportion of errors than other tumour groups.

The proportion of false positive errors is fairly stable across all ages (Figure 5); the proportion of false negative errors slowly declines until age 70 when it increases significantly. The age dependence was investigated and the age-dependence of the basis of diagnosis was found to be at least partially responsible for this - see Appendix 6 for details.

The proportion of false positive cases is less sensitive to the age of the patient.

Figure 5: False negative and false positive errors by age band at diagnosis

Line chart of the proportion of false negative and false positive errors by age band at diagnosis. The proportion of false positive errors is stable from age 20, and the proportion of false negative errors slowly declines to age 70 when it increases

The charts in Figure 6 (below) examine these patterns by tumour group. Please note that age groups for each tumour group must have a denominator of 25 patients or more or they are suppressed for reasons of statistical power.

The patterns of false negative and false positive vary significantly by tumour group. Most groups have a higher proportion of false negatives than false positives at each age.

The proportion of false positives does not exhibit a trend by age for most tumour groups; the proportion rises with increasing age in the bone and soft tissue, head and neck groups and melanoma group and conversely falls with increasing age in the colorectal and unknown groups.

The proportion of false negatives rises with increasing age for all tumour groups except bone and soft tissue and endocrine. The most pronounced increases occur in the brain and central nervous system, colorectal, gynaecological, haematological, prostate, upper gastro-intestinal and unknown primary tumour groups.

The levels of both types of error are highest in tumour groups which are less likely to have solid-tissue pathology (haematological) or where survival rates are typically low. Conversely, the levels of error are lowest for tumour groups for which survival rates are typically higher.

Figure 6: False negative and false positive errors by age band at diagnosis and tumour group

A series of line graphs for each tumour type of the proportion of false negative and false positive errors by age band at diagnosis.

The variation of the false positive and false negative errors with Income deprivation quintile is shown in Figure 7. While there is an overall trend visible this is likely to be due to confounding due to the variation with tumour type shown above and the known association of the incidence of many cancer types with income deprivation.

Figure 7: False negative and false positive errors by income deprivation quintile

Bar chart of the proportion of false negative and false positive errors by income deprivation quintile where the proportion slightly declines from income quintile 1 to income quintile 5.

Figure 8 shows the variation of false negative and false positive errors with route to diagnosis. For false positives there is moderate variation with the lowest error rate being those cases identified through cancer screening or a two week wait referral. (These tumours are those that are likely to be captured in both the COSD dataset and the screening/Cancer Waiting Times datasets so the lower error rate is understandable.)

Most routes to diagnosis have a substantially higher false negative rate than the overall average. ‘Two Week Wait’ (TWW) and screening routes have a substantially lower false negative rate (and make up between them 45% of the total cohort).

Figure 8: False negative and false positive errors by route to diagnosis

Bar chart of the proportion of  false negative and false positive errors by route to diagnosis where cases identified through cancer screening and two week wait referral routes have the lowest error rates compared to other diagnosis routes.

Figure 9 below shows the variation of false negative and false positive errors with whether or not the patient died within 30 days of diagnosis. The false negative error rate varies substantially between patients who die in the 30 days post-diagnosis compared to those who did, meaning that patients who die within 30 days are more likely to be missing from the dataset.

Figure 9: False negative and false positive errors by 30-day mortality

Bar chart of the proportion of false negative and false positive errors by 30 day mortality where false negative error rates are higher in those that died in the 30 days post-diagnosis compared to those that did not die within 30 days.

Figure 10 below shows the variation of false negative and false positive errors with the multiple tumour status of the patient, i.e. whether or not the patient had been diagnosed with more than one type of tumour in the period January 2018 onward. The false positive error rate varies substantially between patients with multiple tumour types and those that don’t, meaning that these patients with multiple tumours are more likely to have incorrect tumour types or diagnosis dates recorded.

Figure 10: False negative and false positive errors by multiple tumour status

Bar chart of the proportion of false negative and false positive errors by multiple tumour status, where false positive error rates are higher in patients with multiple tumour types than those without.

Figure 10b below shows the variation of false negative and false positive errors with the stage at diagnosis.

Figure 10b: False negative and false positive errors by stage

Bar chart of the proportion of false negative and false positive errors by stage at diagnosis, where errors are higher in cases with no stage information compared to other stage categories.

Figure 11 below shows the variation of false negative and false positive errors with the cancer alliance of residence of the patient at the time of diagnosis. The false negative error rate varies more in absolute terms than the false positive rate and may be driven by trust level variation (see figures 11 and 12 below).

Figure 11: False negative and false positive errors by Cancer Alliance

Line chart of the proportion of false negative and false positive errors by Cancer Alliance where false negative error rates vary more than false positive error rates.

Figures 12 and 13 below show the variation of false negative and false positive errors with the trust that diagnosed the tumour. Figure 12 shows the error proportion and figure 13 the numerator (count) of the errors. Trusts shown are limited to NHS secondary care trusts with a denominator of at least 50 patients over the assessment period. Both figures are ordered in descending order of the false negative statistic - but note that the order is not the same in each figure.

There is substantial variation in both false positive and false negative rates and counts. Some large trusts have several hundred or up to 1000 cases (over the six-month period under assessment).

Figure 12: False negative and false positive errors (proportion) by hospital trust

Line chart of the proportion of false negative and false positive errors by unidentifiable hospital trusts where there is substantial variation in both false negative and false positive rates between trusts.

Figure 13: False negative and false positive errors (count) by hospital trust

Line chart of the counts of false negative and false positive errors by unidentified hospital trusts, where there is substantial variation in both error counts between hospital trusts.

Counts of events over time

This section examines the population of events by chronological time and when they appear in successive analytical snapshots in the CAS. Figure 14 shows that most data items in the Rapid Registrations dataset are stable with respect to the snapshot month.

Specific comments about the events shown below are:

  • cancer waiting times data (events 1–4) are received based on the treatment start date; this explains why for event 2 all lines lie exactly on top of each other. Other CWT events accumulate over successive snapshots where these events occur before the first treatment start event

  • an issue with HES data that caused lower than expected completeness from 2020-04-01 was resolved in cas2102, leading to increased event counts in events 5, 6, 11, 12, 13 and 23

  • the definition of event 17 only includes tumour diagnoses prior to 2018, so lack of data in the chart below is expected

  • definitions of staging events may change between snapshots, which might explain higher or lower counts in one snapshot compared to others

  • the vital status shown in event 19 is typically only assessed each January or at the completion of registering each diagnosis year, explaining the large peaks in the graph

  • the raw data used to populate events 21, 54 and 56 is subject to ongoing deduplication, which explains lower counts in earlier time periods for later snapshots

  • between snapshots, event 101–103 (inferred diagnoses) counts generally increase, particularly for recent months as additional COSD data is submitted. For some earlier months, there is a small decrease in these counts because the algorithm excludes potential diagnoses where the patient already has a confirmed diagnosis in the same tumour group more than 90 days before. These exclusions can change between snapshots as gold standard registration data is processed, leading to more confirmed previous diagnoses. The effect has been measured as less than 1% of all cases in any given month

Figure 14: Population of data items to CAS snapshot

These are line graphs showing the count over time, comparing the completeness with the three previous snapshots. The count tends to increase in the more recent snapshots.

These are line graphs showing the count over time, comparing the completeness with the three previous snapshots. The count tends to increase in the more recent snapshots.

These are line graphs showing the count over time, comparing the completeness with the three previous snapshots. The count tends to increase in the more recent snapshots.

For eighteen data items, there are line graphs showing the count over time, comparing the completeness with the three previous snapshots. The count tends to increase in the more recent snapshots.

Estimated completeness of Rapid Registrations and secondary datasets

Detailed linked rapid cancer registration, CWT, SACT and RTDS data is available at approximately a four-month lag from real time. Linked HES and raw COSD data is available at approximately 4-5 months behind real time.

Table 2 below shows data usability and completeness for Rapid Registrations and the constituent datasets. The “latest usable” column shows the ‘hard limit’ on data that is considered fit for analytical purposes (90% completeness), even in months prior to this though data is not necessarily considered complete and the completeness is displayed below. This should be taken into account in any use of the rapid registration data and the secondary datasets.

For the Rapid Tumour data completeness is expressed as the proportion of CCG of residence which show a cancer incidence within the normally expected range (see Table 3 below). For other datasets except CWT completeness is computed as a percentage of the number of data providers who have supplied data over those who are expected to do so.

Data completeness within the Cancer Waiting Times dataset varies at patient level with event type. Figures for the Treatment Start Date and Treatment Period Start Date are given below. Completeness of other CWT events can be estimated by inspecting Figure 13 (events 1-4).

Table 2: Rapid registration and dataset usability/completeness in cas2603

Rapid registration and dataset usability/completeness
Data source Latest usable January 2025 February 2025 March 2025 April 2025 May 2025 June 2025 July 2025 August 2025 September 2025 October 2025 November 2025 December 2025
Rapid Tumours (COSD) December 2025 97% 97% 97% 98% 97% 99% 99% 96% 99% 95% 94% 94%
HES October 2025 Complete Complete Complete Complete Complete Complete Complete Complete Complete 95%
SACT May 2025 96% 98% 97% 93% 92%
RTDS July 2025 Complete Complete 98% 96% 94% 94% 94%
CWT (TSD) December 2025 Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete
CWT (TPSD) November 2025 Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete
Note:
COSD = Cancer Outcomes and Services Dataset
TSD = Treatment Start Date
TPSD = Treatment Period Start Date

Table 3a: Number of outlier CCGs in COSD dataset in cas2603

The table below shows the number of CCGs (using the April 2020 boundaries) which have 3-sigma outlier counts per month (either high or low) compared to the expectation of the fraction of the total number of new cancer registrations in England. This can be used to judge to what extent there is large scale missing data in COSD (and therefore in the Rapid Registrations in any particular month.)

Number of outlier CCGs in COSD dataset
Year and month Outlier: High Outlier: Low In expected range Total received Prop.
2024-01 1 3 131 135 0.9703704
2024-02 0 0 135 135 1.0000000
2024-03 0 1 134 135 0.9925926
2024-04 0 0 135 135 1.0000000
2024-05 0 0 135 135 1.0000000
2024-06 0 1 134 135 0.9925926
2024-07 1 2 132 135 0.9777778
2024-08 1 0 134 135 0.9925926
2024-09 1 3 131 135 0.9703704
2024-10 1 1 133 135 0.9851852
2024-11 1 3 131 135 0.9703704
2024-12 1 4 130 135 0.9629630
2025-01 1 3 131 135 0.9703704
2025-02 1 3 131 135 0.9703704
2025-03 1 3 131 135 0.9703704
2025-04 2 1 132 135 0.9777778
2025-05 1 3 131 135 0.9703704
2025-06 0 2 133 135 0.9851852
2025-07 0 1 134 135 0.9925926
2025-08 2 3 130 135 0.9629630
2025-09 0 1 134 135 0.9925926
2025-10 1 6 128 135 0.9481481
2025-11 1 7 127 135 0.9407407
2025-12 1 7 127 135 0.9407407
2026-01 9 15 111 135 0.8222222
2026-02 31 61 42 134 0.3134328
2026-03 7 0 NA 7 NA

Staging data in the Rapid Registrations dataset

TNM stage group 1-4

The size and extent of a cancer is commonly described using the ‘TNM’ system for “Tumour”, “Node”, and “Metastases”. This is often abbreviated to a number between 1 (typically a localised tumour with limited spread) to 4 (typically a tumour that has invaded or spread to distant organs). The stage at diagnosis is very strongly associated with patient outcomes.

In the current version of the Rapid Registrations dataset partial staging data is provided for a number of different cancer sites (ICD-10 codes can be found in the labels for tables 5a-k). This has been benchmarked against the gold standard cancer registry data for cas2603. Table 4 shows the count and proportion of cases by TNM stage group for both the Rapid Registrations and the Gold Standard Registrations, for calendar year 2018. For example 32% of breast cancers are TNM stage group 1 in the Rapid Registrations, but 38% in the Gold Standard Registrations. Compared to the Gold Standard Registrations in 2018, the Rapid Registrations under report breast cancers diagnosed at stages 1 or 2; colorectal cancers diagnosed at stage 4 are under reported and prostate cancers have under reported stages 1 and 4. In all three tumour groups, there are more tumours allocated to the unknown or unstageable category. Lung cancers in the RCRD most accurately match the Gold Standard Registrations and exhibits a broadly similar stage profile from both measures.

Table 4: Summary proportions of stage at diagnosis for the Rapid Registrations and Gold Standard Registrations

Summary proportions of stage at diagnosis for the Rapid Registrations and Gold Standard Registrations
Broad Cancer Group Stage Group Count (Rapid) Percentage (Rapid) Count (Gold Standard) Percentage (Gold Standard)
Bladder 1 2329 24.2% 2862 29.7%
Bladder 2 1801 18.7% 1877 19.5%
Bladder 3 567 5.9% 882 9.2%
Bladder 4 265 2.8% 653 6.8%
Bladder U 4673 48.5% 3361 34.9%
Breast 1 14307 32.2% 16566 37.3%
Breast 2 13321 30.0% 16775 37.8%
Breast 3 3282 7.4% 3712 8.4%
Breast 4 1283 2.9% 1977 4.5%
Breast U 12225 27.5% 5388 12.1%
Cervical 1 1212 46.5% 833 31.9%
Cervical 2 430 16.5% 400 15.3%
Cervical 3 173 6.6% 191 7.3%
Cervical 4 262 10.0% 235 9.0%
Cervical U 532 20.4% 950 36.4%
Colorectum 1 4908 14.9% 5481 16.7%
Colorectum 2 7045 21.5% 7716 23.5%
Colorectum 3 8296 25.3% 9308 28.3%
Colorectum 4 5179 15.8% 7457 22.7%
Colorectum U 7414 22.6% 2880 8.8%
Kidney 1 2385 29.0% 3330 40.5%
Kidney 2 448 5.4% 554 6.7%
Kidney 3 1350 16.4% 1643 20.0%
Kidney 4 706 8.6% 1577 19.2%
Kidney U 3342 40.6% 1127 13.7%
Lung 1 6190 17.1% 6611 18.3%
Lung 2 2583 7.2% 2683 7.4%
Lung 3 7321 20.3% 7606 21.1%
Lung 4 14984 41.5% 17166 47.5%
Lung U 5023 13.9% 2035 5.6%
Lymphoma 1 921 7.5% 1753 14.3%
Lymphoma 2 955 7.8% 1613 13.2%
Lymphoma 3 1211 9.9% 1986 16.2%
Lymphoma 4 2689 22.0% 4919 40.2%
Lymphoma U 6458 52.8% 1963 16.0%
Melanoma 1 6345 48.2% 8224 62.5%
Melanoma 2 2383 18.1% 2641 20.1%
Melanoma 3 482 3.7% 1038 7.9%
Melanoma 4 223 1.7% 350 2.7%
Melanoma U 3733 28.4% 913 6.9%
Oesophagus 1 298 3.6% 445 5.4%
Oesophagus 2 1489 18.1% 954 11.6%
Oesophagus 3 1763 21.4% 2119 25.7%
Oesophagus 4 2508 30.4% 3193 38.7%
Oesophagus U 2189 26.5% 1536 18.6%
Ovary 1 1189 23.6% 1417 28.1%
Ovary 2 263 5.2% 282 5.6%
Ovary 3 1329 26.3% 1617 32.0%
Ovary 4 791 15.7% 1060 21.0%
Ovary U 1474 29.2% 670 13.3%
Pancreas 1 358 4.5% 662 8.3%
Pancreas 2 629 7.8% 804 10.0%
Pancreas 3 752 9.4% 1040 13.0%
Pancreas 4 2058 25.7% 4100 51.2%
Pancreas U 4218 52.6% 1409 17.6%
Prostate 1 11695 25.3% 16196 35.0%
Prostate 2 5584 12.1% 6518 14.1%
Prostate 3 10446 22.6% 11615 25.1%
Prostate 4 5709 12.3% 8084 17.5%
Prostate U 12879 27.8% 3900 8.4%
Stomach 1 327 8.3% 340 8.6%
Stomach 2 388 9.8% 466 11.8%
Stomach 3 646 16.4% 712 18.0%
Stomach 4 1147 29.0% 1665 42.2%
Stomach U 1441 36.5% 766 19.4%
Uterus 1 4657 58.7% 5335 67.3%
Uterus 2 512 6.5% 537 6.8%
Uterus 3 734 9.3% 817 10.3%
Uterus 4 518 6.5% 552 7.0%
Uterus U 1510 19.0% 690 8.7%

In Tables 5a-m below, the distribution of the stage allocations between the Rapid Registrations and the Gold Standard Registrations are examined.

The figures indicate the proportion of agreement at the 1-digit TNM stage group level, where the stage is known in the Rapid Registrations dataset. Stages 1-4 in the Rapid Registrations dataset agree with the gold standard stage variable for a high proportion.

For example, when examining the subset of Rapid Registrations breast tumours that are identified as TNM stage 1 (32%), approximately 89% of these are found to be TNM stage group 1 in the gold standard registration data, with another 11% distributed across TNM stages 2-4 and the unknown or unstageable groups.

For many but not all (e.g., late stage breast cancer), roughly 85% or more of staged cases in the Rapid Registrations table have the same stage grouping as the equivalent tumour in the standard registration data - this can be seen in the table below by inspecting the figures where the stage metrics for the Rapid Registrations and Gold Standard Registrations are the same.

Where the stage is labelled as unknown or unstageable in the rapid pathway dataset it is known for at least 70% of those cases in the gold standard data.

Tables 5a-n: Stage comparison between Rapid Registrations and Gold Standard Registrations by cancer site

Stage comparison between RCRD and NCRD
a. bladder (ICD-10 C67)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 84.4% 4.2% 7.8% 5.7% 16.3%
2 3.8% 71.5% 15.3% 6.4% 8.5%
3 2.6% 10.8% 64.0% 4.9% 5.4%
4 1.2% 5.0% 5.3% 77.0% 6.4%
U 7.9% 8.6% 7.6% 6.0% 63.4%
Stage comparison between RCRD and NCRD
b. breast (ICD-10 C50)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 88.5% 5.1% 1.8% 4.3% 25.5%
2 6.7% 88.0% 11.5% 16.3% 28.7%
3 0.6% 2.7% 78.9% 5.7% 5.0%
4 0.2% 0.9% 2.9% 67.7% 7.1%
U 4.0% 3.4% 4.9% 6.0% 33.7%

 

Stage comparison between RCRD and NCRD
c. colorectum (ICD-10 C18-C20)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 84.8% 2.2% 1.9% 0.8% 13.1%
2 5.7% 85.4% 5.5% 1.4% 12.0%
3 6.6% 7.3% 84.9% 4.5% 16.1%
4 0.9% 2.9% 5.7% 92.0% 26.7%
U 2.0% 2.3% 2.1% 1.4% 32.1%
Stage comparison between RCRD and NCRD
d. kidney (ICD-10 C64)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 90.9% 6.7% 3.3% 2.1% 32.1%
2 0.5% 77.0% 1.0% 0.8% 5.3%
3 1.8% 6.7% 85.9% 4.2% 11.4%
4 0.5% 3.3% 5.9% 90.7% 24.9%
U 6.3% 6.2% 4.0% 2.1% 26.3%

 

Stage comparison between RCRD and NCRD
e. lung (ICD-10 C33-C34)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 92.7% 6.9% 1.4% 0.6% 10.1%
2 2.8% 83.3% 1.8% 0.4% 3.1%
3 1.9% 5.1% 90.0% 1.3% 11.4%
4 1.4% 3.2% 5.5% 97.1% 40.8%
U 1.2% 1.5% 1.3% 0.6% 34.6%
Stage comparison between RCRD and NCRD
f. melanoma (ICD-10 C43)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 93.6% 2.8% 8.3% 14.3% 57.4%
2 2.3% 78.0% 10.8% 18.8% 14.5%
3 2.0% 11.6% 73.4% 14.8% 6.6%
4 0.2% 1.6% 2.5% 40.8% 5.3%
U 1.9% 6.0% 5.0% 11.2% 16.2%
Stage comparison between RCRD and NCRD
g. oesophagus (ICD-10 C15)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 79.9% 5.0% 0.5% 0.2% 5.4%
2 7.4% 49.7% 3.4% 1.0% 4.9%
3 2.3% 34.7% 68.5% 6.2% 10.6%
4 1.0% 5.2% 21.7% 83.1% 29.5%
U 9.4% 5.5% 5.9% 9.5% 49.5%
Stage comparison between RCRD and NCRD
h. ovary (ICD-10 C56-C57)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 96.1% 7.2% 0.9% 0.3% 16.4%
2 0.5% 88.6% 0.5% 0.1% 2.4%
3 0.9% 1.9% 90.9% 10.5% 21.0%
4 0.5% 0.4% 4.4% 84.1% 22.3%
U 1.9% 1.9% 3.2% 5.1% 37.9%

 

Stage comparison between RCRD and NCRD
i. prostate (ICD-10 C61)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 85.8% 10.5% 4.7% 1.6% 38.8%
2 6.8% 81.5% 2.7% 0.9% 6.6%
3 4.3% 4.3% 85.7% 2.9% 13.7%
4 0.8% 0.8% 3.9% 92.2% 17.6%
U 2.3% 2.8% 3.1% 2.5% 23.3%
Stage comparison between RCRD and NCRD
j. stomach (ICD-10 C16)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 67.3% 4.6% 0.8% 0.1% 6.7%
2 18.7% 64.7% 9.9% 0.8% 5.6%
3 5.8% 19.3% 68.6% 3.2% 9.6%
4 2.1% 6.7% 16.7% 93.4% 31.4%
U 6.1% 4.6% 4.0% 2.5% 46.7%
Stage comparison between RCRD and NCRD
k. uterus (ICD-10 C54-C55)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 96.7% 10.5% 5.7% 7.1% 46.2%
2 0.7% 83.0% 1.2% 2.5% 3.8%
3 0.5% 2.1% 86.8% 7.3% 7.1%
4 0.2% 1.8% 2.5% 75.1% 8.4%
U 1.9% 2.5% 3.8% 7.9% 34.5%
Stage comparison between RCRD and NCRD
l. pancreas (ICD-10 C25)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 72.6% 3.5% 0.9% 0.4% 8.7%
2 15.4% 74.1% 2.4% 0.5% 6.0%
3 4.7% 12.2% 88.2% 0.6% 6.4%
4 3.4% 5.6% 6.0% 97.1% 47.6%
U 3.9% 4.6% 2.5% 1.3% 31.3%
Stage comparison between RCRD and NCRD
m. lymphoma, staged (ICD-10 C81-C86, C88)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 89.3% 1.4% 0.7% 0.6% 13.9%
2 0.9% 92.3% 1.4% 0.5% 10.7%
3 0.8% 1.4% 88.4% 1.7% 13.1%
4 6.4% 2.7% 7.2% 91.8% 35.3%
U 2.7% 2.3% 2.3% 5.4% 27.0%
Stage comparison between RCRD and NCRD
n. cervical (ICD-10 C53)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 58.3% 3.5% 3.5% 3.8% 17.9%
2 1.2% 75.3% 2.3% 5.0% 8.3%
3 1.1% 3.3% 73.4% 1.9% 6.0%
4 0.4% 1.4% 2.9% 67.6% 7.9%
U 38.9% 16.5% 17.9% 21.8% 60.0%

“Early” vs “Late” stage

Below in table 6 we repeat the above tabulations but now grouping Rapid and Gold Standard cancers into “Early” (TNM stage group 1 & 2) or “Late” (TNM stage group 3 & 4) categories. We see that 62% of breast cancers are identified as “Early” stage in the Rapid Registrations dataset compared to 76% in the Gold Standard Registration data due to the higher proportion of “Unknown” stage tumours (28% vs 10% respectively).

As with the more detailed stage data, there is a high degree of concordance between the gold standard and rapid registration stage fields if a known stage can be identified.

Table 6: Summary proportions of “Early” vs “Late” stage for Rapid Registrations and Gold Standard Registrations

Summary proportions of Early vs Late stage for Rapid Registrations and Gold Standard Registrations
Broad Cancer Group Stage Group Count (Rapid) Percentage (Rapid) Count (Gold Standard) Percentage (Gold Standard)
Bladder Early 4130 42.9% 4739 49.2%
Bladder Late 832 8.6% 1535 15.9%
Bladder Unknown 4673 48.5% 3361 34.9%
Breast Early 27628 62.2% 33341 75.1%
Breast Late 4565 10.3% 5689 12.8%
Breast Unknown 12225 27.5% 5388 12.1%
Cervical Early 1642 62.9% 1233 47.3%
Cervical Late 435 16.7% 426 16.3%
Cervical Unknown 532 20.4% 950 36.4%
Colorectum Early 11953 36.4% 13197 40.2%
Colorectum Late 13475 41.0% 16765 51.0%
Colorectum Unknown 7414 22.6% 2880 8.8%
Kidney Early 2833 34.4% 3884 47.2%
Kidney Late 2056 25.0% 3220 39.1%
Kidney Unknown 3342 40.6% 1127 13.7%
Lung Early 8773 24.3% 9294 25.7%
Lung Late 22305 61.8% 24772 68.6%
Lung Unknown 5023 13.9% 2035 5.6%
Lymphoma Early 1876 15.3% 3366 27.5%
Lymphoma Late 3900 31.9% 6905 56.4%
Lymphoma Unknown 6458 52.8% 1963 16.0%
Melanoma Early 8728 66.3% 10865 82.5%
Melanoma Late 705 5.4% 1388 10.5%
Melanoma Unknown 3733 28.4% 913 6.9%
Oesophagus Early 1787 21.7% 1399 17.0%
Oesophagus Late 4271 51.8% 5312 64.4%
Oesophagus Unknown 2189 26.5% 1536 18.6%
Ovary Early 1452 28.8% 1699 33.7%
Ovary Late 2120 42.0% 2677 53.1%
Ovary Unknown 1474 29.2% 670 13.3%
Pancreas Early 987 12.3% 1466 18.3%
Pancreas Late 2810 35.1% 5140 64.1%
Pancreas Unknown 4218 52.6% 1409 17.6%
Prostate Early 17279 37.3% 22714 49.0%
Prostate Late 16155 34.9% 19699 42.5%
Prostate Unknown 12879 27.8% 3900 8.4%
Stomach Early 715 18.1% 806 20.4%
Stomach Late 1793 45.4% 2377 60.2%
Stomach Unknown 1441 36.5% 766 19.4%
Uterus Early 5169 65.2% 5872 74.0%
Uterus Late 1252 15.8% 1369 17.3%
Uterus Unknown 1510 19.0% 690 8.7%

In Table 7a-n below the distribution of the stage allocation between the Rapid Registrations and the Gold Standard Registrations are examined, aggregated into Early and Late stage.

Tables 7a-n: “Early” vs “late” stage comparison between Rapid Registrations and Gold Standard Registrations

Early vs late stage comparison between RCRD and NCRD
a. bladder (ICD-10 C67)
Stage Category
RCRD
NCRD Early Late Unknown
Early 82.7% 19.6% 24.8%
Late 9.1% 73.3% 11.8%
Unknown 8.2% 7.1% 63.4%
Early vs late stage comparison between RCRD and NCRD
b. breast (ICD-10 C50)
Stage Category
RCRD
NCRD Early Late Unknown
Early 94.2% 15.4% 54.2%
Late 2.1% 79.4% 12.1%
Unknown 3.7% 5.2% 33.7%
Early vs late stage comparison between RCRD and NCRD
c. colorectum (ICD-10 C18-C20)
Stage Category
RCRD
NCRD Early Late Unknown
Early 88.8% 5.4% 25.0%
Late 9.1% 92.8% 42.9%
Unknown 2.1% 1.8% 32.1%
Early vs late stage comparison between RCRD and NCRD
d. kidney (ICD-10 C64)
Stage Category
RCRD
NCRD Early Late Unknown
Early 90.2% 3.8% 37.4%
Late 3.5% 92.8% 36.3%
Unknown 6.3% 3.4% 26.3%
Early vs late stage comparison between RCRD and NCRD
e. lung (ICD-10 C33-C34)
Stage Category
RCRD
NCRD Early Late Unknown
Early 94.0% 1.7% 13.2%
Late 4.8% 97.4% 52.2%
Unknown 1.3% 0.8% 34.6%
Early vs late stage comparison between RCRD and NCRD
f. melanoma (ICD-10 C43)
Stage Category
RCRD
NCRD Early Late Unknown
Early 91.8% 23.5% 71.9%
Late 5.2% 69.5% 11.9%
Unknown 3.0% 7.0% 16.2%
Early vs late stage comparison between RCRD and NCRD
g. Oesophagus (ICD-10 C15)
Stage Category
RCRD
NCRD Early Late Unknown
Early 60.1% 2.3% 10.3%
Late 33.7% 89.7% 40.2%
Unknown 6.2% 8.0% 49.5%
Early vs late stage comparison between RCRD and NCRD
h. ovary (ICD-10 C56-C57)
Stage Category
RCRD
NCRD Early Late Unknown
Early 96.5% 1.0% 18.7%
Late 1.6% 95.0% 43.4%
Unknown 1.9% 3.9% 37.9%
Early vs late stage comparison between RCRD and NCRD
i. prostate (ICD-10 C61)
Stage Category
RCRD
NCRD Early Late Unknown
Early 92.4% 5.6% 45.4%
Late 5.1% 91.5% 31.3%
Unknown 2.5% 2.9% 23.3%
Early vs late stage comparison between RCRD and NCRD
j. stomach (ICD-10 C16)
Stage Category
RCRD
NCRD Early Late Unknown
Early 76.9% 4.4% 12.3%
Late 17.8% 92.5% 41.0%
Unknown 5.3% 3.1% 46.7%
Early vs late stage comparison between RCRD and NCRD
k. uterus (ICD-10 C54-C55)
Stage Category
RCRD
NCRD Early Late Unknown
Early 97.0% 8.1% 50.0%
Late 1.0% 86.4% 15.5%
Unknown 1.9% 5.5% 34.5%
Early vs late stage comparison between RCRD and NCRD
l. pancreas (ICD-10 C25)
Stage Category
RCRD
NCRD Early Late Unknown
Early 81.4% 1.6% 14.7%
Late 14.3% 96.8% 54.0%
Unknown 4.4% 1.6% 31.3%
Early vs late stage comparison between RCRD and NCRD
m. lymphoma (ICD-10 C81-C86, C88
Stage Category
RCRD
NCRD Early Late Unknown
Early 91.9% 1.4% 24.6%
Late 5.6% 94.2% 48.4%
Unknown 2.5% 4.5% 27.0%
Early vs late stage comparison between RCRD and NCRD
n. cervical (ICD-10 C53
Stage Category
RCRD
NCRD Early Late Unknown
Early 64.6% 7.6% 26.1%
Late 2.3% 72.2% 13.9%
Unknown 33.1% 20.2% 60.0%

Stage completeness by snapshot

Figure 16 shows the completeness of stage by tumour type for one snapshot per quarter. Stage completeness continues to increase and lags behind the incidence completeness due to staging activity happening up to several months after diagnosis.

Figure 16: Stage completeness by snapshot

A series of individual line charts for thirteen common cancers, showing the proportion of tumour staged by month, broken down by snapshot. For most tumour types, stage completeness has increased gradually over time.

Data completeness, source, and mortality comparisons

Counts of missing data

Figure 17 shows the count of tumours per month where the indicated data item is missing. The data items are: basis of diagnosis, birth date best, ethnic category, NHS number, postcode, quintile 2019, gender and trust code. Larger counts in the most recent months are to be expected.

Figure 17: Counts of missing data

A set of line charts for eight data items. Basis of diagnosis, ethnic category and GENDER show no counts of missing data. For postcode and quintile, larger counts are seen for more recent months. For other data items, the trends are variable over time

Ethnicity completeness

Figure 18 shows the count of tumours per month where the indicated data item is missing. Larger counts in the most recent months are to be expected.

Figure 18: Ethnicity completeness

A set of line charts for seventeen ethnic category codes of the number of tumours with complete ethnicity data over time. The completeness count increases from early 2020, to then decrease for more recent months.

Tumour source

Figure 19 shows the number of tumours created by the source of the diagnosis - i.e., which dataset was used to create them, by month

Figure 19: Tumour source dataset

Four line charts of the tumour count from each source of diagnosis. For COSD, CWT and HES, there is a decrease in count over March 2020, which then increases. For DCO, the count is stable. For all sources, the number decreases for recent months

Mortality proportion by month

Figure 20 shows the mortality proportions by month mortality within 30, 90 and 182 days in the RCRD compared to the NCRD, for all cancers included in RCRD excl C44 and D06.

Figure 20: Monthly mortality proportions at 30, 90 and 182 days

A line graph comparing the proportion of 30 day mortality over time between the rapid  and national cancer registration data. The proportions follow similar trends, and in more recent months, is lower for the national cancer registration data. A line graph comparing the proportion of 30 day mortality over time between the rapid and national cancer registration data.

A line graph comparing the proportion of 182 day mortality over time between the rapid and national cancer registration data. The proportion is mostly higher for the national registration data, and then decreases for more recent months.

Figure 21 shows the proportions of route to diagnosis by month, for all cancers included in RCRD excl C44 and D06.

Figure 21: Monthly Route to Diagnosis

A line graph the proportions of route to diagnosis by month, for all cancers included in RCRD excl C44 and D06

Appendix 1 - List of pathway events

Table A1: AT_RAPID_PATHWAY: event list

AT_RAPID_PATHWAY: event list
EVENT_TYPE EVENT_DESC EVENT_PROPERTY_1 EVENT_PROPERTY_2 EVENT_PROPERTY_3 EVENT_DATE Linkage
1 CWT Treatment Period Start Date CWT First Treatment Flag CWT SITE_ICD10 CWT Cancer Treatment Event Type Treat period start NHSNUMBER
2 CWT Treatment Start CWT Treatment Modality CWT Cancer Treatment Event type Treatment start date NHSNUMBER
3 CWT MDT Begin CWT MDT Cancer Care Plan discussed indicator MDT date NHSNUMBER
4 CWT Faster Diagnosis Period End (null) Faster Diagnosis Period site Faster Diagnosis Period end date NHSNUMBER
5 HES Admitted Patient Care Episode Treatment speciality : Admission Method : HES ethnicity All ICD-10 codes (for episode) All OPCS-4 codes (for episode) Episode Start date - Episode end date NHSNUMBER
6 HES Admitted Patient Care Operation OPCS codes (for date) in POS order ICD-10 codes (for episode) Operation date NHSNUMBER
7 SACT Cycle Benchmark group Cycle number Treatment intent Cycle start date PATIENTID
8 RTDS Episode Radiotherapy intent ICD-10 diagnosis code Episode treatment start date PATIENTID
9 Tumour diagnosis (Provisional) Statusofregistration ICD-10 diagnosis code Stage_best Diagnosisdatebest PATIENTID
11 HES major surgery (historical) OPCS-4 code ICD-10 diagnosis code Operation date NHSNUMBER
12 HES major surgery (historical, further constraints) OPCS-4 code ICD-10 diagnosis code Further notes/constraints Operation date NHSNUMBER
14 RAWDATA major surgery (historical) OPCS-4 code ICD-10 diagnosis code Operation date PATIENTID
15 RAWDATA major surgery (historical, further constraints) OPCS-4 code ICD-10 diagnosis code Further notes/constraints Operation date PATIENTID
17 Prior tumour diagnosis Statusofregistration ICD-10 diagnosis code Stage_best Diagnosisdatebest PATIENTID
18 Tumour diagnosis (Final) Statusofregistration ICD-10 diagnosis code Stage_best Diagnosisdatebest PATIENTID
19 Patient vital status date Vitalstatus ICD-10 Underlying cause of death Death location code Vitalstatusdate PATIENTID
20 RAWDATA holistic needs assessment record HNA point of pathway : HNA offered : HNA staff role Primary diagnosis Laterality Date of HNA PATIENTID
21 RAWDATA staging Inferred best stage ICD-10 diagnosis code T/N/M components for pre-treatment/pathological/integrated stage Collected stage date PATIENTID
22 CWT First Seen Source of referral Categorisation of TWW, screening and consultant upgrade cases, where relevant Suspected cancer referral type Date first seen NHSNUMBER
23 HES diagnostic event OPCS-4 code Description BX/LD Operation date NHSNUMBER
24 RAWDATA personal care and support plan PCSP point of pathway : PCSP offered : PCSP staff role Primary diagnosis Laterality PCSP date PATIENTID
25 RAWDATA end of treatment summary Primary diagnosis Laterality eots_date PATIENTID
50 Skeleton Tumour creation E_base_record type (COSD = England, CANISC = Wales) ICD-10 diagnosis code Diagnosisdate PATIENTID
51 Diagnosis reported in COSD Number of times reported ICD-10 diagnosis code E_base_record type Diagnosisdate NHSNUMBER
52 CWT estimated diagnosis date CWT First Treatment Flag CWT recorded primary diagnosis (ICD) CWT Cancer Treatment Event Type Adjusted treat period start NHSNUMBER
53 HES inferred tumour HES cancer group ICD-10 diagnosis code Episode start date NHSNUMBER
54 COSD diagnosis submission E_base_record primary diagnoses ICD-10 diagnosis code (submission) Diagnosis date (submission) PATIENTID
55 RAWDATA biopsy record Laterality ICD-10 diagnosis code Collected date/authorised date PATIENTID
56 RAWDATA imaging record Laterality ICD-10 diagnosis code Procedure_date - diagdate Diagdate PATIENTID
57 RAWDATA HNA diagnosis Laterality Primary diagnosis (ICD-10) Diagdate PATIENTID
101 Inferred diagnosis Event_property_1 from source record (event 19, 52, 53, 54) ICD-10 diagnosis code Cancer group First recorded date PATIENTID
102 Inferred diagnosis, with adapted diagnosis dates (from 101, adapted) RCRD inferred/ derived diagnosis, with adapted diagnosis dates - using event 101 diagnoses, adapting diagnosis dates for earlier records in 90-days preceding 101 diagnosis date source_id from second record used to adapt diagnosis date ICD-10 diagnosis code Cancer group PATIENTID
103 Alternative inferred diagnosis (work in progress) RCRD inferred/ derived diagnosis based on combination of multiple data sources - using an alternative approach to 101 (approach still being refined) source_id from second record used to adapt diagnosis date ICD-10 diagnosis code Cancer group PATIENTID
*: Data dictionary: Primary cancer site for cancer faster diagnosis pathway

 

**: Data dictionary: Holistic needs assessment point of pathway for cancer


Appendix 2 - List of Rapid Registration fields available

Table A2: AT_RAPID_TUMOUR: field list

AT_RAPID_TUMOUR: field list
COLUMN_NAME DATA_TYPE Notes
INDIVIDUALID NUMBER(19,0) Matches AT_RAPID_PATHWAY for each event with event_type=101
PATIENTID NUMBER(19,0) Matches AT_RAPID_PATHWAY for each event with event_type=101
NHSNUMBER VARCHAR2(12 BYTE) Matches AT_RAPID_PATHWAY for each event with event_type=101
TUMOUR_AVPID NUMBER Matches AT_RAPID_PATHWAY for each event with event_type=101
DIAGNOSISDATE DATE Matches AT_RAPID_PATHWAY for each event with event_type=101
TUMOUR_SITE VARCHAR2(260 CHAR) Matches AT_RAPID_PATHWAY for each event with event_type=101 (event_property_2)
BIRTHDATEBEST DATE Taken from Encore
AGE VARCHAR2(260 CHAR) Taken from Encore
GENDER VARCHAR2(260 CHAR) Taken from Encore
POSTCODE VARCHAR2(255 BYTE) Taken from Encore
SURNAME VARCHAR2(64 BYTE) Taken from Encore
FORENAME VARCHAR2(64 BYTE) Taken from Encore
STAGE VARCHAR2(260 CHAR) Defined for selected cancer sites
ETHNICCATEGORY VARCHAR2(1 CHAR) Taken from Encore or the HESAPC dataset
FINAL_ROUTE VARCHAR2(22 BYTE) Final Route to Diagosis using an adapted version of the standard NCRAS methodology
QUINTILE_2019 VARCHAR2(120 BYTE) Index of Multiple Deprivation quintile defined using the standard NCRAS methodology
CHRL_TOT_27_03 NUMBER(10,0) Charlson score defined using the standard NCRAS methodology
TUMOUR_MORPHOLOGY VARCHAR2(5 CHAR) Tumour morphology as recorded in the COSD system
TUMOUR_PERFORMANCESTATUS VARCHAR2(1 CHAR) Patient performance status at time of diagnosis
BASISOFDIAGNOSIS VARCHAR2(260 CHAR) The basis of diagnosis (e.g. clinical; pathological; etc.)
LSOA11 VARCHAR2(27 BYTE) 2011 census LSOA of residence at time of diagnosis
LSOA21 VARCHAR2(27 BYTE) 2021 census LSOA of residence at time of diagnosis
DIAGNOSIS_TRUST VARCHAR2(260 CHAR) Trust of diagnosis
SOURCE VARCHAR2(11 CHAR) The dataset used as the primary source for the RCRD registration
SOURCE_ID VARCHAR2(69 CHAR) The unique ID of the record used as the primary source for the RCRD registration
VITALSTATUS VARCHAR2(260 CHAR) Records whether the patient is currently alive or deceased at the time of the snapshot.
VITALSTATUSDATE DATE The date of the last known vital status for the patient
CANCER_GROUP VARCHAR2(40 BYTE) Broad cancer group derived from TUMOUR_SITE, according to groupings used for RCRD derivation and RCRD dashboard
CANCER_GROUP_DETAILED VARCHAR2(40 BYTE) Detailed cancer group derived from TUMOUR_SITE, according to groupings used for RCRD dashboard
SURGERY_FLAG NUMBER Indicator flag (0 = No; 1 = Yes) for whether the patient has an associated surgical tumour resection record
SURGERY_DATE DATE Where the SURGERY_FLAG indicates one or more associated surgical tumour resection record, this is the earliest such date
RADIOTHERAPY_FLAG NUMBER Indicator flag (0 = No; 1 = Yes) for whether the patient has an associated radiotherapy record (from RTDS)
RADIOTHERAPY_DATE DATE Where the RADIOTHERAPY_FLAG indicates one or more associated radiotherapy record, this is the earliest such date
SACT_FLAG NUMBER Indicator flag (0 = No; 1 = Yes) for whether the patient has an associated systemical anti-cancer therapy (SACT) record (from SACT dataset)
SACT_DATE DATE Where the SACT_FLAG indicates one or more associated SACT record, this is the earliest such date


Appendix 3 - Cancer groups used for matching

Table A3: Rapid Registration ICD-10 tumour inclusion list

Rapid Registration ICD-10 tumour inclusion list
ICD CANCER_GROUP SCOPE ICD CANCER_GROUP SCOPE
C00 Head & Neck DQ & CD C54 Gynae DQ & CD
C01 Head & Neck DQ & CD C55 Gynae DQ & CD
C02 Head & Neck DQ & CD C56 Gynae DQ & CD
C03 Head & Neck DQ & CD C57 Gynae DQ & CD
C04 Head & Neck DQ & CD C58 Gynae DQ & CD
C05 Head & Neck DQ & CD C59 Other DQ & CD
C06 Head & Neck DQ & CD C60 Urology DQ & CD
C07 Head & Neck DQ & CD C61 Prostate DQ & CD
C08 Head & Neck DQ & CD C62 Urology DQ & CD
C09 Head & Neck DQ & CD C63 Urology DQ & CD
C10 Head & Neck DQ & CD C64 Urology DQ & CD
C11 Head & Neck DQ & CD C65 Urology DQ & CD
C12 Head & Neck DQ & CD C66 Urology DQ & CD
C13 Head & Neck DQ & CD C67 Urology DQ & CD
C14 Head & Neck DQ & CD C68 Urology DQ & CD
C15 O-G DQ & CD C69 Brain & CNS DQ & CD
C16 O-G DQ & CD C70 Brain & CNS DQ & CD
C17 Upper GI DQ & CD C71 Brain & CNS DQ & CD
C18 Colorectal DQ & CD C72 Brain & CNS DQ & CD
C19 Colorectal DQ & CD C73 Endocrine DQ & CD
C20 Colorectal DQ & CD C74 Endocrine DQ & CD
C21 Colorectal DQ & CD C75 Endocrine DQ & CD
C22 Upper GI DQ & CD C76 Unknown Primary DQ & CD
C23 Upper GI DQ & CD C77 Unknown Primary DQ & CD
C24 Upper GI DQ & CD C78 Unknown Primary DQ & CD
C25 Upper GI DQ & CD C79 Unknown Primary DQ & CD
C26 Upper GI DQ & CD C80 Unknown Primary DQ & CD
C27 Other DQ & CD C81 Haematological DQ & CD
C28 Other DQ & CD C82 Haematological DQ & CD
C29 Other DQ & CD C83 Haematological DQ & CD
C30 Head & Neck DQ & CD C84 Haematological DQ & CD
C31 Head & Neck DQ & CD C85 Haematological DQ & CD
C32 Head & Neck DQ & CD C86 Haematological DQ & CD
C33 Lung DQ & CD C87 Haematological DQ & CD
C34 Lung DQ & CD C88 Haematological DQ & CD
C35 Other DQ & CD C89 Haematological DQ & CD
C36 Other DQ & CD C90 Haematological DQ & CD
C37 Other DQ & CD C91 Haematological DQ & CD
C38 Lung DQ & CD C92 Haematological DQ & CD
C39 Lung DQ & CD C93 Haematological DQ & CD
C40 Bone & ST DQ & CD C94 Haematological DQ & CD
C41 Bone & ST DQ & CD C95 Haematological DQ & CD
C42 Other DQ & CD C96 Haematological DQ & CD
C43 Melanoma DQ & CD C97 Unknown Primary DQ & CD
C44 NMSC
D05 Breast DQ
C45 Lung DQ & CD D06 Gynae
C46 Bone & ST DQ & CD D09 Urology DQ
C47 Brain & CNS DQ & CD D32 Brain & CNS DQ
C48 Gynae DQ & CD D33 Brain & CNS DQ
C49 Bone & ST DQ & CD D35 Brain & CNS DQ
C50 Breast DQ & CD D41 Urology DQ
C51 Gynae DQ & CD D42 Brain & CNS DQ
C52 Gynae DQ & CD D43 Brain & CNS DQ
C53 Gynae DQ & CD D44 Brain & CNS DQ
Scope: DQ = ‘Included in this data quality document’; CD = ‘Included in cancerdata.nhs.uk/covid-19/rcrd dashboard’


Appendix 4 - Alternative defining events

Several options were considered as to the defining events for the Rapid Registrations. Both standalone datasets, subsets of standalone datasets, and combined datasets were explored and their FNE and FPE figures quantified. A subset of these alternatives are presented below as a demonstration of the process but the majority of this exploratory work is out of scope for this document.

Candidates for diagnosis events from the three main datasets that are rapidly available and have nominally full coverage of cancer patients are shown below (SACT and RTDS were also examined but data is not presented). Of the three, the CWT data has the best FPE but the FNE is substantially higher than the COSD dataset. HES produced the worst results in both measures. A filtering process was applied to the standalone COSD data to remove apparently new diagnoses that were actually recurrences of prior tumours. This improved the FPE at a cost of increasing the FNE. We continue to test whether this process can be further refined to improve the combined FPE and FNE figures, and monitor changes in the underlying datasets that might also give new opportunities to do so.

Table A4: Rapid Cancer Registrations: alternative defining events

Rapid Cancer Registrations: alternative defining events
Event FPE FNE
Event 52 - standalone CWT 7.6% 28.3%
Event 53 - standalone HES 13.2% 38.9%
Event 54 - standalone COSD 8.1% 15.8%
Event 101 (up to cas2106) - filtered COSD 5.2% 17.7%
Event 101 (cas2107) - filtered combined COSD/CWT 5.6% 16.4%
Event 101 (cas2108) - filtered combined COSD/CWT 5.1% 16.5%
Event 101 (cas2109) - filtered combined COSD/CWT 5.1% 16.6%
Event 101 (cas2110) - filtered combined COSD/CWT/HES 5.1% 14.7%
Event 101 (cas2111) - filtered combined COSD/CWT/HES 6.2% 13.4%
Event 101 (cas2112 to cas2202) - filtered combined COSD/CWT/HES and Death Certificates Only 5.3% 13.4%
Event 101 (cas2203 to cas2204) - filtered combined COSD/CWT/HES and Death Certificates Only 6.3% 12.2%
Event 101 (cas2205) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 12.3%
Event 101 (cas2206) - filtered combined COSD/CWT/HES and Death Certificates Only 5.6% 12.5%
Event 101 (cas2207) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 11.8%
Event 101 (cas2208 to cas2210) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 11.6%
Event 101 (cas2211 to cas2304) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.5%
Event 101 (cas2305) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.3%
Event 101 (cas2306) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.4%
Event 101 (cas2307 to cas2308) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.3%
Event 101 (cas2309 to cas2311) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.4%
Event 101 (cas2312 to cas2409) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 11.5%
Event 101 (cas2410) - filtered combined COSD/CWT/HES and Death Certificates Only 5.9% 11.7%
Event 101 (cas2411) - filtered combined COSD/CWT/HES and Death Certificates Only 5.8% 12.3%
Event 101 (cas2412 to cas2501) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 12.0%
Event 101 (cas2504 to cas2505) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 12.1%
Event 101 (cas2505 to cas2506) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 12.0%
Event 101 (cas2507 to cas2602) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 12.1%


Appendix 5 - Counts and error tabulations

Figure A1 shows an example for a very small dataset of how counts and error proportions are derived. This dataset has 10 Gold Standard Registrations and 7 Rapid Registrations overall (both indicated by the dots in the figure, with time running vertically over the course of 2018 and Gold Standard vs Rapid Registrations divided horizontally). Successful linkages between Gold Standard and Rapid Registrations are indicated by blue lines. False negatives and false positives are indicated. Only tumours in the 6-month assessment period are included in the tabulations below, although these can link to tumours outside the period as shown, and many-to-one linkages are also allowed. The false negative rate is therefore 3 in 7 and the false positive rate 1 in 6 below.

Figure A1: Illustration of counts and errors tabulation

Diagram assessing gold standard registrations against the rapid registration dataset. Details of the graph as described in the text above.

Tables A5 and A6 below tabulate counts of Gold Standard and Rapid Registrations together with the numbers of false positive and false negative errors. When considering comparisons between figures the nature of the linkage and relationships displayed in the diagram above should be kept in mind.

Table A5: Counts and errors tabulation by cancer group

Counts and errors tabulation by cancer group
Cancer group Gold Standard (GS) Registrations Rapid Registrations Difference Percentage Rapid/GS FPE FNE
Brain & CNS 5795 5138 657 88.7% 711 1364
Breast 28970 27223 1747 94.0% 1489 1720
Colorectal 18981 17846 1135 94.0% 919 1699
Endocrine 1910 1484 426 77.7% 139 509
Gynae 9789 9313 476 95.1% 665 1010
Haematological 14051 12432 1619 88.5% 723 2366
Head & Neck 5288 4929 359 93.2% 400 698
Lung 21726 20106 1620 92.5% 626 2115
Melanoma 8259 7691 568 93.1% 698 1072
O-G 6622 6471 151 97.7% 366 476
Prostate 27229 25183 2046 92.5% 326 2469
Bone & Soft Tissue 1140 1082 58 94.9% 365 406
Unknown Primary 3428 2566 862 74.9% 665 1531
Upper GI 9272 8672 600 93.5% 801 1442
Urology 17025 14874 2151 87.4% 1016 2851

Table A6: Counts and errors tabulation by cancer site

Counts and errors tabulation by cancer site
Cancer site Gold Standard (GS) Registrations Rapid Registrations Difference Percentage Rapid/GS FPE FNE
C00 111 150 -39 135.1% 65 25
C01 647 469 178 72.5% 13 61
C02 602 604 -2 100.3% 18 93
C03 234 108 126 46.2% 5 65
C04 256 239 17 93.4% 10 34
C05 211 185 26 87.7% 8 32
C06 270 287 -17 106.3% 21 50
C07 238 288 -50 121.0% 101 53
C08 82 89 -7 108.5% 15 14
C09 912 765 147 83.9% 16 59
C10 152 243 -91 159.9% 12 30
C11 113 111 2 98.2% 6 13
C12 157 99 58 63.1% 1 11
C13 142 130 12 91.5% 11 22
C14 25 66 -41 264.0% 15 13
C15 3996 4266 -270 106.8% 124 220
C16 2626 2205 421 84.0% 242 256
C17 820 669 151 81.6% 129 267
C18 12438 11760 678 94.5% 671 1239
C19 996 937 59 94.1% 43 90
C20 4900 4504 396 91.9% 115 328
C21 647 645 2 99.7% 90 42
C22 2646 2524 122 95.4% 266 454
C23 477 468 9 98.1% 29 62
C24 645 527 118 81.7% 32 84
C25 4533 4188 345 92.4% 137 499
C26 151 296 -145 196.0% 208 76
C30 162 158 4 97.5% 27 25
C31 93 65 28 69.9% 5 27
C32 881 873 8 99.1% 51 71
C33 13 11 2 84.6% 1 3
C34 20262 18740 1522 92.5% 550 1929
C37 169 94 75 55.6% 13 59
C38 73 354 -281 484.9% 46 21
C39 NA 12 NA NA% 4 NA
C40 119 107 12 89.9% 13 25
C41 117 150 -33 128.2% 78 44
C43 8259 7691 568 93.1% 698 1072
C45 1209 895 314 74.0% 12 103
C46 69 43 26 62.3% 3 25
C47 28 12 16 42.9% 5 22
C48 287 400 -113 139.4% 111 87
C49 835 782 53 93.7% 271 312
C50 25139 24355 784 96.9% 1356 1358
C51 645 594 51 92.1% 55 79
C52 95 109 -14 114.7% 16 12
C53 1321 1315 6 99.5% 56 86
C54 4084 3690 394 90.4% 106 199
C55 73 333 -260 456.2% 26 17
C56 2999 2528 471 84.3% 243 491
C57 275 320 -45 116.4% 34 38
C58 10 24 -14 240.0% 18 1
C60 304 315 -11 103.6% 50 37
C61 27229 25183 2046 92.5% 326 2469
C62 1056 1067 -11 101.0% 87 72
C63 33 31 2 93.9% 13 19
C64 4901 4369 532 89.1% 274 795
C65 419 321 98 76.6% 25 92
C66 362 260 102 71.8% 13 122
C67 4475 5056 -581 113.0% 146 681
C68 96 56 40 58.3% 6 40
C69 383 352 31 91.9% 48 64
C70 19 42 -23 221.1% 5 1
C71 2265 2114 151 93.3% 140 206
C72 83 90 -7 108.4% 35 19
C73 1734 1354 380 78.1% 80 414
C74 118 84 34 71.2% 26 57
C75 58 46 12 79.3% 33 38
C76 94 210 -116 223.4% 109 54
C77 271 126 145 46.5% 62 129
C78 593 54 539 9.1% 22 331
C79 231 133 98 57.6% 54 126
C80 2239 2043 196 91.2% 418 891
C81 894 864 30 96.6% 15 72
C82 1208 1046 162 86.6% 16 136
C83 3117 2702 415 86.7% 40 317
C84 395 230 165 58.2% 14 122
C85 1380 1023 357 74.1% 65 317
C86 NA 101 NA NA% 2 NA
C88 221 349 -128 157.9% 12 63
C90 2558 2211 347 86.4% 64 435
C91 2374 1905 469 80.2% 81 566
C92 1762 1570 192 89.1% 237 288
C93 23 187 -164 813.0% 23 1
C94 26 78 -52 300.0% 65 11
C95 51 66 -15 129.4% 10 12
C96 42 100 -58 238.1% 79 26
D05 3831 2868 963 74.9% 133 362
D09 5168 1267 3901 24.5% 245 918
D32 1481 1036 445 70.0% 87 514
D33 492 602 -110 122.4% 113 203
D35 494 550 -56 111.3% 197 148
D41 211 2132 -1921 1010.4% 157 75
D42 150 16 134 10.7% 4 38
D43 278 268 10 96.4% 54 76
D44 122 56 66 45.9% 23 73

Appendix 6 - False negative errors and basis of diagnosis

This appendix explores the reason for the overall age-dependence of the false negative error rate.

The most common methods of confirming a diagnosis (histology and cytology) account for the lowest proportion of false negatives (Figure A2). Where diagnosis comes from specific tumour markers, the Rapid Registrations are much more likely to “miss” the significant event or events. Patients diagnosed clinically (from imaging, consultation by a doctor but without a pathological sample being taken) are also more likely to be “missed” in the Rapid Registrations dataset.

Those patients for whom a diagnosis method cannot be determined (unknown) or died before they could be offered cancer treatment (death certificate), are most likely to be “missed” in the Rapid Registrations dataset. As Figure A3 indicates though, these account for a small proportion of those falsely omitted from the Rapid Registrations.

The marked reduction in the proportion of patients having their diagnosis confirmed from a pathological specimen (histology or cytology) explains the increase often observed at older ages in Figure A3, from the age of around 70, reflecting fewer patients having an invasive procedure performed on them as age increases. This is likely to be the reason behind the increasing false negative proportions by age observed overall and in most tumour groups (Figures 5 and 6).

Figure A2: The proportion of false negative Rapid Registrations by tumour group and basis of diagnosis, England, 2018

A grouped bar chart of the proportion of false negative rapid registrations by tumour group and basis of diagnosis.The proportion of error was highest for tumours from death certificate.

Figure A3: The proportion of false negative Rapid Registrations by method of diagnosis, England, 2018 (all tumour types combined)

A line chart of the proportion of false negative diagnoses by age at diagnosis, grouped by method of diagnosis. The method of diagnosis with the highest proportion of tumours is histology of primary up to age 88, and after is clinical investigation.

Appendix 7 - False positive and false negative proportion by month

Figure 18 shows the False Negative and False Positive error proportions by month for the broader matching criteria and a matching period of 90 and 30 days.

Figure A4: Monthly False Positive and False Negative proportions

A line chart of the proportion of error by month, group by false positive or false negative error. For both 30 and 90 day matching periods, the proportion of false negative errors is lower until July 2021, where false positive errors increase.

Appendix 8 - Sensitivity testing of matching criteria

In this section, the sensitivity of the Rapid Registrations dataset is illustrated for different matching criteria.

As expected, the stricter the criteria about the timing of events, more errors (both false negative and false positive) are observed. Not including a match specification on tumour type (the second line of table 1) improves both matching criteria and demonstrates that approximately 40% of false positive tumours have a cancer diagnosis of some sort when the necessity of matching by tumour group is removed.

Table A7: Proportions of false positive and negative errors under alternative matching criteria

Proportions of false positive and negative errors under alternative matching criteria
Tumour matching Match within N days False Negative % False Positive %
Broader 90 12.1% 6.0%
Broader 60 13.7% 7.6%
Broader 30 19.2% 13.2%
Broader 14 30.0% 24.7%
Broader 7 46.1% 42.2%
Broader 0 81.1% 79.5%
Narrow 90 20.1% 13.9%
None 90 10.6% 4.6%

Appendix 9 - Code changes to the RCRD build process

In this section, code changes introduced in each monthly snapshot are described.

Table A8: RCRD change log

AT_RAPID_PATHWAY: event list
snapshot change_id code_change
cas2603
None
cas2602
None
cas2601
None
cas2512 cas2512-1 Internal code changes to optimise build with regard to HES data
cas2511 cas2511-1 Update to HNA related events to include COSDv10 data and add de-duplication
cas2510 cas2510-1 Further updates to support assignment of C48 tumours to a cancer group depending on patient gender
cas2510 cas2510-2 Cleaning of odd values from stage field
cas2510 cas2510-3 Cleaning of dates of death prior to diagnosis
cas2509 cas2509-1 Further updates to support assignment of C48 tumours to a cancer group depending on patient gender
cas2509 cas2509-2 Update to include persons with death-only information in group of proxy tumours.
cas2508 cas2508-1 Further updates to support assignment of C48 tumours to a cancer group depending on patient gender
cas2508 cas2508-2 Minor changes to surgery lookup table to align with standard treatment reporting
cas2508 cas2508-3 Adding D48, D72, E85, M72 ICD-10 overall lookup table to align with current cancer registration practice
cas2507 cas2507-1 C53 and C57 staging values moved into STAGE field from EXPERIMENTAL_STAGE
cas2507 cas2507-2 C48 tumours now assigned to a cancer group depending on patient gender
cas2507 cas2507-3 Resective surgery lookup table better aligned with 2025 Cancer Flags output
cas2506 cas2506-1 Internal changes to deal with multiple NHSnumbers per personid
cas2506 cas2506-2 Internal changes to prepare for improvements to assigning C48 tumours
cas2506 cas2506-3 Further development of event 102 and 103 experiemental events
cas2505
None
cas2504 cas2504-1 Further development of event 102 and 103 experiemental events
cas2504 cas2504-2 Update to basis of diagnosis code for 2023 cases onward to make consistent with updated registration practice
cas2501 cas2501-1 Permanent fix to enact deuplication of experimental event 102
cas2412 cas2412-1 Include staging of C53 (cervical cancer) in experimental stage field
cas2412 cas2412-2 Correcting issue that excluded rapidly fatal cancers being included from the HES data
cas2412 cas2412-3 Deduplication of experimental event 102 (hotfix)
cas2412 cas2412-4 Excluded lung screening Routes to Diagnosis prior to January 2019
cas2411 cas2411-1 Update to surgery code to use a combined table of all 3-digit ICD-10 codes, for all-stage and stage-specific procedures.
cas2411 cas2411-2 Filter OPCS4 procedure codes saved in initial HES tables, to include only those relevant to later lookups.
cas2411 cas2411-3 Added filtering to exclude Welsh only patients within the rapid_fatality section of event 101.
cas2411 cas2411-4 Two proposed new events, 102 and 103.
cas2410 cas2410-1 Refactored surgical lookup table code to be consistent with those used in treatment flag output
cas2410 cas2410-2 Added GP Practice code to tumour table
cas2409 cas2409-1 Added C33 to allowed list for lung screening
cas2409 cas2409-2 Updated NSPL postcode lookup to NSPL published May 2024
cas2409 cas2409-3 Internal refactoring of surgical lookup table to prepare for a simpler update process
cas2409 cas2409-4 Created internal experimental table showing patient GP practice at time of diagnosis
cas2408 cas2408-1 Changed criteria for including Event 54 in rapid pathway table such that there is a known nhsnumber instead of a known patient id (motivated by changes to COSD v10 data submissions)
cas2407 cas2407-1 Added STAGE_EXPERIMENTAL field
cas2407 cas2407-2 Added staging for C57 ovarian tumours (into STAGE_EXPERIMENTAL field)
cas2407 cas2407-3 Opened selection for screening cases to include C34 lung cancers
cas2406
None
cas2405 cas2405-1 Updated assignment of trusts (reversing effect of cas2305-2 change), reducing numbers of patients diagnosed at tertiary trusts and increasing numbers diagnosed in near-by trusts.
cas2405 cas2405-2 Refactored order of properties in event 5 for consistency throughtout code while maintaining fix for ethnicity made in cas2404.
cas2404 cas2404-1 Fixed issue with ethnicity ‘top up’ from HES data which was incorrectly assigning ethinicity where it was present in HES but missing in COSD.
cas2404 cas2404-2 Update to allow creation of HES identified endocrine tumours based on event 11, restoring diagnoses previously identified from event 13.
cas2403 cas2403-1 Added place of death to event 19, property 3.
cas2403 cas2403-2 Merging event 13 into event 11 and event 16 into 14. This has the effect of no longer distinguishing surgery codes consistent with the CASSOP 4.5 with those specific to the RCRD build.
cas2403 cas2403-3 Add LSOA21 and age at diagnosis to AT_RAPID_TUMOUR table.
cas2402
None
cas2401
None
cas2312 cas2312-1 Update ICD-10 site lookup table to include more D-coded tumour groups.
cas2311 cas2311-1 Filter ethnicity to 1 digit only.
cas2311 cas2311-2 Updated postcode lookup table to nspl_202305.
cas2311 cas2311-3 Added filter to morphology codes to only allow those beginning with ‘8’ or ‘9’.
cas2311 cas2311-4 After review of fields removed ‘received_date’ from pathway table.
cas2311 cas2311-5 After review of fields removed event type 10 as an effective duplicate of event type 19.
cas2310
None
cas2309 cas2309-1 Allow HES and CWT records to create event-type 52 and 53 events even if there is no patientid. Screen these out so that they don’t go on to create event-type 101 events, but are now available for testing.
cas2308
None
cas2307 cas2307-1 Expose path and integrated TNM stage components in event 21.
cas2307 cas2307-2 Change offset for CWT diagnosis events to a fixed lookup table rather than re-calculating each time.
cas2307 cas2307-3 Update CWT surgery codes to reflect changes to CWT data dictionary.
cas2307 cas2307-4 Updated surgery lookup table to reflect changes implemented in cancer treatment flags output.
cas2306 cas2306-1 Move comparison of diagnosis date to date of death to earlier in the processing (and using vital status date for date of death if appropriate).
cas2305 cas2305-1 Remove duplicate patients with multiple patientid and same nhsnumbers.
cas2305 cas2305-2 Revert to prior order to prioritise creation of event 101s without prioritising those with a known trust.
cas2305 cas2305-3 Bring diagnosis trust through to AT_RAPID_TUMOUR table.
cas2305 cas2305-4 Added new basis of diagnosis codes to reflect changes to ENCR definitions for diagnoses from 2023 onwards.
cas2305 cas2305-5 Replace diagnosisdate with date of death for cases where date of death would otherwise have been within the 3 months before diagnosisdate.
cas2304
None
cas2303
None
cas2302
None
cas2301
None