The National Cancer Registration and Analysis Service (NCRAS) has developed an algorithmically generated Rapid Cancer Registration Dataset (RCRD) using the standard administrative datasets which flow rapidly into NHS England (NHSE) and are incorporated into the Cancer Analysis System (CAS) of NCRAS. The data takes the form of a series of significant events that occur to each patient as they proceed through the diagnostic and then therapeutic parts of the cancer pathway, and is available at approximately 4-5 months behind real time. The RCRD is shallower and narrower than the full NCRAS cancer registration dataset; it should be used and interpreted with reference to the caveats outlined within this document.


Main findings

This document outlines the main features of the data to be aware of when interpreting the Rapid Cancer Registration Dataset:

  • across all cancer types included approximately 12.1% of cases are missing and 6.0% of cases are included erroneously or with incorrect cancer type or diagnosis date (when compared to ‘Gold Standard’ registration data for 2018 data)

  • these figures vary strongly with cancer site. Broadly, more common cancers (particularly breast and prostate cancer) perform best and less common cancers (particularly bone and soft tissue and cancers of unknown primary) perform worst

  • non-melanoma skin cancer (ICD-10 C44) tumours are excluded from the majority of data shown (Figure 3 onwards). Carcinoma of the cervix in-situ (ICD-10 D06) is excluded from all data presented

  • there are more missing tumours in those aged over 70 compared to younger age groups

  • other factors that reduce data completeness include the patient’s route to diagnosis, mortality within 30 days of diagnosis, and the presence of multiple cancers

  • usable data is available approximately 4 to 5 months after diagnosis or other clinical activity occurs

  • data on cancer stage group at diagnosis is available for a number of common tumour types, although completeness is lower than that for the Gold Standard registration data. Where data is available it generally agrees with the Gold Standard stage group in 80 to 90% of tumours

The dataset includes Rapid Cancer Registrations from January 2018 to the most recently available data (at the date specified in the title to this document), plus additional event data for the same period.

Summary

A need to make rapidly available ‘proxy cancer registrations’ (and associated clinical activity) for the COVID-19 period has been identified to support the public health response by NHS England (NHSE) and other agencies, and service reorganisation by the NHS. These proxy registrations are called Rapid Registrations in contrast to the more formal detailed registration process that are used in non-clinical cancer research and the National Statistics.

The National Cancer Registration and Analysis Service (NCRAS) has developed a Rapid Cancer Registration Dataset (RCRD) using all standard administrative datasets which flow rapidly into NHSE and are incorporated into the Cancer Analysis System (CAS) of NCRAS.

This document describes the dataset structure, creation methodology, and data quality caveats (due to the rapid automated creation process without additional data curation) behind this dataset.

These data structures and methodologies are expected to evolve over the course of the public health response to COVID-19. The data is updated monthly and is referred to by the monthly CAS snapshot upon which it is based, e.g. CAS2009 refers to the CAS snapshot from September 2020. This document is considered a ‘living document’ and strictly applies only to the snapshot of CAS identified in the title.

Methodology

Proxy registration events (Rapid Registrations)

Datasets available to NHSE were surveyed for how many months in arrears that they arrive within NCRAS and are loaded in a usable format for analysis. From these datasets a selection of event types were defined similarly to those typically used for cancer pathway analysis pursued by NCRAS.

The data takes the form of a series of significant events that occur to each patient as they proceed through the diagnostic and then therapeutic parts of the cancer pathway. These events include chemotherapy cycles, radiotherapy episodes and major cancer surgery as well as events based on the Cancer Waiting Times (CWT) and Cancer Outcomes and Services Dataset (COSD) datasets. These event types are numbered in the range 1-23 in the dataset.

Some events hypothesised to be indicative of a cancer diagnosis were defined including ‘Diagnosis reported in COSD’ (event 51) and ‘CWT estimated diagnosis date’ (event 52). These are numbered in the range 50-57 in the dataset - see Appendix 1 for a full list.

The indicative events for diagnosis were explored as candidate Rapid Registration events. These candidate rapid registration events were judged as matching against a Gold Standard Registration event if it met the following two conditions:

  • diagnosis dates for each event was 90 days or less
  • both registrations fell into the same broad tumour group (as defined in Appendix 3)

Using these matching criteria False Positive errors and False Negative errors are defined as:

  • False Positive Error (FPE): A rapid registration event has been created which does not match against a Gold Standard Registration in the comparison period
  • False Negative Error (FNE): There exists a Gold Standard Registration event for which no rapid registration event can be matched

Additional filtering was applied to the candidate events and eventually event 101 was defined to minimise both false positive and false negative errors and is recommended for use by researchers as the best candidate for a rapid cancer registration. Appendix 4 briefly examines some of the alternatives examined in the development of this event definition.

Data structures

The rapid registration dataset consists of two tables:

AT_RAPID_PATHWAY: This is an event-based dataset with a number of types of event of interest defined based on the rapidly available datasets, see Appendix 1 for event definitions and properties. These are numbered in the range 1-23 for general purpose events, 50-57 for events that are candidates for combining into a rapid registration, and 101 for the final rapid registration event.

AT_RAPID_TUMOUR: This is a tumour level dataset that holds tumour and patient level data for each of the tumours defined by a rapid registration. The structure and contents of this table are presented in Appendix 3.

The rapid registration pathway and tumour table can be linked together as shown in Figure 1, and also to other datasets that are timely enough via NHSnumber.

Figure 1: Linkage diagram for the Rapid Cancer Registration Dataset

Diagram showing how the rapid tumour and rapid pathway tables link together via avpid or tumour avpid, individual ID, patient ID and NHS number, and how patient ID and NHS number can link into the Cancer Analysis System and other CAS reference tables

Data Quality

How do the number of Rapid Registrations compare with Gold Standard Registrations?

To illustrate the strengths and weaknesses of the Rapid Registrations compared to the gold standard process, registrations for tumours diagnosed during 2018 are compared in Figure 2.

For most tumour groups the counts of Rapid Registrations are significantly lower than those of standard registrations. The COSD system does not attempt to record basal cell carcinoma non-melanoma skin cancers (but they are recorded by hospital pathology systems, and thereby registered), explaining the discrepancy there. There is only one group where this situation is reversed - bone and soft tissue - for which a precise morphology is required to properly record the diagnosis. These cancers are being preferentially coded to bone and soft tissue in COSD (as the COSD standard necessitates simpler site-based coding, and this is the best choice under the circumstances) and re-coded during the gold standard registration process where more sophisticated combination of site and morphological coding is possible.

Figure 2: The number of cancer registrations by registration and tumour type, England, 2018

Bar chart of tumour diagnoses in 2018 by registration source where for each tumour type, gold standard registration count is higher than rapid registration count, especially for non-melanoma skin cancer.

Figure 3 shows the age dependence of the ratio between Gold Standard and Rapid Registrations, Non-Melanoma Skin Cancer is excluded. The proportion of diagnoses is consistently high for both males and females until the age of 70 is reached, where it declines. This is explored further in Figure 5 below.

Figure 3: The proportion of cancer registrations by gender, age and registration type, England, 2018 (all tumour types combined)

Line chart of the proportion of rapid compared to gold standard registrations by GENDER, where the proportion of diagnoses for both GENDERs is consistently high until age 70, where it declines.

Comparing the matching quality of Rapid Registrations

The quality of the Rapid Registrations was judged by comparing them against the gold-standard cancer registrations in the period April 2018 to September 2018. This period was chosen as available gold standard registration data was only finalised to December 2018 and a matching period of 90 days was allowed (restricting comparison to the middle six months of the twelve-month period).

Figure 4 shows the proportions of false positive and false negative events, by broad cancer type (excluding non-melanoma skin cancer), measured in the cas2510 snapshot (the tumour groups are defined in Appendix 3). A more detailed tabulation is available by tumour group and tumour site in Appendix 5.

In most tumour groups, there are more tumours missed by the rapid registrations process (false negatives) than there are falsely identified as tumours (false positives).

For breast and prostate, very few incorrect proxy registrations are made. Breast, colorectal, lung, oesophagogastric (O-G) and prostate cancers are also least likely to be missing from the proxy dataset, whereas for cancers of unknown primary, and bone and soft tissue tumours more than 25% of cancers are missed. Bone and soft tissue tumours are not frequently diagnosed. These tumours often require multiple pathology reports to correctly diagnose a patient and the Rapid Registrations dataset has not attempted to reconcile differences in the reported diagnoses.

Figure 4: Types of error by tumour group

Bar chart of the proportion of false negative and false positive errors by tumour group, where bone and soft tissue tumours and cancers of unknown primary have a higher proportion of errors than other tumour groups.

The proportion of false positive errors is fairly stable across all ages (Figure 5); the proportion of false negative errors slowly declines until age 70 when it increases significantly. The age dependence was investigated and the age-dependence of the basis of diagnosis was found to be at least partially responsible for this - see Appendix 6 for details.

The proportion of false positive cases is less sensitive to the age of the patient.

Figure 5: False negative and false positive errors by age band at diagnosis

Line chart of the proportion of false negative and false positive errors by age band at diagnosis. The proportion of false positive errors is stable from age 20, and the proportion of false negative errors slowly declines to age 70 when it increases

The charts in Figure 6 (below) examine these patterns by tumour group. Please note that age groups for each tumour group must have a denominator of 25 patients or more or they are suppressed for reasons of statistical power.

The patterns of false negative and false positive vary significantly by tumour group. Most groups have a higher proportion of false negatives than false positives at each age.

The proportion of false positives does not exhibit a trend by age for most tumour groups; the proportion rises with increasing age in the bone and soft tissue, head and neck groups and melanoma group and conversely falls with increasing age in the colorectal and unknown groups.

The proportion of false negatives rises with increasing age for all tumour groups except bone and soft tissue and endocrine. The most pronounced increases occur in the brain and central nervous system, colorectal, gynaecological, haematological, prostate, upper gastro-intestinal and unknown primary tumour groups.

The levels of both types of error are highest in tumour groups which are less likely to have solid-tissue pathology (haematological) or where survival rates are typically low. Conversely, the levels of error are lowest for tumour groups for which survival rates are typically higher.

Figure 6: False negative and false positive errors by age band at diagnosis and tumour group

A series of line graphs for each tumour type of the proportion of false negative and false positive errors by age band at diagnosis.

The variation of the false positive and false negative errors with Income deprivation quintile is shown in Figure 7. While there is an overall trend visible this is likely to be due to confounding due to the variation with tumour type shown above and the known association of the incidence of many cancer types with income deprivation.

Figure 7: False negative and false positive errors by income deprivation quintile

Bar chart of the proportion of false negative and false positive errors by income deprivation quintile where the proportion slightly declines from income quintile 1 to income quintile 5.

Figure 8 shows the variation of false negative and false positive errors with route to diagnosis. For false positives there is moderate variation with the lowest error rate being those cases identified through cancer screening or a two week wait referral. (These tumours are those that are likely to be captured in both the COSD dataset and the screening/Cancer Waiting Times datasets so the lower error rate is understandable.)

Most routes to diagnosis have a substantially higher false negative rate than the overall average. ‘Two Week Wait’ (TWW) and screening routes have a substantially lower false negative rate (and make up between them 45% of the total cohort).

Figure 8: False negative and false positive errors by route to diagnosis

Bar chart of the proportion of  false negative and false positive errors by route to diagnosis where cases identified through cancer screening and two week wait referral routes have the lowest error rates compared to other diagnosis routes.

Figure 9 below shows the variation of false negative and false positive errors with whether or not the patient died within 30 days of diagnosis. The false negative error rate varies substantially between patients who die in the 30 days post-diagnosis compared to those who did, meaning that patients who die within 30 days are more likely to be missing from the dataset.

Figure 9: False negative and false positive errors by 30-day mortality

Bar chart of the proportion of false negative and false positive errors by 30 day mortality where false negative error rates are higher in those that died in the 30 days post-diagnosis compared to those that did not die within 30 days.

Figure 10 below shows the variation of false negative and false positive errors with the multiple tumour status of the patient, i.e. whether or not the patient had been diagnosed with more than one type of tumour in the period January 2018 onward. The false positive error rate varies substantially between patients with multiple tumour types and those that don’t, meaning that these patients with multiple tumours are more likely to have incorrect tumour types or diagnosis dates recorded.

Figure 10: False negative and false positive errors by multiple tumour status

Bar chart of the proportion of false negative and false positive errors by multiple tumour status, where false positive error rates are higher in patients with multiple tumour types than those without.

Figure 10b below shows the variation of false negative and false positive errors with the stage at diagnosis.

Figure 10b: False negative and false positive errors by stage

Bar chart of the proportion of false negative and false positive errors by stage at diagnosis, where errors are higher in cases with no stage information compared to other stage categories.

Figure 11 below shows the variation of false negative and false positive errors with the cancer alliance of residence of the patient at the time of diagnosis. The false negative error rate varies more in absolute terms than the false positive rate and may be driven by trust level variation (see figures 11 and 12 below).

Figure 11: False negative and false positive errors by Cancer Alliance

Line chart of the proportion of false negative and false positive errors by Cancer Alliance where false negative error rates vary more than false positive error rates.

Figures 12 and 13 below show the variation of false negative and false positive errors with the trust that diagnosed the tumour. Figure 12 shows the error proportion and figure 13 the numerator (count) of the errors. Trusts shown are limited to NHS secondary care trusts with a denominator of at least 50 patients over the assessment period. Both figures are ordered in descending order of the false negative statistic - but note that the order is not the same in each figure.

There is substantial variation in both false positive and false negative rates and counts. Some large trusts have several hundred or up to 1000 cases (over the six-month period under assessment).

Figure 12: False negative and false positive errors (proportion) by hospital trust

Line chart of the proportion of false negative and false positive errors by unidentifiable hospital trusts where there is substantial variation in both false negative and false positive rates between trusts.

Figure 13: False negative and false positive errors (count) by hospital trust

Line chart of the counts of false negative and false positive errors by unidentified hospital trusts, where there is substantial variation in both error counts between hospital trusts.

Counts of events over time

This section examines the population of events by chronological time and when they appear in successive analytical snapshots in the CAS. Figure 14 shows that most data items in the Rapid Registrations dataset are stable with respect to the snapshot month.

Specific comments about the events shown below are:

  • cancer waiting times data (events 1–4) are received based on the treatment start date; this explains why for event 2 all lines lie exactly on top of each other. Other CWT events accumulate over successive snapshots where these events occur before the first treatment start event

  • an issue with HES data that caused lower than expected completeness from 2020-04-01 was resolved in cas2102, leading to increased event counts in events 5, 6, 11, 12, 13 and 23

  • the definition of event 17 only includes tumour diagnoses prior to 2018, so lack of data in the chart below is expected

  • definitions of staging events may change between snapshots, which might explain higher or lower counts in one snapshot compared to others

  • the vital status shown in event 19 is typically only assessed each January or at the completion of registering each diagnosis year, explaining the large peaks in the graph

  • the raw data used to populate events 21, 54 and 56 is subject to ongoing deduplication, which explains lower counts in earlier time periods for later snapshots

  • between snapshots, event 101–103 (inferred diagnoses) counts generally increase, particularly for recent months as additional COSD data is submitted. For some earlier months, there is a small decrease in these counts because the algorithm excludes potential diagnoses where the patient already has a confirmed diagnosis in the same tumour group more than 90 days before. These exclusions can change between snapshots as gold standard registration data is processed, leading to more confirmed previous diagnoses. The effect has been measured as less than 1% of all cases in any given month

Figure 14: Population of data items to CAS snapshot

These are line graphs showing the count over time, comparing the completeness with the three previous snapshots. The count tends to increase in the more recent snapshots.

These are line graphs showing the count over time, comparing the completeness with the three previous snapshots. The count tends to increase in the more recent snapshots.

These are line graphs showing the count over time, comparing the completeness with the three previous snapshots. The count tends to increase in the more recent snapshots.

For eighteen data items, there are line graphs showing the count over time, comparing the completeness with the three previous snapshots. The count tends to increase in the more recent snapshots.

Estimated completeness of Rapid Registrations and secondary datasets

Detailed linked rapid cancer registration, CWT, SACT and RTDS data is available at approximately a four-month lag from real time. Linked HES and raw COSD data is available at approximately 4-5 months behind real time.

Table 2 below shows data usability and completeness for Rapid Registrations and the constituent datasets. The “latest usable” column shows the ‘hard limit’ on data that is considered fit for analytical purposes (90% completeness), even in months prior to this though data is not necessarily considered complete and the completeness is displayed below. This should be taken into account in any use of the rapid registration data and the secondary datasets.

For the Rapid Tumour data completeness is expressed as the proportion of CCG of residence which show a cancer incidence within the normally expected range (see Table 3 below). For other datasets except CWT completeness is computed as a percentage of the number of data providers who have supplied data over those who are expected to do so.

Data completeness within the Cancer Waiting Times dataset varies at patient level with event type. Figures for the Treatment Start Date and Treatment Period Start Date are given below. Completeness of other CWT events can be estimated by inspecting Figure 13 (events 1-4).

Table 2: Rapid registration and dataset usability/completeness in cas2510

Rapid registration and dataset usability/completeness
Data source Latest usable August 2024 September 2024 October 2024 November 2024 December 2024 January 2025 February 2025 March 2025 April 2025 May 2025 June 2025 July 2025
Rapid Tumours (COSD) July 2025 99% 96% 99% 97% 96% 97% 98% 97% 97% 95% 98% 96%
HES January 2025 Complete Complete Complete Complete Complete 95%
SACT February 2025 96% 97% 96% 96% 95% 93% 93%
RTDS March 2025 Complete Complete Complete Complete 98% Complete 96% 96%
CWT (TSD) July 2025 Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete
CWT (TPSD) May 2025 Complete Complete Complete Complete Complete Complete Complete Complete Complete Complete
Note:
COSD = Cancer Outcomes and Services Dataset
TSD = Treatment Start Date
TPSD = Treatment Period Start Date

Table 3: Number of outlier CCGs in COSD dataset in cas2510

The table below shows the number of CCGs (using the April 2020 boundaries) which have 3-sigma outlier counts per month (either high or low) compared to the expectation of the fraction of the total number of new cancer registrations in England. This can be used to judge to what extent there is large scale missing data in COSD (and therefore in the Rapid Registrations in any particular month.)

Number of outlier CCGs in COSD dataset
Year and month Outlier: High Outlier: Low In expected range Total received Prop.
2024-01 1 2 132 135 0.9777778
2024-02 0 0 135 135 1.0000000
2024-03 0 1 134 135 0.9925926
2024-04 0 0 135 135 1.0000000
2024-05 0 0 135 135 1.0000000
2024-06 0 1 134 135 0.9925926
2024-07 1 2 132 135 0.9777778
2024-08 2 0 133 135 0.9851852
2024-09 2 3 130 135 0.9629630
2024-10 1 1 133 135 0.9851852
2024-11 1 3 131 135 0.9703704
2024-12 1 5 129 135 0.9555556
2025-01 1 3 131 135 0.9703704
2025-02 1 2 132 135 0.9777778
2025-03 1 3 131 135 0.9703704
2025-04 2 2 131 135 0.9703704
2025-05 2 5 128 135 0.9481481
2025-06 0 3 132 135 0.9777778
2025-07 1 4 130 135 0.9629630
2025-08 14 17 104 135 0.7703704
2025-09 33 57 45 135 0.3333333
2025-10 17 0 NA 17 NA

Staging data in the Rapid Registrations dataset

TNM stage group 1-4

The size and extent of a cancer is commonly described using the ‘TNM’ system for “Tumour”, “Node”, and “Metastases”. This is often abbreviated to a number between 1 (typically a localised tumour with limited spread) to 4 (typically a tumour that has invaded or spread to distant organs). The stage at diagnosis is very strongly associated with patient outcomes.

In the current version of the Rapid Registrations dataset partial staging data is provided for a number of different cancer sites (ICD-10 codes can be found in the labels for tables 5a-k). This has been benchmarked against the gold standard cancer registry data for cas2510. Table 4 shows the count and proportion of cases by TNM stage group for both the Rapid Registrations and the Gold Standard Registrations, for calendar year 2018. For example 32% of breast cancers are TNM stage group 1 in the Rapid Registrations, but 38% in the Gold Standard Registrations. Compared to the Gold Standard Registrations in 2018, the Rapid Registrations under report breast cancers diagnosed at stages 1 or 2; colorectal cancers diagnosed at stage 4 are under reported and prostate cancers have under reported stages 1 and 4. In all three tumour groups, there are more tumours allocated to the unknown or unstageable category. Lung cancers in the RCRD most accurately match the Gold Standard Registrations and exhibits a broadly similar stage profile from both measures.

Table 4: Summary proportions of stage at diagnosis for the Rapid Registrations and Gold Standard Registrations

Summary proportions of stage at diagnosis for the Rapid Registrations and Gold Standard Registrations
Broad Cancer Group Stage Group Count (Rapid) Percentage (Rapid) Count (Gold Standard) Percentage (Gold Standard)
Bladder 1 2327 24.2% 2861 29.7%
Bladder 2 1799 18.7% 1876 19.5%
Bladder 3 566 5.9% 882 9.2%
Bladder 4 265 2.8% 653 6.8%
Bladder U 4672 48.5% 3357 34.9%
Breast 1 14301 32.2% 16567 37.3%
Breast 2 13324 30.0% 16776 37.8%
Breast 3 3276 7.4% 3709 8.4%
Breast 4 1274 2.9% 1977 4.5%
Breast U 12234 27.5% 5380 12.1%
Cervical 1 1213 46.5% 834 32.0%
Cervical 2 430 16.5% 399 15.3%
Cervical 3 173 6.6% 191 7.3%
Cervical 4 262 10.0% 235 9.0%
Cervical U 532 20.4% 951 36.4%
Colorectum 1 4907 14.9% 5482 16.7%
Colorectum 2 7047 21.5% 7715 23.5%
Colorectum 3 8297 25.3% 9309 28.3%
Colorectum 4 5177 15.8% 7456 22.7%
Colorectum U 7414 22.6% 2880 8.8%
Kidney 1 2384 29.0% 3324 40.4%
Kidney 2 448 5.4% 554 6.7%
Kidney 3 1352 16.4% 1644 20.0%
Kidney 4 704 8.6% 1577 19.2%
Kidney U 3336 40.6% 1125 13.7%
Lung 1 6191 17.2% 6608 18.3%
Lung 2 2580 7.1% 2683 7.4%
Lung 3 7319 20.3% 7605 21.1%
Lung 4 14982 41.5% 17165 47.6%
Lung U 5024 13.9% 2035 5.6%
Lymphoma 1 848 7.4% 1619 14.1%
Lymphoma 2 940 8.2% 1580 13.7%
Lymphoma 3 1192 10.3% 1955 17.0%
Lymphoma 4 2524 21.9% 4719 41.0%
Lymphoma U 6015 52.2% 1646 14.3%
Melanoma 1 6346 48.2% 8225 62.5%
Melanoma 2 2382 18.1% 2641 20.1%
Melanoma 3 472 3.6% 1037 7.9%
Melanoma 4 219 1.7% 350 2.7%
Melanoma U 3747 28.5% 913 6.9%
Oesophagus 1 298 3.6% 445 5.4%
Oesophagus 2 1489 18.1% 956 11.6%
Oesophagus 3 1763 21.4% 2119 25.7%
Oesophagus 4 2508 30.4% 3193 38.7%
Oesophagus U 2190 26.6% 1535 18.6%
Ovary 1 1188 23.5% 1416 28.1%
Ovary 2 263 5.2% 282 5.6%
Ovary 3 1330 26.4% 1619 32.1%
Ovary 4 791 15.7% 1059 21.0%
Ovary U 1474 29.2% 670 13.3%
Pancreas 1 358 4.5% 661 8.2%
Pancreas 2 629 7.8% 804 10.0%
Pancreas 3 752 9.4% 1039 13.0%
Pancreas 4 2057 25.7% 4100 51.2%
Pancreas U 4217 52.6% 1409 17.6%
Prostate 1 11691 25.3% 16194 35.0%
Prostate 2 5575 12.0% 6514 14.1%
Prostate 3 10436 22.5% 11614 25.1%
Prostate 4 5698 12.3% 8082 17.5%
Prostate U 12892 27.8% 3888 8.4%
Stomach 1 327 8.3% 340 8.6%
Stomach 2 387 9.8% 466 11.8%
Stomach 3 646 16.4% 712 18.0%
Stomach 4 1147 29.1% 1665 42.2%
Stomach U 1439 36.5% 763 19.3%
Uterus 1 4656 58.7% 5335 67.3%
Uterus 2 513 6.5% 537 6.8%
Uterus 3 735 9.3% 818 10.3%
Uterus 4 518 6.5% 551 6.9%
Uterus U 1509 19.0% 690 8.7%

In Tables 5a-m below, the distribution of the stage allocations between the Rapid Registrations and the Gold Standard Registrations are examined.

The figures indicate the proportion of agreement at the 1-digit TNM stage group level, where the stage is known in the Rapid Registrations dataset. Stages 1-4 in the Rapid Registrations dataset agree with the gold standard stage variable for a high proportion.

For example, when examining the subset of Rapid Registrations breast tumours that are identified as TNM stage 1 (32%), approximately 89% of these are found to be TNM stage group 1 in the gold standard registration data, with another 11% distributed across TNM stages 2-4 and the unknown or unstageable groups.

For many but not all (e.g., late stage breast cancer), roughly 85% or more of staged cases in the Rapid Registrations table have the same stage grouping as the equivalent tumour in the standard registration data - this can be seen in the table below by inspecting the figures where the stage metrics for the Rapid Registrations and Gold Standard Registrations are the same.

Where the stage is labelled as unknown or unstageable in the rapid pathway dataset it is known for at least 70% of those cases in the gold standard data.

Tables 5a-n: Stage comparison between Rapid Registrations and Gold Standard Registrations by cancer site

Stage comparison between RCRD and NCRD
a. bladder (ICD-10 C67)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 84.4% 4.1% 7.8% 5.7% 16.3%
2 3.8% 71.5% 15.4% 6.4% 8.5%
3 2.6% 10.8% 64.0% 4.9% 5.4%
4 1.2% 4.9% 5.3% 77.0% 6.4%
U 7.9% 8.6% 7.6% 6.0% 63.4%
Stage comparison between RCRD and NCRD
b. breast (ICD-10 C50)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 88.6% 5.1% 1.6% 4.2% 25.5%
2 6.6% 88.1% 11.5% 16.1% 28.7%
3 0.6% 2.6% 79.1% 5.7% 5.0%
4 0.2% 0.9% 2.9% 68.2% 7.1%
U 4.0% 3.4% 4.9% 5.9% 33.7%

 

Stage comparison between RCRD and NCRD
c. colorectum (ICD-10 C18-C20)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 84.9% 2.2% 1.9% 0.8% 13.1%
2 5.7% 85.4% 5.5% 1.4% 12.0%
3 6.5% 7.4% 84.8% 4.5% 16.1%
4 0.9% 2.9% 5.7% 92.0% 26.7%
U 2.0% 2.3% 2.1% 1.4% 32.1%
Stage comparison between RCRD and NCRD
d. kidney (ICD-10 C64)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 90.9% 6.7% 3.3% 2.1% 32.0%
2 0.5% 77.0% 1.0% 0.9% 5.3%
3 1.8% 6.7% 85.9% 4.3% 11.4%
4 0.5% 3.3% 5.8% 90.6% 25.0%
U 6.3% 6.2% 4.0% 2.1% 26.3%

 

Stage comparison between RCRD and NCRD
e. lung (ICD-10 C33-C34)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 92.7% 6.9% 1.4% 0.5% 10.1%
2 2.8% 83.4% 1.8% 0.4% 3.1%
3 1.9% 5.1% 90.0% 1.3% 11.4%
4 1.3% 3.1% 5.5% 97.1% 40.8%
U 1.2% 1.5% 1.3% 0.6% 34.6%
Stage comparison between RCRD and NCRD
f. melanoma (ICD-10 C43)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 93.7% 2.7% 7.8% 13.2% 57.4%
2 2.3% 78.2% 10.4% 18.7% 14.5%
3 2.0% 11.6% 74.2% 15.1% 6.7%
4 0.2% 1.6% 2.5% 41.6% 5.3%
U 1.9% 6.0% 5.1% 11.4% 16.1%
Stage comparison between RCRD and NCRD
g. oesophagus (ICD-10 C15)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 79.9% 5.0% 0.5% 0.2% 5.4%
2 7.4% 49.8% 3.4% 1.0% 4.9%
3 2.3% 34.7% 68.5% 6.2% 10.6%
4 1.0% 5.2% 21.7% 83.1% 29.5%
U 9.4% 5.4% 5.9% 9.5% 49.5%
Stage comparison between RCRD and NCRD
h. ovary (ICD-10 C56)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 96.1% 7.2% 0.9% 0.3% 16.4%
2 0.5% 88.6% 0.5% 0.1% 2.4%
3 0.9% 1.9% 90.9% 10.6% 21.0%
4 0.5% 0.4% 4.4% 83.9% 22.3%
U 1.9% 1.9% 3.2% 5.1% 37.9%

 

Stage comparison between RCRD and NCRD
i. prostate (ICD-10 C61)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 85.9% 10.4% 4.6% 1.6% 38.8%
2 6.7% 81.7% 2.7% 0.8% 6.6%
3 4.3% 4.3% 85.8% 2.8% 13.7%
4 0.8% 0.8% 3.9% 92.3% 17.6%
U 2.3% 2.8% 3.0% 2.5% 23.3%
Stage comparison between RCRD and NCRD
j. stomach (ICD-10 C16)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 67.3% 4.7% 0.8% 0.1% 6.7%
2 18.7% 64.6% 9.9% 0.8% 5.7%
3 5.8% 19.4% 68.6% 3.2% 9.6%
4 2.1% 6.7% 16.7% 93.4% 31.5%
U 6.1% 4.7% 4.0% 2.5% 46.6%
Stage comparison between RCRD and NCRD
k. uterus (ICD-10 C54-C55)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 96.8% 10.5% 5.7% 7.1% 46.2%
2 0.7% 83.0% 1.2% 2.5% 3.8%
3 0.5% 2.1% 86.8% 7.3% 7.1%
4 0.2% 1.8% 2.4% 75.1% 8.3%
U 1.9% 2.5% 3.8% 7.9% 34.5%
Stage comparison between RCRD and NCRD
l. pancreas (ICD-10 C25)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 72.6% 3.5% 0.9% 0.4% 8.6%
2 15.4% 74.1% 2.4% 0.5% 6.0%
3 4.7% 12.2% 88.2% 0.6% 6.4%
4 3.4% 5.6% 6.0% 97.1% 47.7%
U 3.9% 4.6% 2.5% 1.3% 31.3%
Stage comparison between RCRD and NCRD
m. lymphoma (ICD-10 C81-C86)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 89.2% 1.4% 0.6% 0.6% 13.8%
2 0.9% 92.4% 1.3% 0.6% 11.2%
3 0.7% 1.3% 88.8% 1.6% 13.9%
4 6.7% 2.7% 7.2% 94.7% 35.9%
U 2.5% 2.2% 2.1% 2.6% 25.2%
Stage comparison between RCRD and NCRD
n. cervical (ICD-10 C53)
Stage Group
RCRD
NCRD 1 2 3 4 Unknown
1 58.4% 3.5% 3.5% 3.8% 17.9%
2 1.2% 75.1% 2.3% 5.0% 8.3%
3 1.1% 3.3% 73.4% 1.9% 6.0%
4 0.4% 1.4% 2.9% 67.6% 7.9%
U 38.9% 16.7% 17.9% 21.8% 60.0%

“Early” vs “Late” stage

Below in table 6 we repeat the above tabulations but now grouping Rapid and Gold Standard cancers into “Early” (TNM stage group 1 & 2) or “Late” (TNM stage group 3 & 4) categories. We see that 62% of breast cancers are identified as “Early” stage in the Rapid Registrations dataset compared to 76% in the Gold Standard Registration data due to the higher proportion of “Unknown” stage tumours (28% vs 10% respectively).

As with the more detailed stage data, there is a high degree of concordance between the gold standard and rapid registration stage fields if a known stage can be identified.

Table 6: Summary proportions of “Early” vs “Late” stage for Rapid Registrations and Gold Standard Registrations

Summary proportions of Early vs Late stage for Rapid Registrations and Gold Standard Registrations
Broad Cancer Group Stage Group Count (Rapid) Percentage (Rapid) Count (Gold Standard) Percentage (Gold Standard)
Bladder Early 4126 42.8% 4737 49.2%
Bladder Late 831 8.6% 1535 15.9%
Bladder Unknown 4672 48.5% 3357 34.9%
Breast Early 27625 62.2% 33343 75.1%
Breast Late 4550 10.2% 5686 12.8%
Breast Unknown 12234 27.5% 5380 12.1%
Cervical Early 1643 63.0% 1233 47.2%
Cervical Late 435 16.7% 426 16.3%
Cervical Unknown 532 20.4% 951 36.4%
Colorectum Early 11954 36.4% 13197 40.2%
Colorectum Late 13474 41.0% 16765 51.0%
Colorectum Unknown 7414 22.6% 2880 8.8%
Kidney Early 2832 34.4% 3878 47.2%
Kidney Late 2056 25.0% 3221 39.2%
Kidney Unknown 3336 40.6% 1125 13.7%
Lung Early 8771 24.3% 9291 25.7%
Lung Late 22301 61.8% 24770 68.6%
Lung Unknown 5024 13.9% 2035 5.6%
Lymphoma Early 1788 15.5% 3199 27.8%
Lymphoma Late 3716 32.3% 6674 57.9%
Lymphoma Unknown 6015 52.2% 1646 14.3%
Melanoma Early 8728 66.3% 10866 82.5%
Melanoma Late 691 5.2% 1387 10.5%
Melanoma Unknown 3747 28.5% 913 6.9%
Oesophagus Early 1787 21.7% 1401 17.0%
Oesophagus Late 4271 51.8% 5312 64.4%
Oesophagus Unknown 2190 26.6% 1535 18.6%
Ovary Early 1451 28.8% 1698 33.7%
Ovary Late 2121 42.0% 2678 53.1%
Ovary Unknown 1474 29.2% 670 13.3%
Pancreas Early 987 12.3% 1465 18.3%
Pancreas Late 2809 35.1% 5139 64.1%
Pancreas Unknown 4217 52.6% 1409 17.6%
Prostate Early 17266 37.3% 22708 49.1%
Prostate Late 16134 34.9% 19696 42.5%
Prostate Unknown 12892 27.8% 3888 8.4%
Stomach Early 714 18.1% 806 20.4%
Stomach Late 1793 45.4% 2377 60.2%
Stomach Unknown 1439 36.5% 763 19.3%
Uterus Early 5169 65.2% 5872 74.0%
Uterus Late 1253 15.8% 1369 17.3%
Uterus Unknown 1509 19.0% 690 8.7%

In Table 7a-n below the distribution of the stage allocation between the Rapid Registrations and the Gold Standard Registrations are examined, aggregated into Early and Late stage.

Tables 7a-n: “Early” vs “late” stage comparison between Rapid Registrations and Gold Standard Registrations

Early vs late stage comparison between RCRD and NCRD
a. bladder (ICD-10 C67)
Stage Category
RCRD
NCRD Early Late Unknown
Early 82.8% 19.6% 24.8%
Late 9.0% 73.3% 11.8%
Unknown 8.2% 7.1% 63.4%
Early vs late stage comparison between RCRD and NCRD
b. breast (ICD-10 C50)
Stage Category
RCRD
NCRD Early Late Unknown
Early 94.2% 15.1% 54.2%
Late 2.1% 79.7% 12.1%
Unknown 3.7% 5.2% 33.7%
Early vs late stage comparison between RCRD and NCRD
c. colorectum (ICD-10 C18-C20)
Stage Category
RCRD
NCRD Early Late Unknown
Early 88.8% 5.4% 25.0%
Late 9.1% 92.8% 42.9%
Unknown 2.1% 1.8% 32.1%
Early vs late stage comparison between RCRD and NCRD
d. kidney (ICD-10 C64)
Stage Category
RCRD
NCRD Early Late Unknown
Early 90.2% 3.8% 37.3%
Late 3.5% 92.8% 36.4%
Unknown 6.3% 3.4% 26.3%
Early vs late stage comparison between RCRD and NCRD
e. lung (ICD-10 C33-C34)
Stage Category
RCRD
NCRD Early Late Unknown
Early 94.0% 1.7% 13.2%
Late 4.7% 97.5% 52.1%
Unknown 1.3% 0.8% 34.6%
Early vs late stage comparison between RCRD and NCRD
f. melanoma (ICD-10 C43)
Stage Category
RCRD
NCRD Early Late Unknown
Early 91.9% 22.6% 71.9%
Late 5.2% 70.3% 12.0%
Unknown 3.0% 7.1% 16.1%
Early vs late stage comparison between RCRD and NCRD
g. Oesophagus (ICD-10 C15)
Stage Category
RCRD
NCRD Early Late Unknown
Early 60.2% 2.3% 10.4%
Late 33.7% 89.7% 40.1%
Unknown 6.1% 8.0% 49.5%
Early vs late stage comparison between RCRD and NCRD
h. ovary (ICD-10 C56-C57)
Stage Category
RCRD
NCRD Early Late Unknown
Early 96.5% 1.0% 18.7%
Late 1.6% 95.0% 43.4%
Unknown 1.9% 3.9% 37.9%
Early vs late stage comparison between RCRD and NCRD
i. prostate (ICD-10 C61)
Stage Category
RCRD
NCRD Early Late Unknown
Early 92.4% 5.5% 45.4%
Late 5.1% 91.6% 31.3%
Unknown 2.5% 2.8% 23.3%
Early vs late stage comparison between RCRD and NCRD
j. stomach (ICD-10 C16)
Stage Category
RCRD
NCRD Early Late Unknown
Early 76.9% 4.4% 12.4%
Late 17.8% 92.5% 41.1%
Unknown 5.3% 3.1% 46.6%
Early vs late stage comparison between RCRD and NCRD
k. uterus (ICD-10 C54-C55)
Stage Category
RCRD
NCRD Early Late Unknown
Early 97.0% 8.1% 50.0%
Late 1.0% 86.4% 15.4%
Unknown 1.9% 5.5% 34.5%
Early vs late stage comparison between RCRD and NCRD
l. pancreas (ICD-10 C25)
Stage Category
RCRD
NCRD Early Late Unknown
Early 81.4% 1.6% 14.7%
Late 14.3% 96.8% 54.0%
Unknown 4.4% 1.6% 31.3%
Early vs late stage comparison between RCRD and NCRD
m. lymphoma (ICD-10 C81-C86, C88
Stage Category
RCRD
NCRD Early Late Unknown
Early 92.1% 1.4% 25.0%
Late 5.6% 96.2% 49.9%
Unknown 2.3% 2.4% 25.2%
Early vs late stage comparison between RCRD and NCRD
n. cervical (ICD-10 C53
Stage Category
RCRD
NCRD Early Late Unknown
Early 64.6% 7.6% 26.1%
Late 2.3% 72.2% 13.9%
Unknown 33.1% 20.2% 60.0%

Stage completeness by snapshot

Figure 16 shows the completeness of stage by tumour type for one snapshot per quarter. Stage completeness continues to increase and lags behind the incidence completeness due to staging activity happening up to several months after diagnosis.

Figure 16: Stage completeness by snapshot

A series of individual line charts for thirteen common cancers, showing the proportion of tumour staged by month, broken down by snapshot. For most tumour types, stage completeness has increased gradually over time.

Data completeness, source, and mortality comparisons

Counts of missing data

Figure 17 shows the count of tumours per month where the indicated data item is missing. The data items are: basis of diagnosis, birth date best, ethnic category, NHS number, postcode, quintile 2019, gender and trust code. Larger counts in the most recent months are to be expected.

Figure 17: Counts of missing data

A set of line charts for eight data items. Basis of diagnosis, ethnic category and GENDER show no counts of missing data. For postcode and quintile, larger counts are seen for more recent months. For other data items, the trends are variable over time

Ethnicity completeness

Figure 18 shows the count of tumours per month where the indicated data item is missing. Larger counts in the most recent months are to be expected.

Figure 18: Ethnicity completeness

A set of line charts for seventeen ethnic category codes of the number of tumours with complete ethnicity data over time. The completeness count increases from early 2020, to then decrease for more recent months.

Tumour source

Figure 19 shows the number of tumours created by the source of the diagnosis - i.e., which dataset was used to create them, by month

Figure 19: Tumour source dataset

Four line charts of the tumour count from each source of diagnosis. For COSD, CWT and HES, there is a decrease in count over March 2020, which then increases. For DCO, the count is stable. For all sources, the number decreases for recent months

Mortality proportion by month

Figure 20 shows the mortality proportions by month mortality within 30, 90 and 182 days in the RCRD compared to the NCRD, for all cancers included in RCRD excl C44 and D06.

Figure 20: Monthly mortality proportions at 30, 90 and 182 days

A line graph comparing the proportion of 30 day mortality over time between the rapid  and national cancer registration data. The proportions follow similar trends, and in more recent months, is lower for the national cancer registration data. A line graph comparing the proportion of 30 day mortality over time between the rapid and national cancer registration data.

A line graph comparing the proportion of 182 day mortality over time between the rapid and national cancer registration data. The proportion is mostly higher for the national registration data, and then decreases for more recent months.

Figure 21 shows the proportions of route to diagnosis by month, for all cancers included in RCRD excl C44 and D06.

Figure 21: Monthly Route to Diagnosis

A line graph the proportions of route to diagnosis by month, for all cancers included in RCRD excl C44 and D06

Appendix 1 - List of pathway events

Table A1: AT_RAPID_PATHWAY: event list

AT_RAPID_PATHWAY: event list
EVENT_TYPE EVENT_DESC EVENT_PROPERTY_1 EVENT_PROPERTY_2 EVENT_PROPERTY_3 EVENT_DATE Linkage
1 CWT Treatment Period Start Date CWT First Treatment Flag CWT SITE_ICD10 CWT Cancer Treatment Event Type Treat period start NHSNUMBER
2 CWT Treatment Start CWT Treatment Modality CWT Cancer Treatment Event type Treatment start date NHSNUMBER
3 CWT MDT Begin CWT MDT Cancer Care Plan discussed indicator MDT date NHSNUMBER
4 CWT Faster Diagnosis Period End (null) Faster Diagnosis Period site Faster Diagnosis Period end date NHSNUMBER
5 HES Admitted Patient Care Episode Treatment speciality : Admission Method : HES ethnicity All ICD-10 codes (for episode) All OPCS-4 codes (for episode) Episode Start date - Episode end date NHSNUMBER
6 HES Admitted Patient Care Operation OPCS codes (for date) in POS order ICD-10 codes (for episode) Operation date NHSNUMBER
7 SACT Cycle Benchmark group Cycle number Treatment intent Cycle start date PATIENTID
8 RTDS Episode Radiotherapy intent ICD-10 diagnosis code Episode treatment start date PATIENTID
9 Tumour diagnosis (Provisional) Statusofregistration ICD-10 diagnosis code Stage_best Diagnosisdatebest PATIENTID
11 HES major surgery (historical) OPCS-4 code ICD-10 diagnosis code Operation date NHSNUMBER
12 HES major surgery (historical, further constraints) OPCS-4 code ICD-10 diagnosis code Further notes/constraints Operation date NHSNUMBER
14 RAWDATA major surgery (historical) OPCS-4 code ICD-10 diagnosis code Operation date PATIENTID
15 RAWDATA major surgery (historical, further constraints) OPCS-4 code ICD-10 diagnosis code Further notes/constraints Operation date PATIENTID
17 Prior tumour diagnosis Statusofregistration ICD-10 diagnosis code Stage_best Diagnosisdatebest PATIENTID
18 Tumour diagnosis (Final) Statusofregistration ICD-10 diagnosis code Stage_best Diagnosisdatebest PATIENTID
19 Patient vital status date Vitalstatus ICD-10 Underlying cause of death Death location code Vitalstatusdate PATIENTID
20 RAWDATA holistic needs assessment record HNA point of pathway : HNA offered : HNA staff role Primary diagnosis Laterality Date of HNA PATIENTID
21 RAWDATA staging Inferred best stage ICD-10 diagnosis code T/N/M components for pre-treatment/pathological/integrated stage Collected stage date PATIENTID
22 CWT First Seen Source of referral Categorisation of TWW, screening and consultant upgrade cases, where relevant Suspected cancer referral type Date first seen NHSNUMBER
23 HES diagnostic event OPCS-4 code Description BX/LD Operation date NHSNUMBER
24 RAWDATA personal care and support plan PCSP point of pathway : PCSP offered : PCSP staff role Primary diagnosis Laterality PCSP date PATIENTID
25 RAWDATA end of treatment summary Primary diagnosis Laterality eots_date PATIENTID
50 Skeleton Tumour creation E_base_record type (COSD = England, CANISC = Wales) ICD-10 diagnosis code Diagnosisdate PATIENTID
51 Diagnosis reported in COSD Number of times reported ICD-10 diagnosis code E_base_record type Diagnosisdate NHSNUMBER
52 CWT estimated diagnosis date CWT First Treatment Flag CWT recorded primary diagnosis (ICD) CWT Cancer Treatment Event Type Adjusted treat period start NHSNUMBER
53 HES inferred tumour HES cancer group ICD-10 diagnosis code Episode start date NHSNUMBER
54 COSD diagnosis submission E_base_record primary diagnoses ICD-10 diagnosis code (submission) Diagnosis date (submission) PATIENTID
55 RAWDATA biopsy record Laterality ICD-10 diagnosis code Collected date/authorised date PATIENTID
56 RAWDATA imaging record Laterality ICD-10 diagnosis code Procedure_date - diagdate Diagdate PATIENTID
57 RAWDATA HNA diagnosis Laterality Primary diagnosis (ICD-10) Diagdate PATIENTID
101 Inferred diagnosis Event_property_1 from source record (event 19, 52, 53, 54) ICD-10 diagnosis code Cancer group First recorded date PATIENTID
102 Inferred diagnosis, with adapted diagnosis dates (from 101, adapted) RCRD inferred/ derived diagnosis, with adapted diagnosis dates - using event 101 diagnoses, adapting diagnosis dates for earlier records in 90-days preceding 101 diagnosis date source_id from second record used to adapt diagnosis date ICD-10 diagnosis code Cancer group PATIENTID
103 Alternative inferred diagnosis (work in progress) RCRD inferred/ derived diagnosis based on combination of multiple data sources - using an alternative approach to 101 (approach still being refined) source_id from second record used to adapt diagnosis date ICD-10 diagnosis code Cancer group PATIENTID
*: Data dictionary: Primary cancer site for cancer faster diagnosis pathway

 

**: Data dictionary: Holistic needs assessment point of pathway for cancer


Appendix 2 - List of Rapid Registration fields available

Table A2: AT_RAPID_TUMOUR: field list

AT_RAPID_TUMOUR: field list
COLUMN_NAME DATA_TYPE Notes
INDIVIDUALID NUMBER(19,0) Matches AT_RAPID_PATHWAY for each event with event_type=101
PATIENTID NUMBER(19,0) Matches AT_RAPID_PATHWAY for each event with event_type=101
NHSNUMBER VARCHAR2(12 BYTE) Matches AT_RAPID_PATHWAY for each event with event_type=101
TUMOUR_AVPID NUMBER Matches AT_RAPID_PATHWAY for each event with event_type=101
DIAGNOSISDATE DATE Matches AT_RAPID_PATHWAY for each event with event_type=101
TUMOUR_SITE VARCHAR2(260 CHAR) Matches AT_RAPID_PATHWAY for each event with event_type=101 (event_property_2)
BIRTHDATEBEST DATE Taken from Encore
AGE VARCHAR2(260 CHAR) Taken from Encore
GENDER VARCHAR2(260 CHAR) Taken from Encore
POSTCODE VARCHAR2(255 BYTE) Taken from Encore
SURNAME VARCHAR2(64 BYTE) Taken from Encore
FORENAME VARCHAR2(64 BYTE) Taken from Encore
STAGE VARCHAR2(260 CHAR) Defined for selected cancer sites
ETHNICCATEGORY VARCHAR2(1 CHAR) Taken from Encore or the HESAPC dataset
FINAL_ROUTE VARCHAR2(22 BYTE) Final Route to Diagosis using an adapted version of the standard NCRAS methodology
QUINTILE_2019 VARCHAR2(120 BYTE) Index of Multiple Deprivation quintile defined using the standard NCRAS methodology
CHRL_TOT_27_03 NUMBER(10,0) Charlson score defined using the standard NCRAS methodology
TUMOUR_MORPHOLOGY VARCHAR2(5 CHAR) Tumour morphology as recorded in the COSD system
TUMOUR_PERFORMANCESTATUS VARCHAR2(1 CHAR) Patient performance status at time of diagnosis
BASISOFDIAGNOSIS VARCHAR2(260 CHAR) The basis of diagnosis (e.g. clinical; pathological; etc.)
LSOA11 VARCHAR2(27 BYTE) 2011 census LSOA of residence at time of diagnosis
LSOA21 VARCHAR2(27 BYTE) 2021 census LSOA of residence at time of diagnosis
DIAGNOSIS_TRUST VARCHAR2(260 CHAR) Trust of diagnosis
SOURCE VARCHAR2(11 CHAR) The dataset used as the primary source for the RCRD registration
SOURCE_ID VARCHAR2(69 CHAR) The unique ID of the record used as the primary source for the RCRD registration
VITALSTATUS VARCHAR2(260 CHAR) Records whether the patient is currently alive or deceased at the time of the snapshot.
VITALSTATUSDATE DATE The date of the last known vital status for the patient
CANCER_GROUP VARCHAR2(40 BYTE) Broad cancer group derived from TUMOUR_SITE, according to groupings used for RCRD derivation and RCRD dashboard
CANCER_GROUP_DETAILED VARCHAR2(40 BYTE) Detailed cancer group derived from TUMOUR_SITE, according to groupings used for RCRD dashboard
SURGERY_FLAG NUMBER Indicator flag (0 = No; 1 = Yes) for whether the patient has an associated surgical tumour resection record
SURGERY_DATE DATE Where the SURGERY_FLAG indicates one or more associated surgical tumour resection record, this is the earliest such date
RADIOTHERAPY_FLAG NUMBER Indicator flag (0 = No; 1 = Yes) for whether the patient has an associated radiotherapy record (from RTDS)
RADIOTHERAPY_DATE DATE Where the RADIOTHERAPY_FLAG indicates one or more associated radiotherapy record, this is the earliest such date
SACT_FLAG NUMBER Indicator flag (0 = No; 1 = Yes) for whether the patient has an associated systemical anti-cancer therapy (SACT) record (from SACT dataset)
SACT_DATE DATE Where the SACT_FLAG indicates one or more associated SACT record, this is the earliest such date


Appendix 3 - Cancer groups used for matching

Table A3: Rapid Registration ICD-10 tumour inclusion list

Rapid Registration ICD-10 tumour inclusion list
ICD CANCER_GROUP SCOPE ICD CANCER_GROUP SCOPE
C00 Head & Neck DQ & CD C54 Gynae DQ & CD
C01 Head & Neck DQ & CD C55 Gynae DQ & CD
C02 Head & Neck DQ & CD C56 Gynae DQ & CD
C03 Head & Neck DQ & CD C57 Gynae DQ & CD
C04 Head & Neck DQ & CD C58 Gynae DQ & CD
C05 Head & Neck DQ & CD C59 Other DQ & CD
C06 Head & Neck DQ & CD C60 Urology DQ & CD
C07 Head & Neck DQ & CD C61 Prostate DQ & CD
C08 Head & Neck DQ & CD C62 Urology DQ & CD
C09 Head & Neck DQ & CD C63 Urology DQ & CD
C10 Head & Neck DQ & CD C64 Urology DQ & CD
C11 Head & Neck DQ & CD C65 Urology DQ & CD
C12 Head & Neck DQ & CD C66 Urology DQ & CD
C13 Head & Neck DQ & CD C67 Urology DQ & CD
C14 Head & Neck DQ & CD C68 Urology DQ & CD
C15 O-G DQ & CD C69 Brain & CNS DQ & CD
C16 O-G DQ & CD C70 Brain & CNS DQ & CD
C17 Upper GI DQ & CD C71 Brain & CNS DQ & CD
C18 Colorectal DQ & CD C72 Brain & CNS DQ & CD
C19 Colorectal DQ & CD C73 Endocrine DQ & CD
C20 Colorectal DQ & CD C74 Endocrine DQ & CD
C21 Colorectal DQ & CD C75 Endocrine DQ & CD
C22 Upper GI DQ & CD C76 Unknown Primary DQ & CD
C23 Upper GI DQ & CD C77 Unknown Primary DQ & CD
C24 Upper GI DQ & CD C78 Unknown Primary DQ & CD
C25 Upper GI DQ & CD C79 Unknown Primary DQ & CD
C26 Upper GI DQ & CD C80 Unknown Primary DQ & CD
C27 Other DQ & CD C81 Haematological DQ & CD
C28 Other DQ & CD C82 Haematological DQ & CD
C29 Other DQ & CD C83 Haematological DQ & CD
C30 Head & Neck DQ & CD C84 Haematological DQ & CD
C31 Head & Neck DQ & CD C85 Haematological DQ & CD
C32 Head & Neck DQ & CD C86 Haematological DQ & CD
C33 Lung DQ & CD C87 Haematological DQ & CD
C34 Lung DQ & CD C88 Haematological DQ & CD
C35 Other DQ & CD C89 Haematological DQ & CD
C36 Other DQ & CD C90 Haematological DQ & CD
C37 Other DQ & CD C91 Haematological DQ & CD
C38 Lung DQ & CD C92 Haematological DQ & CD
C39 Lung DQ & CD C93 Haematological DQ & CD
C40 Bone & ST DQ & CD C94 Haematological DQ & CD
C41 Bone & ST DQ & CD C95 Haematological DQ & CD
C42 Other DQ & CD C96 Haematological DQ & CD
C43 Melanoma DQ & CD C97 Unknown Primary DQ & CD
C44 NMSC
D05 Breast DQ
C45 Lung DQ & CD D06 Gynae
C46 Bone & ST DQ & CD D09 Urology DQ
C47 Brain & CNS DQ & CD D32 Brain & CNS DQ
C48 Gynae DQ & CD D33 Brain & CNS DQ
C49 Bone & ST DQ & CD D35 Brain & CNS DQ
C50 Breast DQ & CD D41 Urology DQ
C51 Gynae DQ & CD D42 Brain & CNS DQ
C52 Gynae DQ & CD D43 Brain & CNS DQ
C53 Gynae DQ & CD D44 Brain & CNS DQ
Scope: DQ = ‘Included in this data quality document’; CD = ‘Included in cancerdata.nhs.uk/covid-19/rcrd dashboard’


Appendix 4 - Alternative defining events

Several options were considered as to the defining events for the Rapid Registrations. Both standalone datasets, subsets of standalone datasets, and combined datasets were explored and their FNE and FPE figures quantified. A subset of these alternatives are presented below as a demonstration of the process but the majority of this exploratory work is out of scope for this document.

Candidates for diagnosis events from the three main datasets that are rapidly available and have nominally full coverage of cancer patients are shown below (SACT and RTDS were also examined but data is not presented). Of the three, the CWT data has the best FPE but the FNE is substantially higher than the COSD dataset. HES produced the worst results in both measures. A filtering process was applied to the standalone COSD data to remove apparently new diagnoses that were actually recurrences of prior tumours. This improved the FPE at a cost of increasing the FNE. We continue to test whether this process can be further refined to improve the combined FPE and FNE figures, and monitor changes in the underlying datasets that might also give new opportunities to do so.

Table A4: Rapid Cancer Registrations: alternative defining events

Rapid Cancer Registrations: alternative defining events
Event FPE FNE
Event 52 - standalone CWT 7.6% 28.3%
Event 53 - standalone HES 13.2% 38.9%
Event 54 - standalone COSD 8.1% 15.8%
Event 101 (up to cas2106) - filtered COSD 5.2% 17.7%
Event 101 (cas2107) - filtered combined COSD/CWT 5.6% 16.4%
Event 101 (cas2108) - filtered combined COSD/CWT 5.1% 16.5%
Event 101 (cas2109) - filtered combined COSD/CWT 5.1% 16.6%
Event 101 (cas2110) - filtered combined COSD/CWT/HES 5.1% 14.7%
Event 101 (cas2111) - filtered combined COSD/CWT/HES 6.2% 13.4%
Event 101 (cas2112 to cas2202) - filtered combined COSD/CWT/HES and Death Certificates Only 5.3% 13.4%
Event 101 (cas2203 to cas2204) - filtered combined COSD/CWT/HES and Death Certificates Only 6.3% 12.2%
Event 101 (cas2205) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 12.3%
Event 101 (cas2206) - filtered combined COSD/CWT/HES and Death Certificates Only 5.6% 12.5%
Event 101 (cas2207) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 11.8%
Event 101 (cas2208 to cas2210) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 11.6%
Event 101 (cas2211 to cas2304) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.5%
Event 101 (cas2305) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.3%
Event 101 (cas2306) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.4%
Event 101 (cas2307 to cas2308) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.3%
Event 101 (cas2309 to cas2311) - filtered combined COSD/CWT/HES and Death Certificates Only 6.1% 11.4%
Event 101 (cas2312 to cas2409) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 11.5%
Event 101 (cas2410) - filtered combined COSD/CWT/HES and Death Certificates Only 5.9% 11.7%
Event 101 (cas2411) - filtered combined COSD/CWT/HES and Death Certificates Only 5.8% 12.3%
Event 101 (cas2412 to cas2501) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 12.0%
Event 101 (cas2504 to cas2505) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 12.1%
Event 101 (cas2505 to cas2506) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 12.0%
Event 101 (cas2507 to cas2510) - filtered combined COSD/CWT/HES and Death Certificates Only 6.0% 12.1%


Appendix 5 - Counts and error tabulations

Figure A1 shows an example for a very small dataset of how counts and error proportions are derived. This dataset has 10 Gold Standard Registrations and 7 Rapid Registrations overall (both indicated by the dots in the figure, with time running vertically over the course of 2018 and Gold Standard vs Rapid Registrations divided horizontally). Successful linkages between Gold Standard and Rapid Registrations are indicated by blue lines. False negatives and false positives are indicated. Only tumours in the 6-month assessment period are included in the tabulations below, although these can link to tumours outside the period as shown, and many-to-one linkages are also allowed. The false negative rate is therefore 3 in 7 and the false positive rate 1 in 6 below.

Figure A1: Illustration of counts and errors tabulation

Diagram assessing gold standard registrations against the rapid registration dataset. Details of the graph as described in the text above.

Tables A5 and A6 below tabulate counts of Gold Standard and Rapid Registrations together with the numbers of false positive and false negative errors. When considering comparisons between figures the nature of the linkage and relationships displayed in the diagram above should be kept in mind.

Table A5: Counts and errors tabulation by cancer group

Counts and errors tabulation by cancer group
Cancer group Gold Standard (GS) Registrations Rapid Registrations Difference Percentage Rapid/GS FPE FNE
Brain & CNS 5779 5134 645 88.8% 711 1352
Breast 28966 27218 1748 94.0% 1488 1720
Colorectal 18981 17844 1137 94.0% 917 1700
Endocrine 1908 1484 424 77.8% 139 507
Gynae 9787 9313 474 95.2% 666 1008
Haematological 14025 12397 1628 88.4% 714 2364
Head & Neck 5289 4930 359 93.2% 400 697
Lung 21716 20102 1614 92.6% 625 2109
Melanoma 8260 7686 574 93.1% 694 1073
O-G 6621 6469 152 97.7% 365 476
Prostate 27209 25176 2033 92.5% 323 2452
Bone & Soft Tissue 1139 1081 58 94.9% 367 408
Unknown Primary 3428 2565 863 74.8% 665 1532
Upper GI 9269 8659 610 93.4% 800 1448
Urology 17021 14823 2198 87.1% 1015 2896

Table A6: Counts and errors tabulation by cancer site

Counts and errors tabulation by cancer site
Cancer site Gold Standard (GS) Registrations Rapid Registrations Difference Percentage Rapid/GS FPE FNE
C00 111 150 -39 135.1% 65 25
C01 646 470 176 72.8% 13 59
C02 602 604 -2 100.3% 18 93
C03 234 108 126 46.2% 5 65
C04 256 239 17 93.4% 10 34
C05 211 185 26 87.7% 8 32
C06 270 287 -17 106.3% 21 50
C07 238 288 -50 121.0% 101 53
C08 82 89 -7 108.5% 15 14
C09 912 766 146 84.0% 16 59
C10 152 244 -92 160.5% 12 30
C11 113 111 2 98.2% 6 13
C12 157 99 58 63.1% 1 11
C13 142 130 12 91.5% 11 22
C14 26 64 -38 246.2% 15 14
C15 3996 4266 -270 106.8% 124 220
C16 2625 2203 422 83.9% 241 256
C17 818 666 152 81.4% 129 267
C18 12440 11758 682 94.5% 669 1242
C19 996 937 59 94.1% 43 90
C20 4898 4504 394 92.0% 115 326
C21 647 645 2 99.7% 90 42
C22 2646 2520 126 95.2% 265 457
C23 477 467 10 97.9% 29 62
C24 645 526 119 81.6% 32 84
C25 4532 4184 348 92.3% 137 502
C26 151 296 -145 196.0% 208 76
C30 162 158 4 97.5% 27 25
C31 93 65 28 69.9% 5 27
C32 882 873 9 99.0% 51 71
C33 13 11 2 84.6% 1 3
C34 20254 18737 1517 92.5% 549 1924
C37 167 92 75 55.1% 13 58
C38 73 355 -282 486.3% 46 21
C39 NA 12 NA NA% 4 NA
C40 119 107 12 89.9% 13 25
C41 117 150 -33 128.2% 78 44
C43 8260 7686 574 93.1% 694 1073
C45 1209 895 314 74.0% 12 103
C46 68 43 25 63.2% 4 25
C47 28 12 16 42.9% 5 22
C48 287 400 -113 139.4% 111 87
C49 835 781 54 93.5% 272 314
C50 25135 24349 786 96.9% 1355 1357
C51 644 594 50 92.2% 56 79
C52 95 109 -14 114.7% 16 12
C53 1321 1315 6 99.5% 56 86
C54 4083 3691 392 90.4% 106 197
C55 73 333 -260 456.2% 26 17
C56 2999 2527 472 84.3% 243 491
C57 275 320 -45 116.4% 34 38
C58 10 24 -14 240.0% 18 1
C60 304 315 -11 103.6% 50 37
C61 27209 25176 2033 92.5% 323 2452
C62 1055 1067 -12 101.1% 88 72
C63 33 31 2 93.9% 13 19
C64 4902 4359 543 88.9% 272 802
C65 419 322 97 76.8% 25 92
C66 362 259 103 71.5% 13 123
C67 4474 5052 -578 112.9% 146 683
C68 96 56 40 58.3% 6 40
C69 382 352 30 92.1% 48 63
C70 19 41 -22 215.8% 4 1
C71 2263 2116 147 93.5% 139 206
C72 83 90 -7 108.4% 35 19
C73 1733 1354 379 78.1% 80 413
C74 117 84 33 71.8% 26 56
C75 58 46 12 79.3% 33 38
C76 94 210 -116 223.4% 109 54
C77 271 126 145 46.5% 62 129
C78 594 54 540 9.1% 22 332
C79 231 131 100 56.7% 52 126
C80 2238 2044 194 91.3% 420 891
C81 893 867 26 97.1% 15 69
C82 1208 1044 164 86.4% 16 139
C83 3114 2696 418 86.6% 39 318
C84 394 230 164 58.4% 14 122
C85 1381 1015 366 73.5% 63 318
C86 NA 101 NA NA% 2 NA
C88 220 350 -130 159.1% 12 62
C90 2552 2205 347 86.4% 63 434
C91 2360 1900 460 80.5% 79 557
C92 1761 1559 202 88.5% 233 294
C93 23 186 -163 808.7% 23 1
C94 26 79 -53 303.8% 66 11
C95 51 65 -14 127.5% 10 13
C96 42 100 -58 238.1% 79 26
D05 3831 2869 962 74.9% 133 363
D09 5166 1233 3933 23.9% 245 954
D32 1474 1037 437 70.4% 88 508
D33 488 602 -114 123.4% 115 200
D35 492 547 -55 111.2% 194 146
D41 210 2129 -1919 1013.8% 157 74
D42 150 16 134 10.7% 4 38
D43 278 265 13 95.3% 56 76
D44 122 56 66 45.9% 23 73

Appendix 6 - False negative errors and basis of diagnosis

This appendix explores the reason for the overall age-dependence of the false negative error rate.

The most common methods of confirming a diagnosis (histology and cytology) account for the lowest proportion of false negatives (Figure A2). Where diagnosis comes from specific tumour markers, the Rapid Registrations are much more likely to “miss” the significant event or events. Patients diagnosed clinically (from imaging, consultation by a doctor but without a pathological sample being taken) are also more likely to be “missed” in the Rapid Registrations dataset.

Those patients for whom a diagnosis method cannot be determined (unknown) or died before they could be offered cancer treatment (death certificate), are most likely to be “missed” in the Rapid Registrations dataset. As Figure A3 indicates though, these account for a small proportion of those falsely omitted from the Rapid Registrations.

The marked reduction in the proportion of patients having their diagnosis confirmed from a pathological specimen (histology or cytology) explains the increase often observed at older ages in Figure A3, from the age of around 70, reflecting fewer patients having an invasive procedure performed on them as age increases. This is likely to be the reason behind the increasing false negative proportions by age observed overall and in most tumour groups (Figures 5 and 6).

Figure A2: The proportion of false negative Rapid Registrations by tumour group and basis of diagnosis, England, 2018

A grouped bar chart of the proportion of false negative rapid registrations by tumour group and basis of diagnosis.The proportion of error was highest for tumours from death certificate.

Figure A3: The proportion of false negative Rapid Registrations by method of diagnosis, England, 2018 (all tumour types combined)

A line chart of the proportion of false negative diagnoses by age at diagnosis, grouped by method of diagnosis. The method of diagnosis with the highest proportion of tumours is histology of primary up to age 88, and after is clinical investigation.

Appendix 7 - False positive and false negative proportion by month

Figure 18 shows the False Negative and False Positive error proportions by month for the broader matching criteria and a matching period of 90 and 30 days.

Figure A4: Monthly False Positive and False Negative proportions

A line chart of the proportion of error by month, group by false positive or false negative error. For both 30 and 90 day matching periods, the proportion of false negative errors is lower until July 2021, where false positive errors increase.

Appendix 8 - Sensitivity testing of matching criteria

In this section, the sensitivity of the Rapid Registrations dataset is illustrated for different matching criteria.

As expected, the stricter the criteria about the timing of events, more errors (both false negative and false positive) are observed. Not including a match specification on tumour type (the second line of table 1) improves both matching criteria and demonstrates that approximately 40% of false positive tumours have a cancer diagnosis of some sort when the necessity of matching by tumour group is removed.

Table A7: Proportions of false positive and negative errors under alternative matching criteria

Proportions of false positive and negative errors under alternative matching criteria
Tumour matching Match within N days False Negative % False Positive %
Broader 90 12.1% 6.0%
Broader 60 13.7% 7.6%
Broader 30 19.2% 13.2%
Broader 14 30.0% 24.7%
Broader 7 46.2% 42.2%
Broader 0 81.1% 79.6%
Narrow 90 20.1% 13.9%
None 90 10.6% 4.6%

Appendix 9 - Code changes to the RCRD build process

In this section, code changes introduced in each monthly snapshot are described.

Table A8: RCRD change log

AT_RAPID_PATHWAY: event list
snapshot change_id code_change
cas2510 cas2510-1 Further updates to support assignment of C48 tumours to a cancer group depending on patient gender
cas2510 cas2510-2 Cleaning of odd values from stage field
cas2510 cas2510-3 Cleaning of dates of death prior to diagnosis
cas2509 cas2509-1 Further updates to support assignment of C48 tumours to a cancer group depending on patient gender
cas2509 cas2509-2 Update to include persons with death-only information in group of proxy tumours.
cas2508 cas2508-1 Further updates to support assignment of C48 tumours to a cancer group depending on patient gender
cas2508 cas2508-2 Minor changes to surgery lookup table to align with standard treatment reporting
cas2508 cas2508-3 Adding D48, D72, E85, M72 ICD-10 overall lookup table to align with current cancer registration practice
cas2507 cas2507-1 C53 and C57 staging values moved into STAGE field from EXPERIMENTAL_STAGE
cas2507 cas2507-2 C48 tumours now assigned to a cancer group depending on patient gender
cas2507 cas2507-3 Resective surgery lookup table better aligned with 2025 Cancer Flags output
cas2506 cas2506-1 Internal changes to deal with multiple NHSnumbers per personid
cas2506 cas2506-2 Internal changes to prepare for improvements to assigning C48 tumours
cas2506 cas2506-3 Further development of event 102 and 103 experiemental events
cas2505
None
cas2504 cas2504-1 Further development of event 102 and 103 experiemental events
cas2504 cas2504-2 Update to basis of diagnosis code for 2023 cases onward to make consistent with updated registration practice
cas2501 cas2501-1 Permanent fix to enact deuplication of experimental event 102
cas2412 cas2412-1 Include staging of C53 (cervical cancer) in experimental stage field
cas2412 cas2412-2 Correcting issue that excluded rapidly fatal cancers being included from the HES data
cas2412 cas2412-3 Deduplication of experimental event 102 (hotfix)
cas2412 cas2412-4 Excluded lung screening Routes to Diagnosis prior to January 2019
cas2411 cas2411-1 Update to surgery code to use a combined table of all 3-digit ICD-10 codes, for all-stage and stage-specific procedures.
cas2411 cas2411-2 Filter OPCS4 procedure codes saved in initial HES tables, to include only those relevant to later lookups.
cas2411 cas2411-3 Added filtering to exclude Welsh only patients within the rapid_fatality section of event 101.
cas2411 cas2411-4 Two proposed new events, 102 and 103.
cas2410 cas2410-1 Refactored surgical lookup table code to be consistent with those used in treatment flag output
cas2410 cas2410-2 Added GP Practice code to tumour table
cas2409 cas2409-1 Added C33 to allowed list for lung screening
cas2409 cas2409-2 Updated NSPL postcode lookup to NSPL published May 2024
cas2409 cas2409-3 Internal refactoring of surgical lookup table to prepare for a simpler update process
cas2409 cas2409-4 Created internal experimental table showing patient GP practice at time of diagnosis
cas2408 cas2408-1 Changed criteria for including Event 54 in rapid pathway table such that there is a known nhsnumber instead of a known patient id (motivated by changes to COSD v10 data submissions)
cas2407 cas2407-1 Added STAGE_EXPERIMENTAL field
cas2407 cas2407-2 Added staging for C57 ovarian tumours (into STAGE_EXPERIMENTAL field)
cas2407 cas2407-3 Opened selection for screening cases to include C34 lung cancers
cas2406
None
cas2405 cas2405-1 Updated assignment of trusts (reversing effect of cas2305-2 change), reducing numbers of patients diagnosed at tertiary trusts and increasing numbers diagnosed in near-by trusts.
cas2405 cas2405-2 Refactored order of properties in event 5 for consistency throughtout code while maintaining fix for ethnicity made in cas2404.
cas2404 cas2404-1 Fixed issue with ethnicity ‘top up’ from HES data which was incorrectly assigning ethinicity where it was present in HES but missing in COSD.
cas2404 cas2404-2 Update to allow creation of HES identified endocrine tumours based on event 11, restoring diagnoses previously identified from event 13.
cas2403 cas2403-1 Added place of death to event 19, property 3.
cas2403 cas2403-2 Merging event 13 into event 11 and event 16 into 14. This has the effect of no longer distinguishing surgery codes consistent with the CASSOP 4.5 with those specific to the RCRD build.
cas2403 cas2403-3 Add LSOA21 and age at diagnosis to AT_RAPID_TUMOUR table.
cas2402
None
cas2401
None
cas2312 cas2312-1 Update ICD-10 site lookup table to include more D-coded tumour groups.
cas2311 cas2311-1 Filter ethnicity to 1 digit only.
cas2311 cas2311-2 Updated postcode lookup table to nspl_202305.
cas2311 cas2311-3 Added filter to morphology codes to only allow those beginning with ‘8’ or ‘9’.
cas2311 cas2311-4 After review of fields removed ‘received_date’ from pathway table.
cas2311 cas2311-5 After review of fields removed event type 10 as an effective duplicate of event type 19.
cas2310
None
cas2309 cas2309-1 Allow HES and CWT records to create event-type 52 and 53 events even if there is no patientid. Screen these out so that they don’t go on to create event-type 101 events, but are now available for testing.
cas2308
None
cas2307 cas2307-1 Expose path and integrated TNM stage components in event 21.
cas2307 cas2307-2 Change offset for CWT diagnosis events to a fixed lookup table rather than re-calculating each time.
cas2307 cas2307-3 Update CWT surgery codes to reflect changes to CWT data dictionary.
cas2307 cas2307-4 Updated surgery lookup table to reflect changes implemented in cancer treatment flags output.
cas2306 cas2306-1 Move comparison of diagnosis date to date of death to earlier in the processing (and using vital status date for date of death if appropriate).
cas2305 cas2305-1 Remove duplicate patients with multiple patientid and same nhsnumbers.
cas2305 cas2305-2 Revert to prior order to prioritise creation of event 101s without prioritising those with a known trust.
cas2305 cas2305-3 Bring diagnosis trust through to AT_RAPID_TUMOUR table.
cas2305 cas2305-4 Added new basis of diagnosis codes to reflect changes to ENCR definitions for diagnoses from 2023 onwards.
cas2305 cas2305-5 Replace diagnosisdate with date of death for cases where date of death would otherwise have been within the 3 months before diagnosisdate.
cas2304
None
cas2303
None
cas2302
None
cas2301
None