The National Cancer Registration and Analysis Service (NCRAS) has
developed an algorithmically generated Rapid Cancer Registration Dataset
(RCRD) using the standard administrative datasets which flow rapidly
into NHS England (NHSE) and are incorporated into the Cancer Analysis
System (CAS) of NCRAS. The data takes the form of a series of
significant events that occur to each patient as they proceed through
the diagnostic and then therapeutic parts of the cancer pathway, and is
available at approximately 4-5 months behind real time. The RCRD is
shallower and narrower than the full NCRAS cancer registration dataset;
it should be used and interpreted with reference to the caveats outlined
within this document.
Main findings
This document outlines the main features of the data to be aware of
when interpreting the Rapid Cancer Registration Dataset:
across all cancer types included approximately 12.1% of cases are
missing and 6.0% of cases are included erroneously or with incorrect
cancer type or diagnosis date (when compared to ‘Gold Standard’
registration data for 2018 data)
these figures vary strongly with cancer site. Broadly, more
common cancers (particularly breast and prostate cancer) perform best
and less common cancers (particularly bone and soft tissue and cancers
of unknown primary) perform worst
non-melanoma skin cancer (ICD-10 C44) tumours are excluded from
the majority of data shown (Figure 3 onwards). Carcinoma of the cervix
in-situ (ICD-10 D06) is excluded from all data presented
there are more missing tumours in those aged over 70 compared to
younger age groups
other factors that reduce data completeness include the patient’s
route to diagnosis, mortality within 30 days of diagnosis, and the
presence of multiple cancers
usable data is available approximately 4 to 5 months after
diagnosis or other clinical activity occurs
data on cancer stage group at diagnosis is available for a number
of common tumour types, although completeness is lower than that for the
Gold Standard registration data. Where data is available it generally
agrees with the Gold Standard stage group in 80 to 90% of
tumours
The dataset includes Rapid Cancer Registrations from January 2018 to
the most recently available data (at the date specified in the title to
this document), plus additional event data for the same period.
Summary
A need to make rapidly available ‘proxy cancer registrations’ (and
associated clinical activity) for the COVID-19 period has been
identified to support the public health response by NHS England (NHSE)
and other agencies, and service reorganisation by the NHS. These proxy
registrations are called Rapid Registrations in contrast to the more
formal detailed registration process that are used in non-clinical
cancer research and the National
Statistics.
The National Cancer Registration and Analysis Service (NCRAS) has
developed a Rapid Cancer Registration Dataset (RCRD) using all standard
administrative datasets which flow rapidly into NHSE and are
incorporated into the Cancer Analysis System (CAS) of NCRAS.
This document describes the dataset structure, creation methodology,
and data quality caveats (due to the rapid automated creation process
without additional data curation) behind this dataset.
These data structures and methodologies are expected to evolve over
the course of the public health response to COVID-19. The data is
updated monthly and is referred to by the monthly CAS snapshot upon
which it is based, e.g. CAS2009 refers to the CAS snapshot from
September 2020. This document is considered a ‘living document’ and
strictly applies only to the snapshot of CAS identified in the
title.
Methodology
Proxy registration events (Rapid Registrations)
Datasets available to NHSE were surveyed for how many months in
arrears that they arrive within NCRAS and are loaded in a usable format
for analysis. From these datasets a selection of event types were
defined similarly to those typically used for cancer pathway analysis
pursued by NCRAS.
The data takes the form of a series of significant events that occur
to each patient as they proceed through the diagnostic and then
therapeutic parts of the cancer pathway. These events include
chemotherapy cycles, radiotherapy episodes and major cancer surgery as
well as events based on the Cancer Waiting Times (CWT) and Cancer
Outcomes and Services Dataset (COSD) datasets. These event types are
numbered in the range 1-23 in the dataset.
Some events hypothesised to be indicative of a cancer diagnosis were
defined including ‘Diagnosis reported in COSD’ (event 51) and ‘CWT
estimated diagnosis date’ (event 52). These are numbered in the range
50-57 in the dataset - see Appendix 1 for a full list.
The indicative events for diagnosis were explored as candidate Rapid
Registration events. These candidate rapid registration events were
judged as matching against a Gold Standard Registration event if it met
the following two conditions:
- diagnosis dates for each event was 90 days or less
- both registrations fell into the same broad tumour group (as defined
in Appendix 3)
Using these matching criteria False Positive errors and False
Negative errors are defined as:
- False Positive Error (FPE): A rapid registration
event has been created which does not match against a Gold Standard
Registration in the comparison period
- False Negative Error (FNE): There exists a Gold
Standard Registration event for which no rapid registration event can be
matched
Additional filtering was applied to the candidate events and
eventually event 101 was defined to minimise both false positive and
false negative errors and is recommended for use by researchers as the
best candidate for a rapid cancer registration. Appendix 4 briefly
examines some of the alternatives examined in the development of this
event definition.
Data structures
The rapid registration dataset consists of two tables:
AT_RAPID_PATHWAY: This is an event-based dataset
with a number of types of event of interest defined based on the rapidly
available datasets, see Appendix 1 for event definitions and properties.
These are numbered in the range 1-23 for general purpose events, 50-57
for events that are candidates for combining into a rapid registration,
and 101 for the final rapid registration event.
AT_RAPID_TUMOUR: This is a tumour level dataset that
holds tumour and patient level data for each of the tumours defined by a
rapid registration. The structure and contents of this table are
presented in Appendix 3.
The rapid registration pathway and tumour table can be linked
together as shown in Figure 1, and also to other datasets that are
timely enough via NHSnumber.
Data Quality
How do the number of Rapid Registrations compare with Gold Standard
Registrations?
To illustrate the strengths and weaknesses of the Rapid Registrations
compared to the gold standard process, registrations for tumours
diagnosed during 2018 are compared in Figure 2.
For most tumour groups the counts of Rapid Registrations are
significantly lower than those of standard registrations. The COSD
system does not attempt to record basal cell carcinoma non-melanoma skin
cancers (but they are recorded by hospital pathology systems, and
thereby registered), explaining the discrepancy there. There is only one
group where this situation is reversed - bone and soft tissue - for
which a precise morphology is required to properly record the diagnosis.
These cancers are being preferentially coded to bone and soft tissue in
COSD (as the COSD standard necessitates simpler site-based coding, and
this is the best choice under the circumstances) and re-coded during the
gold standard registration process where more sophisticated combination
of site and morphological coding is possible.
Comparing the matching quality of Rapid Registrations
The quality of the Rapid Registrations was judged by comparing them
against the gold-standard cancer registrations in the period April 2018
to September 2018. This period was chosen as available gold standard
registration data was only finalised to December 2018 and a matching
period of 90 days was allowed (restricting comparison to the middle six
months of the twelve-month period).
Figure 4 shows the proportions of false positive and false negative
events, by broad cancer type (excluding non-melanoma skin cancer),
measured in the cas2603 snapshot (the tumour groups are defined in
Appendix 3). A more detailed tabulation is available by tumour group and
tumour site in Appendix 5.
In most tumour groups, there are more tumours missed by the rapid
registrations process (false negatives) than there are falsely
identified as tumours (false positives).
For breast and prostate, very few incorrect proxy registrations are
made. Breast, colorectal, lung, oesophagogastric (O-G) and prostate
cancers are also least likely to be missing from the proxy dataset,
whereas for cancers of unknown primary, and bone and soft tissue tumours
more than 25% of cancers are missed. Bone and soft tissue tumours are
not frequently diagnosed. These tumours often require multiple pathology
reports to correctly diagnose a patient and the Rapid Registrations
dataset has not attempted to reconcile differences in the reported
diagnoses.
Counts of events over time
This section examines the population of events by chronological time
and when they appear in successive analytical snapshots in the CAS.
Figure 14 shows that most data items in the Rapid Registrations dataset
are stable with respect to the snapshot month.
Specific comments about the events shown below are:
cancer waiting times data (events 1–4) are received based on the
treatment start date; this explains why for event 2 all lines lie
exactly on top of each other. Other CWT events accumulate over
successive snapshots where these events occur before the first treatment
start event
an issue with HES data that caused lower than expected
completeness from 2020-04-01 was resolved in cas2102, leading to
increased event counts in events 5, 6, 11, 12, 13 and 23
the definition of event 17 only includes tumour diagnoses prior
to 2018, so lack of data in the chart below is expected
definitions of staging events may change between snapshots, which
might explain higher or lower counts in one snapshot compared to
others
the vital status shown in event 19 is typically only assessed
each January or at the completion of registering each diagnosis year,
explaining the large peaks in the graph
the raw data used to populate events 21, 54 and 56 is subject to
ongoing deduplication, which explains lower counts in earlier time
periods for later snapshots
between snapshots, event 101–103 (inferred diagnoses) counts
generally increase, particularly for recent months as additional COSD
data is submitted. For some earlier months, there is a small decrease in
these counts because the algorithm excludes potential diagnoses where
the patient already has a confirmed diagnosis in the same tumour group
more than 90 days before. These exclusions can change between snapshots
as gold standard registration data is processed, leading to more
confirmed previous diagnoses. The effect has been measured as less than
1% of all cases in any given month
Estimated completeness of Rapid Registrations and secondary
datasets
Detailed linked rapid cancer registration, CWT, SACT and RTDS data is
available at approximately a four-month lag from real time. Linked HES
and raw COSD data is available at approximately 4-5 months behind real
time.
Table 2 below shows data usability and completeness for Rapid
Registrations and the constituent datasets. The “latest usable” column
shows the ‘hard limit’ on data that is considered fit for analytical
purposes (90% completeness), even in months prior to this though data is
not necessarily considered complete and the completeness is displayed
below. This should be taken into account in any use of the rapid
registration data and the secondary datasets.
For the Rapid Tumour data completeness is expressed as the proportion
of CCG of residence which show a cancer incidence within the normally
expected range (see Table 3 below). For other datasets except CWT
completeness is computed as a percentage of the number of data providers
who have supplied data over those who are expected to do so.
Data completeness within the Cancer Waiting Times dataset varies at
patient level with event type. Figures for the Treatment Start Date and
Treatment Period Start Date are given below. Completeness of other CWT
events can be estimated by inspecting Figure 13 (events 1-4).
Table 2: Rapid registration and dataset usability/completeness in
cas2603
Rapid registration and dataset usability/completeness
|
Data source
|
Latest usable
|
January 2025
|
February 2025
|
March 2025
|
April 2025
|
May 2025
|
June 2025
|
July 2025
|
August 2025
|
September 2025
|
October 2025
|
November 2025
|
December 2025
|
|
Rapid Tumours (COSD)
|
December 2025
|
97%
|
97%
|
97%
|
98%
|
97%
|
99%
|
99%
|
96%
|
99%
|
95%
|
94%
|
94%
|
|
HES
|
October 2025
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
95%
|
|
|
|
SACT
|
May 2025
|
96%
|
98%
|
97%
|
93%
|
92%
|
|
|
|
|
|
|
|
|
RTDS
|
July 2025
|
Complete
|
Complete
|
98%
|
96%
|
94%
|
94%
|
94%
|
|
|
|
|
|
|
CWT (TSD)
|
December 2025
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
|
CWT (TPSD)
|
November 2025
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
Complete
|
|
|
Note:
|
COSD = Cancer Outcomes and Services Dataset TSD =
Treatment Start Date TPSD = Treatment Period Start Date
|
Table 3a: Number of outlier CCGs in COSD dataset in cas2603
The table below shows the number of CCGs (using the April 2020
boundaries) which have 3-sigma outlier counts per month (either high or
low) compared to the expectation of the fraction of the total number of
new cancer registrations in England. This can be used to judge to what
extent there is large scale missing data in COSD (and therefore in the
Rapid Registrations in any particular month.)
Number of outlier CCGs in COSD dataset
|
Year and month
|
Outlier: High
|
Outlier: Low
|
In expected range
|
Total received
|
Prop.
|
|
2024-01
|
1
|
3
|
131
|
135
|
0.9703704
|
|
2024-02
|
0
|
0
|
135
|
135
|
1.0000000
|
|
2024-03
|
0
|
1
|
134
|
135
|
0.9925926
|
|
2024-04
|
0
|
0
|
135
|
135
|
1.0000000
|
|
2024-05
|
0
|
0
|
135
|
135
|
1.0000000
|
|
2024-06
|
0
|
1
|
134
|
135
|
0.9925926
|
|
2024-07
|
1
|
2
|
132
|
135
|
0.9777778
|
|
2024-08
|
1
|
0
|
134
|
135
|
0.9925926
|
|
2024-09
|
1
|
3
|
131
|
135
|
0.9703704
|
|
2024-10
|
1
|
1
|
133
|
135
|
0.9851852
|
|
2024-11
|
1
|
3
|
131
|
135
|
0.9703704
|
|
2024-12
|
1
|
4
|
130
|
135
|
0.9629630
|
|
2025-01
|
1
|
3
|
131
|
135
|
0.9703704
|
|
2025-02
|
1
|
3
|
131
|
135
|
0.9703704
|
|
2025-03
|
1
|
3
|
131
|
135
|
0.9703704
|
|
2025-04
|
2
|
1
|
132
|
135
|
0.9777778
|
|
2025-05
|
1
|
3
|
131
|
135
|
0.9703704
|
|
2025-06
|
0
|
2
|
133
|
135
|
0.9851852
|
|
2025-07
|
0
|
1
|
134
|
135
|
0.9925926
|
|
2025-08
|
2
|
3
|
130
|
135
|
0.9629630
|
|
2025-09
|
0
|
1
|
134
|
135
|
0.9925926
|
|
2025-10
|
1
|
6
|
128
|
135
|
0.9481481
|
|
2025-11
|
1
|
7
|
127
|
135
|
0.9407407
|
|
2025-12
|
1
|
7
|
127
|
135
|
0.9407407
|
|
2026-01
|
9
|
15
|
111
|
135
|
0.8222222
|
|
2026-02
|
31
|
61
|
42
|
134
|
0.3134328
|
|
2026-03
|
7
|
0
|
NA
|
7
|
NA
|
Staging data in the Rapid Registrations dataset
TNM stage group 1-4
The size and extent of a cancer is commonly described using the ‘TNM’ system for “Tumour”,
“Node”, and “Metastases”. This is often abbreviated to a number between
1 (typically a localised tumour with limited spread) to 4 (typically a
tumour that has invaded or spread to distant organs). The stage at
diagnosis is very strongly associated with patient outcomes.
In the current version of the Rapid Registrations dataset partial
staging data is provided for a number of different cancer sites (ICD-10
codes can be found in the labels for tables 5a-k). This has been
benchmarked against the gold standard cancer registry data for cas2603.
Table 4 shows the count and proportion of cases by TNM stage group for
both the Rapid Registrations and the Gold Standard Registrations, for
calendar year 2018. For example 32% of breast cancers are TNM stage
group 1 in the Rapid Registrations, but 38% in the Gold Standard
Registrations. Compared to the Gold Standard Registrations in 2018, the
Rapid Registrations under report breast cancers diagnosed at stages 1 or
2; colorectal cancers diagnosed at stage 4 are under reported and
prostate cancers have under reported stages 1 and 4. In all three tumour
groups, there are more tumours allocated to the unknown or unstageable
category. Lung cancers in the RCRD most accurately match the Gold
Standard Registrations and exhibits a broadly similar stage profile from
both measures.
Table 4: Summary proportions of stage at diagnosis for the Rapid
Registrations and Gold Standard Registrations
Summary proportions of stage at diagnosis for the Rapid Registrations
and Gold Standard Registrations
|
Broad Cancer Group
|
Stage Group
|
Count (Rapid)
|
Percentage (Rapid)
|
Count (Gold Standard)
|
Percentage (Gold Standard)
|
|
Bladder
|
1
|
2329
|
24.2%
|
2862
|
29.7%
|
|
Bladder
|
2
|
1801
|
18.7%
|
1877
|
19.5%
|
|
Bladder
|
3
|
567
|
5.9%
|
882
|
9.2%
|
|
Bladder
|
4
|
265
|
2.8%
|
653
|
6.8%
|
|
Bladder
|
U
|
4673
|
48.5%
|
3361
|
34.9%
|
|
Breast
|
1
|
14307
|
32.2%
|
16566
|
37.3%
|
|
Breast
|
2
|
13321
|
30.0%
|
16775
|
37.8%
|
|
Breast
|
3
|
3282
|
7.4%
|
3712
|
8.4%
|
|
Breast
|
4
|
1283
|
2.9%
|
1977
|
4.5%
|
|
Breast
|
U
|
12225
|
27.5%
|
5388
|
12.1%
|
|
Cervical
|
1
|
1212
|
46.5%
|
833
|
31.9%
|
|
Cervical
|
2
|
430
|
16.5%
|
400
|
15.3%
|
|
Cervical
|
3
|
173
|
6.6%
|
191
|
7.3%
|
|
Cervical
|
4
|
262
|
10.0%
|
235
|
9.0%
|
|
Cervical
|
U
|
532
|
20.4%
|
950
|
36.4%
|
|
Colorectum
|
1
|
4908
|
14.9%
|
5481
|
16.7%
|
|
Colorectum
|
2
|
7045
|
21.5%
|
7716
|
23.5%
|
|
Colorectum
|
3
|
8296
|
25.3%
|
9308
|
28.3%
|
|
Colorectum
|
4
|
5179
|
15.8%
|
7457
|
22.7%
|
|
Colorectum
|
U
|
7414
|
22.6%
|
2880
|
8.8%
|
|
Kidney
|
1
|
2385
|
29.0%
|
3330
|
40.5%
|
|
Kidney
|
2
|
448
|
5.4%
|
554
|
6.7%
|
|
Kidney
|
3
|
1350
|
16.4%
|
1643
|
20.0%
|
|
Kidney
|
4
|
706
|
8.6%
|
1577
|
19.2%
|
|
Kidney
|
U
|
3342
|
40.6%
|
1127
|
13.7%
|
|
Lung
|
1
|
6190
|
17.1%
|
6611
|
18.3%
|
|
Lung
|
2
|
2583
|
7.2%
|
2683
|
7.4%
|
|
Lung
|
3
|
7321
|
20.3%
|
7606
|
21.1%
|
|
Lung
|
4
|
14984
|
41.5%
|
17166
|
47.5%
|
|
Lung
|
U
|
5023
|
13.9%
|
2035
|
5.6%
|
|
Lymphoma
|
1
|
921
|
7.5%
|
1753
|
14.3%
|
|
Lymphoma
|
2
|
955
|
7.8%
|
1613
|
13.2%
|
|
Lymphoma
|
3
|
1211
|
9.9%
|
1986
|
16.2%
|
|
Lymphoma
|
4
|
2689
|
22.0%
|
4919
|
40.2%
|
|
Lymphoma
|
U
|
6458
|
52.8%
|
1963
|
16.0%
|
|
Melanoma
|
1
|
6345
|
48.2%
|
8224
|
62.5%
|
|
Melanoma
|
2
|
2383
|
18.1%
|
2641
|
20.1%
|
|
Melanoma
|
3
|
482
|
3.7%
|
1038
|
7.9%
|
|
Melanoma
|
4
|
223
|
1.7%
|
350
|
2.7%
|
|
Melanoma
|
U
|
3733
|
28.4%
|
913
|
6.9%
|
|
Oesophagus
|
1
|
298
|
3.6%
|
445
|
5.4%
|
|
Oesophagus
|
2
|
1489
|
18.1%
|
954
|
11.6%
|
|
Oesophagus
|
3
|
1763
|
21.4%
|
2119
|
25.7%
|
|
Oesophagus
|
4
|
2508
|
30.4%
|
3193
|
38.7%
|
|
Oesophagus
|
U
|
2189
|
26.5%
|
1536
|
18.6%
|
|
Ovary
|
1
|
1189
|
23.6%
|
1417
|
28.1%
|
|
Ovary
|
2
|
263
|
5.2%
|
282
|
5.6%
|
|
Ovary
|
3
|
1329
|
26.3%
|
1617
|
32.0%
|
|
Ovary
|
4
|
791
|
15.7%
|
1060
|
21.0%
|
|
Ovary
|
U
|
1474
|
29.2%
|
670
|
13.3%
|
|
Pancreas
|
1
|
358
|
4.5%
|
662
|
8.3%
|
|
Pancreas
|
2
|
629
|
7.8%
|
804
|
10.0%
|
|
Pancreas
|
3
|
752
|
9.4%
|
1040
|
13.0%
|
|
Pancreas
|
4
|
2058
|
25.7%
|
4100
|
51.2%
|
|
Pancreas
|
U
|
4218
|
52.6%
|
1409
|
17.6%
|
|
Prostate
|
1
|
11695
|
25.3%
|
16196
|
35.0%
|
|
Prostate
|
2
|
5584
|
12.1%
|
6518
|
14.1%
|
|
Prostate
|
3
|
10446
|
22.6%
|
11615
|
25.1%
|
|
Prostate
|
4
|
5709
|
12.3%
|
8084
|
17.5%
|
|
Prostate
|
U
|
12879
|
27.8%
|
3900
|
8.4%
|
|
Stomach
|
1
|
327
|
8.3%
|
340
|
8.6%
|
|
Stomach
|
2
|
388
|
9.8%
|
466
|
11.8%
|
|
Stomach
|
3
|
646
|
16.4%
|
712
|
18.0%
|
|
Stomach
|
4
|
1147
|
29.0%
|
1665
|
42.2%
|
|
Stomach
|
U
|
1441
|
36.5%
|
766
|
19.4%
|
|
Uterus
|
1
|
4657
|
58.7%
|
5335
|
67.3%
|
|
Uterus
|
2
|
512
|
6.5%
|
537
|
6.8%
|
|
Uterus
|
3
|
734
|
9.3%
|
817
|
10.3%
|
|
Uterus
|
4
|
518
|
6.5%
|
552
|
7.0%
|
|
Uterus
|
U
|
1510
|
19.0%
|
690
|
8.7%
|
In Tables 5a-m below, the distribution of the stage allocations
between the Rapid Registrations and the Gold Standard Registrations are
examined.
The figures indicate the proportion of agreement at the 1-digit TNM
stage group level, where the stage is known in the Rapid Registrations
dataset. Stages 1-4 in the Rapid Registrations dataset agree with the
gold standard stage variable for a high proportion.
For example, when examining the subset of Rapid Registrations breast
tumours that are identified as TNM stage 1 (32%), approximately 89% of
these are found to be TNM stage group 1 in the gold standard
registration data, with another 11% distributed across TNM stages 2-4
and the unknown or unstageable groups.
For many but not all (e.g., late stage breast cancer), roughly 85% or
more of staged cases in the Rapid Registrations table have the same
stage grouping as the equivalent tumour in the standard registration
data - this can be seen in the table below by inspecting the figures
where the stage metrics for the Rapid Registrations and Gold Standard
Registrations are the same.
Where the stage is labelled as unknown or unstageable in the rapid
pathway dataset it is known for at least 70% of those cases in the gold
standard data.
Tables 5a-n: Stage comparison between Rapid Registrations and Gold
Standard Registrations by cancer site
Stage comparison between RCRD and NCRD
a. bladder (ICD-10 C67)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
84.4%
|
4.2%
|
7.8%
|
5.7%
|
16.3%
|
|
2
|
3.8%
|
71.5%
|
15.3%
|
6.4%
|
8.5%
|
|
3
|
2.6%
|
10.8%
|
64.0%
|
4.9%
|
5.4%
|
|
4
|
1.2%
|
5.0%
|
5.3%
|
77.0%
|
6.4%
|
|
U
|
7.9%
|
8.6%
|
7.6%
|
6.0%
|
63.4%
|
Stage comparison between RCRD and NCRD
b. breast (ICD-10 C50)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
88.5%
|
5.1%
|
1.8%
|
4.3%
|
25.5%
|
|
2
|
6.7%
|
88.0%
|
11.5%
|
16.3%
|
28.7%
|
|
3
|
0.6%
|
2.7%
|
78.9%
|
5.7%
|
5.0%
|
|
4
|
0.2%
|
0.9%
|
2.9%
|
67.7%
|
7.1%
|
|
U
|
4.0%
|
3.4%
|
4.9%
|
6.0%
|
33.7%
|
Stage comparison between RCRD and NCRD
c. colorectum (ICD-10 C18-C20)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
84.8%
|
2.2%
|
1.9%
|
0.8%
|
13.1%
|
|
2
|
5.7%
|
85.4%
|
5.5%
|
1.4%
|
12.0%
|
|
3
|
6.6%
|
7.3%
|
84.9%
|
4.5%
|
16.1%
|
|
4
|
0.9%
|
2.9%
|
5.7%
|
92.0%
|
26.7%
|
|
U
|
2.0%
|
2.3%
|
2.1%
|
1.4%
|
32.1%
|
Stage comparison between RCRD and NCRD
d. kidney (ICD-10 C64)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
90.9%
|
6.7%
|
3.3%
|
2.1%
|
32.1%
|
|
2
|
0.5%
|
77.0%
|
1.0%
|
0.8%
|
5.3%
|
|
3
|
1.8%
|
6.7%
|
85.9%
|
4.2%
|
11.4%
|
|
4
|
0.5%
|
3.3%
|
5.9%
|
90.7%
|
24.9%
|
|
U
|
6.3%
|
6.2%
|
4.0%
|
2.1%
|
26.3%
|
Stage comparison between RCRD and NCRD
e. lung (ICD-10 C33-C34)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
92.7%
|
6.9%
|
1.4%
|
0.6%
|
10.1%
|
|
2
|
2.8%
|
83.3%
|
1.8%
|
0.4%
|
3.1%
|
|
3
|
1.9%
|
5.1%
|
90.0%
|
1.3%
|
11.4%
|
|
4
|
1.4%
|
3.2%
|
5.5%
|
97.1%
|
40.8%
|
|
U
|
1.2%
|
1.5%
|
1.3%
|
0.6%
|
34.6%
|
Stage comparison between RCRD and NCRD
f. melanoma (ICD-10 C43)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
93.6%
|
2.8%
|
8.3%
|
14.3%
|
57.4%
|
|
2
|
2.3%
|
78.0%
|
10.8%
|
18.8%
|
14.5%
|
|
3
|
2.0%
|
11.6%
|
73.4%
|
14.8%
|
6.6%
|
|
4
|
0.2%
|
1.6%
|
2.5%
|
40.8%
|
5.3%
|
|
U
|
1.9%
|
6.0%
|
5.0%
|
11.2%
|
16.2%
|
Stage comparison between RCRD and NCRD
g. oesophagus (ICD-10 C15)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
79.9%
|
5.0%
|
0.5%
|
0.2%
|
5.4%
|
|
2
|
7.4%
|
49.7%
|
3.4%
|
1.0%
|
4.9%
|
|
3
|
2.3%
|
34.7%
|
68.5%
|
6.2%
|
10.6%
|
|
4
|
1.0%
|
5.2%
|
21.7%
|
83.1%
|
29.5%
|
|
U
|
9.4%
|
5.5%
|
5.9%
|
9.5%
|
49.5%
|
Stage comparison between RCRD and NCRD
h. ovary (ICD-10 C56-C57)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
96.1%
|
7.2%
|
0.9%
|
0.3%
|
16.4%
|
|
2
|
0.5%
|
88.6%
|
0.5%
|
0.1%
|
2.4%
|
|
3
|
0.9%
|
1.9%
|
90.9%
|
10.5%
|
21.0%
|
|
4
|
0.5%
|
0.4%
|
4.4%
|
84.1%
|
22.3%
|
|
U
|
1.9%
|
1.9%
|
3.2%
|
5.1%
|
37.9%
|
Stage comparison between RCRD and NCRD
i. prostate (ICD-10 C61)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
85.8%
|
10.5%
|
4.7%
|
1.6%
|
38.8%
|
|
2
|
6.8%
|
81.5%
|
2.7%
|
0.9%
|
6.6%
|
|
3
|
4.3%
|
4.3%
|
85.7%
|
2.9%
|
13.7%
|
|
4
|
0.8%
|
0.8%
|
3.9%
|
92.2%
|
17.6%
|
|
U
|
2.3%
|
2.8%
|
3.1%
|
2.5%
|
23.3%
|
Stage comparison between RCRD and NCRD
j. stomach (ICD-10 C16)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
67.3%
|
4.6%
|
0.8%
|
0.1%
|
6.7%
|
|
2
|
18.7%
|
64.7%
|
9.9%
|
0.8%
|
5.6%
|
|
3
|
5.8%
|
19.3%
|
68.6%
|
3.2%
|
9.6%
|
|
4
|
2.1%
|
6.7%
|
16.7%
|
93.4%
|
31.4%
|
|
U
|
6.1%
|
4.6%
|
4.0%
|
2.5%
|
46.7%
|
Stage comparison between RCRD and NCRD
k. uterus (ICD-10 C54-C55)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
96.7%
|
10.5%
|
5.7%
|
7.1%
|
46.2%
|
|
2
|
0.7%
|
83.0%
|
1.2%
|
2.5%
|
3.8%
|
|
3
|
0.5%
|
2.1%
|
86.8%
|
7.3%
|
7.1%
|
|
4
|
0.2%
|
1.8%
|
2.5%
|
75.1%
|
8.4%
|
|
U
|
1.9%
|
2.5%
|
3.8%
|
7.9%
|
34.5%
|
Stage comparison between RCRD and NCRD
l. pancreas (ICD-10 C25)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
72.6%
|
3.5%
|
0.9%
|
0.4%
|
8.7%
|
|
2
|
15.4%
|
74.1%
|
2.4%
|
0.5%
|
6.0%
|
|
3
|
4.7%
|
12.2%
|
88.2%
|
0.6%
|
6.4%
|
|
4
|
3.4%
|
5.6%
|
6.0%
|
97.1%
|
47.6%
|
|
U
|
3.9%
|
4.6%
|
2.5%
|
1.3%
|
31.3%
|
Stage comparison between RCRD and NCRD
m. lymphoma, staged (ICD-10
C81-C86, C88)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
89.3%
|
1.4%
|
0.7%
|
0.6%
|
13.9%
|
|
2
|
0.9%
|
92.3%
|
1.4%
|
0.5%
|
10.7%
|
|
3
|
0.8%
|
1.4%
|
88.4%
|
1.7%
|
13.1%
|
|
4
|
6.4%
|
2.7%
|
7.2%
|
91.8%
|
35.3%
|
|
U
|
2.7%
|
2.3%
|
2.3%
|
5.4%
|
27.0%
|
Stage comparison between RCRD and NCRD
n. cervical (ICD-10 C53)
|
Stage Group
|
RCRD
|
|
NCRD
|
1
|
2
|
3
|
4
|
Unknown
|
|
1
|
58.3%
|
3.5%
|
3.5%
|
3.8%
|
17.9%
|
|
2
|
1.2%
|
75.3%
|
2.3%
|
5.0%
|
8.3%
|
|
3
|
1.1%
|
3.3%
|
73.4%
|
1.9%
|
6.0%
|
|
4
|
0.4%
|
1.4%
|
2.9%
|
67.6%
|
7.9%
|
|
U
|
38.9%
|
16.5%
|
17.9%
|
21.8%
|
60.0%
|
“Early” vs “Late” stage
Below in table 6 we repeat the above tabulations but now grouping
Rapid and Gold Standard cancers into “Early” (TNM stage group 1 & 2)
or “Late” (TNM stage group 3 & 4) categories. We see that 62% of
breast cancers are identified as “Early” stage in the Rapid
Registrations dataset compared to 76% in the Gold Standard Registration
data due to the higher proportion of “Unknown” stage tumours (28% vs 10%
respectively).
As with the more detailed stage data, there is a high degree of
concordance between the gold standard and rapid registration stage
fields if a known stage can be identified.
Table 6: Summary proportions of “Early” vs “Late” stage for Rapid
Registrations and Gold Standard Registrations
Summary proportions of Early vs Late stage for Rapid Registrations and
Gold Standard Registrations
|
Broad Cancer Group
|
Stage Group
|
Count (Rapid)
|
Percentage (Rapid)
|
Count (Gold Standard)
|
Percentage (Gold Standard)
|
|
Bladder
|
Early
|
4130
|
42.9%
|
4739
|
49.2%
|
|
Bladder
|
Late
|
832
|
8.6%
|
1535
|
15.9%
|
|
Bladder
|
Unknown
|
4673
|
48.5%
|
3361
|
34.9%
|
|
Breast
|
Early
|
27628
|
62.2%
|
33341
|
75.1%
|
|
Breast
|
Late
|
4565
|
10.3%
|
5689
|
12.8%
|
|
Breast
|
Unknown
|
12225
|
27.5%
|
5388
|
12.1%
|
|
Cervical
|
Early
|
1642
|
62.9%
|
1233
|
47.3%
|
|
Cervical
|
Late
|
435
|
16.7%
|
426
|
16.3%
|
|
Cervical
|
Unknown
|
532
|
20.4%
|
950
|
36.4%
|
|
Colorectum
|
Early
|
11953
|
36.4%
|
13197
|
40.2%
|
|
Colorectum
|
Late
|
13475
|
41.0%
|
16765
|
51.0%
|
|
Colorectum
|
Unknown
|
7414
|
22.6%
|
2880
|
8.8%
|
|
Kidney
|
Early
|
2833
|
34.4%
|
3884
|
47.2%
|
|
Kidney
|
Late
|
2056
|
25.0%
|
3220
|
39.1%
|
|
Kidney
|
Unknown
|
3342
|
40.6%
|
1127
|
13.7%
|
|
Lung
|
Early
|
8773
|
24.3%
|
9294
|
25.7%
|
|
Lung
|
Late
|
22305
|
61.8%
|
24772
|
68.6%
|
|
Lung
|
Unknown
|
5023
|
13.9%
|
2035
|
5.6%
|
|
Lymphoma
|
Early
|
1876
|
15.3%
|
3366
|
27.5%
|
|
Lymphoma
|
Late
|
3900
|
31.9%
|
6905
|
56.4%
|
|
Lymphoma
|
Unknown
|
6458
|
52.8%
|
1963
|
16.0%
|
|
Melanoma
|
Early
|
8728
|
66.3%
|
10865
|
82.5%
|
|
Melanoma
|
Late
|
705
|
5.4%
|
1388
|
10.5%
|
|
Melanoma
|
Unknown
|
3733
|
28.4%
|
913
|
6.9%
|
|
Oesophagus
|
Early
|
1787
|
21.7%
|
1399
|
17.0%
|
|
Oesophagus
|
Late
|
4271
|
51.8%
|
5312
|
64.4%
|
|
Oesophagus
|
Unknown
|
2189
|
26.5%
|
1536
|
18.6%
|
|
Ovary
|
Early
|
1452
|
28.8%
|
1699
|
33.7%
|
|
Ovary
|
Late
|
2120
|
42.0%
|
2677
|
53.1%
|
|
Ovary
|
Unknown
|
1474
|
29.2%
|
670
|
13.3%
|
|
Pancreas
|
Early
|
987
|
12.3%
|
1466
|
18.3%
|
|
Pancreas
|
Late
|
2810
|
35.1%
|
5140
|
64.1%
|
|
Pancreas
|
Unknown
|
4218
|
52.6%
|
1409
|
17.6%
|
|
Prostate
|
Early
|
17279
|
37.3%
|
22714
|
49.0%
|
|
Prostate
|
Late
|
16155
|
34.9%
|
19699
|
42.5%
|
|
Prostate
|
Unknown
|
12879
|
27.8%
|
3900
|
8.4%
|
|
Stomach
|
Early
|
715
|
18.1%
|
806
|
20.4%
|
|
Stomach
|
Late
|
1793
|
45.4%
|
2377
|
60.2%
|
|
Stomach
|
Unknown
|
1441
|
36.5%
|
766
|
19.4%
|
|
Uterus
|
Early
|
5169
|
65.2%
|
5872
|
74.0%
|
|
Uterus
|
Late
|
1252
|
15.8%
|
1369
|
17.3%
|
|
Uterus
|
Unknown
|
1510
|
19.0%
|
690
|
8.7%
|
In Table 7a-n below the distribution of the stage allocation between
the Rapid Registrations and the Gold Standard Registrations are
examined, aggregated into Early and Late stage.
Tables 7a-n: “Early” vs “late” stage comparison between Rapid
Registrations and Gold Standard Registrations
Early vs late stage comparison between RCRD and NCRD
a. bladder
(ICD-10 C67)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
82.7%
|
19.6%
|
24.8%
|
|
Late
|
9.1%
|
73.3%
|
11.8%
|
|
Unknown
|
8.2%
|
7.1%
|
63.4%
|
Early vs late stage comparison between RCRD and NCRD
b. breast
(ICD-10 C50)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
94.2%
|
15.4%
|
54.2%
|
|
Late
|
2.1%
|
79.4%
|
12.1%
|
|
Unknown
|
3.7%
|
5.2%
|
33.7%
|
Early vs late stage comparison between RCRD and NCRD
c. colorectum
(ICD-10 C18-C20)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
88.8%
|
5.4%
|
25.0%
|
|
Late
|
9.1%
|
92.8%
|
42.9%
|
|
Unknown
|
2.1%
|
1.8%
|
32.1%
|
Early vs late stage comparison between RCRD and NCRD
d. kidney
(ICD-10 C64)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
90.2%
|
3.8%
|
37.4%
|
|
Late
|
3.5%
|
92.8%
|
36.3%
|
|
Unknown
|
6.3%
|
3.4%
|
26.3%
|
Early vs late stage comparison between RCRD and NCRD
e. lung (ICD-10
C33-C34)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
94.0%
|
1.7%
|
13.2%
|
|
Late
|
4.8%
|
97.4%
|
52.2%
|
|
Unknown
|
1.3%
|
0.8%
|
34.6%
|
Early vs late stage comparison between RCRD and NCRD
f. melanoma
(ICD-10 C43)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
91.8%
|
23.5%
|
71.9%
|
|
Late
|
5.2%
|
69.5%
|
11.9%
|
|
Unknown
|
3.0%
|
7.0%
|
16.2%
|
Early vs late stage comparison between RCRD and NCRD
g. Oesophagus
(ICD-10 C15)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
60.1%
|
2.3%
|
10.3%
|
|
Late
|
33.7%
|
89.7%
|
40.2%
|
|
Unknown
|
6.2%
|
8.0%
|
49.5%
|
Early vs late stage comparison between RCRD and NCRD
h. ovary (ICD-10
C56-C57)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
96.5%
|
1.0%
|
18.7%
|
|
Late
|
1.6%
|
95.0%
|
43.4%
|
|
Unknown
|
1.9%
|
3.9%
|
37.9%
|
Early vs late stage comparison between RCRD and NCRD
i. prostate
(ICD-10 C61)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
92.4%
|
5.6%
|
45.4%
|
|
Late
|
5.1%
|
91.5%
|
31.3%
|
|
Unknown
|
2.5%
|
2.9%
|
23.3%
|
Early vs late stage comparison between RCRD and NCRD
j. stomach
(ICD-10 C16)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
76.9%
|
4.4%
|
12.3%
|
|
Late
|
17.8%
|
92.5%
|
41.0%
|
|
Unknown
|
5.3%
|
3.1%
|
46.7%
|
Early vs late stage comparison between RCRD and NCRD
k. uterus
(ICD-10 C54-C55)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
97.0%
|
8.1%
|
50.0%
|
|
Late
|
1.0%
|
86.4%
|
15.5%
|
|
Unknown
|
1.9%
|
5.5%
|
34.5%
|
Early vs late stage comparison between RCRD and NCRD
l. pancreas
(ICD-10 C25)
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
81.4%
|
1.6%
|
14.7%
|
|
Late
|
14.3%
|
96.8%
|
54.0%
|
|
Unknown
|
4.4%
|
1.6%
|
31.3%
|
Early vs late stage comparison between RCRD and NCRD
m. lymphoma
(ICD-10 C81-C86, C88
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
91.9%
|
1.4%
|
24.6%
|
|
Late
|
5.6%
|
94.2%
|
48.4%
|
|
Unknown
|
2.5%
|
4.5%
|
27.0%
|
Early vs late stage comparison between RCRD and NCRD
n. cervical
(ICD-10 C53
|
Stage Category
|
RCRD
|
|
NCRD
|
Early
|
Late
|
Unknown
|
|
Early
|
64.6%
|
7.6%
|
26.1%
|
|
Late
|
2.3%
|
72.2%
|
13.9%
|
|
Unknown
|
33.1%
|
20.2%
|
60.0%
|
Stage trends over time
Figure 15 shows the monthly variation of the incidence count by stage
at diagnosis for a number of common cancers. Allowing for variation in
the number of working days in each month (which affects the overall
number of tumours diagnosed per month) and for statistical fluctuation
there is little evidence of any stage shift in the period displayed. The
feature around May 2018 in the prostate cancer trends can be ascribed to
the so called ‘Turnbull-Fry
effect’.
Stage completeness by snapshot
Figure 16 shows the completeness of stage by tumour type for one
snapshot per quarter. Stage completeness continues to increase and lags
behind the incidence completeness due to staging activity happening up
to several months after diagnosis.
Data completeness, source, and mortality comparisons
Counts of missing data
Figure 17 shows the count of tumours per month where the indicated
data item is missing. The data items are: basis of diagnosis, birth date
best, ethnic category, NHS number, postcode, quintile 2019, gender and
trust code. Larger counts in the most recent months are to be
expected.
Ethnicity completeness
Figure 18 shows the count of tumours per month where the indicated
data item is missing. Larger counts in the most recent months are to be
expected.
Tumour source
Figure 19 shows the number of tumours created by the source of the
diagnosis - i.e., which dataset was used to create them, by month
Mortality proportion by month
Figure 20 shows the mortality proportions by month mortality within
30, 90 and 182 days in the RCRD compared to the NCRD, for all cancers
included in RCRD excl C44 and D06.
Appendix 1 - List of pathway events
Table A1: AT_RAPID_PATHWAY: event list
AT_RAPID_PATHWAY: event list
|
EVENT_TYPE
|
EVENT_DESC
|
EVENT_PROPERTY_1
|
EVENT_PROPERTY_2
|
EVENT_PROPERTY_3
|
EVENT_DATE
|
Linkage
|
|
1
|
CWT Treatment Period Start Date
|
CWT First Treatment Flag
|
CWT SITE_ICD10
|
CWT Cancer Treatment Event Type
|
Treat period start
|
NHSNUMBER
|
|
2
|
CWT Treatment Start
|
CWT Treatment Modality
|
CWT Cancer Treatment Event type
|
|
Treatment start date
|
NHSNUMBER
|
|
3
|
CWT MDT Begin
|
CWT MDT Cancer Care Plan discussed indicator
|
|
|
MDT date
|
NHSNUMBER
|
|
4
|
CWT Faster Diagnosis Period End
|
(null)
|
Faster Diagnosis Period site
|
|
Faster Diagnosis Period end date
|
NHSNUMBER
|
|
5
|
HES Admitted Patient Care Episode
|
Treatment speciality : Admission Method : HES ethnicity
|
All ICD-10 codes (for episode)
|
All OPCS-4 codes (for episode)
|
Episode Start date - Episode end date
|
NHSNUMBER
|
|
6
|
HES Admitted Patient Care Operation
|
OPCS codes (for date) in POS order
|
ICD-10 codes (for episode)
|
|
Operation date
|
NHSNUMBER
|
|
7
|
SACT Cycle
|
Benchmark group
|
Cycle number
|
Treatment intent
|
Cycle start date
|
PATIENTID
|
|
8
|
RTDS Episode
|
Radiotherapy intent
|
ICD-10 diagnosis code
|
|
Episode treatment start date
|
PATIENTID
|
|
9
|
Tumour diagnosis (Provisional)
|
Statusofregistration
|
ICD-10 diagnosis code
|
Stage_best
|
Diagnosisdatebest
|
PATIENTID
|
|
11
|
HES major surgery (historical)
|
OPCS-4 code
|
ICD-10 diagnosis code
|
|
Operation date
|
NHSNUMBER
|
|
12
|
HES major surgery (historical, further constraints)
|
OPCS-4 code
|
ICD-10 diagnosis code
|
Further notes/constraints
|
Operation date
|
NHSNUMBER
|
|
14
|
RAWDATA major surgery (historical)
|
OPCS-4 code
|
ICD-10 diagnosis code
|
|
Operation date
|
PATIENTID
|
|
15
|
RAWDATA major surgery (historical, further constraints)
|
OPCS-4 code
|
ICD-10 diagnosis code
|
Further notes/constraints
|
Operation date
|
PATIENTID
|
|
17
|
Prior tumour diagnosis
|
Statusofregistration
|
ICD-10 diagnosis code
|
Stage_best
|
Diagnosisdatebest
|
PATIENTID
|
|
18
|
Tumour diagnosis (Final)
|
Statusofregistration
|
ICD-10 diagnosis code
|
Stage_best
|
Diagnosisdatebest
|
PATIENTID
|
|
19
|
Patient vital status date
|
Vitalstatus
|
ICD-10 Underlying cause of death
|
Death location code
|
Vitalstatusdate
|
PATIENTID
|
|
20
|
RAWDATA holistic needs assessment record
|
HNA point of pathway : HNA offered : HNA staff role
|
Primary diagnosis
|
Laterality
|
Date of HNA
|
PATIENTID
|
|
21
|
RAWDATA staging
|
Inferred best stage
|
ICD-10 diagnosis code
|
T/N/M components for pre-treatment/pathological/integrated stage
|
Collected stage date
|
PATIENTID
|
|
22
|
CWT First Seen
|
Source of referral
|
Categorisation of TWW, screening and consultant upgrade cases, where
relevant
|
Suspected cancer referral type
|
Date first seen
|
NHSNUMBER
|
|
23
|
HES diagnostic event
|
OPCS-4 code
|
Description
|
BX/LD
|
Operation date
|
NHSNUMBER
|
|
24
|
RAWDATA personal care and support plan
|
PCSP point of pathway : PCSP offered : PCSP staff role
|
Primary diagnosis
|
Laterality
|
PCSP date
|
PATIENTID
|
|
25
|
RAWDATA end of treatment summary
|
|
Primary diagnosis
|
Laterality
|
eots_date
|
PATIENTID
|
|
50
|
Skeleton Tumour creation
|
E_base_record type (COSD = England, CANISC = Wales)
|
ICD-10 diagnosis code
|
|
Diagnosisdate
|
PATIENTID
|
|
51
|
Diagnosis reported in COSD
|
Number of times reported
|
ICD-10 diagnosis code
|
E_base_record type
|
Diagnosisdate
|
NHSNUMBER
|
|
52
|
CWT estimated diagnosis date
|
CWT First Treatment Flag
|
CWT recorded primary diagnosis (ICD)
|
CWT Cancer Treatment Event Type
|
Adjusted treat period start
|
NHSNUMBER
|
|
53
|
HES inferred tumour
|
HES cancer group
|
ICD-10 diagnosis code
|
|
Episode start date
|
NHSNUMBER
|
|
54
|
COSD diagnosis submission
|
E_base_record primary diagnoses
|
ICD-10 diagnosis code (submission)
|
|
Diagnosis date (submission)
|
PATIENTID
|
|
55
|
RAWDATA biopsy record
|
Laterality
|
ICD-10 diagnosis code
|
|
Collected date/authorised date
|
PATIENTID
|
|
56
|
RAWDATA imaging record
|
Laterality
|
ICD-10 diagnosis code
|
Procedure_date - diagdate
|
Diagdate
|
PATIENTID
|
|
57
|
RAWDATA HNA diagnosis
|
Laterality
|
Primary diagnosis (ICD-10)
|
|
Diagdate
|
PATIENTID
|
|
101
|
Inferred diagnosis
|
Event_property_1 from source record (event 19, 52, 53, 54)
|
ICD-10 diagnosis code
|
Cancer group
|
First recorded date
|
PATIENTID
|
|
102
|
Inferred diagnosis, with adapted diagnosis dates (from 101, adapted)
|
RCRD inferred/ derived diagnosis, with adapted diagnosis dates - using
event 101 diagnoses, adapting diagnosis dates for earlier records in
90-days preceding 101 diagnosis date
|
source_id from second record used to adapt diagnosis date
|
ICD-10 diagnosis code
|
Cancer group
|
PATIENTID
|
|
103
|
Alternative inferred diagnosis (work in progress)
|
RCRD inferred/ derived diagnosis based on combination of multiple data
sources - using an alternative approach to 101 (approach still being
refined)
|
source_id from second record used to adapt diagnosis date
|
ICD-10 diagnosis code
|
Cancer group
|
PATIENTID
|
*:
Data
dictionary: Primary cancer site for cancer faster diagnosis pathway
**: Data
dictionary: Holistic needs assessment point of pathway for
cancer
Appendix 2 - List of Rapid Registration fields available
Table A2: AT_RAPID_TUMOUR: field list
AT_RAPID_TUMOUR: field list
|
COLUMN_NAME
|
DATA_TYPE
|
Notes
|
|
INDIVIDUALID
|
NUMBER(19,0)
|
Matches AT_RAPID_PATHWAY for each event with event_type=101
|
|
PATIENTID
|
NUMBER(19,0)
|
Matches AT_RAPID_PATHWAY for each event with event_type=101
|
|
NHSNUMBER
|
VARCHAR2(12 BYTE)
|
Matches AT_RAPID_PATHWAY for each event with event_type=101
|
|
TUMOUR_AVPID
|
NUMBER
|
Matches AT_RAPID_PATHWAY for each event with event_type=101
|
|
DIAGNOSISDATE
|
DATE
|
Matches AT_RAPID_PATHWAY for each event with event_type=101
|
|
TUMOUR_SITE
|
VARCHAR2(260 CHAR)
|
Matches AT_RAPID_PATHWAY for each event with event_type=101
(event_property_2)
|
|
BIRTHDATEBEST
|
DATE
|
Taken from Encore
|
|
AGE
|
VARCHAR2(260 CHAR)
|
Taken from Encore
|
|
GENDER
|
VARCHAR2(260 CHAR)
|
Taken from Encore
|
|
POSTCODE
|
VARCHAR2(255 BYTE)
|
Taken from Encore
|
|
SURNAME
|
VARCHAR2(64 BYTE)
|
Taken from Encore
|
|
FORENAME
|
VARCHAR2(64 BYTE)
|
Taken from Encore
|
|
STAGE
|
VARCHAR2(260 CHAR)
|
Defined for selected cancer sites
|
|
ETHNICCATEGORY
|
VARCHAR2(1 CHAR)
|
Taken from Encore or the HESAPC dataset
|
|
FINAL_ROUTE
|
VARCHAR2(22 BYTE)
|
Final Route to Diagosis using an adapted version of the standard NCRAS
methodology
|
|
QUINTILE_2019
|
VARCHAR2(120 BYTE)
|
Index of Multiple Deprivation quintile defined using the standard NCRAS
methodology
|
|
CHRL_TOT_27_03
|
NUMBER(10,0)
|
Charlson score defined using the standard NCRAS methodology
|
|
TUMOUR_MORPHOLOGY
|
VARCHAR2(5 CHAR)
|
Tumour morphology as recorded in the COSD system
|
|
TUMOUR_PERFORMANCESTATUS
|
VARCHAR2(1 CHAR)
|
Patient performance status at time of diagnosis
|
|
BASISOFDIAGNOSIS
|
VARCHAR2(260 CHAR)
|
The basis of diagnosis (e.g. clinical; pathological; etc.)
|
|
LSOA11
|
VARCHAR2(27 BYTE)
|
2011 census LSOA of residence at time of diagnosis
|
|
LSOA21
|
VARCHAR2(27 BYTE)
|
2021 census LSOA of residence at time of diagnosis
|
|
DIAGNOSIS_TRUST
|
VARCHAR2(260 CHAR)
|
Trust of diagnosis
|
|
SOURCE
|
VARCHAR2(11 CHAR)
|
The dataset used as the primary source for the RCRD registration
|
|
SOURCE_ID
|
VARCHAR2(69 CHAR)
|
The unique ID of the record used as the primary source for the RCRD
registration
|
|
VITALSTATUS
|
VARCHAR2(260 CHAR)
|
Records whether the patient is currently alive or deceased at the time
of the snapshot.
|
|
VITALSTATUSDATE
|
DATE
|
The date of the last known vital status for the patient
|
|
CANCER_GROUP
|
VARCHAR2(40 BYTE)
|
Broad cancer group derived from TUMOUR_SITE, according to groupings used
for RCRD derivation and RCRD dashboard
|
|
CANCER_GROUP_DETAILED
|
VARCHAR2(40 BYTE)
|
Detailed cancer group derived from TUMOUR_SITE, according to groupings
used for RCRD dashboard
|
|
SURGERY_FLAG
|
NUMBER
|
Indicator flag (0 = No; 1 = Yes) for whether the patient has an
associated surgical tumour resection record
|
|
SURGERY_DATE
|
DATE
|
Where the SURGERY_FLAG indicates one or more associated surgical tumour
resection record, this is the earliest such date
|
|
RADIOTHERAPY_FLAG
|
NUMBER
|
Indicator flag (0 = No; 1 = Yes) for whether the patient has an
associated radiotherapy record (from RTDS)
|
|
RADIOTHERAPY_DATE
|
DATE
|
Where the RADIOTHERAPY_FLAG indicates one or more associated
radiotherapy record, this is the earliest such date
|
|
SACT_FLAG
|
NUMBER
|
Indicator flag (0 = No; 1 = Yes) for whether the patient has an
associated systemical anti-cancer therapy (SACT) record (from SACT
dataset)
|
|
SACT_DATE
|
DATE
|
Where the SACT_FLAG indicates one or more associated SACT record, this
is the earliest such date
|
Appendix 3 - Cancer groups used for matching
Table A3: Rapid Registration ICD-10 tumour inclusion list
Rapid Registration ICD-10 tumour inclusion list
|
ICD
|
CANCER_GROUP
|
SCOPE
|
ICD
|
CANCER_GROUP
|
SCOPE
|
|
C00
|
Head & Neck
|
DQ & CD
|
C54
|
Gynae
|
DQ & CD
|
|
C01
|
Head & Neck
|
DQ & CD
|
C55
|
Gynae
|
DQ & CD
|
|
C02
|
Head & Neck
|
DQ & CD
|
C56
|
Gynae
|
DQ & CD
|
|
C03
|
Head & Neck
|
DQ & CD
|
C57
|
Gynae
|
DQ & CD
|
|
C04
|
Head & Neck
|
DQ & CD
|
C58
|
Gynae
|
DQ & CD
|
|
C05
|
Head & Neck
|
DQ & CD
|
C59
|
Other
|
DQ & CD
|
|
C06
|
Head & Neck
|
DQ & CD
|
C60
|
Urology
|
DQ & CD
|
|
C07
|
Head & Neck
|
DQ & CD
|
C61
|
Prostate
|
DQ & CD
|
|
C08
|
Head & Neck
|
DQ & CD
|
C62
|
Urology
|
DQ & CD
|
|
C09
|
Head & Neck
|
DQ & CD
|
C63
|
Urology
|
DQ & CD
|
|
C10
|
Head & Neck
|
DQ & CD
|
C64
|
Urology
|
DQ & CD
|
|
C11
|
Head & Neck
|
DQ & CD
|
C65
|
Urology
|
DQ & CD
|
|
C12
|
Head & Neck
|
DQ & CD
|
C66
|
Urology
|
DQ & CD
|
|
C13
|
Head & Neck
|
DQ & CD
|
C67
|
Urology
|
DQ & CD
|
|
C14
|
Head & Neck
|
DQ & CD
|
C68
|
Urology
|
DQ & CD
|
|
C15
|
O-G
|
DQ & CD
|
C69
|
Brain & CNS
|
DQ & CD
|
|
C16
|
O-G
|
DQ & CD
|
C70
|
Brain & CNS
|
DQ & CD
|
|
C17
|
Upper GI
|
DQ & CD
|
C71
|
Brain & CNS
|
DQ & CD
|
|
C18
|
Colorectal
|
DQ & CD
|
C72
|
Brain & CNS
|
DQ & CD
|
|
C19
|
Colorectal
|
DQ & CD
|
C73
|
Endocrine
|
DQ & CD
|
|
C20
|
Colorectal
|
DQ & CD
|
C74
|
Endocrine
|
DQ & CD
|
|
C21
|
Colorectal
|
DQ & CD
|
C75
|
Endocrine
|
DQ & CD
|
|
C22
|
Upper GI
|
DQ & CD
|
C76
|
Unknown Primary
|
DQ & CD
|
|
C23
|
Upper GI
|
DQ & CD
|
C77
|
Unknown Primary
|
DQ & CD
|
|
C24
|
Upper GI
|
DQ & CD
|
C78
|
Unknown Primary
|
DQ & CD
|
|
C25
|
Upper GI
|
DQ & CD
|
C79
|
Unknown Primary
|
DQ & CD
|
|
C26
|
Upper GI
|
DQ & CD
|
C80
|
Unknown Primary
|
DQ & CD
|
|
C27
|
Other
|
DQ & CD
|
C81
|
Haematological
|
DQ & CD
|
|
C28
|
Other
|
DQ & CD
|
C82
|
Haematological
|
DQ & CD
|
|
C29
|
Other
|
DQ & CD
|
C83
|
Haematological
|
DQ & CD
|
|
C30
|
Head & Neck
|
DQ & CD
|
C84
|
Haematological
|
DQ & CD
|
|
C31
|
Head & Neck
|
DQ & CD
|
C85
|
Haematological
|
DQ & CD
|
|
C32
|
Head & Neck
|
DQ & CD
|
C86
|
Haematological
|
DQ & CD
|
|
C33
|
Lung
|
DQ & CD
|
C87
|
Haematological
|
DQ & CD
|
|
C34
|
Lung
|
DQ & CD
|
C88
|
Haematological
|
DQ & CD
|
|
C35
|
Other
|
DQ & CD
|
C89
|
Haematological
|
DQ & CD
|
|
C36
|
Other
|
DQ & CD
|
C90
|
Haematological
|
DQ & CD
|
|
C37
|
Other
|
DQ & CD
|
C91
|
Haematological
|
DQ & CD
|
|
C38
|
Lung
|
DQ & CD
|
C92
|
Haematological
|
DQ & CD
|
|
C39
|
Lung
|
DQ & CD
|
C93
|
Haematological
|
DQ & CD
|
|
C40
|
Bone & ST
|
DQ & CD
|
C94
|
Haematological
|
DQ & CD
|
|
C41
|
Bone & ST
|
DQ & CD
|
C95
|
Haematological
|
DQ & CD
|
|
C42
|
Other
|
DQ & CD
|
C96
|
Haematological
|
DQ & CD
|
|
C43
|
Melanoma
|
DQ & CD
|
C97
|
Unknown Primary
|
DQ & CD
|
|
C44
|
NMSC
|
|
D05
|
Breast
|
DQ
|
|
C45
|
Lung
|
DQ & CD
|
D06
|
Gynae
|
|
|
C46
|
Bone & ST
|
DQ & CD
|
D09
|
Urology
|
DQ
|
|
C47
|
Brain & CNS
|
DQ & CD
|
D32
|
Brain & CNS
|
DQ
|
|
C48
|
Gynae
|
DQ & CD
|
D33
|
Brain & CNS
|
DQ
|
|
C49
|
Bone & ST
|
DQ & CD
|
D35
|
Brain & CNS
|
DQ
|
|
C50
|
Breast
|
DQ & CD
|
D41
|
Urology
|
DQ
|
|
C51
|
Gynae
|
DQ & CD
|
D42
|
Brain & CNS
|
DQ
|
|
C52
|
Gynae
|
DQ & CD
|
D43
|
Brain & CNS
|
DQ
|
|
C53
|
Gynae
|
DQ & CD
|
D44
|
Brain & CNS
|
DQ
|
|
Scope: DQ = ‘Included in this data quality document’; CD =
‘Included in cancerdata.nhs.uk/covid-19/rcrd dashboard’
|
Appendix 4 - Alternative defining events
Several options were considered as to the defining events for the
Rapid Registrations. Both standalone datasets, subsets of standalone
datasets, and combined datasets were explored and their FNE and FPE
figures quantified. A subset of these alternatives are presented below
as a demonstration of the process but the majority of this exploratory
work is out of scope for this document.
Candidates for diagnosis events from the three main datasets that are
rapidly available and have nominally full coverage of cancer patients
are shown below (SACT and RTDS were also examined but data is not
presented). Of the three, the CWT data has the best FPE but the FNE is
substantially higher than the COSD dataset. HES produced the worst
results in both measures. A filtering process was applied to the
standalone COSD data to remove apparently new diagnoses that were
actually recurrences of prior tumours. This improved the FPE at a cost
of increasing the FNE. We continue to test whether this process can be
further refined to improve the combined FPE and FNE figures, and monitor
changes in the underlying datasets that might also give new
opportunities to do so.
Table A4: Rapid Cancer Registrations: alternative defining
events
Rapid Cancer Registrations: alternative defining events
|
Event
|
FPE
|
FNE
|
|
Event 52 - standalone CWT
|
7.6%
|
28.3%
|
|
Event 53 - standalone HES
|
13.2%
|
38.9%
|
|
Event 54 - standalone COSD
|
8.1%
|
15.8%
|
|
Event 101 (up to cas2106) - filtered COSD
|
5.2%
|
17.7%
|
|
Event 101 (cas2107) - filtered combined COSD/CWT
|
5.6%
|
16.4%
|
|
Event 101 (cas2108) - filtered combined COSD/CWT
|
5.1%
|
16.5%
|
|
Event 101 (cas2109) - filtered combined COSD/CWT
|
5.1%
|
16.6%
|
|
Event 101 (cas2110) - filtered combined COSD/CWT/HES
|
5.1%
|
14.7%
|
|
Event 101 (cas2111) - filtered combined COSD/CWT/HES
|
6.2%
|
13.4%
|
|
Event 101 (cas2112 to cas2202) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
5.3%
|
13.4%
|
|
Event 101 (cas2203 to cas2204) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.3%
|
12.2%
|
|
Event 101 (cas2205) - filtered combined COSD/CWT/HES and Death
Certificates Only
|
6.1%
|
12.3%
|
|
Event 101 (cas2206) - filtered combined COSD/CWT/HES and Death
Certificates Only
|
5.6%
|
12.5%
|
|
Event 101 (cas2207) - filtered combined COSD/CWT/HES and Death
Certificates Only
|
6.0%
|
11.8%
|
|
Event 101 (cas2208 to cas2210) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.0%
|
11.6%
|
|
Event 101 (cas2211 to cas2304) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.1%
|
11.5%
|
|
Event 101 (cas2305) - filtered combined COSD/CWT/HES and Death
Certificates Only
|
6.1%
|
11.3%
|
|
Event 101 (cas2306) - filtered combined COSD/CWT/HES and Death
Certificates Only
|
6.1%
|
11.4%
|
|
Event 101 (cas2307 to cas2308) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.1%
|
11.3%
|
|
Event 101 (cas2309 to cas2311) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.1%
|
11.4%
|
|
Event 101 (cas2312 to cas2409) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.0%
|
11.5%
|
|
Event 101 (cas2410) - filtered combined COSD/CWT/HES and Death
Certificates Only
|
5.9%
|
11.7%
|
|
Event 101 (cas2411) - filtered combined COSD/CWT/HES and Death
Certificates Only
|
5.8%
|
12.3%
|
|
Event 101 (cas2412 to cas2501) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.0%
|
12.0%
|
|
Event 101 (cas2504 to cas2505) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.0%
|
12.1%
|
|
Event 101 (cas2505 to cas2506) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.0%
|
12.0%
|
|
Event 101 (cas2507 to cas2602) - filtered combined COSD/CWT/HES and
Death Certificates Only
|
6.0%
|
12.1%
|
Appendix 5 - Counts and error tabulations
Figure A1 shows an example for a very small dataset of how counts and
error proportions are derived. This dataset has 10 Gold Standard
Registrations and 7 Rapid Registrations overall (both indicated by the
dots in the figure, with time running vertically over the course of 2018
and Gold Standard vs Rapid Registrations divided horizontally).
Successful linkages between Gold Standard and Rapid Registrations are
indicated by blue lines. False negatives and false positives are
indicated. Only tumours in the 6-month assessment period are included in
the tabulations below, although these can link to tumours outside the
period as shown, and many-to-one linkages are also allowed. The false
negative rate is therefore 3 in 7 and the false positive rate 1 in 6
below.
Table A5: Counts and errors tabulation by cancer group
Counts and errors tabulation by cancer group
|
Cancer group
|
Gold Standard (GS) Registrations
|
Rapid Registrations
|
Difference
|
Percentage Rapid/GS
|
FPE
|
FNE
|
|
Brain & CNS
|
5795
|
5138
|
657
|
88.7%
|
711
|
1364
|
|
Breast
|
28970
|
27223
|
1747
|
94.0%
|
1489
|
1720
|
|
Colorectal
|
18981
|
17846
|
1135
|
94.0%
|
919
|
1699
|
|
Endocrine
|
1910
|
1484
|
426
|
77.7%
|
139
|
509
|
|
Gynae
|
9789
|
9313
|
476
|
95.1%
|
665
|
1010
|
|
Haematological
|
14051
|
12432
|
1619
|
88.5%
|
723
|
2366
|
|
Head & Neck
|
5288
|
4929
|
359
|
93.2%
|
400
|
698
|
|
Lung
|
21726
|
20106
|
1620
|
92.5%
|
626
|
2115
|
|
Melanoma
|
8259
|
7691
|
568
|
93.1%
|
698
|
1072
|
|
O-G
|
6622
|
6471
|
151
|
97.7%
|
366
|
476
|
|
Prostate
|
27229
|
25183
|
2046
|
92.5%
|
326
|
2469
|
|
Bone & Soft Tissue
|
1140
|
1082
|
58
|
94.9%
|
365
|
406
|
|
Unknown Primary
|
3428
|
2566
|
862
|
74.9%
|
665
|
1531
|
|
Upper GI
|
9272
|
8672
|
600
|
93.5%
|
801
|
1442
|
|
Urology
|
17025
|
14874
|
2151
|
87.4%
|
1016
|
2851
|
Table A6: Counts and errors tabulation by cancer site
Counts and errors tabulation by cancer site
|
Cancer site
|
Gold Standard (GS) Registrations
|
Rapid Registrations
|
Difference
|
Percentage Rapid/GS
|
FPE
|
FNE
|
|
C00
|
111
|
150
|
-39
|
135.1%
|
65
|
25
|
|
C01
|
647
|
469
|
178
|
72.5%
|
13
|
61
|
|
C02
|
602
|
604
|
-2
|
100.3%
|
18
|
93
|
|
C03
|
234
|
108
|
126
|
46.2%
|
5
|
65
|
|
C04
|
256
|
239
|
17
|
93.4%
|
10
|
34
|
|
C05
|
211
|
185
|
26
|
87.7%
|
8
|
32
|
|
C06
|
270
|
287
|
-17
|
106.3%
|
21
|
50
|
|
C07
|
238
|
288
|
-50
|
121.0%
|
101
|
53
|
|
C08
|
82
|
89
|
-7
|
108.5%
|
15
|
14
|
|
C09
|
912
|
765
|
147
|
83.9%
|
16
|
59
|
|
C10
|
152
|
243
|
-91
|
159.9%
|
12
|
30
|
|
C11
|
113
|
111
|
2
|
98.2%
|
6
|
13
|
|
C12
|
157
|
99
|
58
|
63.1%
|
1
|
11
|
|
C13
|
142
|
130
|
12
|
91.5%
|
11
|
22
|
|
C14
|
25
|
66
|
-41
|
264.0%
|
15
|
13
|
|
C15
|
3996
|
4266
|
-270
|
106.8%
|
124
|
220
|
|
C16
|
2626
|
2205
|
421
|
84.0%
|
242
|
256
|
|
C17
|
820
|
669
|
151
|
81.6%
|
129
|
267
|
|
C18
|
12438
|
11760
|
678
|
94.5%
|
671
|
1239
|
|
C19
|
996
|
937
|
59
|
94.1%
|
43
|
90
|
|
C20
|
4900
|
4504
|
396
|
91.9%
|
115
|
328
|
|
C21
|
647
|
645
|
2
|
99.7%
|
90
|
42
|
|
C22
|
2646
|
2524
|
122
|
95.4%
|
266
|
454
|
|
C23
|
477
|
468
|
9
|
98.1%
|
29
|
62
|
|
C24
|
645
|
527
|
118
|
81.7%
|
32
|
84
|
|
C25
|
4533
|
4188
|
345
|
92.4%
|
137
|
499
|
|
C26
|
151
|
296
|
-145
|
196.0%
|
208
|
76
|
|
C30
|
162
|
158
|
4
|
97.5%
|
27
|
25
|
|
C31
|
93
|
65
|
28
|
69.9%
|
5
|
27
|
|
C32
|
881
|
873
|
8
|
99.1%
|
51
|
71
|
|
C33
|
13
|
11
|
2
|
84.6%
|
1
|
3
|
|
C34
|
20262
|
18740
|
1522
|
92.5%
|
550
|
1929
|
|
C37
|
169
|
94
|
75
|
55.6%
|
13
|
59
|
|
C38
|
73
|
354
|
-281
|
484.9%
|
46
|
21
|
|
C39
|
NA
|
12
|
NA
|
NA%
|
4
|
NA
|
|
C40
|
119
|
107
|
12
|
89.9%
|
13
|
25
|
|
C41
|
117
|
150
|
-33
|
128.2%
|
78
|
44
|
|
C43
|
8259
|
7691
|
568
|
93.1%
|
698
|
1072
|
|
C45
|
1209
|
895
|
314
|
74.0%
|
12
|
103
|
|
C46
|
69
|
43
|
26
|
62.3%
|
3
|
25
|
|
C47
|
28
|
12
|
16
|
42.9%
|
5
|
22
|
|
C48
|
287
|
400
|
-113
|
139.4%
|
111
|
87
|
|
C49
|
835
|
782
|
53
|
93.7%
|
271
|
312
|
|
C50
|
25139
|
24355
|
784
|
96.9%
|
1356
|
1358
|
|
C51
|
645
|
594
|
51
|
92.1%
|
55
|
79
|
|
C52
|
95
|
109
|
-14
|
114.7%
|
16
|
12
|
|
C53
|
1321
|
1315
|
6
|
99.5%
|
56
|
86
|
|
C54
|
4084
|
3690
|
394
|
90.4%
|
106
|
199
|
|
C55
|
73
|
333
|
-260
|
456.2%
|
26
|
17
|
|
C56
|
2999
|
2528
|
471
|
84.3%
|
243
|
491
|
|
C57
|
275
|
320
|
-45
|
116.4%
|
34
|
38
|
|
C58
|
10
|
24
|
-14
|
240.0%
|
18
|
1
|
|
C60
|
304
|
315
|
-11
|
103.6%
|
50
|
37
|
|
C61
|
27229
|
25183
|
2046
|
92.5%
|
326
|
2469
|
|
C62
|
1056
|
1067
|
-11
|
101.0%
|
87
|
72
|
|
C63
|
33
|
31
|
2
|
93.9%
|
13
|
19
|
|
C64
|
4901
|
4369
|
532
|
89.1%
|
274
|
795
|
|
C65
|
419
|
321
|
98
|
76.6%
|
25
|
92
|
|
C66
|
362
|
260
|
102
|
71.8%
|
13
|
122
|
|
C67
|
4475
|
5056
|
-581
|
113.0%
|
146
|
681
|
|
C68
|
96
|
56
|
40
|
58.3%
|
6
|
40
|
|
C69
|
383
|
352
|
31
|
91.9%
|
48
|
64
|
|
C70
|
19
|
42
|
-23
|
221.1%
|
5
|
1
|
|
C71
|
2265
|
2114
|
151
|
93.3%
|
140
|
206
|
|
C72
|
83
|
90
|
-7
|
108.4%
|
35
|
19
|
|
C73
|
1734
|
1354
|
380
|
78.1%
|
80
|
414
|
|
C74
|
118
|
84
|
34
|
71.2%
|
26
|
57
|
|
C75
|
58
|
46
|
12
|
79.3%
|
33
|
38
|
|
C76
|
94
|
210
|
-116
|
223.4%
|
109
|
54
|
|
C77
|
271
|
126
|
145
|
46.5%
|
62
|
129
|
|
C78
|
593
|
54
|
539
|
9.1%
|
22
|
331
|
|
C79
|
231
|
133
|
98
|
57.6%
|
54
|
126
|
|
C80
|
2239
|
2043
|
196
|
91.2%
|
418
|
891
|
|
C81
|
894
|
864
|
30
|
96.6%
|
15
|
72
|
|
C82
|
1208
|
1046
|
162
|
86.6%
|
16
|
136
|
|
C83
|
3117
|
2702
|
415
|
86.7%
|
40
|
317
|
|
C84
|
395
|
230
|
165
|
58.2%
|
14
|
122
|
|
C85
|
1380
|
1023
|
357
|
74.1%
|
65
|
317
|
|
C86
|
NA
|
101
|
NA
|
NA%
|
2
|
NA
|
|
C88
|
221
|
349
|
-128
|
157.9%
|
12
|
63
|
|
C90
|
2558
|
2211
|
347
|
86.4%
|
64
|
435
|
|
C91
|
2374
|
1905
|
469
|
80.2%
|
81
|
566
|
|
C92
|
1762
|
1570
|
192
|
89.1%
|
237
|
288
|
|
C93
|
23
|
187
|
-164
|
813.0%
|
23
|
1
|
|
C94
|
26
|
78
|
-52
|
300.0%
|
65
|
11
|
|
C95
|
51
|
66
|
-15
|
129.4%
|
10
|
12
|
|
C96
|
42
|
100
|
-58
|
238.1%
|
79
|
26
|
|
D05
|
3831
|
2868
|
963
|
74.9%
|
133
|
362
|
|
D09
|
5168
|
1267
|
3901
|
24.5%
|
245
|
918
|
|
D32
|
1481
|
1036
|
445
|
70.0%
|
87
|
514
|
|
D33
|
492
|
602
|
-110
|
122.4%
|
113
|
203
|
|
D35
|
494
|
550
|
-56
|
111.3%
|
197
|
148
|
|
D41
|
211
|
2132
|
-1921
|
1010.4%
|
157
|
75
|
|
D42
|
150
|
16
|
134
|
10.7%
|
4
|
38
|
|
D43
|
278
|
268
|
10
|
96.4%
|
54
|
76
|
|
D44
|
122
|
56
|
66
|
45.9%
|
23
|
73
|
Appendix 6 - False negative errors and basis of diagnosis
This appendix explores the reason for the overall age-dependence of
the false negative error rate.
The most common methods of confirming a diagnosis (histology and
cytology) account for the lowest proportion of false negatives (Figure
A2). Where diagnosis comes from specific tumour markers, the Rapid
Registrations are much more likely to “miss” the significant event or
events. Patients diagnosed clinically (from imaging, consultation by a
doctor but without a pathological sample being taken) are also more
likely to be “missed” in the Rapid Registrations dataset.
Those patients for whom a diagnosis method cannot be determined
(unknown) or died before they could be offered cancer treatment (death
certificate), are most likely to be “missed” in the Rapid Registrations
dataset. As Figure A3 indicates though, these account for a small
proportion of those falsely omitted from the Rapid Registrations.
The marked reduction in the proportion of patients having their
diagnosis confirmed from a pathological specimen (histology or cytology)
explains the increase often observed at older ages in Figure A3, from
the age of around 70, reflecting fewer patients having an invasive
procedure performed on them as age increases. This is likely to be the
reason behind the increasing false negative proportions by age observed
overall and in most tumour groups (Figures 5 and 6).
Appendix 7 - False positive and false negative proportion by
month
Figure 18 shows the False Negative and False Positive error
proportions by month for the broader matching criteria and a matching
period of 90 and 30 days.
Appendix 8 - Sensitivity testing of matching criteria
In this section, the sensitivity of the Rapid Registrations dataset
is illustrated for different matching criteria.
As expected, the stricter the criteria about the timing of events,
more errors (both false negative and false positive) are observed. Not
including a match specification on tumour type (the second line of table
1) improves both matching criteria and demonstrates that approximately
40% of false positive tumours have a cancer diagnosis of some sort when
the necessity of matching by tumour group is removed.
Table A7: Proportions of false positive and negative errors under
alternative matching criteria
Proportions of false positive and negative errors under alternative
matching criteria
|
Tumour matching
|
Match within N days
|
False Negative %
|
False Positive %
|
|
Broader
|
90
|
12.1%
|
6.0%
|
|
Broader
|
60
|
13.7%
|
7.6%
|
|
Broader
|
30
|
19.2%
|
13.2%
|
|
Broader
|
14
|
30.0%
|
24.7%
|
|
Broader
|
7
|
46.1%
|
42.2%
|
|
Broader
|
0
|
81.1%
|
79.5%
|
|
Narrow
|
90
|
20.1%
|
13.9%
|
|
None
|
90
|
10.6%
|
4.6%
|
Appendix 9 - Code changes to the RCRD build process
In this section, code changes introduced in each monthly snapshot are
described.
Table A8: RCRD change log
AT_RAPID_PATHWAY: event list
|
snapshot
|
change_id
|
code_change
|
|
cas2603
|
|
None
|
|
cas2602
|
|
None
|
|
cas2601
|
|
None
|
|
cas2512
|
cas2512-1
|
Internal code changes to optimise build with regard to HES data
|
|
cas2511
|
cas2511-1
|
Update to HNA related events to include COSDv10 data and add
de-duplication
|
|
cas2510
|
cas2510-1
|
Further updates to support assignment of C48 tumours to a cancer group
depending on patient gender
|
|
cas2510
|
cas2510-2
|
Cleaning of odd values from stage field
|
|
cas2510
|
cas2510-3
|
Cleaning of dates of death prior to diagnosis
|
|
cas2509
|
cas2509-1
|
Further updates to support assignment of C48 tumours to a cancer group
depending on patient gender
|
|
cas2509
|
cas2509-2
|
Update to include persons with death-only information in group of proxy
tumours.
|
|
cas2508
|
cas2508-1
|
Further updates to support assignment of C48 tumours to a cancer group
depending on patient gender
|
|
cas2508
|
cas2508-2
|
Minor changes to surgery lookup table to align with standard treatment
reporting
|
|
cas2508
|
cas2508-3
|
Adding D48, D72, E85, M72 ICD-10 overall lookup table to align with
current cancer registration practice
|
|
cas2507
|
cas2507-1
|
C53 and C57 staging values moved into STAGE field from
EXPERIMENTAL_STAGE
|
|
cas2507
|
cas2507-2
|
C48 tumours now assigned to a cancer group depending on patient gender
|
|
cas2507
|
cas2507-3
|
Resective surgery lookup table better aligned with 2025 Cancer Flags
output
|
|
cas2506
|
cas2506-1
|
Internal changes to deal with multiple NHSnumbers per personid
|
|
cas2506
|
cas2506-2
|
Internal changes to prepare for improvements to assigning C48 tumours
|
|
cas2506
|
cas2506-3
|
Further development of event 102 and 103 experiemental events
|
|
cas2505
|
|
None
|
|
cas2504
|
cas2504-1
|
Further development of event 102 and 103 experiemental events
|
|
cas2504
|
cas2504-2
|
Update to basis of diagnosis code for 2023 cases onward to make
consistent with updated registration practice
|
|
cas2501
|
cas2501-1
|
Permanent fix to enact deuplication of experimental event 102
|
|
cas2412
|
cas2412-1
|
Include staging of C53 (cervical cancer) in experimental stage field
|
|
cas2412
|
cas2412-2
|
Correcting issue that excluded rapidly fatal cancers being included from
the HES data
|
|
cas2412
|
cas2412-3
|
Deduplication of experimental event 102 (hotfix)
|
|
cas2412
|
cas2412-4
|
Excluded lung screening Routes to Diagnosis prior to January 2019
|
|
cas2411
|
cas2411-1
|
Update to surgery code to use a combined table of all 3-digit ICD-10
codes, for all-stage and stage-specific procedures.
|
|
cas2411
|
cas2411-2
|
Filter OPCS4 procedure codes saved in initial HES tables, to include
only those relevant to later lookups.
|
|
cas2411
|
cas2411-3
|
Added filtering to exclude Welsh only patients within the rapid_fatality
section of event 101.
|
|
cas2411
|
cas2411-4
|
Two proposed new events, 102 and 103.
|
|
cas2410
|
cas2410-1
|
Refactored surgical lookup table code to be consistent with those used
in treatment flag output
|
|
cas2410
|
cas2410-2
|
Added GP Practice code to tumour table
|
|
cas2409
|
cas2409-1
|
Added C33 to allowed list for lung screening
|
|
cas2409
|
cas2409-2
|
Updated NSPL postcode lookup to NSPL published May 2024
|
|
cas2409
|
cas2409-3
|
Internal refactoring of surgical lookup table to prepare for a simpler
update process
|
|
cas2409
|
cas2409-4
|
Created internal experimental table showing patient GP practice at time
of diagnosis
|
|
cas2408
|
cas2408-1
|
Changed criteria for including Event 54 in rapid pathway table such that
there is a known nhsnumber instead of a known patient id (motivated by
changes to COSD v10 data submissions)
|
|
cas2407
|
cas2407-1
|
Added STAGE_EXPERIMENTAL field
|
|
cas2407
|
cas2407-2
|
Added staging for C57 ovarian tumours (into STAGE_EXPERIMENTAL field)
|
|
cas2407
|
cas2407-3
|
Opened selection for screening cases to include C34 lung cancers
|
|
cas2406
|
|
None
|
|
cas2405
|
cas2405-1
|
Updated assignment of trusts (reversing effect of cas2305-2 change),
reducing numbers of patients diagnosed at tertiary trusts and increasing
numbers diagnosed in near-by trusts.
|
|
cas2405
|
cas2405-2
|
Refactored order of properties in event 5 for consistency throughtout
code while maintaining fix for ethnicity made in cas2404.
|
|
cas2404
|
cas2404-1
|
Fixed issue with ethnicity ‘top up’ from HES data which was incorrectly
assigning ethinicity where it was present in HES but missing in COSD.
|
|
cas2404
|
cas2404-2
|
Update to allow creation of HES identified endocrine tumours based on
event 11, restoring diagnoses previously identified from event 13.
|
|
cas2403
|
cas2403-1
|
Added place of death to event 19, property 3.
|
|
cas2403
|
cas2403-2
|
Merging event 13 into event 11 and event 16 into 14. This has the effect
of no longer distinguishing surgery codes consistent with the CASSOP 4.5
with those specific to the RCRD build.
|
|
cas2403
|
cas2403-3
|
Add LSOA21 and age at diagnosis to AT_RAPID_TUMOUR table.
|
|
cas2402
|
|
None
|
|
cas2401
|
|
None
|
|
cas2312
|
cas2312-1
|
Update ICD-10 site lookup table to include more D-coded tumour groups.
|
|
cas2311
|
cas2311-1
|
Filter ethnicity to 1 digit only.
|
|
cas2311
|
cas2311-2
|
Updated postcode lookup table to nspl_202305.
|
|
cas2311
|
cas2311-3
|
Added filter to morphology codes to only allow those beginning with ‘8’
or ‘9’.
|
|
cas2311
|
cas2311-4
|
After review of fields removed ‘received_date’ from pathway table.
|
|
cas2311
|
cas2311-5
|
After review of fields removed event type 10 as an effective duplicate
of event type 19.
|
|
cas2310
|
|
None
|
|
cas2309
|
cas2309-1
|
Allow HES and CWT records to create event-type 52 and 53 events even if
there is no patientid. Screen these out so that they don’t go on to
create event-type 101 events, but are now available for testing.
|
|
cas2308
|
|
None
|
|
cas2307
|
cas2307-1
|
Expose path and integrated TNM stage components in event 21.
|
|
cas2307
|
cas2307-2
|
Change offset for CWT diagnosis events to a fixed lookup table rather
than re-calculating each time.
|
|
cas2307
|
cas2307-3
|
Update CWT surgery codes to reflect changes to CWT data dictionary.
|
|
cas2307
|
cas2307-4
|
Updated surgery lookup table to reflect changes implemented in cancer
treatment flags output.
|
|
cas2306
|
cas2306-1
|
Move comparison of diagnosis date to date of death to earlier in the
processing (and using vital status date for date of death if
appropriate).
|
|
cas2305
|
cas2305-1
|
Remove duplicate patients with multiple patientid and same nhsnumbers.
|
|
cas2305
|
cas2305-2
|
Revert to prior order to prioritise creation of event 101s without
prioritising those with a known trust.
|
|
cas2305
|
cas2305-3
|
Bring diagnosis trust through to AT_RAPID_TUMOUR table.
|
|
cas2305
|
cas2305-4
|
Added new basis of diagnosis codes to reflect changes to ENCR
definitions for diagnoses from 2023 onwards.
|
|
cas2305
|
cas2305-5
|
Replace diagnosisdate with date of death for cases where date of death
would otherwise have been within the 3 months before diagnosisdate.
|
|
cas2304
|
|
None
|
|
cas2303
|
|
None
|
|
cas2302
|
|
None
|
|
cas2301
|
|
None
|