Summary

This document summarises the relationship between the Rapid Cancer Registration Data (RCRD) and the National Cancer Registration Dataset (NCRD) in terms of the correspondence between the proportion of cases diagnosed at an early stage (broadly: at stage 1 or 2) out of all the cases that are staged, referred to as the early diagnosis proportion (or sometimes colloquially rate) and abbreviated “ED%”. The relationship between this proportion from the two data sources is monitored here and this document is updated quarterly.

We find -

  • the RCRD is included to May 2025 at time of writing, based on the August 2025 data snapshot, while the NCRD is available to September 2023
  • the RCRD can be used as a proxy for the NCRD in the more recent period where the NCRD is not yet available, for multiple uses including but not limited to measuring ED%
  • comparisons within datasets (either within the NCRD or within the RCRD) are preferred to direct comparisons between the two
  • overall there is a good correspondence between the ED% in the RCRD and NCRD, which was largely maintained even during the disruption of the COVID-19 period
  • there is some variation in the relationship between ED% in the RCRD and NCRD by cancer site and demographic factors, this should be taken into account with reference to this document when working with RCRD data for specific cancer types and demographic factors
  • there is variation in the ED% by geography, particularly for prostate cancer, which appears real rather than artefactual and may in that case be related to differential use of PSA testing
  • the relationship over time in the ED% in the RCRD and NCRD does not statistically significantly vary over time at a 95% confidence level: the p-value is measured at 0.724 in the most recent data. For some cancer types, there is a statistically significant change over time. NDRS continues to monitor these relationships
  • the most recent month of data available generally shows, when first published, an under-estimate of the early diagnosis proportion that it will eventually show when the data matures 2-3 months later, by around 0.3%. The percentage for the (rolling) 12-month early diagnosis proportion does not have this property and is preferred for most measurement purposes.

Please contact with any queries, or suggestions on how to improve this document.

Introduction

The Early Diagnosis Ambition

The NHS Long Term Plan (LTP) was published in January 2019. One of the key ambitions is:

“By 2028, the NHS will diagnose 75% of cancers at stage 1 or 2.”

The LTP sets out a number of initiatives by which this may achieved, for instance the modernisation of the Bowel Cancer Screening Programme or the extension of lung health checks. Other innovative approaches post-date the LTP (for example, the NHS Galleri Trial). For all these improvement initiatives it is desirable to be able to monitor, in as close to real-time as possible, the proportion of cancers detected at an early stage.

The Rapid Cancer Registration and National Cancer Registration Datasets (RCRD/NCRD)

NDRS collects, curates and quality-assures the National Cancer Registration Data (NCRD) [1]. This data is the ‘gold standard’ for measuring the incidence and properties (including stage at diagnosis) of cancer in England, and is used to produce annual national statistics. The data undergoes careful review before publication to ensure high completeness and accuracy, this does however result in a delay of up to two years between a cancer being diagnosed and the data being reportable.

In response to the COVID-19 pandemic and the desire for more rapid reporting NDRS developed an algorithmically generated Rapid Cancer Registration Dataset (RCRD) [2] using the standard administrative datasets which flow most rapidly into NDRS and are incorporated into the Cancer Analysis System (CAS). This represents the best available database of ‘proxy cancer registrations’ (including proxy staging information) for a more recent period than that available in finalised NCRD data.

The RCRD as a proxy for the NCRD

The RCRD was developed using a data-led approach that aimed to replicate the NCRD as closely as possible. It aims to simultaneously minimise: false negative errors (cases missing from RCRD present in NCRD); false positive errors (cases present in the RCRD not present in the NCRD); and cases with a substantially incorrect diagnosis date, topographic site of cancer in the body, and/or stage at diagnosis.

The RCRD is updated monthly, and a data quality monitoring report is published for each release. This monitors the correspondence between the NCRD and RCRD datasets for the period in which they overlap, before the COVID-influenced period began in either dataset.

Stage by cancer sites in RCRD and NCRD

The NCRD attempts to stage all stageable cancers (some cancer sites do not have a specific staging system that allows their stage to be classified). Details of this classification can be found in the annual National Statistics on Cancer Registration, with the Staging data in England online tool allowing data to be explored and downloaded.

The RCRD attempts to stage 13 specific cancer sites representing approximately three quarters of all malignant cancers (excluding non-melanoma skin cancer). Details of these sites and how they compare to similar sites in the NCRD can be found in the monthly data quality monitoring report.

Comparisons between RCRD and NCRD

Overall comparison

Plot

Data

Figure 1: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD.

Figure 1 shows that across the whole time series the smallest and largest absolute differences, between RCRD and NCRD, were 0% and 2%, respectively. However, there is typically an offset between the two of approximately 2 percentage points, with the RCRD being higher than the NCRD. During the initial Covid-19 lockdown, in Quarter 2 of 2020, the ED% for RCRD and NCRD fell by 6% and 8%, respectively.

Comparison by cancer site

Cancer B-K



Cancer Lu-Oth



Cancer Ov-U

Data

Figures 2a-c: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD, by cancer site.

Figures 2a-c show the ED% for patients in the RCRD compared to patients in the NCRD split by cancer site. There is a substantial difference in the relationship for different cancer sites. Across the whole time series, bladder had the largest absolute difference between RCRD and NCRD. The smallest and largest absolute differences for this cancer site were 6% and 11%, respectively.

During the initial Covid-19 lockdown, in quarter 2 of 2020, the ED% for prostate and kidney (RCRD) fell by 8% and 6%, respectively; these were the largest drops out of all the cancer sites.

The ‘other’ category only exists in NCRD, representing cancers staged in the NCRD that are not in the RCRD.

Comparison by gender

Gender

Data

Figure 3: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD, by gender.

Figure 3 shows the ED% for patients in the RCRD compared to patients in the NCRD, split by gender. The match in ED% for males was considerably closer than for females. Across the whole time series the largest absolute differences, between RCRD and NCRD, were 1% and 3% for males and females, respectively.

During the initial Covid-19 lockdown, in quarter 2 of 2020, the ED% for RCRD and NCRD fell by 8% and 9% for males and 7% and 8% for females, respectively. Comparing the quarter prior to the initial Covid lockdown (quarter 2 of 2020), females took 2 quarters to return to the same ED%, whereas males took approximately 5 quarters. On examination of gender and cancer site breakdowns, this was largely attributable to prostate cancer (see figure 2c).

Comparison by age group

Age group

Data

Figure 4: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD, by age group.

Figure 4 shows the ED% for patients in the RCRD compared to patients in the NCRD split by age group. Across the whole time series, age group 50-59 had the largest absolute differences, between RCRD and NCRD: up to 4%. During the initial Covid-19 lockdown, in quarter 2 of 2020, the ED% for RCRD fell by 8% (age group 50-59), 9% (60-69), 6% (70-79) and 6% (80+). The youngest age group, 00-49, did not experience such a drastic fall during the same period.

Comparison by deprivation

Deprivation

Data

Figure 5: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD, by deprivation quintile.

Figure 5 shows the ED% for patients in the RCRD compared to patients in the NCRD split by quintile of the index of multiple deprivation (IMD). Across the whole time series, the least deprived quintile had the largest absolute differences, between RCRD and NCRD, up to 3%.

During the initial Covid-19 lockdown, in quarter 2 of 2020, the ED% for RCRD fell by 8%, 7%, 5%, 6% and 5% for IMD quintiles 1, 2, 3, 4 and 5 respectively.

Comparison by ethnicity

Ethnicity

Data

Figure 6: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD, by ethnicity.

Figure 6 shows the ED% for patients in the RCRD compared to patients in the NCRD split by broad ethnic category. Across the whole time series, Mixed ethnicity had the largest absolute difference between RCRD and NCRD, up to 7%.

During the initial Covid-19 lockdown, in Quarter 2 of 2020, the ED% for RCRD fell by 6% (Asian ethnicity), 7% (Black), 12% (Mixed), 9% (Other), 10% (Unknown) and 6% (White). Some ethnicities, such as Mixed and Other had a volatile trend over time; this is likely at least partly due to small numbers in the data.

Comparison by NHSE region

NHSE Region

Data

Figure 7: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD, by NHSE region.

Figure 7 shows the ED% for patients in the RCRD compared to patients in the NCRD split by NHS England region. Across the whole time series, the match between RCRD and NCRD was relatively good, however there were some inconsistencies post-Covid.

During the initial Covid-19 lockdown, in Quarter 2 of 2020, the ED% for RCRD fell by 4%, 8%, 6%, 6%, 6%, 7% and 8% for regions East of England, London, Midlands, North East and Yorkshire, North West, South East and South West, respectively.

Comparison by Cancer Alliance

Alliance 1-12



Alliance 13-21

Data

Figures 8a-b: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD, by Cancer Alliance.

Figures 8a-b show the ED% by cancer alliances for patients in the RCRD compared to the NCRD. Across all alliances, breast has the highest ED% values for both data sources and North Central London, North East London, South East London had a larger variation between NCRD and RCRD ED% values relative to the other alliances. For colorectal, North Central London has a large variation between the NCRD and RCRD ED%s from 2020-2022 and for Thames Valley and West Midlands there is a larger variation between NCRD and RCRD relative to the other main tumour sites for those alliances. However, lung shows stronger similarities for ED% between the two datasets. Similarly to variation with ICB (below) the largest inter-alliance variation is in prostate cancer. Larger variations between the two datasets are seen in the financial year 2020-2021.

Comparison by ICB

ICB 1-12



ICB 13-24



ICB 25-36



ICB 37-42

Data

Figures 9a-d: The early diagnosis proportion (ED%) for patients in the RCRD and the NCRD, by ICB.

Looking at ED% in RCRD and NCRD at an ICB level, figure 9 shows that for breast in the 2020-2021 financial year, South East London had the greatest variation between RCRD and NCRD proportions. For prostate, Nottinghamshire had more variation compared to other ICB’s and for colorectal, there is differences in ED% values between the two data sources for Birmingham, Black Country, Oxfordshire and Devon ICB’s. Similar to the alliance level breakdown, Lung did not have as much variation between the two datasets.

Of the four cancer sites pictured the inter-ICB variation in ED% is substantially larger for prostate cancer.

Stability of relationship between RCRD and NCRD

Due to different methodologies used in their creation it is not surprising that the early diagnosis proportion observed in each has an offset. This offset is quantified below and broken down by available clinical and demographic factors.

The period between 2019-04 to the most recent data available in both datasets is selected. The difference in the overall ED% for all staged cancers in each dataset is calculated. This included cancers in the ‘Other’ category in the NCRD which have no direct equivalent in the RCRD, as the intention is explore the relationship between ED% as reported from each dataset rather than to make the closest comparison possible.

The change in this offset over time is also explored using a linear fit for the difference in ED% between the two datasets. To maintain independence in the monthly data is used in the fit rather than the 12-month rolling proportion discussed below.

Figure 10: The difference in the early diagnosis proportion (ED%) for patients in the RCRD and the NCRD for all staged cancers.

Figure 10 shows the difference in ED%, between RCRD and NCRD, for all staged cancers. The mean difference between the two datasets over all the months in this period is 1.7% (inter-quartile range [IQR] 1.6-1.8%). The gradient of the fit line shows a divergence of >0.1% per year between the two, with a p-value of 0.724.

Figure 11: The difference in the early diagnosis proportion (ED%) for patients in the RCRD and the NCRD for all staged cancers excluding prostate.

Figure 11 shows the difference in ED% between the RCRD and NCRD for all staged cancers excluding prostate cancer. In this case the mean difference is 2.2% (IQR 2.1-2.3%) and the gradient is 0.1% per year with a p-value of 0.367.

Table 1, below, shows the same measures for each individual cancer site, demographic variable and geography. Both positive and negative gradients are observed for different cancer sites, demographic strata and different geographies.

Numeric comparison of NCRD and RCRD data from 2019-04
Variable Value Mean difference % 25th Percentile % 75th Percentile % Yearly change % p-value for trend
General All 1.7 1.6 1.8 0.0 0.724
General All excl. Prostate 2.2 2.1 2.3 0.1 0.367
Gender Male 0.6 0.2 0.8 0.2 0.019
Gender Female 2.4 2.4 2.5 0.0 0.630
Cancer type Bladder 7.0 6.3 8.0 -0.9 0.001
Cancer type Breast 0.1 -0.1 0.3 -0.1 0.272
Cancer type Cervical -0.9 -2.4 -0.7 -1.8 0.000
Cancer type Colorectal 2.4 2.3 2.6 -0.3 0.033
Cancer type Kidney -1.0 -1.5 -0.7 -0.3 0.274
Cancer type Lung -1.6 -2.0 -1.1 -0.2 0.020
Cancer type Lymphoma 3.3 2.9 3.7 -0.2 0.366
Cancer type Melanoma 4.4 3.7 5.0 -0.2 0.327
Cancer type Oesophagus 8.2 7.7 8.8 0.0 0.995
Cancer type Ovary -0.4 -0.8 0.1 -0.5 0.080
Cancer type Pancreas 1.9 1.4 2.6 0.1 0.646
Cancer type Prostate -0.3 -0.7 0.1 0.1 0.378
Cancer type Stomach 3.2 1.8 4.5 0.8 0.022
Cancer type Uterine -0.9 -1.4 -0.2 -0.3 0.122
Ethnicity Asian 2.4 2.1 2.6 0.0 0.951
Ethnicity Black 1.8 1.6 2.3 0.3 0.177
Ethnicity Mixed and Other 1.3 1.0 1.6 -0.4 0.248
Ethnicity Unknown 1.3 0.9 1.7 0.3 0.154
Ethnicity White 1.7 1.6 1.8 0.0 0.816
Alliance Cheshire and Merseyside 0.7 0.3 1.1 0.2 0.151
Alliance East Midlands 1.8 1.1 2.3 -0.5 0.006
Alliance East of England North 1.5 1.1 2.0 0.2 0.445
Alliance East of England South 3.4 3.1 3.6 -0.1 0.704
Alliance Greater Manchester 0.5 0.3 0.7 0.1 0.700
Alliance Humber and North Yorkshire 2.6 1.8 3.2 0.6 0.004
Alliance Kent and Medway 1.7 0.5 2.9 -0.5 0.079
Alliance Lancashire and South Cumbria 0.3 -0.1 0.7 -0.4 0.090
Alliance North Central London 2.7 1.8 3.4 0.2 0.501
Alliance North East London 1.5 0.8 2.1 0.4 0.184
Alliance North West and South West London 1.1 0.8 1.4 -0.2 0.476
Alliance Northern 1.6 1.1 1.9 0.1 0.446
Alliance Peninsula 1.9 1.7 2.2 0.2 0.317
Alliance Somerset, Wiltshire, Avon and Gloucestershire 2.7 2.4 3.0 0.3 0.114
Alliance South East London 0.1 -0.6 0.8 0.9 0.002
Alliance South Yorkshire and Bassetlaw 2.3 1.2 3.3 -0.2 0.661
Alliance Surrey and Sussex 0.9 0.7 1.0 0.1 0.679
Alliance Thames Valley 4.4 3.5 5.3 -0.8 0.003
Alliance Wessex 1.4 1.1 1.8 0.0 0.885
Alliance West Midlands 1.9 1.5 2.3 -0.2 0.121
Alliance West Yorkshire and Harrogate 1.2 0.7 1.7 0.4 0.017
Deprivation Quintile 1 - most deprived 1.1 1.1 1.2 0.0 0.839
Deprivation Quintile 2 1.5 1.4 1.7 0.1 0.537
Deprivation Quintile 3 1.7 1.6 1.7 0.0 0.933
Deprivation Quintile 4 1.9 1.8 2.0 0.0 0.708
Deprivation Quintile 5 - least deprived 2.0 1.8 2.2 0.0 0.764
Ageband 00-49 0.4 0.0 0.9 -0.1 0.556
Ageband 50-59 2.1 1.8 2.4 0.2 0.083
Ageband 60-69 2.0 1.8 2.3 0.1 0.276
Ageband 70-79 1.6 1.6 1.8 0.1 0.530
Ageband 80+ 2.2 1.9 2.5 -0.2 0.009
ICB NHS Bath and North East Somerset, Swindon and Wiltshire Integrated Care Board 1.7 1.2 2.3 0.5 0.141
ICB NHS Bedfordshire, Luton and Milton Keynes Integrated Care Board 4.3 3.8 4.9 -0.2 0.437
ICB NHS Birmingham and Solihull Integrated Care Board 3.1 2.4 3.7 0.0 0.987
ICB NHS Black Country Integrated Care Board 1.4 0.5 2.3 -0.5 0.139
ICB NHS Bristol, North Somerset and South Gloucestershire Integrated Care Board 4.6 3.9 5.4 0.5 0.219
ICB NHS Buckinghamshire, Oxfordshire and Berkshire West Integrated Care Board 4.4 3.5 5.3 -0.8 0.003
ICB NHS Cambridgeshire and Peterborough Integrated Care Board 1.6 1.3 1.9 -0.3 0.479
ICB NHS Cheshire and Merseyside Integrated Care Board 0.7 0.3 1.1 0.2 0.151
ICB NHS Cornwall and the Isles of Scilly Integrated Care Board 1.1 0.3 2.1 -0.1 0.800
ICB NHS Coventry and Warwickshire Integrated Care Board 1.4 1.0 1.8 0.1 0.699
ICB NHS Derby and Derbyshire Integrated Care Board 1.0 0.6 1.5 -0.6 0.091
ICB NHS Devon Integrated Care Board 2.3 1.7 3.2 0.4 0.128
ICB NHS Dorset Integrated Care Board 0.7 -0.2 1.4 0.5 0.066
ICB NHS Frimley Integrated Care Board 2.6 2.2 3.2 -0.3 0.492
ICB NHS Gloucestershire Integrated Care Board 2.2 1.2 2.8 -0.4 0.301
ICB NHS Greater Manchester Integrated Care Board 0.5 0.3 0.6 0.1 0.655
ICB NHS Hampshire and the Isle of Wight Integrated Care Board 1.8 1.3 2.3 -0.2 0.309
ICB NHS Herefordshire and Worcestershire Integrated Care Board 1.7 1.3 2.3 0.4 0.237
ICB NHS Hertfordshire and West Essex Integrated Care Board 1.8 1.6 2.1 0.2 0.619
ICB NHS Humber and North Yorkshire Integrated Care Board 2.6 1.8 3.2 0.6 0.004
ICB NHS Kent and Medway Integrated Care Board 1.7 0.5 2.9 -0.5 0.079
ICB NHS Lancashire and South Cumbria Integrated Care Board 0.3 -0.1 0.7 -0.4 0.090
ICB NHS Leicester, Leicestershire and Rutland Integrated Care Board 3.0 1.8 3.9 -1.0 0.002
ICB NHS Lincolnshire Integrated Care Board 0.4 -0.6 1.3 -0.6 0.031
ICB NHS Mid and South Essex Integrated Care Board 4.2 3.1 5.0 -0.2 0.709
ICB NHS Norfolk and Waveney Integrated Care Board 1.7 1.3 2.1 0.2 0.560
ICB NHS North Central London Integrated Care Board 2.7 1.8 3.4 0.2 0.501
ICB NHS North East and North Cumbria Integrated Care Board 1.6 1.1 1.9 0.1 0.446
ICB NHS North East London Integrated Care Board 1.5 0.8 2.1 0.4 0.184
ICB NHS North West London Integrated Care Board 1.4 0.7 2.0 -0.6 0.041
ICB NHS Northamptonshire Integrated Care Board 1.6 0.6 2.5 0.7 0.078
ICB NHS Nottingham and Nottinghamshire Integrated Care Board 3.6 2.7 4.5 -0.7 0.026
ICB NHS Shropshire, Telford and Wrekin Integrated Care Board 0.6 -1.0 2.0 -1.1 0.004
ICB NHS Somerset Integrated Care Board 2.7 2.2 3.2 0.5 0.201
ICB NHS South East London Integrated Care Board 0.1 -0.6 0.8 0.9 0.002
ICB NHS South West London Integrated Care Board 0.9 0.4 1.4 0.4 0.243
ICB NHS South Yorkshire Integrated Care Board 2.0 1.0 3.0 0.0 0.898
ICB NHS Staffordshire and Stoke-on-Trent Integrated Care Board 2.6 1.8 3.4 -0.6 0.087
ICB NHS Suffolk and North East Essex Integrated Care Board 1.5 0.6 2.5 0.4 0.284
ICB NHS Surrey Heartlands Integrated Care Board -0.1 -0.5 0.5 0.4 0.242
ICB NHS Sussex Integrated Care Board 0.8 0.6 0.9 0.0 0.929
ICB NHS West Yorkshire Integrated Care Board 1.2 0.7 1.7 0.4 0.017
Note:
p-values below 0.05 are highlighted in red

Table 1: The difference in the early diagnosis proportion (ED%) for patients in the RCRD and the NCRD by demographic and clinical factors.

Early-diagnosis proportion from RCRD by time of publication

The RCRD data is updated monthly, resulting in changes to the ED% reported for a given month. The stability of the ED% between monthly snapshots of the RCRD data is assessed by comparing each monthly ED% for a given snapshot with the respective ED% from the previous snapshot, and calculating the standard deviation for all snapshot-to-snapshot monthly differences. All RCRD data presented here for comparison is for all staged cancers in England.

Figure showing the change in ED rate for the same month, for each snapshot.

Figure 12: The change in early diagnosis proportion (ED%) for the same month, for each snapshot

Figure 12 shows the change in the RCRD ED% between a given snapshot and the previous snapshot, for the same calendar month, split by number of months between publication. The most recent month of data available data in two snapshots is the least stable, with a larger average difference (0.31 percentage points, compared to differences of between -0.09 and 0.00 percentage points for other months) and larger standard deviation (0.26, compared to 0.04 to 0.13 for other months). The latest month generally has the lowest rate of completeness due to staging data accumulating over several months after diagnosis and not necessarily being complete at the time of diagnosis.

The average positive difference of 0.31% for the most recent month shows that the ED% for these months are generally being revised upwards. That is, when a new snapshot is released, the most recent month’s ED% data is likely an underestimation of around 0.31%.

Figure showing the stablity of the difference between the 12 month rolling average measure of the difference between the early diagnosis percentage in the Rapid Cancer Registration Dataset compared to the Gold Standard cancer data set.

Figure 13: The change in 12-month early diagnosis rolling proportion (ED%) for the same month, for each snapshot

Figure 13 shows the same comparisons when based on ED%s calculated as a 12-month rolling average, i.e., repeating the analysis using the 12m moving average ED proportion. This shows that the this average difference for the most recent month available in both snapshots is reduced to 0%. However, the standard deviation of the most recent month is still, relatively, higher than older months. This is expected and indicates that whilst the average values between snapshots are similar, there is still variation for each individual snapshot.

References

  1. Henson et al., Data Resource Profile: National Cancer Registration Dataset in England, Int J Epidemiol. 2020 Feb 1;49(1):16-16h. doi: 10.1093/ije/dyz076

  2. National Disease Registration Service. The Rapid Cancer Registration Data set