Introduction

This report presents the latest data on net cancer survival for Pennsylvanians. Current net survival is estimated using patients diagnosed between 2009 and 2015. It divides the data in these chapters:

Introduction: Explanation of terms and types of data
Summary: Important findings and general survival data
Trends: Changes in data from 2001 to 2016
Stage: How a cancer’s stage at diagnosis affects survival
Disparities: How poverty, race and insurance coverage affect cancer survival
Geographic: Comparison of net survival rates between counties or health districts
Technical Notes: How the data was collected, analyzed and presented
Download: Downloadable ZIP archive of all the data in the report
References: Other books, articles and websites about cancer survival

Basics of survival analysis

Survival analysis estimates how many people in a group will still be alive after a specific amount of time. For cancer survival, the time is usually five years after diagnosis. The ratio of survivors to everyone in the group is called a survival rate.

In the simplest case, the survival rate would be the number who survived divided by the total. However, this would require excluding recently diagnosed people. For example, we currently only know for sure whether somebody with cancer was alive up to the end of 2016. We don’t know if anybody diagnosed after 2012 survived five years.

To include people diagnosed more recently, we break the chosen amount of time into smaller intervals. For example, we could divide five years into 1,826 days. We then estimate the chance of death for each day. If we only have data for a patient up to one year after diagnosis, we can still use that data for those days. Finally, we recombine these daily chances to get the five-year chance of death.

Net survival

A net cancer survival rate is what the survival rate would be if there were no deaths unrelated to cancer. You can use this data to focus on cancer-related differences between groups.

Net cancer survival compares the chances of death between people diagnosed with cancer and people without a diagnosis. It splits the daily chance of death into two parts, called hazards:

the population hazard experienced by similar people without cancer; and
the net hazard added by cancer.

Net survival only uses the net hazard. See the Technical Notes for details.

Figure 1: Illustration of How the Net Hazard Can Be Estimated from the Observed and Population Hazards

Some notes on interpreting net survival in this report:

Net survival rates are not actual survival chances.
A lower net survival rate means a greater chance of death.
This report controls for differences in chance of death from age, race, sex and year. It does not control for income, lifestyle or health habits.
Net survival rates can go above 100 percent. This happens when the group with cancer has less of a chance of death than the general population.

Statistical accuracy

A population’s net survival rate comes from a weighted average of its members’ cumulative hazards. It is assumed there is a true average for the group that one can only estimate. A confidence interval shows the uncertainty of an estimate. Estimates for larger populations have more certainty.

Nobody can directly observe a group’s net survival. We use confidence intervals to show how precise a net survival estimate is. A wider interval means a less precise estimate.

We say the difference between two rates is “statistically significant” if there is less than a 5 percent chance of no difference.

Net Cancer Survival in Pennsylvania

Pennsylvania Division of Health Informatics

October 2019

Introduction

Basics of survival analysis

Net survival

Statistical accuracy