This report presents the latest data on net cancer survival for Pennsylvanians. Current net survival is estimated using patients diagnosed between 2010 and 2016. It divides the data in these chapters:


Explanation of terms and types of data


Important findings and general survival data


Changes in data from 2001 to 2017


How a cancer’s stage at diagnosis affects survival


How poverty, race, and insurance coverage affect cancer survival


Comparison of net survival rates between counties or health districts

Technical Notes

How the data was collected, analyzed, and presented


Downloadable ZIP archive of all the data in the report


Other books, articles, and websites about cancer survival

Basics of survival analysis

Survival analysis estimates how many people in a group will still be alive after a specific amount of time. For cancer survival, the time is usually 5 years after diagnosis. The ratio of survivors to everyone in the group is called a survival rate.

In the simplest case, the survival rate would be the number who survived divided by the total. However, this would require excluding recently diagnosed people. For example, we currently only know for sure whether somebody with cancer was alive up to the end of 2017. Therefore, we don’t know if anybody diagnosed after 2013 survived 5 years.

To include people diagnosed more recently, we break the chosen amount of time into smaller intervals. For example, we could divide 5 years into 1,826 days. We then estimate the chance of death for each day. If we only have data for a patient up to one year after diagnosis, we can still use that data for those days. Finally, we recombine these daily chances to get the 5-year chance of death.

Net survival

A net cancer survival rate is what the survival rate would be if there were no deaths unrelated to cancer. You can use this data to focus on cancer-related differences between groups.

Net cancer survival compares the chances of death between people diagnosed with cancer and people without a diagnosis. It splits the daily chance of death into 2 parts, called hazards:

  1. the population hazard experienced by similar people without cancer; and
  2. the net hazard added by cancer.

Net survival only uses the net hazard. See the Technical Notes for details.

Figure 1: Illustration of How the Net Hazard Can Be Estimated from the Observed and Population Hazards
Illustration of How the Net Hazard Can Be Estimated from the Observed and Population Hazards

Some notes on interpreting net survival in this report:

  • Net survival rates are not actual survival chances.
  • A lower net survival rate means a greater chance of death.
  • This report controls for differences in chance of death from age, race, sex, and year. It does not control for income, lifestyle, or health habits.
  • Net survival rates can go above 100%. This happens when the group with cancer has less of a chance of death than the general population.

Statistical accuracy

A population’s net survival rate comes from a weighted average of its members’ cumulative hazards. It is assumed there is a true average for the group that one can only estimate. A confidence interval shows the uncertainty of an estimate. Estimates for larger populations have more certainty.

Nobody can directly observe a group’s net survival. We use confidence intervals to show how precise a net survival estimate is. A wider interval means a less precise estimate.

We say the difference between two rates is “statistically significant” if there is less than a 5% chance of no difference.