Breakthrough Infections Following Vaccination Against COVID-19: Descriptive Statistics From the COVID-19 Research Database

Alexandra Muir1, Etienne Holder1, Claire Cravero1, Sara Rogovin1, Mohak Jain1, Christine Horne1, Victor Cai1, Tim Suther2, Martin Aboitiz 3, Mark Cullen4,5,6

1Datavant Inc., San Francisco, CA, USA

2Change Healthcare, Nashville, TN, USA

3Healthjump, Inc., Philadelphia, PA, USA

4COVID-19 Research Database Scientific Steering Committee

5Department of Medicine, Stanford University, CA, USA

6Division of Primary Care and Population Health, Stanford University, CA, USA


As variants of COVID-19 emerge, such as the now widespread Delta variant, questions remain about the long-term efficacy of COVID-19 vaccines. Recently, concerns about “breakthrough infections” (infections after a person has been fully vaccinated) have dominated the media. Current estimates of breakthrough infection rates in the United States range from .01% to .54%, but comprehensive data from all 50 states are lacking. Current health policy decision-making in the United States is indexed on studies from Israel and other geographies with comprehensive data capture.


Based on evidence from the Israeli experience, the Centers for Disease Control began recommending boosters shots eight months following full vaccination to shore up immunity, with plans to begin shots this month. However, some scientists have noted that the push for booster shots may be premature, as the data on breakthrough infections suggests that the course of illness is much milder than without vaccination.


There is a pressing need for more comprehensive surveillance of breakthrough infections in the United States. We need to better understand the rate of breakthrough infections, the course of illness following breakthrough infection, and the long-term efficacy of different vaccines.  The availability of this data is critical to public health policy regarding booster shots and their timing.


This post provides descriptive statistics concerning breakthrough infections and the preliminary trends we are seeing in the data currently available in the COVID-19 Research Database. This should serve as an orientation to the data available in this accessible resource.


Data Source

The COVID-19 Research Database is the product of a pro-bono, cross-industry collaborative, composed of institutions donating technology services, healthcare expertise, and de-identified data. It is currently the largest secure repository of HIPAA-compliant, de-identified, and limited patient-level data sets, which have been made available to public health and policy researchers to further their investigations of the direct and indirect effects of the COVID-19 pandemic.


We used electronic health records, claims data, and mortality data currently available in the COVID-19 Research Database. Electronic health record data included diagnosis, procedures, labs, vitals, medications, and histories sourced from participating members of the Healthjump network. A claims database from Change Healthcare and a separate claims database were also used for analyses. We identified COVID-19 vaccine administrations based on CVX codes or CPT codes1 and breakthrough infections2 were identified based on ICD codes and with a diagnosis date that was 14 days after the date of full vaccination3. Filtering for diagnoses following 14 days better characterizes those who were infected despite achieving full immunity. Mortality data are provided by Datavant. The mortality data source contains obituary data sourced from online newspapers, funeral homes, online memorials, direct submissions, and more. The mortality data does not include cause of death; given this missing data and the design of this observational study, we can make no causal inference between COVID-19 breakthrough infection and death.


To track the patient’s journey from vaccination to potential breakthrough infection or death, we connected records within and across datasets using the patient’s privacy preserving record linkage (PPRL).


There are limitations to the current dataset, and all results presented therein should be interpreted within the context of these limitations. First, the COVID-19 Research Database currently does not collect data from mass vaccination sites as provided by local health departments, but rather includes vaccine doses captured in claims and electronic health records. Additionally, individuals seeking vaccination at their local doctor’s office may be, on average, sicker and more likely to seek medical care for any symptoms following vaccination. Although valuable information, the rates presented below may not be generalizable to the entire United States population.


Rates presented below are crude rates and percentages and have not been adjusted for rates of COVID-19 or Delta variant transmission within local communities. We expect that future studies using the COVID-19 Research Database can rigorously test hypotheses based on the descriptive statistics presented below.


Additionally, as with any study examining vaccine effectiveness at this point in time, there is a chance that the data available do not extend far enough into the future to truly understand the true rate of breakthrough infections. With most of the data presented here being concentrated in the February, March, and April months, we most likely will have to wait until the winter months of 2021 to get a better sense of the true rate of breakthrough infections.


As a note, new data is added to the COVID-19 Research Database weekly, providing real-time real-world data relating to the ongoing COVID-19 pandemic. For this post, analyses were conducted on August 17th, 2021.


1 To determine the date at which a patient was fully vaccinated, we used the latest administration of vaccination for a patient. For example, if two administrations of the Pfizer vaccine were present for a patient, the date of the later administration was determined to be the date of full vaccination. If an individual received the Pfizer or Moderna vaccines, two distinct doses needed to be present to be included in analyses.

2 Breakthrough COVID-19 infections were defined as a record with an ICD-10 code of U07.1 or U07.2 with a diagnosis date 14 days after the date of the second dose (or first in the case of Janssen).

3 Vaccine doses were defined as records with a CVX code of 207 (Moderna), 208 (Pfizer), 212 (Janssen), or CPT codes of 91300, 91301, or 91303. Vaccine records given to those under the age of 12 or before December 11th, 2020 were filtered out, as there is a high likelihood that these records are mistakes in coding rather than actual vaccination records. Patients who received Moderna or Pfizer needed a distinct first and second dose to be included in analyses.



Demographic Information

The electronic health record database contained 755,834 vaccine doses, claims database #1 contained 662,320 vaccine doses, and claims database #2 contained 3,718,724 vaccine doses. In the combined, de-duplicated dataset there were records of a distinct first and second dose of the vaccine (or one vaccine in the case of Janssen) from 1,821,955 individuals. The average age of individuals was 48.2 years old (range: 12 – 89; Note: the data is top-coded at 89; Figure 1). Figure 2 displays the number of vaccine records from the electronic health record database by the state to help characterize the density of coverage across the United States. Location information was not available in the two claims databases.

Figure 1

Figure 2: Number of vaccine records separated by state. The number of records per state ranges from 3 to 75,024. There are differences in the distribution of records compared to the real distribution of individuals in the United States. The current sample is not a perfect representation of the United States.


Splitting by vaccine type, 158,977 individuals received Janssen (8.9%), 914,063 individuals received Pfizer (50.0%), and 755,101 received Moderna (41.3%).  All individuals included in this sample who received Moderna or Pfizer vaccines had a distinct first and second dose.


Breakthrough Infections

So far, among those with documented full vaccination, there have been 1,194 breakthrough cases (.07%) in the combined dataset. The average time to breakthrough infection was 70.3 days (Figure 2). Split by vaccine type, the average time to breakthrough infection after full immunity was 46.5 days for Janssen, 81.1 days for Pfizer, and 79.8 days for Moderna (Figure 3). Figure 4 displays the rate of breakthrough infection split by the state in the electronic health record database.

Figure 3: Density on the y-axis represents the percent of data falling at each day following full immunization. All bars will sum to 1.

Figure 4

Figure 5: States with less than 100 records in the dataset were removed and marked in gray, as gathering accurate breakthrough rates would be difficult from such a small sample size. State information was only available in the electronic health record database.


1,173 (0.06%) individuals who received vaccination in the combined dataset eventually passed away. Only 4 (<.0001%) of these individuals had a record of breakthrough infection. To reiterate, the cause of death is not available in the mortality dataset, so we cannot assume that these individuals passed away due to their breakthrough infection (data not shown).


All rates of breakthrough infection were well below 1% (range: 0.06% to 0.22%), following the trends seen in other data in the United States. Janssen had the highest crude incidence rate of breakthrough infections, followed by Moderna, and then Pfizer (Table 1).

Vaccine Type % of Sample Count of Breakthrough Infection Count of Vaccines Rate of Breakthrough Infection (%)

Crude Incidence Rate

(per 100,000 person-days)

Janssen 8.7% 352 158,977 0.22% 1.7
Pfizer 50.0% 270 914,063 0.06% 0.2
Moderna 41.3% 572 755,101 0.15% 0.6

Table 1: Rate of breakthrough infections by vaccine type. Please note that these numbers have not been adjusted for well-known confounding factors, and therefore should be interpreted with caution.


To address the question of how vaccine efficacy changes over time, Table 2 displays the crude rate of breakthrough infections split by what month the full immunization was received. As expected, those vaccinated in December, January, and February of 2021 are displaying the highest crude rates of breakthrough cases.

Month of Full Immunization Total Individuals Fully Vaccinated Breakthrough Cases Crude Rate of Breakthrough Cases (%) Average Days Until Breakthrough

Crude Incidence Rate

(per 100,000 person/days)

December 57 0 0 N/A N/A
January 8,801 9 0.1 123.6 0.5
February 212,802 210 0.1 104.6 0.5
March 370,740 355 0.1 73.9 0.6
April 644,449 456 0.07 60.7 0.6
May 370,876 107 0.03 47.0 0.3
June 160,363 47 0.03 34.6 0.4
July 55,480 10 0.02 24.7 0.5
August 4,573 N/A N/A N/A N/A

Table 2: Crude rates of breakthrough infection stratified by month vaccine was received. Breakthrough infection was defined as any record of COVID-19 diagnosis between the time of the second dose and August 17th, 2021.


Those who received the Janssen vaccine seem to have higher crude rates of breakthrough infections, along with those who received Moderna in the early months of vaccine roll-out (Figure 6).

Figure 6: The percentage reported above is the percent of individuals who received their full immunization in the month displayed on the x-axis that had a breakthrough infection before August 17th, 2021. 


Surprisingly, the bulk of breakthrough infections for all months and all vaccine types happened relatively soon after full immunization (Figure 7). It is important to note that many of these “early” breakthroughs may reflect the huge surge during the winter and spring months, as these data have not been adjusted for background location, date, or the extent of circulating Delta variant. Moreover, most vaccinated individuals have had 6 months of follow-up post-vaccine or less.

Figure 7: This figure only displays those who had a breakthrough infection after full immunization. Rates were calculated by dividing the number of breakthrough infections in a particular month and vaccination cohort (color of the bar) by the sum of all breakthrough infections for the vaccination cohort. As such, each color if added together will sum to 100%. 



This post presented preliminary descriptive statistics regarding rates of breakthrough infections following vaccination against COVID-19. Specifically, this post aimed to describe how crude rates of breakthrough infection differ by vaccine type received and timing of full immunization.


In sum, the overall rate of breakthrough infection was very low, with almost all estimates being well below 1%. This is encouraging, suggesting that even months after vaccination, breakthrough infections are not common. Additionally, although rates of breakthrough infections in those vaccinated in the early months of vaccine rollout seem higher than those vaccinated later, all rates remain under 1%.


As the Delta variant becomes the dominant variant being transmitted throughout the United States, questions regarding vaccine efficacy over time are jumping to the top of people’s minds. Research groups are noting that protection against infection following full immunization is decreasing over time.


When looking at the timing of breakthrough infection in the current sample, those who were fully immunized during the early rollout of the vaccines (January and February) on average experienced breakthrough infections approximately four months following full immunization. Those vaccinated later showed relatively shorter windows for breakthrough infection, averaging approximately 50 days. Combined with crude rates of breakthrough infection, we expect that the majority of breakthrough infections will occur in the upcoming months as Delta variant cases accrue and immunity wanes as predicted. As such, monitoring of comprehensive passive surveillance data regarding breakthrough infections in the United States in the upcoming months is essential.

In summary:

  • All three vaccines studied had a very low breakthrough infection rate, under 0.3 %
  • Breakthrough infection rate decreased over the 200 day observation period included in the current study, although this may be due to a lack of data extending into the future
  • Preliminary trends may suggest that individuals receiving the Janssen vaccine may need to wait beyond the originally prescribed 14 days to reach the same levels of immunity found in Moderna and Pfizer vaccines

All of these observations need further study and may change as we account for rates of Delta variant transmission within the United States.

Call to Action

At this moment in time, questions of long-term vaccine efficacy and breakthrough infections are of the utmost importance. Better understanding both vaccine efficacy and breakthrough infections will inform public health policy, including the necessity and timing of booster shots. Additionally, real-world data can be utilized to understand the severity of breakthrough infections, understand the long-term safety of vaccines, among other extremely pertinent questions.


We invite any interested non-commercial research group to use the pro-bono COVID-19 Research Database to rigorously test hypotheses concerning vaccine efficacy and breakthrough infections. All partners of the COVID-19 Research Database hope that this resource can be used in a meaningful way to help alleviate the COVID-19 pandemic. Please visit to learn more. Inquiries should be directed to