
Preparing for Influenza Season
Project Overview
Flu Season Staffing
Influenza season is a very serious issue in the US, and hospitals and clinics are understaffed and not properly equipped. Vulnerable populations need extra staff and will need proper data evidence to provide the best logistical resolution on where the staffing agency should send their workers.
​
My objective as a data analyst is to use historical patient and hospital data of influenza patients to determine the best course of action to meet the needs of the upcoming flu season. We will make data-driven decisions to identify the most "vulnerable" populations, so that the medical staffing agency can implement the best logistical resolution.

Stakeholders
&
Key Questions
The scope of this project covers all 50 states and hospitals, as well as the following stakeholders:
​
-
Medical frontline staff (nurses, physician assistants, and doctors)
-
Hospitals and clinics using agency services
-
Influenza patients
-
Staffing agency administrators
​
The questions below are key to achieving the goal of this project for the stakeholders to determine the best course of action before the next flu season:
​
-
Who amonst the population is considered "vulnerable" to the virus? Is there a specific age group with a higher mortality rate?​
-
Which states and geographic locations should the agency send more medical personnel to?​​
-
What is considered "flu season"? Is there a time of year where there is an uptick of flu cases?
DATA
TOOLS
SKILLS
​CDC Dataset
This data comes from the Center of Disease Control & Prevention, which shows the number of Influenza deaths by state and age
​
The US Census Bureau has this population data by geography, time, age, and gender
​
This is the final integrated dataset used for the project
-
Microsoft Excel
​
-
Tableau


-
Visual & spatial analysis​​
-
Forecasting
-
Data mapping, transformation and integration​​
-
Tableau storytelling
-
Data cleansing​
-
​Statistical hypothesis testing​​
-
Stakholder presentation
The Process

Data Preparation
The first step in this project was cleaning and checking for integrity for both the CDC and US Census datasets in Excel. Once both datasets were clean, the next step was data transformation, using pivot tables. These transformed tables were then consolidated into one final Integrated dataset (shown above) combining both population and influenza for the next steps in analysis.
Data Storytelling
The last piece of information was to determine what exactly is "flu season". The line graph visual on the right shows all the flu cases by month from the years 2009-2017.
With all the analysis in place, I created a final Tableau storyboard showcasing all the findings and recommendations, as well as a Vimeo video presentation.

Statistical Hypothesis Testing
The next step in analysis was finding the correlation of certain age groups and influenza deaths. Using the statistical based functions in Excel, I was able to support the hypothesis that the higher the age of a US citizen, the more susceptible they are to dying by the influenza virus.

Tableau Visualizations
Using the new integrated dataset, I extracted the Excel file into Tableau to create visualizations as supporting arguments for the final stakeholder presentation. Some visualizations included scatterplots, heat maps, histograms, bubble charts, box plots, and many others. The primary visual I wanted to include was a Choropleth map (above image) showing both population and influenza deaths.
​

Conclusion
I've summarized the key points below on which population is considered vulnerable, which states should receive the most medical attention, and when to send the additional staff. Based on my recommendations, the medical staffing will act accordingly to prepare for the upcoming flu season.
Age Group
After analysis of both datasets, and statistical hypothesis testing, I determined that the "vulnerable" age group was US citizens over the age of 75. There is a strong correlation between this age group and mortality rate. Medical staff and personnel should focus their shift on prioritizing their older patients, especially those with flu like symptoms.
States

The visualizations I created in Tableau revealed that the states with the most flu deaths, are almost identical to the states with the most population. The agency should send the vast majority of their staff to California, Florida, Texas, Pennsylvania, and New York, while also allocating fewer resources to less vulnerable states like Vermont, Wyoming, and Delaware.
Season
The historical trending data on influenza cases shows that "flu season" is in the winter months between Nov-Jan, with an annual peak of deaths coming in January. With the proper logistics in place, the staffing agency should send their nurses and doctors in early October as a preventative measure and maximize efficiency on decreasing the number of cases that become rampant in late winter.
