Hungarian physician Dr. Ignaz Semmelweis worked at the Vienna General Hospital with childbed fever patients. Childbed fever is a deadly disease affecting women who have just given birth, and in the early 1840s, as many as 10% of the women giving birth died from it at the Vienna General Hospital.
Dr.Semmelweis discovered that it was the contaminated hands of the doctors delivering the babies, and on June 1st, 1847, he decreed that everyone should wash their hands, an unorthodox and controversial request; nobody in Vienna knew about bacteria.
in this Markdown we will reanalyze the data that made Semmelweis discover the importance of handwashing and its impact on the hospital.
Yearly dataset contains the number of women giving birth at the two clinics at the Vienna General Hospital between the years 1841 and 1846.
yearly = read.csv("yearly_deaths_by_clinic.csv", sep = ",", header = TRUE)
head(yearly)
## year births deaths clinic
## 1 1841 3036 237 clinic 1
## 2 1842 3287 518 clinic 1
## 3 1843 3060 274 clinic 1
## 4 1844 3157 260 clinic 1
## 5 1845 3492 241 clinic 1
## 6 1846 4010 459 clinic 1
Monthly dataset contains data from ‘Clinic 1’ of the hospital where most deaths occurred.
monthly = read.csv("monthly_deaths.csv", sep = ",", header = TRUE)
head(monthly)
## date births deaths
## 1 1841-01-01 254 37
## 2 1841-02-01 239 18
## 3 1841-03-01 277 12
## 4 1841-04-01 255 4
## 5 1841-05-01 255 2
## 6 1841-06-01 200 10
yearly = yearly %>%
mutate(proportion_deaths = deaths/births)
head(yearly)
## year births deaths clinic proportion_deaths
## 1 1841 3036 237 clinic 1 0.07806324
## 2 1842 3287 518 clinic 1 0.15759051
## 3 1843 3060 274 clinic 1 0.08954248
## 4 1844 3157 260 clinic 1 0.08235667
## 5 1845 3492 241 clinic 1 0.06901489
## 6 1846 4010 459 clinic 1 0.11446384
monthly = monthly %>%
mutate(proportion_deaths = deaths / births)
head(monthly)
## date births deaths proportion_deaths
## 1 1841-01-01 254 37 0.145669291
## 2 1841-02-01 239 18 0.075313808
## 3 1841-03-01 277 12 0.043321300
## 4 1841-04-01 255 4 0.015686275
## 5 1841-05-01 255 2 0.007843137
## 6 1841-06-01 200 10 0.050000000
ggplot(yearly, aes(x= year , y= proportion_deaths, color = clinic )) + geom_line() +
labs(x= "Year", y= "Proportion Deaths")
ggplot(monthly, aes(x= as.Date(date) , y= proportion_deaths, group = 1)) + geom_line() +
labs( x= "Date", y = "Proportion Deaths") +
scale_x_date(date_labels = "%Y-%m", date_breaks = "12 month")
handwashing_start = as.Date("1847-06-01")
monthly = monthly %>%
mutate(handwashing_started = date >= handwashing_start)
head(monthly)
## date births deaths proportion_deaths handwashing_started
## 1 1841-01-01 254 37 0.145669291 FALSE
## 2 1841-02-01 239 18 0.075313808 FALSE
## 3 1841-03-01 277 12 0.043321300 FALSE
## 4 1841-04-01 255 4 0.015686275 FALSE
## 5 1841-05-01 255 2 0.007843137 FALSE
## 6 1841-06-01 200 10 0.050000000 FALSE
ggplot(monthly, aes(x= as.Date(date) , y= proportion_deaths, group = 1, color = handwashing_started)) + geom_line() +
labs( x= "Date", y = "Proportion Deaths") +
scale_x_date(date_labels = "%Y-%m", date_breaks = "12 month")
monthly_summary = monthly %>%
group_by(handwashing_started) %>%
summarise(mean_proportion_deaths = mean(proportion_deaths))
monthly_summary
## # A tibble: 2 × 2
## handwashing_started mean_proportion_deaths
## <lgl> <dbl>
## 1 FALSE 0.105
## 2 TRUE 0.0211