4 Formatting data
For all of the function to run, the data must be in the long format. This means that each event must be on a new row. An event would be a single point, a line segment, or an arrow. If a study unit has multiple events occur they must be recorded over multiple rows. Often times data is given in the long format (eg. One row per patient).
4.1 Long data
Here is an example data.frame in the long format.
wide_example <- structure(list(ID = c("ID:001", "ID:002", "ID:003"), Date.begin.Treatment = structure(c(14307,
14126, 15312), class = "Date"), AE = structure(c(16133, 14491,
NA), class = "Date"), SAE = structure(c(16316, NA, 16042), class = "Date"),
Death.date = structure(c(16499, NA, 17869), class = "Date"),
Response1 = c("SD", "SD", NA), Response1.Start = structure(c(14745,
14345, NA), class = "Date"), Response1.End = structure(c(15111,
14418, NA), class = "Date"), Response2 = c("CR", "PR", NA
), Response2.Start = structure(c(15768, 14674, NA), class = "Date"),
Response2.End = structure(c(16133, 14856, NA), class = "Date"),
Response3 = c(NA, "CR", NA), Response3.Start = structure(c(NA,
14856, NA), class = "Date"), Response3.End = structure(c(NA,
15587, NA), class = "Date"), Last.follow.up = structure(c(16499,
17048, 17869), class = "Date")), class = "data.frame", row.names = c(NA,
-3L))| ID | Date.begin.Treatment | AE | SAE | Death.date | Response1 | Response1.Start | Response1.End | Response2 | Response2.Start | Response2.End | Response3 | Response3.Start | Response3.End | Last.follow.up |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID:001 | 2009-03-04 | 2014-03-04 | 2014-09-03 | 2015-03-05 | SD | 2010-05-16 | 2011-05-17 | CR | 2013-03-04 | 2014-03-04 | NA | NA | NA | 2015-03-05 |
| ID:002 | 2008-09-04 | 2009-09-04 | NA | NA | SD | 2009-04-11 | 2009-06-23 | PR | 2010-03-06 | 2010-09-04 | CR | 2010-09-04 | 2012-09-04 | 2016-09-04 |
| ID:003 | 2011-12-04 | NA | 2013-12-03 | 2018-12-04 | NA | NA | NA | NA | NA | NA | NA | NA | NA | 2018-12-04 |
All of the dates need to be converted to time. For each patient the Date.begin.Treatment is the starting point (Time 0)
date_cols <- c("Date.begin.Treatment","AE","SAE",'Death.date','Response1.Start', 'Response1.End','Response2.Start', 'Response2.End',
'Response3.Start' ,'Response3.End' ,'Last.follow.up') # Getting the columns with dates
wide_example[date_cols] <- lapply(wide_example[date_cols], as.numeric) # Converting to numbers
wide_example[date_cols] <- round((wide_example[date_cols]-wide_example$Date.begin.Treatment)/365.25,1) #Calcuating the time in years since the start of treatment
knitr::kable(wide_example)| ID | Date.begin.Treatment | AE | SAE | Death.date | Response1 | Response1.Start | Response1.End | Response2 | Response2.Start | Response2.End | Response3 | Response3.Start | Response3.End | Last.follow.up |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID:001 | 0 | 5 | 5.5 | 6 | SD | 1.2 | 2.2 | CR | 4.0 | 5 | NA | NA | NA | 6 |
| ID:002 | 0 | 1 | NA | NA | SD | 0.6 | 0.8 | PR | 1.5 | 2 | CR | 2 | 4 | 8 |
| ID:003 | 0 | NA | 2.0 | 7 | NA | NA | NA | NA | NA | NA | NA | NA | NA | 7 |
The wide data can be used to create the bars of the swimmer plot
plot <- swimmer_plot(df=wide_example,id='ID',end='Last.follow.up',col='black',fill='grey')
plot