1.4 Types of Data

1.4.1 Identifiers

  • No personally identifying data should appear, including EMR numbers
  • MRN can be included at the data entry stage and removed prior to sending for analysis.
  • Keep a sheet separate to the data linking Study IDs to patient IDs

1.4.2 Numeric Data

  • Enter continuous data, such as Age or Weight as a single numeric field without any extra text (ie enter 50 instead of 50kg)
  • Do not enter two variables for the same piece of information (ie Age and Age Category). Instead, enter age and specify AgeCat as a calculated variable, or in your statistical analysis plan.
  • Entering data once reduces the amount of data entry and the potential for errors.

1.4.3 Categorical Data

  • Enter the Levels of categorical and code variables in the order you would like them presented (ie CR=complete recovery, PR=partial recovery,SD=stable disease,PD=progressive disease)
  • Categorical data can be entered as numbers, letters or abbreviations instead of text
  • Categories are entered separate by commas
  • Example:
    • T1,T2,T3,T4
  • Codes are entered in the data dictionary in the format code=label separated by commas
  • Examples:
    • 1=Female, 2=Male
    • CR= Complete Recovery, PR= Partial Recovery, SD = Stable Disease, PD = Progressive Disease

1.4.4 Dates

  • Should be entered in the unambiguous format “01-Jan-2020” to avoid confusion
  • Dates after the current date will be highlighted in red as a warning