2.2 Variable Types

There are seven allowed types of variables:

  • calculated for variables where users specify calculations to be performed (age groups, overall survival)
  • category for variables which can take only pre-specified values (male, female), or where character short forms are used (M=male, F=female)
  • character for variables such as the id, or any variable not to be analysed
  • codes for variables where numeric short forms are entered (ie 1=male, 2=female)
  • date for all dates
  • integer for values which can take only whole numbers (ECOG score)
  • numeric for all other numeric values

For the date, integer and numeric types the user must specify both a minimum and a maximum value. The minimum and maximum values can be numeric, the names of other variables in the data (if you want to make sure death is after diagnosis) or, for date types, ‘today’ can be used to allow all dates up to the time of entry.

For the codes and category variables the user needs to specify the allowed categories, or the codes, in the order they would like them presented. If they struggle with this for data that has already been entered, then the UNIQUE function in Excel may be of use to them, it will list all the unique variables in a range.

Note that codes variables contain numeric data in the data entry sheet while category variables contain character data in the data entry sheet.

Once the dictionary is complete clicking the the Create Data Entry Sheet button will create the data entry table.

If the user wishes to change any of the ranges or codes after entering data then they can change the dictionary and click Re-format Exising Sheet to update the validation rules.

If code or category variables are present then a _codes_ sheet will be created to contain the allowed values for data validation. This sheet contains the same name as the data entry sheet, suffixed by codes. This sheet is hidden from the user.