Australian secondary school students' use of tobacco, alcohol, and over-the-counter and illicit substances in 2011
Appendix 2: Data matters
Coding and editing of dataFollowing procedures established for the earlier surveys in this series, cleaning of data relating to all substance use questions involved checking for inconsistencies in reported use of substances across time periods (lifetime, past year, past month and past week). This cleaning procedure ensured maximum use of the data and operated on the principle that the participant’s response about personal use in the most recent time period was accurate. Cleaning involved checking that the response for the most recent time period was consistent with the response for subsequent time periods. If responses for other time periods were missing or inconsistent with the response for the most recent time period, responses were recoded to indicate use that matched the response for the recent time period. For example, if students indicated they had used a substance in the past week and in the past month but indicated that they had not used it in the past year or, if the response to this question was missing, the response for the past year was recoded to indicate that the substance had been
used within this time period. This change was considered appropriate as using a substance in the past week and past month necessitates that it was used in the past year. However, if respondents indicated that they did not use a substance in the past week and the response for use in the past month was missing or yes, these responses were not changed, as it is possible for someone who did not use a substance in the past week to have used it in the past month. The missing response was retained, as it could not be determined whether or not the student had used the substance. If students indicated that they had used a substance in the past week, month or year, but indicated that they had not used the substance in their lifetime, the response to this latter question was changed to ‘invalid’. Regardless of the students’ reported substance use, no change was made to their response indicating how they see their own substance use behaviour, as this question was aimed to assess self-perception only. As in previous survey years, the impact of these sorts of recodes on the data set was minimal, with around three per cent of data recoded.
Data analyses detailsLogistic regression analyses were used to examine whether the proportions of students in 2011 who had used tobacco, alcohol and each of the illicit substance within different time periods (e.g. lifetime, past month, past week) were different from the proportions found in 2008 and 2005. For these analyses students were grouped into the age groups: 12- to 15-year-olds, 16- to 17-year-olds and 12- to 17-year-olds; and the proportions of all students, and male and female students using substances in each survey year were examined. In these analyses, the outcome variable was binary coded, with 1 indicating that the behaviour was engaged in and 0 indicating the behaviour did not occur. Age (within each of the two age groups), school type (government, Catholic and independent), state/territory and, where appropriate, gender were entered into the analyses first. Year of survey was entered as a categorical variable, and a 2 value associated with the main effect of year was estimated.
Because this study used a two-stage sampling procedure, the sample was less efficient than a simple random sample of the same size. As students within the sample were clustered by school, standard errors for prevalence estimates may have been underestimated. Procedures within the statistical package STATA accommodate complex sample designs within analytic procedures by adjusting for the clustering of observations. STATA was used for analyses comparing prevalence estimates across survey years and standard errors robust to potential non-independence within subjects obtained.