Analysis roadmap

This is a general guideline and should not be followed literally. See table guidelines for style.

General map

  1. Pre-analysis work
    1. Create or get DAG
    2. Data validation (use test frame work and data quality tools in SAS) and automated tests against extracted data
      1. Anytime any type of import of data is done, ensure none of the fields were truncated (easy in SAS for truncation to occur).
      2. Ensure no records are missing from any data subsets that are created.
    3. Get all covariates in DAG + our standard covariates in an analytics dataset
    4. Test analytic dataset (use test frame work) and write automated tests against final datasets
      1. In SAS: Use test frame work, test code should run separate from main code
      2. In Stata: Use assert statements. If you make a decision about analysis bases on the data, then roll this into a runtime assert.
  2. Exploratory analysis = used for face validation and building model
    1. Descriptive tables
      1. Covariates by exposure status
      2. Covariates by outcome status
    2. Rate tables by outcome (if cohort design)
      1. Overall, age-standardized to Canadian population (gender-standardized and/or stratified by gender; also stratified by RHA): Numerator is outcome, denominator from source population
      2. Stratified by region, other relevant factors
    3. Univariate analysis
      1. Association between exposure and covariates (use univariate_analysis.ado Stata program)
      2. Association between outcome and covariates (use univariate_analysis.ado Stata program)
    4. Model building, base this on broadest exposure and broadest outcome group(s)
      1. Association between exposure and outcome (use change_in_estimate.ado Stata program)
        • Crude
        • Adjusted for each individual covariate (with change in estimate)
      2. Check for effect modifiers (association between exposure and outcome stratified by each individual covariate level + likelihood ratio test) (use test_interaction.ado Stata program)
      3. Model is built based on
        • Prior information and associations (DAG), confirm that the associations are as expected
        • Other confounders (based on associations, change in estimate, and not being an effect modifier)
        • Stratification by effect modifiers
    5. Model verification (after building model)
      1. Check for effect modifiers in adjusted model: Association between exposure and outcome adjusted for the model; take each covariate out of the model individually and stratify by that and perform likelihood ratio test. (use test_interaction.ado Stata program)
  3. Detailed analysis = meat on the bones, looping over all reasonable permutations of stratification groups, exposures and outcomes.
    • Detail depends on study and proposed sensitivity analyses.

Specific analysis notes

  • Proportional hazards models, stcox should run using the efron tiebreaker
  • When matching is done, all analysis should be in the matched set:
    • Logistic regresssion becomes conditional logistic regression, clogit, and group needs to be used
    • Other analysis, like stcox, should use the strata option instead
  • For incidence rate ratios
    • If rates are counts, use ir
    • If rates are for time-to-event, use stir
24/08/2018