Analysis roadmap
This is a general guideline and should not be followed literally. See table guidelines for style.
General map
- Pre-analysis work
- Create or get DAG
- Data validation (use test frame work and data quality tools in SAS) and automated tests against extracted data
- Anytime any type of import of data is done, ensure none of the fields were truncated (easy in SAS for truncation to occur).
- Ensure no records are missing from any data subsets that are created.
- Get all covariates in DAG + our standard covariates in an analytics dataset
- Test analytic dataset (use test frame work) and write automated tests against final datasets
- In SAS: Use test frame work, test code should run separate from main code
- In Stata: Use
assert
statements. If you make a decision about analysis bases on the data, then roll this into a runtime assert.
- Exploratory analysis = used for face validation and building model
- Descriptive tables
- Covariates by exposure status
- Covariates by outcome status
- Rate tables by outcome (if cohort design)
- Overall, age-standardized to Canadian population (gender-standardized and/or stratified by gender; also stratified by RHA): Numerator is outcome, denominator from source population
- Stratified by region, other relevant factors
- Univariate analysis
- Association between exposure and covariates (use
univariate_analysis.ado
Stata program)
- Association between outcome and covariates (use
univariate_analysis.ado
Stata program)
- Model building, base this on broadest exposure and broadest outcome group(s)
- Association between exposure and outcome (use
change_in_estimate.ado
Stata program)
- Crude
- Adjusted for each individual covariate (with change in estimate)
- Check for effect modifiers (association between exposure and outcome stratified by each individual covariate level + likelihood ratio test) (use
test_interaction.ado
Stata program)
- Model is built based on
- Prior information and associations (DAG), confirm that the associations are as expected
- Other confounders (based on associations, change in estimate, and not being an effect modifier)
- Stratification by effect modifiers
- Model verification (after building model)
- Check for effect modifiers in adjusted model: Association between exposure and outcome adjusted for the model; take each covariate out of the model individually and stratify by that and perform likelihood ratio test. (use
test_interaction.ado
Stata program)
- Detailed analysis = meat on the bones, looping over all reasonable permutations of stratification groups, exposures and outcomes.
- Detail depends on study and proposed sensitivity analyses.
Specific analysis notes
- Proportional hazards models,
stcox
should run using the efron
tiebreaker
- When matching is done, all analysis should be in the matched set:
- Logistic regresssion becomes conditional logistic regression,
clogit
, and group
needs to be used
- Other analysis, like
stcox
, should use the strata
option instead
- For incidence rate ratios
- If rates are counts, use
ir
- If rates are for time-to-event, use
stir