Methods Group Oct 28: Power Calculations

By Daniala Weir and Deepa Jahagirdar

We all learn about basic power calculations in stats 101.  But when it comes time to actually do it, it’s as if we know nothing at all.  In our methods group discussion on October 28th, we talked this challenging aspect of every study. In the simplest case, software (STATA, SAS, R) and online power calculators are your best friend, for example when the effect measure is an unadjusted difference in proportions. Gillian Ainsworth provided us with a few great examples including an overview by Dr. Hanley & Dr. Moodie.

But let’s face it: the simplest case rarely applies.  Simultaneously accounting for confounding, correlation, ranges of prevalence for outcomes and exposures requires methods beyond those online calculators. In this case, it is best to simulate data.   While the word ‘simulation’ can scare a lot of people off, Brooke Levis provided some very useful examples from her own research as well as R code written along with Andrea Benedetti to conduct a simulation for her power calculations.  The code will is pasted below.

Finally, Dr. Hanley gave some important advice,“Every study is valuable for answering an over arching research question….. Just because you don’t have enough power to conduct the study you want, doesn’t mean it shouldn’t be conducted at all. Think about it as contributing to an overall met-analysis on a particular research question.”

R Code for Power Calculations Simulations (credit to Brooke Levis and Dr Andrea Benedetti):
#this function generates a dataset of size n with a binary outcome, binary exposure, and binary confounder
#the exposure prevalence is given by prevx
#the confounder prevalence is given by prevconf
#the outcome prevalence is given by prevy
#the OR between the confounder and x is ORconfx
#the OR between the confounder and y is ORconfy
#the OR between the exposure and y is ORxy
#nreps is the number of times the data is generated and analyzed
#for each data set, a crude model and an adjusted model are fit and the significance of the exposure beta is assessed
getpower<-function(n, prevx, prevconf, prevy, ORconfx, ORconfy, ORxy, nreps){
#make a matrix to hold the results
res<-matrix(NA, ncol=11, nrow=nreps)
colnames(res)<-c(“n”,”prevx”,”prevconf”,”prevy”, “ORconfx”,”ORconfy”,”ORxy”,”pvaladjmodel”,”sigadjmodel”,”pvalcrudemodel”,”sigcrudemodel”)

for(i in 1:nreps){
#generate the binary exposure – input prevalence of exposure
x<-rbinom(n, 1, prevx)

#generate the binary confounder – prevalence of confounder and OR between exposure and confounder
conf<-rbinom(n,1, exp(regeqxconf)/(1+exp(regeqxconf)) )

#generate the binary outcome – prevalence of outcome, OR between exposure and outcome and OR between confounder and outcome
y<-rbinom(n, 1, exp(regeq)/(1+exp(regeq)))

#adjusted model
m1<-glm(y~x+conf, family=binomial)
#get p value for exposure beta
#is it significant?

#crude model
m0<-glm(y~x, family=binomial)
#get p value for exposure beta
#is it significant?
#hold onto data generation params
#return the results

#call the function
p1<-getpower(n=400, prevx=.5, prevconf=.1, prevy=.2, ORconfx=2, ORconfy=2, ORxy=2, nreps=500)


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s