paugmented

An Ethical Approach to Peeking at Data

This page contains R functions and Excel spreadsheets to calculate the statistics discussed in Sagarin, Ambler, and Lee's (2014, Perspectives on Psychological Science, 9, 293-304) "An Ethical Approach to Peeking at Data".

Statistics available

p_actual represents the actual Type I error rate stemming from dataset augmentation (see section "Augmenting Datasets and Type I Error Inflation" and Figure 3 in the paper)
p_crit represents the critical value necessary to maintain a desired Type I error rate (e.g., .05) while allowing for dataset augmentation (see section "Maintaining p < .05 while Augmenting the Dataset" and Figure 5 in the paper)
p_augmented represents the Type I error inflation resulting from post-hoc dataset augmentation (see section "Ethical Post-hoc Dataset Augmentation via p_augmented" in the paper)

Files available

SagarinAmblerLee_2014.pdf contains Sagarin, Ambler, and Lee (2014)
SagarinAmblerLee_2014_SupplementalMaterials.pdf contains a description of the calculation of p_actual, p_crit, and p_augmented
paugmented.r contains R functions for calculating p_actual, p_crit, and p_augmented with an unlimited number of rounds of dataset augmentation
pactual.xlsm calculates p_actual for up to two rounds of dataset augmentation
pcrit.xlsm calculates p_crit for up to two rounds of dataset augmentation
paugmented.xlsm calculates p_augmented for up to two rounds of dataset augmentation
Note: All three Excel spreadsheets use Visual Basic macros, so you will need to enable macros when opening the spreadsheets.

Instructions for R functions

pactual(ns, pmax = 1, pcrit = 0.05, slices = 1000, tails = 2, indent = "")

ns refers to a vector of sample sizes that indicate the initial number of participants run (N1) followed by the number of participants in the first round of augmentation (N2), the number of participants in the second round of augmentation (N3), etc.
pmax refers to the maximum p-value in the data collected thusfar such that the researcher would augment the sample with additional participants.
pcrit refers to the value for determining statistical significance (typically .05).
slices refers to the number of slices to divide the probability distribution (higher numbers of slices will make the calculations slower but more precise).
tails refers to whether the tests are one- or two-tailed.
indent is used to format the countdown when this function is called from pcrit.
Example: pactual(c(100,50,25),.1,.05,1000,1) returns the actual alpha level resulting from an initial sample size of 100 followed by two rounds of dataset augmentation, the first with 50 participants, the second with 25. pcrit = .05, so a result with p < .05 is considered statistically significant. pmax = .1, so a result with .05 <= p < .1 would be augmented with additional participants (up to twice). slices = 1000, so the probability distribution for calculations is divided into 1000 slices. tails = 1, so significance tests are one-tailed.

pcrit(ns, pmax = 1, pdesired = 0.05, slices = 1000, tails = 2)

ns refers to a vector of sample sizes that indicate the initial number of participants run (N1) followed by the number of participants in the first round of augmentation (N2), the number of participants in the second round of augmentation (N3), etc.
pmax refers to the maximum p-value in the data collected thusfar such that the researcher would augment the sample with additional participants.
pdesired refers to the desired Type I error rate (typically .05).
slices refers to the number of slices to divide the probability distribution (higher numbers of slices will make the calculations slower but more precise).
tails refers to whether the tests are one- or two-tailed.
Example: pcrit(c(200,150,50),.2,.05,1000,2) returns the critical value needed to maintain a Type I error rate of .05 with an initial sample size of 200 and up to two rounds of augmentation, the first with 150 additional participants, the second with 50 additional participants. Augmentation will be done if the data collected thusfar show a two-tailed p-value below .2.

paugmented(ns, plargest, pfinal, pcrit = 0.05, slices = 1000, tails = 2)

ns refers to a vector of sample sizes that indicate the initial number of participants run (N1) followed by the number of participants in the first round of augmentation (N2), the number of participants in the second round of augmentation (N3), etc.
plargest refers to the largest p-value obtained in the initial or augmented subsamples. For example, if the p-value for the initial N1 participants was .15, the p-value for the combined N1+N2 participants was .18, and the p-value for the combined N1+N2+N3 participants was .04, plargest would be .18.
pfinal refers to the p-value obtained in the final sample that includes all participants.
pcrit refers to the value for determining statistical significance (typically .05).
slices refers to the number of slices to divide the probability distribution (higher numbers of slices will make the calculations slower but more precise).
tails refers to whether the tests are one- or two-tailed.
Example: paugmented(c(40,40,20),.18,.04,.05,1000,2) calculates p_augmented bounds for a study with an initial sample size of 40, two rounds of augmentation, the first with 40 additional participants, the second with 20 additional participants, a largest p-value of .18 observed either in the initial 40 participants or in the 80 participants after the first round of augmentation, a final p-value of .04 in the full sample of 100 participants, 1000 slices, and two-tailed significance tests.

Instructions for Excel spreadsheets

Three Microsoft Excel spreadsheets are also available to calculate p_actual, p_crit, and p_augmented for up to two rounds of dataset augmentation (for more than two rounds of augmentation, the R functions must be used). To use the spreadsheets, fill in values for cells B1 through B7 (on pactual.xlsm and pcrit.xlsm) or B1 through B9 (on paugmented.xlsm). Then run the calculation macro by pressing CTRL+z on Windows or CMD+ALT+z on Mac. The specific cells are as follows:

N1: The number of participants in the original sample
N2: The number of participants in the first round of augmentation
N3: The number of participants in the second round of augmentation (leave this blank if there was only one round of augmentation)
pmax: the value for pmax (e.g., 1)
pcrit: the value for pcrit (e.g., .05)
tails: 1 for a one-tailed test, 2 for a two-tailed test
slices: the number of slices to divide the probability distribution into (e.g., 10000)
pdesired: the desired Type I error rate (e.g., .05)
p1: The p-value from the first N1 participants
p12: The p-value from the combined N1+N2 participants
p123: The p-value from the combined N1+N2+N3 participants (leave this blank if there was only one round of augmentation)

For more information

Please send questions, comments, suggestions, and bug reports to bsagarin@niu.edu.