// // Program SBBOOT // Written by: Larry Taylor and Joe Cardinale, Fall 2007 // Purpose: To run a Parametric Bootstrap using SB.do and SBIN.do // This .do file simulates the SB regression to find critical values // and perform basic performance diagnostics on simulation results // given the underlying geometric distribution // program define SBBOOT, rclass clear do SBIN.do /* SBIN reads in the empirical data */ // // Generate the starting variables required for SB.do // gen Phase = 0 /* The number of phases */ gen t = 0 /* Will be the number of observations */ gen u = 0 /* Will be an element of the random uniform distribution */ gen S = 0 /* Will be a binary dependent variable for State */ gen DT = 0 /* Will be the duration of the current phase */ gen count = 0 /* Will count the number of phases to be used in the sample */ gen sampyes = 0 /* SAMPYES = 1 if the data point is included in generated sample */ do SB.do /* SB.do sets up the simulated data & regression */ // The simulation runs 10,000 times and saves the following variables: // teststat: SB regression t-test // ttesthat: Empirical t-ratio // Invalid: Number of invalid simulations (based on phase criteria) // simulate teststat=r(teststat) invalid=r(invalid) ttesthat=r(ttesthat), saving(tfile,replace) nodots reps(10000): SB // // Drop out invalid simulations where not enough phases were generated // egen tinvalid = count(invalid) if invalid == 1 egen tvalid = count(invalid) if invalid == 0 egen totinvalid = max(tinvalid) egen totvalid = max(tvalid) keep if invalid == 0 drop invalid tinvalid tvalid // // Calculation of critical values for the simulation distribution of test statistics // egen tplow = pctile(teststat), p(2.5) /* Finds the t-score at the 2.5 percentile */ egen tphigh = pctile(teststat), p(97.5) /* Finds the t-score at the 97.5 percentile */ di " " di "Selected values for the regression test statistic:" di " The empirical test statistic is " ttesthat di " The bootstrapped critical value at 2.5% is " tplow di " The bootstrapped critical value at 97.5% is " tphigh di " " di " " // // Find p-value // egen tmean = mean(teststat) egen pupper = count(teststat) if (ttesthat > tmean) & (teststat > ttesthat) egen plower = count(teststat) if (ttesthat < tmean) & (teststat < ttesthat) if (ttesthat >= tmean) gen pnumerator = pupper /* set pvalue numerator based on where the ... */ else gen pnumerator = plower /* ... empirical value is relative to the mean */ gen pvaluetemp = (pnumerator + 1) / (totvalid + 1) /* calc pvalue */ egen pvalue1t = max(pvaluetemp) /* since last data matrix element might be blank */ gen pvalue2t = 2 * pvalue1t /* create a 2 tailed test */ drop pnumerator pvaluetemp pvalue1t pupper plower /* delete unnecessary variables */ // // Compress data -- allow Stata to optimize data storage to minimize size restrictions // quietly compress // // List results in Stata // Direction of duration dependence is based on the sign of the empirical t-ratio // Directional test is based on a discrete data analysis // di " " di "SBBOOT generated " (totvalid + totinvalid) " replications" di " " totvalid " replications were used" di " " totinvalid " replications did not generate enough phases" di " " di " " di "Simulation Results" di " The null hypothesis is Duration Independence" di " The p-value for the SB test is " pvalue2t if ttesthat >= 0 { di " The sign of the SB ratio indicates Positive Duration Dependence" } else di " The sign of the SB ratio indicates Negative Duration Dependence" di " " di "SBBOOT has completed the simulation." // // Write the output to a file // outsheet using SBBOOT_output, replace end