Statistical Simulation for Modeling Crash Count Data

Statistical Simulation for Modeling Crash Count Data

Perla E. Reyes, Xiao Qin

Proceedings of the 10th International Conference of Chinese Transportation Professionals 2010. DOI: 10.1061/41127(382)33

ABSTRACT

Many models and statistical techniques have been proposed in transportation literature to model crash count data. Although there has been some effort to compare and evaluate crash model performance, most of the results are difficult to generalize because they focus on a real datasets. It is argued that a common ground to compare techniques for accident modeling need to be defined. A statistical simulation is proposed to serve this purpose, not only because the known model used to simulate data is an ideal point of comparison, but also because the flexibility of the technique.

Artificial hot spots, i.e. sites whose crash intensity is higher than the one expected for sites with similar known features, were created as part of the simulation. The simulation used Negative Binomial (NB) regression mean estimate, NB Empirical Bayesian (EB) estimate, and Pearson regression residuals to rank sites and to identify hot spots. The results show that only a small fraction of the artificial sites can be successfully identified regardless of the methodologies and Pearson regression residuals lead to the best results.

KEY WORDS: Statistical Simulation, Count Data, Negative Binomial, Empirical Bayesian