"The best think of being a statistician is that you get to play in everyone's backyard"
"All models are wrong but some are useful"
Ph.D. in Statistics 2006 – 2010
Dissertation: Selection of Spatial and Spatial-Temporal Linear Models for Lattice Data Adviser: Dr. Jun Zhu
M.S. in Statistics 2008
University of Wisconsin-Madison, Madison, WI
B.S. in Actuarial Sciences 1997 – 2001
Assistant Professor 2012 – present
Department of Statistics, Kansas State University, Manhattan, KS
Postdoctoral Associate 2010 – 2012
Department of Applied Mathematics and Statistics, University of California, Santa Cruz, CA. Adviser: Dr. Abel Rodriguez
Teaching Assistant 2006 – 2010
Department of Statistics, University of Wisconsin-Madison, Madison, WI
Research Assistant 2008 – 2009
Traffic Operations and Safety (TOPS) Laboratory, Department of Civil and Environmental Engineering, University of Wisconsin-Madison, Madison, WI. Supervisor: Dr. Xiao Qin
Department of Actuarial Sciences, National Autonomous University of Mexico. Undergraduate-level: Statistics II.
Operative Statistics Coordinator 2001 – 2005
Nielsen Mexico, Mexico City, Mexico
• Stat 730 Online Multivariate Statistical Methods. Spring 2016, 2017, Fall 2016.
• Stat 730 Multivariate Statistical Methods. Spring 2014, 2015, 2016, 2017.
• Stat 950 Adv. Top. Bayesian Nonparametric Models. Spring 2015.
• Stat 713 Applied Linear Statistical Models. Fall 2014, 2015.
• Stat 880 Time Series Analysis. Fall 2013, 2015.
• Stat 704 Analysis of Variance. Spring 2013, Fall 2013.
• Stat 703 Statistical Methods for Natural Sciences. Fall 2012.
• Stat 775 Bayesian Decision and Control I. Fall 2009.
• Stat 571 Statistical Methods for Bioscience I. Fall 2006.
• Summer Institute for Training in Biostatistics 2010.
• Stat 441 Introduction to Biostatistics for Pharmacy. Spring 2010.
• Stat 301 Introduction to Statistical Methods. Fall 2007, 2009, Spring 2007, and 2008.
Graduated: Alexander McLellan, M.S. 2016. Shu Zhou, M.S. 2015. Yang Wang, M.S. 2014.
Actual: Samirah Alzubaidi, M.S. expected 2017.
Committee Member: Statistics: Zhinning Ou, PhD expected 2017; Xintong Li, PhD preliminary passed 2017; Mengjiao Wu, PhD preliminary passed 2017; Songnian Zhao M.S. 2017; Yu Wang, M.S. 2017; Zachary Button, M.S. 2015. Industrial Engineering: Randal E. Hickman, PhD 2014, Prakarsh Tiwari, M.S. 2014.
PROCEEDINGS EDITOR 2015 & 2016
The Kansas State University Conference on Applied Statistics in Agriculture provides a forum for discussion of statistical issues motivated by a wide range of research problems in the discipline of agriculture in its broadest sense, including traditional agriculture as well as in related research areas such as genetics, biology, ecology, and the environment.
Curtis, K. Reyes, P. E., O’Connell, H. A. and Zhu, J.
Spatial Demography 2013, 1(2), 178–194. DOI:10.1007/BF03354897
This study assesses the social-structural, spatial, and temporal dimensions of aggregate-level poverty in the US Upper Midwest between 1960 and 2000. Central focus is on the links between local-area poverty, industrial structure and racial/ethnic composition, and the spatial and temporal dimensions of the linkages. During the study period, the region underwent significant industrial restructuring and dramatic change in racial/ethnic concentration. Using newly developed statistical methods for spatial-temporal regression, we explore hypotheses related to the spatial and temporal dimensions of the complex relationship between poverty, industrial structure, and race/ethnicity. Our approach yields reliable and interpretable estimates for structural factors of interest as well as the spatial-temporal autocorrelation structure underlying the data. Results inform theory about the implications of industrial structure and racial/ethnic composition for the concentration and persistence of poverty with clear direction for future research, and contribute to our understanding of the methodological approaches to investigating data that varies by and is dependent on space and time.
KEYWORDS County poverty, race/ethnicity, industrial structure, Upper Midwest, spatial-temporal autocorrelation, maximum likelihood estimation.
Reyes, P. E., Zhu, J. and Aukema, B. H.
Journal of Agricultural, Biological, and Environmental Statistics 2012, 17-3, 508–525. DOI: 10.1007/s13253-012-0103-0
Insects are among the most significant indicators of a changing climate. Here we evaluate the impact of temperature, precipitation, and elevation on the tree-killing ability of an eruptive species of bark beetle in pine forests of British Columbia, Canada. We consider a spatial-temporal linear regression model and in particular, a new statistical method that simultaneously performs model selection and parameter estimation. This approach is penalized maximum likelihood estimation under a spatial-temporal adaptive Lasso penalty, paired with a computationally efficient algorithm to obtain approximate penalized maximum likelihood estimates. A simulation study shows that finite-sample properties of these estimates are sound. In a case study, we apply this approach to identify the appropriate components of a general class of landscape models which features the factors that propagate an outbreak. We interpret the results from ecological perspectives and compare our method with alternative model selection procedures.
KEYWORDS: Autoregressive models, Bark beetle, Lattice model, Model selection, Penalized maximum likelihood, Spatial-temporal process.
Qin, X. and Reyes, P. E.
Journal of Transportation Engineering 2011, 137, 601–607. DOI: 10.1061/(ASCE)TE.1943-5436.0000247
Crashes are important evidence for identifying deficiencies existing in highway systems, but they are random and rare. The investigation of the nature of the problem normally draws on crashes collected over a multiyear period and from different locations to obtain a sizable sample. Hence, the issue of data heterogeneity arises because the pooled data originated from different sources. Data heterogeneity has to be addressed to obtain stable and meaningful estimates for variable coefficients. A desirable method of handling heterogeneous data is quantile regression (QR) because it focuses on depicting the relationship between a family of conditional quantiles of the crash distribution and the covariates. The QR method is appealing because it offers a complete view of how the covariates affect the response variable from the full range of the distribution, which is of particular use for distributions without symmetric or normal forms (i.e., heavy tails, heteroscedasticity, multimodality, etc.). Crash data possess some of the properties that quantile analysis can handle, as demonstrated in an intersection crash study. The compelling results illustrate that conditional quantile estimates are more informative than conditional means. The findings provide information relative to the effect of traffic volume, intersection layout, and traffic control on crash occurrence.
KEYWORDS: Intersection Crashes, Quantile Regression, Data Heterogeneity
Magle, S., Reyes, P. E., Zhu, J. and Crooks, K.
Biological Conservation 2010, 143, 2146–2155. DOI:10.1016/j.biocon.2010.05.027
Using 5 years of patch occupancy data for 384 habitat fragments, we evaluated population and habitat dynamics of the black-tailed prairie dog in urban habitat remnants in the rapidly developing landscape of Denver, CO, USA. Specifically, we evaluated the landscape factors, including fragment area, age, and connectivity, that characterize the habitat fragments most likely to be colonized by prairie dogs, as well as those experiencing local extinctions. In addition, we determined which patch types were most often removed by land development. Sites in proximity to colonies were more likely to be colonized by prairie dogs. Local extinctions were most common on isolated colonies, and older and more isolated colonies were more likely to be extirpated by human activity. In general, smaller and older habitat patches were at the highest risk of being lost to land development. Our results provide observations of dynamic changes to the distribution of a potential keystone species in an urban area, which can be used to inform island biogeographic and metapopulation models for wildlife persistence in developing landscapes. Although populations are currently in decline, most local extinctions are the direct result of human activity, and we suggest that prairie dogs in this area can persist with appropriate management.
Colonization, Connectivity, Extinction, Keystone, Prairie dog, Urban habitat
Qin, X., Ng, M. and Reyes, P. E.
Accident Analysis and Prevention 2010, 42(6), 1531–1537. DOI: 10.1016/j.aap.2010.03.009
Identifying locations that exhibit the greatest potential for safety improvements is becoming more and more important because of competing needs and a tightening safety improvement budget. Current crash modeling practices mainly target changes at the mean level. However, crash data often have skewed distributions and exhibit substantial heterogeneity. Changes at mean level do not adequately represent patterns present in the data. This study employs a regression technique known as the quantile regression. Quantile regression offers the flexibility of estimating trends at different quantiles. It is particularly useful for summarizing data with heterogeneity. Here, we consider its application for identifying intersections with severe safety issues. Several classic approaches for determining risk-prone intersections are also compared. Our findings suggest that relative to other methods, quantile regression yields a sensible and much more refined subset of risk-prone locations.
KEYWORDS: Quantile regression; Heterogeneity; Poisson-gamma; Confidence interval
Perla E. Reyes, Xiao Qin
Proceedings of the 10th International Conference of Chinese Transportation Professionals 2010. DOI: 10.1061/41127(382)33
Many models and statistical techniques have been proposed in transportation literature to model crash count data. Although there has been some effort to compare and evaluate crash model performance, most of the results are difficult to generalize because they focus on a real datasets. It is argued that a common ground to compare techniques for accident modeling need to be defined. A statistical simulation is proposed to serve this purpose, not only because the known model used to simulate data is an ideal point of comparison, but also because the flexibility of the technique.
Artificial hot spots, i.e. sites whose crash intensity is higher than the one expected for sites with similar known features, were created as part of the simulation. The simulation used Negative Binomial (NB) regression mean estimate, NB Empirical Bayesian (EB) estimate, and Pearson regression residuals to rank sites and to identify hot spots. The results show that only a small fraction of the artificial sites can be successfully identified regardless of the methodologies and Pearson regression residuals lead to the best results.
KEY WORDS: Statistical Simulation, Count Data, Negative Binomial, Empirical Bayesian
Zhu, J., Huang, H. and Reyes, P. E.
Journal of the Royal Statistical Society Series B 2010, 72, 389–402. DOI: 10.1111/j.1467-9868.2010.00739.x
Spatial linear models are popular for the analysis of data on a spatial lattice, but statistical techniques for selection of covariates and a neighbourhood structure are limited. Here we develop new methodology for simultaneous model selection and parameter estimation via penalized maximum likelihood under a spatial adaptive lasso. A computationally efficient algorithm is devised for obtaining approximate penalized maximum likelihood estimates. Asymptotic properties of penalized maximum likelihood estimates and their approximations are established. A simulation study shows that the method proposed has sound finite sample properties and, for illustration, we analyse an ecological data set in western Canada.
KEYWORDS: Conditional auto-regressive model; Model selection; Penalized likelihood; Simultaneous auto-regressive model; Spatial statistics; Variable selection.