Research

My research interests include statistical learning, statistics education, applied statistical consulting, and sports analytics.

Statistical learning involves creating and evaluating predictive algorithms, designed to “learn” from and adapt to information contained in complex, noisy data. My research focuses on random forest methodology . My latest article “From Black Box to Shining Spotlight” introduces a set of web applications (1, 2) visualizing random forest prediction intervals and comparing them with intervals produced by linear regression models. I have also published articles introducing a residual-based approach to robust random forest regression , evaluating techniques for aggregating random forest predictions across trees , and using random forests to improve college student retention in STEM majors.

I also enjoy working collaboratively with social science colleagues to analyze data collected in their studies. I am currently working with a psychology colleague on a study examining the effect of mindfulness and mood-monitoring strategies on mental health of colleague students.

Finally, I enjoy applying statistical methods to sports data and have supervised several student projects in this area. One of these projects, by Derek Brickley, examined relationships between colleges’ and universities’ academic and athletic performances and was published in SIAM Undergraduate Research Online in 2022.

Publications

Sage, A. J., Liu, Y., & Sato, J. (2022). From black box to shining spotlight: Using random forest prediction intervals to illuminate the impact of assumptions in linear regression. The American Statistician, 76(4), 414-429.

Smith, J. G., Sage, A. J., McGlenn, M., Robbins, J., & Garmon, S. L. (2022). Is Sexual Racism Still Really Racism? Revisiting Callander et al.(2015) in the USA. Archives of Sexual Behavior, 51(6), 3049-3062.

Sage, A.J., Genschel, U., Nettleton, D. (2021). A Residual-Based Approach to Robust Random Forest Regression. Statistics and its Interface. 14(4) 389-402.

Sage, A.J., Genschel, U., Nettleton, D. (2020). Tree aggregation for random forest class probability estimation. Statistical Analysis and Data Mining: The ASA Data Science Journal.

Sage, A.J., Cervato, C., Genschel, U., Ogilvie, C., (2018). Combining academics and social engagement: a major-specific early alert method to counter student attrition in STEM. Journal of College Student Retention: Research, Theory, and Practice. 1521025118780502

Sage, A. J., & Wright, S. E. (2016). Obtaining cell counts for contingency tables from rounded conditional frequencies. European Journal of Operational Research, 250(1), 91-100.