Machine learning established by using crowdsourced investigation vehicle data for forecast of expressway crash risk
Autour(s)
- Lee Chen, Don Chen, Chang Li, Bing Pan, Lixuan Zhang, Zheng Xiang
Abstract
Real-time prediction of crash risk can support traffic incident management by generating critical information for practitioners to allocate resources for responding to anticipated traffic crashes proactively. Unlike previous studies using archived traffic data covering a limited highway environment such as a segment or corridor, this study uses a statewide live traffic database from HERE to develop real-time traffic crash prediction models. This database pro- vides crowdsourced probe vehicle data that are high-resolution real-time traffic speed for the entire freeway network (nearly 2,000 miles) in Alabama. This study aims to use machine learning models to predict crash risk on freeways according to pre-crash traffic dynamics (e.g., mean speed, speed reduction) along with static freeway attributes. Traffic speed char- acteristics were extracted from the HERE database for both pre-crash and crash-free traffic conditions. Random Forest (RF), Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost) were developed and compared. Separate models were estimated for three major crash types: single-vehicle, rear-end, and sideswipe crashes. The model predic- tion accuracy indicated that the RF models outperform other models. Models for rear-end crashes are found to have greater accuracy than other models, which implies that rear-end crashes have a significant relationship with pre-crash traffic dynamics and are more predict- able. The traffic speed factors that are ranked high in terms of feature importance are the speed variance and speed reduction prior to crashes. According to partial dependence plots, the rear-end crash risk is positively related to the speed variance and speed reductions. More results are discussed in the paper.