Dr. Zhu has successfully mined social media data to identify motor vehicle collisions with animals (Zhu, et al., 2011) and study bullying (Xu, et al., 2014) [project page]. His methodology to obtain spatio-temporal signal recovery from social media data was recognized with Best Paper in Knowledge Discovery in the 2012 ECML PKDD conference (Xu, et al., 2012). An on-going work is also being conducted by Dr. Lee to investigate driver distraction based on tweets (link).
Figure 1. Temporal distribution of roadkill for four species. [ref]
Figure 2. Spatial distribution of species in roadkill tweets. [ref]
Click to see an animation of two years’ bullying tweets in 40 seconds.
Figure 3. Venn diagram of bullying tweets. [ref]
Table 1. Number and percentage of author’s role in bullying traces. [ref]
Figure 4. Temporal distribution of bullying. [ref]
Figure 5. Network graph of frequent terms possibly related to driving distraction and their associations. [ref]
- Xiaojin Zhu, Jun-Ming Xu, Christine M. Marsh, Megan K. Hines, and F. Joshua Dein. Machine learning for zoonotic emerging disease detection. In ICML 2011 Workshop on Machine Learning for Global Challenges, 2011. [pdf, poster]
- Jun-Ming Xu, Hsun-Chih Huang, Amy Bellmore, and Xiaojin Zhu. School Bullying in Twitter and Weibo: a Comparative Study. In the Eighth International AAAI Conference on Weblogs and Social Media (ICWSM), 2014. [pdf]
- Jun-Ming Xu, Aniruddha Bhargava, Robert Nowak, and Xiaojin Zhu. Socioscope: Spatio-temporal signal recovery from social media. In The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2012. [pdf]
- Paper under review: Deciphering 140 Characters: Text Mining Tweets On #DriverDistraction [pdf]
SHRP2 Data Processing
The research team has extensive experience working with SHRP2 NDS data. Dr. Lee helped to develop data reduction and analysis methods for SHRP2 data (Dozza, et al., 2012 and Boyle, et al., 2012) and is part of the team led by Chalmers University that is studying eye glances and distraction using SHRP2 NDS data (Yekhshatya & Lee, 2013). Drs. Noyce and Lee are currently working on another EARP project ”Quantifying Driver Distraction and Engagement using Video Analytics” using SHRP2 NDS video data (link).
SHRP2 Data Processing
Figure 1. Comparison between conventional data reduction procedure and the proposed chunking procedure. [ref]
Figure 2. The dynamic relationships between driver, vehicle, roadway, and environment and the resulting safety consequences.[ref]
Figure 3. Data sampling strategies (figure not to scale).[ref]
Figure 4. Examples of dynamic and static factors that relate to driver, vehicle, roadway, and environmental variables at the trip level and event level.[ref]
Figure 5. Relationships among S05 contractor questions (five and eight themes).[ref]
- Dozza, M., Bärgman, J., & Lee, J. D. (2012). Chunking: A procedure to improve naturalistic data analysis. Accident Analysis & Prevention. [pdf]
- Boyle, L. N., Hallmark, S., Lee, J. D., McGehee, D. V., Neyens, D. M., & Ward, N. J. (2012). Integration of Analysis Methods and Development of Analysis Plan. SHRP2 (Strategic Highway Research Program). [pdf]
- Yekhshatya, L., & Lee, J. D. (2013). Changes in the correlation between eye and steering movements indicate driver distraction. IEEE Transactions on Intelligent Transportation Systems. [pdf]
Dr. Lee used text mining to decipher free response consumer complaints in the NHTSA vehicle owner’ complaint database (Ghazizadeh, et al., 2014). They showed that complaints increased just before the recalls for Toyota and Ford/Firestone. … more>>
VOQ Text Mining
Figure 1. Comparison of the Airbag clusters identified in fatal incidents and incidents involving injury. The vertical axis shows the frequency of each term relative to each level of severity and the size and horizontal position of the terms reflect their average frequency. In plotting this graph, the term ”airbag” was removed from both clusters, as it had a much higher frequency than the other terms and would make it difficult to see any other terms. In addition, those terms that occurred in fewer than 10% of the reports were removed to reduce clutter. [ref]
Figure 2. Comparison of the Contact clusters identified in fatal incidents, incidents involving injury, and minor incidents. The size of the terms reflects their average frequency. In plotting this graph, the term ”contact” was removed from all three clusters, as it had a much higher frequency than the other terms and would make it difficult to see any other terms. In addition, those terms that occurred in fewer than 10% of the reports were removed to reduce clutter. [ref]
- Ghazizadeh, M., McDonald, A.D. and Lee, J.D. Text mining to decipher free-response consumer complaints: Insights from the NHTSA vehicle owner’s complaint database.[pdf]
Dr. Noyce and his team have integrated weather data with crash data using spatial statistics for ice, snow and rain related crashes (Khan, et al., 2008 and Khan, et al. 2009).
Safety Analyses Related to Weather
Figure 1. Relative rain crash rates by Wisconsin county 2000 to 2002. [ref]
Figure 2. Local Moran’s I analysis for ice-related crashes. [ref]
- Khan, G., Qin, X., and Noyce, D. (2008). Spatial Analysis of Weather Crash Patterns. J. Transp. Eng., 134(5), 191?202. [pdf]
- Khan, G., Santiago-Chaparro, K., Qin, X., and David, N. Application and Integration of Lattice Data Analysis, Network K Functions, and Geographic Information System Software to Study Ice-Related Crashes. Transportation Research Record, v2136, pp. 67?76, 2009. [pdf]
Safety Data Integration
Dr. Noyce had developed a safety data integration framework to integrate roadway infrastructure, pavement marking, signing, weather and traffic information with crash data (Khan, et al., 2008). This framework has been used to analyze the safety performance of highway curves(Khan, et al., 2012 and Khan, et al., 2013). A recent publication by Dr. Noyce studied secondary crashes by integrating multiple data sources for an entire state and was recognized by an Outstanding Paper Award for 2014 by the TRB Committee ANB20 (Zheng, et al., 2014).
Figure 1. Safety data integration.
Figure 2. Development process and description of horizontal curve (H. curve) data set for crash prediction models (HCM = Highway Capacity Manual). [ref]
Figure 3. Details of horizontal curve data set on Wisconsin STN roads. [ref]
Secondary Crash Identification
Figure 4. Framework of large-scale secondary identification using integrated highway and crash data. [ref]
- Khan, G., Santiago-Chaparro, K., Chiturri, M., and Noyce, D.. Development of Data Collection and Integration Framework for Road Inventory Data. Transportation Research Record: Journal of the Transportation Research Board, v2160, pp. 29?39, 2010. [pdf]
- Khan, G., Bill, A., Chiturri, M., and Noyce, D.. Horizontal Curves, Signs, and Safety. Transportation Research Record: Journal of the Transportation Research Board, v2279, pp. 124?131, 2012. [pdf]
- Khan, G., Bill, A., Chiturri, M., and Noyce, D.. Safety Evaluation of Horizontal Curves on Rural Undivided Roads. Transportation Research Record: Journal of the Transportation Research Board, v2386, pp. 147?157, 2013. [pdf]
- Dongxi Zheng, Madhav Chitturi, Andrea Bill and David A. Noyce. Secondary Crash Identification on A Large-Scale Highway System. In Proceeding of the 2014 TRB Annual Meeting. Jan. 2014. Outstanding Paper Award of the Safety Data, Analysis and Evaluation (ANB20) Committee[pdf]
|wdt_ID||Research Topic||David Noyce||John Lee||Jerry Zhu||Steven Parker|
|2||Social Media Analysis||○||●|
• indicates substantial expertise (three or more papers)
º indicates marginal expertise (at least one paper or project)