Finding out the features best describing the over-speeding problem of the state
This is a thesis project completed as part of the ST606 - Msc in Data Science and Analytics- Project & Disseration at Maynooth University, Ireland.
The project presents an analysis and modelling techniques, an attempt to deal with the over-speeding problem that Montana state is facing since the government passed the rule that abolished the day-time speeding limit. The dataset contains various attributes and through the analysis performed we find the best set of variables that associate with the state’s speeding violation problem. We look for modelling strategies to fit data variations and are able to achieve the fit capturing most of the variation exhibited by our dependent variable (number of speeding stops for each outcome group: warning, citation and arrest) . The dataset used for analysis is over-dispersed and the results of Poisson Model will lead to make incorrect inferences. Alternative method of dealing with over-dispersed count data like negative binomial is used. After comparing each fitted model against their AICs, negative binomial model is found to best fit the data. Also, the full model with negative binomial fit displayed better results than the pruned negative binomial models.
For more information, please refer to the main thesis report.
Srishti Kakkar, MSc Data Science and Analytics candidate
Dr. Niamh Cahill , Lecturer / Assistant Professor at Maynooth University