An analysis of the data quality in reports of Brazilian federal highways for the data mining process
Abstract
The article presents a study on the application of data mining process in instances of data on federal highways, generated by the Federal Highway Police, in 2012. The aim of this study is to analyze the feasibility of applying the process on the data to identify associations between variables related to traffic accidents in all Brazilian federal highways. In this work the main difficulties encountered in the process of implementation, the results obtained using the PART and Apriori learning algorithms, and describe the future work to be performed based on this study.
Keywords:
electronic government
References
Barrat, A., Barthélemy M., Pastor-Satorras R., Vespignani, A. (2004) The architecture of Complex Weighted Networks, Proceedings of the National Academy of Sciences, March 16, vol. 101,n. 11, pp 3747-3752
Balijepalli, Chandra, Oppong, Olivia (2014) Measuring vulnerability of road network considering the extent of serviceability of critical road links in urban areas, Journal of Transport Geography, 39, pp. 145-155.
Borgatti, Stephen P. (2005) Centrality and network flow, Social networks ,27, pp. 55- 71.
Chen, Bi Yu, Lam Willian H. K., Sumalee, A., Li, Qingquan, Li, Zhi-Chun (2012) Vulnerability analysis for large-scale and congested road networks with demand uncertainty, Transportation Research Part A: Policy and Practice 46.3, pp. 501-516
Easley, D., Jon K. (2010) “Networks, Crowds, and Markets”, Vol. 8, Cambridge:Cambridge University Press.
Ferber, C., Berche, B., Holovatch, T., Holovatch Yu. (2012) A tale of Two Cities Vulnerabilities of the London and Paris Transit Networks, J Transp. Secur. 5. 199/216.
Hagberg, A., Swart, P., Chult, D. (2008) “Exploring Network Structure, Dynamics, and Function Using NetworkX”. in Proceedings of the 7th Python in Science Conference (SciPy2008), pp. 11-15.
Jenelius, E., Mattsson L-G. (2015) Vulnerability and resilience of transport systems — A discussion of recent research, Transportation Research Part 4, 81, pp. 16-34.
Joanes, D. N., Gill, C. A. (1998). Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician) 47, pp. 183-189.
Latora, V., Marchiori, M. (2001) Efficient behavior of small-world networks. Physical review letters 87.19, pp.198701.
Nist (2003), “Engineering Statistics Hanbook”, https://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm
Pearson, M., Sagastuy, J., Samaniego, S. (2017) “Traffic Flow Analysis Using Uber Movement Data”, htts:// snap.stanford.edu/class/projects/
Saramãki, J., Kivel, M., Onnela, J-P., Kaskil, K., Kertész, J. (2007) Generalizations of the Clustering Coefficient to Weighted Complex Networks, Physical Review E, 75.2,pp. 027105.
Sebastiani, M., Luders R., Fonseca K. (2016) “Evaluating electric bus operation for a real-world BRT public transportation using simulation optimization,” IEEE Trans. Intel. Transp. Sys., vol. 99, pp. 1-10.
Uber (2019) “Uber Movement”, https://movement.uber.com/
Vonu P., Tang L., Vassilakis W. (2011) “Spatio-temporal effects of bus arrival time information,” in Proceedings of the 4th ACM SIGSPATIAL International Workshopon Computational Transportation Science, CTS "11, (New York, NY, USA), pp. 6-11.
Balijepalli, Chandra, Oppong, Olivia (2014) Measuring vulnerability of road network considering the extent of serviceability of critical road links in urban areas, Journal of Transport Geography, 39, pp. 145-155.
Borgatti, Stephen P. (2005) Centrality and network flow, Social networks ,27, pp. 55- 71.
Chen, Bi Yu, Lam Willian H. K., Sumalee, A., Li, Qingquan, Li, Zhi-Chun (2012) Vulnerability analysis for large-scale and congested road networks with demand uncertainty, Transportation Research Part A: Policy and Practice 46.3, pp. 501-516
Easley, D., Jon K. (2010) “Networks, Crowds, and Markets”, Vol. 8, Cambridge:Cambridge University Press.
Ferber, C., Berche, B., Holovatch, T., Holovatch Yu. (2012) A tale of Two Cities Vulnerabilities of the London and Paris Transit Networks, J Transp. Secur. 5. 199/216.
Hagberg, A., Swart, P., Chult, D. (2008) “Exploring Network Structure, Dynamics, and Function Using NetworkX”. in Proceedings of the 7th Python in Science Conference (SciPy2008), pp. 11-15.
Jenelius, E., Mattsson L-G. (2015) Vulnerability and resilience of transport systems — A discussion of recent research, Transportation Research Part 4, 81, pp. 16-34.
Joanes, D. N., Gill, C. A. (1998). Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician) 47, pp. 183-189.
Latora, V., Marchiori, M. (2001) Efficient behavior of small-world networks. Physical review letters 87.19, pp.198701.
Nist (2003), “Engineering Statistics Hanbook”, https://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm
Pearson, M., Sagastuy, J., Samaniego, S. (2017) “Traffic Flow Analysis Using Uber Movement Data”, htts:// snap.stanford.edu/class/projects/
Saramãki, J., Kivel, M., Onnela, J-P., Kaskil, K., Kertész, J. (2007) Generalizations of the Clustering Coefficient to Weighted Complex Networks, Physical Review E, 75.2,pp. 027105.
Sebastiani, M., Luders R., Fonseca K. (2016) “Evaluating electric bus operation for a real-world BRT public transportation using simulation optimization,” IEEE Trans. Intel. Transp. Sys., vol. 99, pp. 1-10.
Uber (2019) “Uber Movement”, https://movement.uber.com/
Vonu P., Tang L., Vassilakis W. (2011) “Spatio-temporal effects of bus arrival time information,” in Proceedings of the 4th ACM SIGSPATIAL International Workshopon Computational Transportation Science, CTS "11, (New York, NY, USA), pp. 6-11.
Published
2014-05-27
How to Cite
COSTA, Jefferson de J.; BERNARDINI, Flavia Cristina; LIMA, Thiago J. B. de; VITERBO, José.
An analysis of the data quality in reports of Brazilian federal highways for the data mining process. In: LATIN AMERICAN SYMPOSIUM ON DIGITAL GOVERNMENT (LASDIGOV), 6. , 2014, Londrina.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2014
.
p. 9-16.
ISSN 2763-8723.
