Vehicle Energy Dataset - an enriched dataset, balancing data integrity and anonymization

  • João Batista Firmino Júnior Federal Institute of Education, Science and Technology of Paraíba (IFPB) http://orcid.org/0000-0001-8038-8516
  • Francisco Dantas Nobre Neto Federal Institute of Education, Science and Technology of Paraíba (IFPB)

Abstract


Obtaining displacement databases for research in the area of destination and trajectory prediction is a challenging task. In this sense, the study proposes to describe and mine the Vehicle Energy Dataset (VED), suitable for the evaluation of clustering techniques, with 384 vehicles and their displacements, concentrated in the city of Ann Harbor, Michigan (USA). This work combines the application of the ST-DBSCAN algorithms, as a space-time clustering technique, with that of map-matching, through an adequate Coordinate Reference System, in addition to counting the repetition of the origin-destination cells of the trajectories and, in the end, 231 vehicles were obtained with at least 1 trajectory with an origin-destination repetition of at least 2 times.

Keywords: Vehicle energy dataset, Integrity, Anonymization, Data, Trajectories

References

Atluri, G.; Karpatne, A.; Kumar, V. (2018). Spatio-Temporal Data Mining. In: ACM Computing Surveys, v. 51, n. 4, p. 1–41, 22 ago. 2018.

Birant, D.; Kut, A. (2007). ST-DBSCAN: An algorithm for clustering spatial–temporal data. In: Data & Knowledge Engineering, v. 60, n. 1, p. 208–221, jan. 2007.

Feng, Z.; Zhu, Y. (2016). A Survey on Trajectory Data Mining: Techniques and Applications. In: IEEE Access, v. 4, p. 2056–2067, 2016.

Farzanehfar, A.; Houssiau, F.; De Montjoye, Y.-A. The risk of re-identification remains high even in country-scale location datasets. In: Patterns, v. 2, n. 3, p. 100204, mar. 2021.

Gold, C. Tessellations in GIS: Part I—putting it all together. In: Geo-spatial Information Science, v. 19, n. 1, p. 9–25, 2 jan. 2016.

Leite da Silva, C.; May Petry, L.; Bogorny, V. In: 2019 8TH Brazilian Conference on Intelligent Systems (BRACIS). A survey and comparison of trajectory classification methods. (Anais...). Salvador, Brazil: IEEE, 2019. p.788–793. DOI: https://doi.org/10.1109/BRACIS.2019.00141.

Li, K.; Rao, X.; Pang, X.; Chen, L.; Fan, S. Route search and planning: A survey. Big data research, v. 26, n. 100246, p. 100246, 2021. DOI: https://doi.org/10.1016/j.bdr.2021.100246.

Marketos, G. (2009). Mobility Data Warehousing and Mining. In: Proceedings of the VLDB 2009 PhD Workshop. Co-located with the 35th International Conference on Very Large Data Bases (VLDB 2009). Lyon, France, August 24, 2009.

Oh, G.; Leblanc, D. J.; Peng, H. (2020). Vehicle Energy Dataset (VED), A Large-Scale Dataset for Vehicle Energy Consumption Research. In: IEEE Transactions on Intelligent Transportation Systems, p. 1–11, 2020.

OH, G. S. VED (Vehicle Energy Dataset). Disponível em: https://github.com/gsoh/VED. Acesso em: 1 abr. 2023.

Yu, L.; Zhang, Z.; Ding, R. Map-Matching on Low Sampling Rate Trajectories through Frequent Pattern Mining. Scientific Programming, v. 2022, p. 1–15, 21 mar. 2022.

Zheng, Y. (2015). Trajectory Data Mining. In: ACM Transactions on Intelligent Systems and Technology, v. 6, n. 3, p. 1–41, 12 maio. 2015.
Published
2023-09-25
FIRMINO JÚNIOR, João Batista; NOBRE NETO, Francisco Dantas. Vehicle Energy Dataset - an enriched dataset, balancing data integrity and anonymization. In: DATASET SHOWCASE WORKSHOP (DSW), 5. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 109-118. DOI: https://doi.org/10.5753/dsw.2023.235742.