A Data-centric Model Transformation Approach using Model2GraphFrame Transformations





Model Extractor, Data-­centric approach, Spark GraphFrames, Model Transformations


Data­-centric (Dc) approaches are being used for data processing in several application domains, such as distributed systems, natural language processing, and others. There are different data processing frameworks that ease the task of parallel and distributed data processing. However, there are few research approaches studying on how to execute model manipulation operations, as model transformations models on such frameworks. In addition, it is often necessary to provide extraction of XMI­-based formats into possibly distributed models. In this paper, we present a Model2GraphFrame operation to extract a model in a modeling technical space into the Apache Spark framework and its GraphFrame supported format. It generates GraphFrame from the input models, which can be used for partitioning and processing model operations. We used two model partitioning strategies: based on sub­graphs, and clustering. The approach allows to perform model analysis applying operations on the generated graphs, as well as Model Transformations (MT). The proof of concept results such as model2GraphFrame, GraphFrame partitioning, GraphFrame connectivity, and GraphFrame model transformations indicate that our Model Extraction can be used in various application domains, since it enables the specification of analytical expressions on graphs. Furthermore, its model graph elements are used in model transformations on a scalable platform.


Download data is not yet available.


Ahlgren, B., Hidell, M., and Ngai, E. C. (2016). Internet of things for smart cities: Interoperability and open data. IEEE Internet Computing, 20(6):52–56.

Alvaro, P., Conway, N., Hellerstein, J. M., and Marczak, W. R. (2011). Consistency analysis in bloom: a CALM and collected approach. In CIDR 2011, pages 249–260, CA, USA. CIDRDB.

Anjorin, A., Leblebici, E., and Schürr, A. (2016). 20 years of triple graph grammars: A roadmap for future research. Electronic Communications of the EASST, 73.

Apache, S. F. (2019). Apache spark, 2019 may, release 2.4.3. https://spark.apache.org/. Online, accessed 2019­08.

Aslak, U., Rosvall, M., and Lehmann, S. (2018). Constrained information flows in temporal networks reveal intermittent communities. Phys. Rev. E 97, 062312 (2018),97(6):062312.

Azzi, G. G., Bezerra, J. S., Ribeiro, L., Costa, A., Rodrigues, L. M., and Machado, R. (2018). The Verigraph System for Graph Transformation. In Heckel, R. and Taentzer, G., editors, Graph Transformation, Specifications, and Nets: In Memory of Hartmut Ehrig, pages 160–178. Springer International Publishing.

Barquero, G., Burgueño, L., Troya, J., and Vallecillo, A. (2018). Extending complex event processing to graph-structured information. In Proceedings of the 21th ACM/IEEE International Conference on Model-Driven Engineering Languages and Systems, MODELS ’18, pages 166–175, New York, NY, USA. ACM.

Batory, D. and Azanza, M. (2017). Teaching model-driven engineering from a relational database perspective. Software & Systems Modeling, 16(2):443–467.

Benelallam, A., Gómez, A., Tisi, M., and Cabot, J. (2015). Distributed Model-to-Model Transformation with ATL on MapReduce. In 2015 ACM SIGPLAN Software Language Engineering, SLE 2015, pages 37–48, New York, NY,USA. ACM

Benelallam, A., Gómez, A., Tisi, M., and Cabot, J. (2018). Distributing relational model transformation on MapReduce. Journal of Systems and Software, 142:1 – 20.

Benelallam, A., Tisi, M., Cuadrado, J. S., de Lara, J., andCabot, J. (2016). Efficient model partitioning for distributed model transformations. In Proceedings of the 2016 ACM SIGPLAN International Conference on Software Language Engineering, SLE 2016, pages 226–238, New York, NY, USA. ACM.

Blondel, V. D., Guillaume, J.­L., Lambiotte, R., and Lefeb­vre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):10008.

Bohlin, L., Edler, D., A., L., and M., R. (2014). Mapequation framework.

Bollati, V. A., Vara, J. M., Jiménez, A., and Marcos, E.(2013). Applying MDE to the (semi-)automatic development of model transformations. Inf. Softw. Technol., 55(4):699–718.

Brambilla, M., Cabot, J., and Wimmer, M. (2012). Model-Driven Software Engineering in Practice, volume 1. Morgan & Claypool, Williston, USA, 1 ed. edition.

Burgueno, L., Troya, J., Wimmer, M., and Vallecillo, A.(2015). Parallel in place model transformations with LinTra. In Proceedings of the 3rd Workshop on Scalable Model-Driven Engineering, pages 52–62.

Burgueno, L., Wimmer, M., and Vallecillo, A. (2016). A linda­based platform for the parallel execution of out­place model transformations. Inf. Software Technology, 79:17–35.

Camargo, L. C. and Fabro, M. D. D. (2019). Applying a data-centric framework for developing model transformations. In ACM/SIGAPP Symposium on Applied Computing, SAC’19, page 1570–1573, New York, NY, USA. Association for Computing Machinery.

Chambers, B. and Zaharia, M. (2018).Spark: The Definitive Guide, volume 1. Ó Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA, USA, 1 ed. edition.

Daniel, G., Sunye, G., Benelallam, A., Tisi, M., Vernageau, Y., Gomez, A., and Cabot, J. (2017). NeoEMF: A multi­database model persistence framework for very large models. Science of Computer Programming, 149:9 – 14. Special Issue on MODELS’16.

Daniel, G., Sunyé, G., and Cabot, J. (2016). UMLtoGraphDB: Mapping conceptual schemas to graph databases. In Comyn­Wattiau, I., Tanaka, K., Song, I.­Y., Yamamoto, S.,and Saeki, M., editors, Conceptual Modeling, pages 430–444, Cham. Springer International Publishing.

Dean, J. and Ghemawat, S. (2008). Mapreduce: Simplified data processing on large clusters.Commun. ACM, 51(1):107–113.

Eclipse, F. (2019). Atl transformations list (zoo). http://www.eclipse.org/atl/atlTransformations/. On­line, accessed 2019/02.

Edgar, J., Sebastian, B., Dennis, W., Li, D., Abel, H., Markus,H., Tassilo, H., Elina, K., Christian, K., Kevin, L., Markus,L., Arend, R., Louis, R., Sebastian, W., and Steffen, M.(2014). A survey and comparison of transformation tools based on the transformation tool contest. Science of Computer Programming, 85:41 – 99. Special issue on Experimental Software Engineering in the Cloud(ESEiC).

Edler, D., Bohlin, L., and Rosvall, M. (2017). Mapping higher-order network flows in memory and multilayer networks with Infomap. CoRR, abs/1706.04792.

Gao, Y., Zhou, Y., Zhou, B., Shi, L., and Zhang, J. (2017). Handling data skew in MapReduce cluster by using partition tuning. In Journal of healthcare engineering, pages1–12.

Gómez, A., Tisi, M., Sunyé, G., and Cabot, J. (2015). Map-based transparent persistence for very large models. In Fundamental Approaches to Software Engineering 18th International Conference, (FASE), pages 19–34.

Hermann, F., Ehrig, H., Golas, U., and Orejas, F. (2014). Formal analysis of model transformations based on triple graph grammars. Mathematical Structures in Computer Science, 24(4).

Hochbaum, D. S. (2008). The pseudoflow algorithm: A new algorithm for the maximum flow problem. Oper. Res.,56(4):992–1009.

Imre, G. and Mezei, G. (2012). Parallel graph transforma­tions on multicore systems. In Proceedings of the 2012 International Conference on Multicore Software Engineering, Performance, and Tools, MSEPT’12, pages 86–89, Berlin, Heidelberg. Springer­Verlag.

Jia, X. and Jones, C. (2015). Design of adaptive domain­specific modeling languages for model­driven mobile ap­plication development. In 2015 10th International Joint Conference on Software Technologies (ICSOFT), volume 1, pages 1–6.

Jin, J., Luo, J., Song, A., Dong, F., and Xiong, R. (2011). Bar: An efficient data locality-driven task scheduling algorithm for cloud computing. In2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pages 295–304.

Jouault, F., Allilaire, F., Bézivin, J., and I., K. (2008). Atl: A model transformation tool.Science of Computer Pro­gramming, 72(1):31 – 39. Special Issue on Second issue of experimental software and toolkits (EST).

Junghanns, M., Petermann, A., Teichmann, N., Gómez, K.,and Rahm, E. (2016). Analyzing Extended Property Graphs with Apache Flink. In SIGMOD Workshop on Network Data Analytics (NDA), pages 1–8.

Kahani, N., Bagherzadeh, M., Cordy, J. R., Dingel, J., andVarró, D. (2018). Survey and classification of model trans­formation tools.Software & Systems Modeling.

Kendig, C. E. (2016). What is proof of concept research and how does it generate epistemic and ethical categories for future scientific practice? In Nature, S., editor, Science and Engineering Ethics, pages 735–753. Springer Interna­tional Publishing, Switzerland AG.

Kolovos, D. S., Paige, R. F., and Polack, F. A. C. (2008). The Epsilon Transformation Language, pages 46–60. Springer Berlin Heidelberg, Berlin, Heidelberg.

Larman, C. (2004). Applying UML and Patterns: An Intro­duction to Object­Oriented Analysis and Design and the Unified Process, volume 1. Prentice-Hall, Upper SaddleRiver, United States, 3 ed. edition.

Le, Y., Liu, J., Ergün, F., and Wang, D. (2014). Online load balancing for MapReduce with skewed data input. In IEEE INFOCOM 2014 ­ IEEE Conference on Computer Com­munications, pages 2004–2012.

Li, L., Geda, R., Hayes, A. B., Chen, Y., Chaudhari, P., Zhang, E. Z., and Szegedy, M. (2017). A simple yet effec­tive balanced edge partition model for parallel computing.SIGMETRICS Perform. Eval. Rev., 45(1):6–6.

Löwe, M. (2018). Model transformations as free construc­tions. In Heckel, R. and Taentzer, G., editors, Graph Transformation, Specifications, and Nets: In Memory of Hart­mut Ehrig, pages 142–159. Springer International Publish­ing, Cham.

MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statis­tics and Probability, Volume 1: Statistics, pages 281–297, Berkeley, Calif. University of California Press.

Michael l., S. (2016).Programming Language Pragmatics.Morgan Kaufmann, 4 ed. edition.

Milo, R., Shen­Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii,D., and Alon, U. (2002). Network motifs: Simple building blocks of complex networks.Science (New York, N.Y.),298:824–7.

OMG (2016). Qvt query view transformation, formal/2016­06­03 v1.3. http://www.omg.org/spec/QVT. Accessed in2018/06.

Pagán, J. E., Cuadrado, J. S., and Molina, J. G. (2015). Arepository for scalable model management.Software &Systems Modeling, 14(1):219–239.

Raman, R. (2015). Encoding data structures. In Rahman, M. S. and Tomita, E., editors, WALCOM: Algorithms and Computation, pages 1–7, Cham. Springer InternationalPublishing.

Rutle, A., Rossini, A., Lamo, Y., and Wolter, U. (2012). A formal approach to the specification and transformation of constraints in mde.The Journal of Logic and AlgebraicProgramming, 81(4):422 – 457.

Schürr, A. (1995). Specification of graph translators with triple graph grammars. InProceedings of the 20th Inter­national Workshop on Graph­Theoretic Concepts in Com­puter Science, WG 94, pages 151–163. Springer­Verlag.

Shkapsky, A., Yang, M., Interlandi, M., Chiu, H., Condie, T.,and Zaniolo, C. (2016). Big data analytics with data log queries on spark. InProceedingsofthe2016InternationalConference on Management of Data, SIGMOD16, pages1135–1149.

Szárnyas, G., Izsó, B., Ráth, I., Harmath, D., Bergmann, G.,and Varró, D. (2014). Incquery­d: A distributed incremental model query framework in the cloud. In Dingel, J., Schulte, W., Ramos, I., Abrahão, S., and Insfran, E., edi­tors,Model­Driven Engineering Languages and Systems, pages 653–669. Springer International Publishing.

Szárnyas, G., Izsó, B., Ráth, I., and Varró, D. (2018). The train benchmark: cross­technology performance evalua­tion of continuous model queries.Software System Model,17, 4:28.

Tang, M., Shao, S., Yang, W., Liang, Y., Yu, Y., Saha, B.,and Hyun, D. (2019). Sac: A system for big data lineage tracking. In2019 IEEE 35th International Conference on Data Engineering (ICDE), pages 1964–1967.

Tisi, M., Martínez, S., and Choura, H. (2013). Parallel execu­tion of atl transformation rules. InProceedings of the 16thInternational Conference on Model­Driven EngineeringLanguages and Systems ­ Volume 8107, pages 656–672, New York, NY, USA. Springer­Verlag New York, Inc.

Tomaszek, S., Leblebici, E., Wang, L., and Schürr, A. (2018).Model­driven development of virtual network embeddingalgorithms with model transformation and linear optimiza­tion techniques. In Schaefer, I., Karagiannis, D., Vogel­sang, A., Méndez, D., and Seidl, C., editors,Modellierung2018, pages 39–54, Bonn. Gesellschaft für Informatik e.V.

Vara, J. M. and Marcos, E. (2012). A framework for model­driven development of information systems.Journal ofSystems Software., 85(10):2368–2384.

Varró, D., Bergmann, G., Hegedüs, Á., Horváth, Á., Ráth,I., and Ujhelyi, Z. (2016). Road to a reactive and incre­mental model transformation platform: three generations of the viatra framework.Software & Systems Modeling,15(3):609–629.

Varro, G., Schurr, A., and Varro, D. (2005). Benchmark­ing for graph transformation. In2005 IEEE Sympo­siumonVisualLanguagesandHuman­CentricComputing(VL/HCC’05), pages 79–88.

W3C (2014). Rdf 1.1 concepts and abstract syntax.

Wischenbart, M., Mitsch, S., Kapsammer, E., Kusel, A.,Pröll, B., Retschitzegger, W., Schwinger, W., Schönböck,J., Wimmer, M., and Lechner, S. (2012). User profile inte­gration made easy: Model­driven extraction and transfor­mation of social network schemas. InProceedings of the21st International Conference on World Wide Web, pages939–948.

Xin, R. S., Gonzalez, J. E., Franklin, M. J., and Stoica, I. (2013). Graphx: A resilient distributed graph system on spark. In First International Workshop on GraphData Management Experiences and Systems, GRADES’13, pages 2:1–2:6.




How to Cite

Camargo, L. C., & Del Fabro, M. D. (2021). A Data-centric Model Transformation Approach using Model2GraphFrame Transformations. Journal of Software Engineering Research and Development, 9(1), 10:1 – 10:17. https://doi.org/10.5753/jserd.2021.477



Research Article