Evolution of XR Research in Brazil according to the first 22 SVR editions

AlthoughVirtual Reality technologywas first developed almost sixty years ago, there has been little survey work giving an overview on how research in VR, AR and MR evolved in Brazil along with their future trends. We provide such analysis by reviewing the development made since the first WRV event, back in 1997, until SVR 2020. The first 22 event editions may help understand how the area was explored and provide a roadmap for future research. The 609 full papers analyzed were compiled into an open catalog, available on the internet. It features important filter capabilities, enabling the quick selection of papers based on citation number, conference topic, area of application, user experiments, statistical analysis and so on. We hope this tool to be of great value to the field, and also for helping researchers decide which topics should be explored when they are beginning their own studies in the area. In this analysis, we also refer to the most frequent authors in the area and how they contributed to the field based on their expertise and research group.


Introduction
Extended Reality (XR) embrace technologies which allow humans or physical objects to interact with computer generated virtual imagery in real time. While in VR the user is completely immersed in a virtual environment (Burdea and Coiffet, 2003), AR allows to overlay virtual content on the real world (Azuma, 1997). MR goes beyond that and is capable of anchoring such virtual content in the real world (Ohta and Tamura, 2014).
Despite Morton Heilig having patented one of the first experiences in VR in 1962, the Sensorama 1 , only 35 years later the first Brazilian VR conference was held: the Workshop on Virtual Reality '97 (WRV 1997). Since then the scope of the conference has evolved to embrace more Human-Computer Interaction topics including AR and MR, and in 2001 it changed to SVR (Symposium on Virtual and Augmented Reality). This paper reviews the history of SVR and earlier WRVs until 2020, which comprises 22 editions of the aforementioned events. Naturally, they are not the only venues for presenting XR research.
However, SVR is the premier conference for the field in Brazil so tracking research trends through it provides an interesting history of the evolution of such areas and helps identify areas for future research.
As quoted from Thomas S. Monson, who was an American religious leader, one should "learn from the past, prepare for the future and live in the present" (Monson, 1985). That is exactly what this paper is about. We intend to show what was developed in the past, the developments being made in the moment and what to expect from the evolving technologies in the future. For that, the remainder of this paper is structured as follows. We describe the method used in this research in Section 2. A broader analysis of the findings when analyzing SVR publications is presented in Section 3. Section 4 lists and discusses about the 5 most cited works on 1 https://patents.google.com/patent/US3050870A/en each topic in the history of SVR. Section 5 tries to find a relationship about what was researched and future directions for those technologies. Section 6 provides a profile of the most frequent authors and institutions involved on the analyzed publications. At last, we come to a conclusion in Section 7.

Method
The main method used in this research was based on the work of Zhou et al. (2008) and intends to review previously published conference full papers from the conference proceedings of WRV 97, WRV 99, WRV 2000, SVR 2001, SVR 2002, SVR 2003, SVR 2004, and SVR 2006to SVR 2020 Note that there was no event in 1998 and 2005. Since the event started in 1997, the researchers decided to make its second edition only two years later, that is, 1999. In 2004, the scientific board of the event decided to adjust the event schedule so it would happen in the first semester of the year. Since it was going to be too close to the next edition, they have decided to skip year 2005. There are 609 full papers contained in these proceedings, providing an interesting snapshot of emerging research trends in XR over the last twenty one events. We exclude short papers and posters which are typically shorter and usually not reviewed as rigorously. Our analysis of the collected research was specifically guided by the following three questions: 1. Which areas have been explored in XR? 2. What are the developments and key problems in these areas? 3. What are important future trends for XR research?
In addition to analyzing paper topics, we also measured their relative impact by calculating the number of citations of the papers. This was found by taking the total number of citations as reported on Google Scholar in Septemper, 2021. There are issues with the absolute accuracy of citation data from Google Scholar, such as reporting citations from nonscholarly sources, but it at least provides some indication about the relative importance of the papers.
We also carefully collected information regarding the authors responsible for the publications together with their corresponding research groups/institutions. This way, we may determine the most frequent authors and what are their areas of expertise, serving as a guide for new researchers that are in search of references for possible postgraduate research or other partnerships.

SVR proceedings review results
During the evolution of XR, a variety of related research topics have been developed and discussed extensively. In our work, we grouped past XR research into the eleven topics shown in Figure 1. These topics are based on the key topics used in the previous editions of the conference. It is important to say that such topics evolved during the years. For instance, the SVR 2020 topics include, besides the previous 11 topics: • Virtual environments evaluation • Machine learning for XR • User studies and evaluation • Advanced display technology • Immersive projection technology • Perception, presence, and cognition in XR Based on a different approach, the call for papers from SVR 2021 states that it accepts "works that present original unpublished research in the areas of virtual, augmented and mixed realities and related areas". The usual topics were not adopted in 2021, which makes the call more generic since "related areas" can regard any research the authors judge related to the fields of virtual/augmented reality.
From the total set of papers analyzed it is apparent that the papers published can be broken down into two groups. The first group contains the five main research topics: 1. Systems, frameworks and toolkits (40.07%, i.e. 244/609) 2. Social, economic, and technical impacts of XR (13.63%) 3. Computer graphics techniques for XR (10.84%) 4. Multi-user and distributed XR (8.37%) 5. 3D interaction (7.39%) These are the five topics that papers are published on most frequently, which is interesting because they are not all in the core technology areas needed to deliver a XR application (for instance, tracking and sensing and devices are core technologies). This means that most SVR publications are focused on using technologies to solve applied problems, instead of developing new base core technologies. The second group of topics reflects some core technology areas and some specific application areas, such as Teleoperation and telepresence, including: 6. Virtual humans and avatars (5.42%) 7. Tracking and sensing (4.76%)  Table 1 shows the proportion of the cited papers related the number of papers published for each topic.
The evolution of the citation numbers per year is shown in Figure 2. It illustrates a varying curve composed by many hills and valleys. Such behaviour leads us to believe there is a two year interstice between publishing works considered more relevant to the scientific community. Table 2 provides an overview of the most cited works for each event. We have considered also papers from SVR 2020, since some of them already have 4 citations, even with this short time span (7 months since SVR 2020).
We also grouped the analyzed papers according to their application areas, similar to the previous topic classification. The selected areas were: • Education: works regarding education, learning, training and other related areas; • Health: works regarding medicine, physiotherapy, phobia treatments and other related areas; • Industry: works regarding improvement in industrial production, quality of processes in manufacturing processes and other related areas; • Games: works targeting games and entertainment related areas; • Other: works regarding areas not listed before; • N/A: works that were not specific for an area, for instance: an AR toolkit that was developed or a body tracking solution (these may be used for different application areas). Figure 3 shows how the published papers were distributed in the main application areas. Since the "N/A" category represents papers that have no specific target area and the "Other" groups different areas that were not mentioned, the most important application areas focused by SVR papers may be considered Education and Health.
One important point when developing new technologies and solutions is how to evaluate them. Usually, user studies are applied to that end and for this reason this was also analyzed on the 609 papers. For all of them, we checked whether they had a user study to validate the work and if some statistical analysis of the study was made available as well. Figure 4 illustrates the results found. It is possible to perceive   a growing tendency for both user studies and statistical analyses during the years, showing how important is to perform such validation.
Despite its internationalization efforts, SVR is mainly comprised of publications from Brazilian authors in its majority. From the 609 analyzed publications, only 7.38% (45 papers) are from foreign authors, distributed amongst Europe (28), North America (8), Asia (6) and South America (3). The geographical distribution of Brazilian authors is shown in Figure 5, considering the five Brazilian macro regions: North, Northeast, Midwest, Southeast and South.
All information discussed in this section is available in the form of an online open catalog 2 , as shown in Figure 6.

The development of research topics
This section analyzes the five most cited works for each of the research topics previously described.

Systems/frameworks/toolkits
Being the most frequent topic in all 609 publications (40.07%), Systems/frameworks/toolkits is a general topic that comprises different works. According to the evolution of the numbers of papers in this area shown in Figure 7, this topic demonstrates no specific increase or decrease trends throughout the years. To perceive how generic this topic can be, one just needs to take a look at the five most cited works.
In 2001, Zuffo et al. (2001) described the first CAVE visualization system implemented in Latin America. It consisted of an integrated VR solution with multiprojection estereoscopy for fully immersive VR applications, composed by 5 3m x 3m projection planes (4 walls and the floor) with 2000x2500 pixels of resolution.
In 2006, Niniss and Inoue (2006) described a driving simulator for electric powered wheelchairs that uses a hemispherical display system capable of allowing a very wide horizontal FOV. The authors also gave the alert that cyber sickness should be prevented and minimized as much as possible to diminish the discomfort effect on the tested subjects. It is interesting to notice that cyber sickness was not yet a concern  in the work of Zuffo et al. (2001).
In 2011, Wu and Boulanger (2011) proposed a framework to tackle the problem of missing markers due to occlusions or ambiguities in motion capture solutions. A Kalman filter estimated the missing markers positions in real time and effectively reconstructs the human motion captured. This work was important because there was not many works regarding Microsoft Kinect V1 use for human body tracking yet (release date of November, 2010) and solution presented by Wu and Boulanger (2011) enhanced the tracking results of existing commercial systems such as OptiTrack.
In 2014, de Paiva Guimarães and Martins (2014) developed a checklist to measure the usability of AR applications in a practical way. By adapting the ISO 9241-11 and Nielsen Heuristics to the AR context, the reached a good solution for AR usability evaluation. As a curiosity, the first AR-related works published in SVR are from 2003. This means that it took about 11 years since AR works started being published in Brazil for researchers being interested in measuring the usability of general AR applications.
In 2016, De Oliveira et al. (2016) discussed the possibility of immersion not in another environment but in another person's body. The authors used a low budget system that reproduced a person's head movements as if one's own head were in another body viewed through a head mounted display (HMD) while having body agency, i.e., controlling the movements of another real body as if it was a "real avatar". They described the tool in details and discussed its feasibility and preliminary results based on the analysis of the participants' perceptions collected through validated questionnaires and in-depth interviews. One may find that approaches such as the one proposed by De Oliveira et al. (2016) would be very important in the next years after its publication, specially in the areas of neuroscience, psychology and education.

Social/economic/technical impacts of XR
Being the second most frequent topic in all 609 publications (13.63%), Social/economic/technical impacts of XR represents works that focus on the effect they have on their users, even if it is social, economical or technical-related. According to the evolution of the numbers of papers in this area shown in Figure 8, this topic demonstrates an increase in the number of publications during the years. This growth is most noticeable in the last two years. The five most cited works regarding this topic are listed as follows. In 1999, Costa et al. (1999) addressed issues related to the rehabilitation of people with brain disorders and presented the main characteristics of an integrated virtual environment, focused on cognitive rehabilitation. The work described the appearance of the environments, the cognitive function associated to each environment and the stimulus-generating task that composed this integrated virtual environment. This work is important since it is the first one related to Health applications in the history of SVR.
As another work related to Health applications, in 2003, Dainese et al. (2003) presented the proposal of an AR system for the cognitive development of deaf children. The proposal suggested exploring visual resources to facilitate understanding and learning of new content. It defined the use of video cameras, head trackers, gloves, polarized glasses and monitors attached to a computer. For the development of the software, it proposed the use of Microsoft Visual C++, AR-Toolkit and OpenGL.
Coincidently, in 2012, Mirzaei et al. (2012) described a system to help deaf people in the communication process. The system combined AR, Automatic Speech Recognition (ASR), Text-to-Speech Synthesis (TTS) and Audio-Visual Speech Recognition (AVSR). It converted the narrator's speech into readable text and displayed it on the AR display. The AVSR was used to improve system accuracy in noisy environments and the TTS to convert input text into speech. In the top 5 most cited papers from the Social/economic/technical impacts of XR category, three of them are related to Health. This demonstrates the strength of the topic and interest of different areas intending to use VR/AR related technologies as a means to a serious end.
Also in 2012, Simões et al. (2012) analyzed the advantages and drawbacks in using 3D reconstruction from image techniques and tools in the uncontrolled environment of an electrical substation. The characteristics of the scenario and tools were considered, from the point of view of limitations. Some available tools were evaluated and the relationship between the characteristics of the scenario or object and the quality of the reconstructed models was pointed out. This work showed a concern regarding finding the best cost-benefit solutions as alternative to laser-based 3D reconstruction ones. Nowadays, such solutions are still considered the most accurate ones, but vision-based solutions (using RGB or RGBD sensors) are capable of achieving comparable results and may pose as viable alternatives when financial resources are too limited.
In 2013, dos Santos et al. (2013) developed a Systematic Literature Review (SLR) to identify evidence in the literature on the use of Requirements Engineering (RE) for VR systems and the contributions of VR to the RE process. The results contributed to an understanding of the RE process in the field of VR and indicated gaps that highlighted opportunities for investigations. Since AR/VR-related SW is unconventional (in comparison to software developed to other areas), it is important to pain attention on RE specifically tackled to this development. Any means capable of easing the development process of AR/VR solutions and assuring their success is welcomed by the scientific community.

Computer graphics techniques for XR
Being the third most frequent topic in all 609 publications (10.84%), Computer graphics techniques for XR represents works that develop algorithm for data visualization, rendering, physics simulation and other related areas. According to the evolution of the numbers of papers in this area shown in Figure 9, it had its peak near 2008 and now it demonstrates a discrete growth. The five most cited works regarding this topic are listed as follows. The most cited papers in this category are mostly focused on performance and photorealistic techniques.
In 2002, Arsenault and Ware (2002) discussed the issue of screen size with respect to the task of navigating in 3D com-puter graphics environments. The experiments results exemplify the "robustness of visual perspective" phenomenon that has been reported by vision researchers. The work analyzed the performance of the user while testing different screen sizes during a navigation task.
In the same year, Figueiredo et al. (2002) looked into current approaches for collision detection that can be used for real time interactive virtual environments and present a taxonomy for classifying different collision detection algorithms. The authors also compare the performance between collision detection algorithms based on axis-aligned, discrete orientation polytopes and oriented bounding boxes. This work also focused on performance, but this time regarding the computational one. Almost 10 years have passed and collision detection is still an open problem. It is really difficult to have an effective collision detection algorithm capable of processing complex triangle meshes in real time. The best solutions make use of GPUs for parallel processing and engines such as Unity 3D or Unreal come with simple collision detectors, which are not always suitable for dealing with general triangle meshes.
In 2008, Pessoa et al. (2008) presented a combined solution to realistically insert virtual objects into real scenes. The techniques implemented obtained good visual results. As interactive performance was achieved, the solution presented is suitable to insert virtual objects into AR applications in a photorealistic way. This was one of the first works published in SVR that focused on photorealistic results for AR. The authors made use of environmental maps and shader programming, which would later be substituted by NVIDIA's CUDA-enabled algorithms.
In 2012, Dos  introduced a graphics rendering pipeline applied to AR, based on a real time ray tracing paradigm. As proof of concept, a case study using both ARToolKitPlus library and the Microsoft Kinect is developed. This work is important because it was one of the first efforts to implement ray tracing-based solutions for AR applications, using the CUDA framework. In 2020, NVIDIA enabled real time ray tracing for most of its GPUs, making this type of rendering the common ground for 3D applications.
Also in 2012, de Figueiredo et al. (2012) described a method to generate virtual objects with realistic shadows, either in indoor and in outdoor scenes. The authors created smooth shadows in real time on mobile platforms and discuss the possibilities, limitations and specific implementation details of these platforms. This was the first work regarding real time light source estimation for soft shadow synthesis focusing mobile AR. Nowadays, the graphics capabilities of mobile GPUs have evolved to a point that soft shadows may come at not cost when ray tracing techniques are used. In 2012, the approach was made possible and implemented using the conventional rasterization pipeline, through the use of shader programming.

Multi-user and distributed XR
Being the fourth most frequent topic in all 609 publications (8.37%), Multi-user and distributed XR represents works that encompass collaborative, multi-user and distributed so-lutions. According to the evolution of the numbers of papers in this area shown in Figure 10, there is a decrease in the number of papers in this category. This may be due to the increase in computational processing power (the real time processing can be performed locally, using the GPU, for instance) and a better network quality (higher bandwidths and less delays diminishes network-related problems). In times like this where most of the world population is in quarantine due to the Covid-19 pandemic, there was a boom in the adoption of distributed solutions, so this topic may reemerge in the next years. The six most cited works regarding this topic are listed as follows. The majority of the works in this category analyze the performance of distributed XR solutions focusing medical training. Collaborative simulators such as CyberMed and Sim-CEC were used in these studies.
In 2003, Jaramillo et al. (2003) proposed a communication's architecture for scaling distributed environments, aimed at minimizing the number of required multicast groups. The goal was met by taking advantage of the free processing capacity of the participating entities' hosts, and providing a load-balancing mechanism that prevents the hosts' processing capacity from being exceeded. This paper was one of the first SVR works to address the problem of scaling distributed environments. About 8 years later, the next work analyzed the network performance regarding hybrid networks, instead of local ones. This work also gave birth to a research interest focused on infrastructure of distributed XR solutions. How to share physics simulations and synchronize them is another example of work published on SVR derived from this one.
In 2011, Paiva et al. (2011) present the results of simulation and performance analysis of the CyberMed framework. The main goal of the experiment performed was to evaluate the real conditions of CyberMed when executed over a non-dedicated hybrid network, like the Internet, comparing its results with other similar works. Following the concern regarding scale of distributed XR application, this work extended the previous works by analyzing collaboration where distance was no longer limited.
In 2012, Paiva et al. (2012) disclosed implementation issues of a peer-to-peer multicast network architecture on the collaborative module of the CyberMed VR framework. The multicast protocol was known to provide better scalability and decrease the use of bandwidth on Collaborative Virtual Environments, allowing better Quality of Experience (QoE). The results of a performance evaluation experiment are also discussed. This work was a direct evolution from the previous one.
In the same year, Medeiros et al. (2012) analyzed the applicability of the 3C collaboration model as a methodology to model and define collaborative tools in the development of a collaborative virtual reality application. The presented case study illustrated the selection and evaluation of different tools that aim to support the actions of communication, cooperation and coordination between users that interact in a virtual environment. The authors showed how important is to adopt an appropriate methodology for modeling the basic features of virtual environments.
In 2015, de Farias Paiva et al. (2015) discussed possibilities and advantages of using Collaborative Virtual Environments for the training process and assessment of surgical teams as well as presenting the steps of planning and development of a simulator with these features, called SimCEC. This work focused specifically on its application, which were health and medical training. The authors linked the network infrastructure needs with the training task and evaluated them together.
In 2016, Paiva et al. (2016) discussed the theoretical and practical aspects of SimCEC, a collaborative simulator for education and assessment of student groups in basic surgical routines, as well as its advantages and the new possibilities offered for surgical education area. This work directly followed the previous one.
One interesting point to notice is that despite the fact the Unity engine was publicly and freely released in 2009, and Unreal Engine in 2015, none of the aforementioned works used high level 3D engines or game engines. Nowadays, the works focusing collaboration and distributed solutions make use of high level engines so that they can focus on the network problem. In fact, there are commercial solutions focused on massive collaboration and virtual environments such as Virbela 3 that are fully based on the Unity engine.

3D interaction
Being the fifth most frequent topic in all 609 publications (7.39%), 3D interaction represents works that focus on how the user performs spatial interaction. It may include using a single body part, the whole body, or even mimicking actions to naturally interact with the application. According to the evolution of the numbers of papers in this area shown in Figure 11, there was an isolated increase in publication numbers in 2017 (9 papers published), but for most of the years the number of papers in such category was not higher than 4. The five most cited works regarding this topic are listed as follows.
In 2004, Santin et al. (2004) presented interactive actions developed for AR environments based on ARToolKit, making possible the intuitive manipulation of virtual objects, so that the user could superimpose virtual objects over the real world and manipulate them using his hands. The early Figure 11. Overall distribution of papers from the "3D Interaction" topic.
works presented on SVR focused on interaction using markers. There are plenty of them, for the most diverse applications. Even when different sensors such as Microsoft Kinect and Leapmotion were available, there was still space for works using marker-based AR.
In 2011, Santos et al. (2011) provided an application that used gestures to interact with virtual objects. To achieve this goal, the authors used Kinect for accessing both RGB and depth information. The user was allowed to perform operations on menus and manipulation of virtual objects. Seven years later compared to the work previously mentioned, this work started using an RGB-D sensor so that the used did not need to wear or hold any marker. Most 3D interaction applications published on SVR and using Kinect were focused on human pose estimation and postural or gesture analysis, as the following two that will be described.
The next two publications are directly related. In the first, the authors focused on the final application (health). The second one focused on the technology behind the movement identification and its completion estimation.
In 2012, Da Gama et al. (2012) provided a solution for guidance and correction to performed movements based on the therapeutic definition of them. The correct description of therapeutic movements was implemented in a prototype along with a scoring mechanism in order to measure the patient performance, as well as to encourage him to improve it by displaying a positive feedback whenever he does a correct movement.
In 2012, Chaves et al. (2012) presented a technique that recognized human movements using the set of data provided by an RGB-D camera. It also described a way to identify not only if the performed motion was valid, but also the identification of at which point the motion was. The proposed technique was validated through a set of tests focused on analyzing its robustness considering a series of variations during the interaction like fast and complex gestures.
In 2016, Abreu et al. (2016) evaluate the usage of electromyogram (EMG) data provided by the Myo armband as features for classification of 20 stationary letter gestures from the Brazilian Sign Language (LIBRAS) alphabet. The results obtained show that it is possible to identify the gestures, but substantial limitations were found. At last, a different sensor was used (Leapmotion) to gather muscle information and recognize Sign language gestures performed in the air. Nowadays, Deep learning-based techniques are able of performing much better by directly processing the myoelectric signals from the sensor attached to the person's forearm.

Virtual humans/avatars
This topic corresponds to 5.42% of all 609 publications and represents works that deal with virtual representations of users and virtual agents as well. Such virtual agents may incorporate knowledge so that they behaviour are as realistic as possible. According to the evolution of the numbers of papers in this area shown in Figure 12, there was a peek in publication numbers in 2004 and 2007, but works in this area are sparse in the event, specially in the last nine years. The five most cited works regarding this topic are listed as follows. From the most cited works in this category, it is possible to perceive the interest for autonomous and synthetic humans (avatars). Instead of creating a representation of a human that is therefore controlled by him/her, the avatar is conceived to be independent and interact with different people.
In 2000, Vidal et al. (2000) presented an intelligent librarian simuloid, called BIA, responsible for accessing an electronic dictionary in a distributed virtual reality environment for language learning. The proposed solution could be extended as a mean for efficient access to external resources from virtual reality environments. This was one of the first SVR works regarding avatars. Despite the fact that the avatar was created to an specific end, it could be extended to other applications.
In 2002, Torres et al. (2002) proposed an architecture to implement autonomous synthetic actors in virtual worlds, being represented by an articulated structure and having cognitive reasoning using BDI logic. Such architecture used an interpreter for AgentSpeak language and an articulated body animation model. A communication interface using sockets was also suggested. The architecture proposed by this work focused on how to exchange information between different agents. Instead of focusing on a single agent, the authors were concerned on how to share information between them and make their behaviour more realistic.
In 2004, Siscoutto and Tori (2004) presented AVTC, an architecture for teleconferencing based in augmented virtuality, integrating real stereoscopic images into threedimensional virtual worlds modeled by computer graphics. It was validated through the implementation of a prototype and by the accomplishment of tests, both static and dynamic, that analyzed the viability and the quality of the application. This work is directly related to a hot topic from nowadays. In these pandemic times, teleconferencing has become one of the main platforms for working from home and still communicate with others. The current interest is on how to produce virtual representations of real people using single images or short videos, and how to make these representations the most realistic ones possible.
In the same year, Longhi et al. (2004) presented studies about Embodied Conversational Agents (ECA) and a development proposal. The agent called Maga Vitta was developed to inhabit CIVITAS, an environment conceived to be used by children to allow the interactive construction of virtual cities. Seventeen years later, we can say that ECA are present in our daily lives. From TV commercials to support on e-commerce websites, this type of conversational agent was used to automate most part of the support nerwork used for client-store communication.
In 2007, Rodrigues et al. (2007) present a model for producing eye movements in synthetic agents, in order to improve the realism of facial animation of such agents during conversational interactions. The experimental results indicate that increasing details in eyes animation can significantly improve the realism of facial animation. While producing realist eye movements in synthetic agents was a problem in 2007, nowadays a related problem could be how to correct the eye gaze on teleconferencing applications. In this new world teleconferencing is a game changer in work relations, it still does not seems natural to most of its users, mainly due to the lack of direct eye contact. In a real life conversation, people are used to look into each others eyes. In a teleconferencing application, this does not happen because the user is looking at the screen instead of the camera. The current best solutions for this problem use deep neural networks for correcting the eye gaze, but the computational requirements for running them are so high (usually high end GPUs) that they are not mainstream yet.

Tracking/sensing
This topic corresponds to 4.76% of all 609 publications and represents works describing algorithms capable of detecting, tracking and sensing users, objects or the environment that surrounds us. According to the evolution of the numbers of papers in this area shown in Figure 13, the papers in this category only started being published in 2006. Even after that, it is not common to find more than 2 papers from this category per year. The six most cited works regarding this topic are listed as follows. Figure 13. Overall distribution of papers from the "Tracking/sensing" topic.
In 2009, Lima et al. (2009) implemented and evaluated model based markerless 3D tracking techniques aiming the development of augmented reality applications. Recursive and non-recursive methods based on edge and texture information were contemplated, enabling object tracking in a wide range of scenarios. Tracking performance and accuracy results obtained using different configurations of the implemented techniques were also compared and evaluated. This is one of the first SVR works focusing on markerless approaches for AR. Before this work, most SVR attention was solely directed to the use of 2D fiducial markers and how they could be applied to solve different problems.
In the same year, Farias et al. (2009) described an enhanced implementation of the Kanade-Lucas-Tomasi (KLT) feature tracker on the GPU. The implementation provided speed up improvements of 90x when compared to the CPU reference implementation. The authors were able to work with 1,024 x 1,024 video streams at 50 fps while extracting 1,000 features. This was the first CUDA-related work to be published on SVR. Following the markerless tendency stated by the previous work, a GPU-based KLT implementation could be used as basis for markerless tracking algorithms based on the structure of the object being tracked or the complete environment, such as SfM (Structure from Motion) techniques.
In 2010, Alves Fernandes and Fernández Sánchez (2010) proposed a hand interaction approach to AR Tabletop applications. The user's hands were detected using haar-like feature classifiers and their positions were correlated with the fiducial markers used. The solution allowed the user to move, rotate and resize virtual objects with his bare hands. Despite using fiducial markers to place the objects, the 3D interaction was based solely on the hand pixels. Nowadays, 11 years later, we have far more precise solutions for bare hand tracking such as Google's Mediapipe 4 .
In 2011, Bernadelli et al. (2011) proposed a solution for performing object tracking, on live video, through the boundary detection by Hough Transform and color detection through color maps transform. An experiment was performed where the trace of human iris was tested and the variations of texture and contour measurements were stored by the tracking, presented and discussed. Ten years later, nowadays researchers are focusing their attention to object trackers based on neural networks. Despite the high computational requirements needed by this approach, its generalization capability justifies the research effort placed.
In 2012, Sanches et al. (2012) applied a video quality subjective assessment method to obtain the AR user's opinions about videos with different misclassified pixels rates. The results showed that segmentation errors are perceived by AR applications users, but the video quality was not related to the number of the misclassified pixels. Also, the authors noticed that when the errors concentrated in the element of interest increase the score of the associated video decreases. This work is an outlier in comparison to the other works from this category. Instead of proposing a sensing/tracking technique, it focuses on using the tracking of the person's attention to assess how segmentation is performed in AR applications. 4 https://mediapipe.dev/ In 2016, Araujo et al. (2016) introduced STAM, a Simple Tracking and Mapping system that was developed in desktop and evaluated in a challenging scenario. Additionally, STAM was ported to a mobile version, using the Android platform and Google's Tango tablet device. The desktop version presented better tracking performance in simple scenarios with respect to reprojection error, but it presented a few drawbacks when dealing with the most complex ones. Regarding the mobile version, it proved to be slower than its desktop counterpart. Despite the fact that the Tango device was discontinued, this work could serve as a predecessor of ARCore, which was made publicly available only in 2018, while ARKit was made publicly available in 2017.
The point in common to the most cited works in this category is that all of them focus on markerless based tracking. There is no information regarding additional markers inserted on the scene available, and this is a trend perceived today: the use of 2D fiducial markers is becoming rare, due to many factors such as sensor quality, 3D mapping of the environment, trustable algorithms for determining indoor location, etc.

Devices for XR
This topic corresponds to 3.28% of all 609 publications and represents works that developed any input/output device for use with XR purposes. According to the evolution of the numbers of papers in this area shown in Figure 14, a peak in publication numbers happened between 2006 and 2009, with a mean of 2.5 papers per year. Besides that, it is rare to find publication of devices related to XR in SVR (usually one per year). The five most cited works regarding this topic are listed as follows. A brief analysis of the top most cited papers in this category reveals that the majority is focused on input devices for XR. A single one deals with visualization, as described in the following paragraphs.
In 2006, Schwaiger et al. (2006) designed a locomotion device which enabled the user to walk naturally in any desired direction. The user stayed in place while walking due to the recentering control which compensates all movements originating from the walker's movements. Different from existing 2D foot followers, the device could realize free movement of the walker in any direction by new rotational degrees of freedom. It is interesting to see how such devices evolved along the years. Improvements in both capabilities and size led us to solutions such as the Virtuix Omni 5 , currently one of the most popular VR motion platforms in the marker.
In , de Miranda et al. (2008 designed and implemented AR X-Ray, an augmented reality tool that allowed visual exploration of internal details of buildings through the use of a portable projector that worked as a "magic lantern" that projected a Virtual X-Ray over real walls. The problem discussed was that the projector focus had to be manually adjusted in the range of distances from the wall in which the system was tested. This work inspired many others in the field of projective augmented reality. Usually, XR visualization solutions focused on multiple viewers are not as usual as the ones focused on single viewers due to their cost. In this case, with a single projector it was possible to see information registered according to the part of the wall that was being illuminated. The work provided a very interesting costbenefit relationship since it had the best of both worlds. In the same year, Pamplona et al. (2008) described the design of an image-based data glove prototype suitable for virtual objects manipulation and interaction approaches. The proposed device uses a camera to track visual markers at fingertips, and a software module to compute the position of each finger tip and its joints in real-time. The prototype was built and validated with 15 volunteers. At the time this paper was published, there were few data glove alternatives available on the market, all of them being extremely expensive. The idea of using markers to detect how the hand joints were placed enabled the production of a extremely low cost alternative to such devices.
In 2011, Gallotti et al. (2011) developed a device that mapped a touch interface in a virtual reality immersive environment. A wireless glove (v-Glove) was created, which has two main functionalities: tracking the position of the user's index finger and vibrate the fingertip when it reaches an area mapped in the interaction space to simulate a touch feeling. Quantitative and qualitative analysis were performed with users to evaluate the v-Glove, comparing it with a gyroscopic 3D mouse. As happened with the previous work, this work also focused on the development of a data glove (in this case, a haptic one). It is interesting to see that some of the hardware used to develop the v-Glove prototype, which is the Arduino microcontroller, is still vastly used nowadays.
In 2014, Oliveira et al. (2014) presented the design and evaluation of a tactile vocabulary to aid navigation in an underground mine. The authors have selected tactons based on the users' ability to perceive and process them during navigation in virtual environments to design a more usable tactile interface. A user experiment in a virtual simulation of an emergency situation in an underground mine showed that the tactile feedback facilitated the execution of the task and increased its usability as well as the memorization of its signals. Haptic feedback helps increasing the immersion in XR applications, especially with VR ones. Compared to the previous work, in which the haptic feedback was provided to the user with a single vibration motor, in this case the feedback is provided using 8 electromechanical tactors. This type of tactile solutions that aid navigation have evolved to compact 5 https://www.virtuix.com/ and accurate ones such as Ashirase 6 , which is a navigation system consisting of a smartphone app and a 3D vibration device including a motion sensor, attached inside the shoe.

Immersive gaming/serious games
This topic corresponds to 2.96% of all 609 publications and represents immersive or serious games that focus on healthrelated or training applications. According to the evolution of the numbers of papers in this area shown in Figure 15, the interest for this topic was strong in 2011 and recently is gaining a small representativeness (a mean of 1.6 papers per year in the last five years). The seven most cited works regarding this topic are listed as follows. Analyzing the most cited works in this category, it was possible to notice that the top five cited ones were publish in 2011 and 2012 SVR editions. We may infer from these five works that there is a strict relation between serios games and health applications, despite being possible to find another application areas as well.
In 2011, de Abreu et al. (2011) described the development of a game that combines the technologies of Virtual Reality and Multi-Agents. The proposed system aimed at improving the cognitive functions in patients with neuropsychiatric disorders. Up to recent days it is possible to find works focused on the use of XR applied to neurological diseases. As different sensors such as portable EEGs became available, new input can be used to infer valuable neurological information.
In the same year, Torres et al. (2011) presented a serious game as a way to enhance the user experience in the use of medical training tools that use VR. The serious game contained entertaining aspects that were designed to stimulate the student to perform virtually the examination of breast biopsy. In contrary to the previous work, this one was focused on the training of the doctors (students) themselves. Stimuli provided by XR technologies can be beneficial to other health-related areas, such as physiotherapy. In this case, it may stimulate patients to do their rehabilitation exercises in a gamified way.
Also in 2011, Brasil et al. (2011) presented a game which simulates a drilling rig and exposes the players to various work events. Artificial Intelligence techniques help users to learn what to do in troublesome scenarios. The game is web-based, allowing it to be remotely executed on different platforms. This work is focused on training in the field of gas industry. Differently from the majority of the top works which are focused in health-related problems, this one was developed to aid Petrobras workers to better act in routine situations. As the SVR editions went by, we may find different virtual environment solutions focusing on the gas industry, mainly developed by Techgraf with Petrobras support.
In 2012, de Carvalho Souza and dos  described the development of a serious game designed to support the recovery of motor function in patients who have suffered a recent stroke. The authors also discussed how to use the game as part of the patient's treatment and presented the results collected when real patients played the game as part of their recovery treatment. As stated before, this work is a representative of example of serious games being used in the physiotherapy field.
In the same year, Brückheimer et al. (2012) presented a game-like virtual environment where controllable situations were generated and users limitations were considered in order to foster movements on a relaxed set of activities. The authors showed that a close collaboration between physiotherapists and computer scientists are mandatory in order to achieve a useful application. Nowadays, it is possible to find mobile apps capable of working as exercise trainers. The application is capable of identifying the human body joints and detect if the movement is being performed correctly, such as in Zenia 7 .
In 2015, Silva et al. (2015) exposed how they united computer vision technology and tangible user interfaces to conceive an innovative product targeting children audience. The developed electronic game, Voxar Puzzle, was defined using a Blue Ocean strategy for setting its functionalities and features, comparing it with relevant competitors. The final prototype was validated with children from a local public school. This work comprises one of the first works in the history of SVR to conceive a product based on XR tecnologies and focusing innovation methodologies.
In 2017, Bastos et al. (2017) investigated specific features that confer the quality of immersion to an electronic game based on the analysis of titles the audience considered as immersive. A prototype of an immersive game based on these features was developed and its comparison with selected titles suggested that six features were able to provide an immersive experience in games. This type of work is important as it serves as a guideline for the development of new immersive solutions, especially regarding electronic games.

Haptics/audio/other non-visual interfaces
This topic corresponds to 1.97% of all 609 publications and comprises works that deal with unconventional interfaces, such as haptic, audio-based, voice-based and non-visual ones.
According to the evolution of the numbers of papers in this area shown in Figure 16, the amount or papers in such topic is small, probably due to the difficulty of acquiring haptic devices in Brazil (availability and cost). Alternatively, authors 7 https://zenia.app/ usually produce their own devices. The five most cited works regarding this topic are listed as follows. In 2003, Pizzolato and Rezende (2003) discussed the speech recognition technology and the technical and design issues related to adopting it in virtual worlds. Some speech recognition experiments with a NUANCE recogniser were detailed and future works were pointed. Speech recognition is an important way of interacting to XR content. For instance, Microsoft Hololens enables voice commmands to aid interaction with the virtual content being showed and applications being executed on the platform. Also in 2003, Meiguins et al. (2003) presented a prototype that allowed interaction between users and a threedimensional virtual world using voice commands, in order to improve the interaction of the user and the three-dimensional environment. The case study consisted of a virtual environment for the creation and manipulation of electric circuits, and the experience was considered positive by users. Similar to the previous work, this one also dealt with voice commands to allow interaction with virtual content. In this case, the chosen application was engineer-related, but nowadays it is possible to interact by voice with a infinity of systems. For instance, Alexa (Amazon's personal assistant), recognizes the user voice and is capable of interacting even with other electronic devices of the house. The highlight in this application happens when voice commands escape the digital world and are able to connect to real devices. This is also possible using the Google Home Assistant.
In 2013, Berretta et al. (2013) proposed a method to aid blind people in skill development to walk in unfamiliar environments, based on the construction of spatial cognitive maps. Their results demonstrated that the proposed method was promising. This work resembles the ojective of a previously mentioned commercial solution, capable of aiding navigation of visually impaired people by placing vibration motors on the users's shoes.
In the same year, Correa et al. (2013) presented an analysis of problems related to training in anesthesia, and the possibility of resolution using VR. The authors provided details regarding the entire process, from a requirements survey with the participation of experts to the development of a prototype for preliminary testing. Haptic feedback can be valuable specially on tasks that need force control by the user. Anesthesia is a good example, in which the doctor must control his/her force to make the infusion reach a specific tissue level.
Also in 2013, Bogoni and Pinho (2013) proposed a method for haptic rendering with voxel-based rigid bodies that may be composed of various materials with different densities. An experiment showed that the method provided stability and allows the haptic device to simulate the removal of voxels and also different material densities properly. This work is interesting for providing dynamic interaction with the 3D objects, allowing them to be modified based on the interaction of the "tool" (drill, for instance) with the object. This work can be considered complementary to the previous one.

Teleoperation/telepresence
This topic corresponds to 1.31% of all 609 publications and comprises works that deal teleoperation or telepresence applications. According to the evolution of the numbers of papers in this area shown in Figure 17, the amount or papers in such topic is small, and was concentrated between 2008 and 2014. The four most cited works regarding this topic are listed as follows. Figure 17. Overall distribution of papers from the "Teleoperation/telepresence" topic.
From the 4 most cited works in this category, the first 3 are related to telepresence, while the last one regards teleoperation.
In 2009, Tokunaga et al. (2009) presented a video avatar system suitable to commodity hardware-based educational systems. By requiring a single image and a dense depth map for it, it is possible to control the viewpoint using head tracking and see the avatar from distinct angles. The system worked at interactive frame rates, requiring low cost equipment and little setup. The result of the work presented is similar to the camera interpolation performed in sport events, which as captured using many cameras located at different positions at the stadium. This is interesting because it adds interactivity to the teleconference, and comes closer to what can be done in a real life conversation, in which the participants may move in order to change his/her point of view in relation to the other person.
In 2011, Corrêa et al. (2011) aimed to increase the sense of immersion and presence of the participants in a teleconference by allowing the presenter to immerse himself in a 3D virtual environment. The key challenges and solutions adopted were also detailed. This work extends the previous one by using augmented virtuality, which means that real content is inserted into the virtual world.
In 2012, de Souza Almeida et al. (2012) proposed an AR approach to enhance social presence for video-mediated systems by allowing one user to be present in the other user's video image. A preliminary pilot study with 10 participants indicated that the system had higher degree of social presence compared to traditional video-chat systems. We may compare the results showed in this paper to what happens nowadays with video filters for teleconference. In the later, the user is segmented and a background content can be added to the user's video.
In 2014, Teixeira et al. (2014) proposed using Google Glass for visualization and control of an Unmanned Aerial Vehicle focusing structural inspection of buildings. The authors discussed both problems and limitations regarding the existing technology and how to overcome them. The result presented in the paper resembles the commercial product from Flyability 8 , which allows remote inspections to be performed using drones. The main difference no head mounted display is used for visualization, neither gestures are used to control the inspection drone. Instead, a specific controller/base station is used to control the drone and visualize information from it.

Recent trends and future directions
Gartner, a famous market analysis company, created the so called Gartner Hype Cycles (Gartner, 2020), a graphic representation that helps understanding the maturity and adoption of technologies and applications, and how they are potentially relevant to solving real business problems and exploiting new opportunities. Figure 18 represents the Gartner Hype Cycle of Emerging Technologies in 2017. It shows VR in the "Slope of Enlightment" phase, while AR is in the "Trough of Disillusionment". What is important nowadays is that in the Gartner Hype Cycle of Emerging Technologies for 2019, there is no mention to VR, AR and MR. This means that these technologies are considered to be mature, and many companies believe in their potential for solving their problems. While this is true, some hot topics deserve attention.
The first of them was mentioned earlier and perceived in Figure 4. The increased interest in user studies for validating solutions is demonstrated in later SVR publications. But to make solutions accessible in a global scale, one may have to pay careful attention on how to test the solutions. According to John Quarles from University of Texas at San Antonio, latency, presence, avatars, and cybersickness in VR affect people differently. Different populations in research often yield very different results. Many populations have not been studied in VR. Psychology and HCI researchers have still not solved the diversity problem in their respective fields(John Quarles, 2020).
The second topic regards realtime understanding of video content captured by cameras and how companies may benefit from this intelligent analytics resource. According to Susan Persky, from National Institutes of Health, the automatic  Gartner (2020) capture and analysis of physical movement data available through the use of VR technologies may provide researchers, clinicians, and others with unprecedented ability to assess and even predict outcomes ranging from psychological states to disease progression (Susan Persky, 2020). The possibilities are endless. For instance, in the health field, this would allow us to apply routine VR use to better understand human psychology and behavior and to screen for disease and health threats even before symptoms manifest in other ways.
The third topic relates to XR adoption for immersive learning and collaborative virtual environments. According to Figure 10, SVR only had a few publications regarding multiuser and distributed XR after 2012. In multiple industries and common human forums -commerce, manufacturing, education and training, healthcare, entertainment, communications, and creative enterprises -the world and its capabilities are changing because of the new types of experience enabled by these technologies and the connections they are able to foster. In an era of social distancing resulting from the coronavirus pandemic, the need for collaboration in virtual environments is unprecedented. We are already seeing and participating on virtual conferences, and the facilities they bring are numerous. From approximating remote teams to providing distance learning, different solutions struggle to stand out in a market that is in fast ascension.
It is not by chance Artificial Intelligence (AI) is now one of the official topics of interest of SVR. AI is everywhere. According to Xiaoou Tang, from Chinese University of Hong Kong, today's AI technology make it possible for us to accurately comprehend the real world (Xiaoou Tang, 2020). With the combination of AI and AR, brand-new enriched experiences can be created. Hardware is evolving as fast as ever to reach that goal. For instance, the recently released OpenCV AI Kit 9 with Depth is defined as a "spatial AI camera with multi-stage inference". This means that machine learning algorithms based on OpenVINO run inside the camera and output the result of the recognition performed, freeing the host hardware of most of the necessary processing.
Iván Markman (2021), Chief Business Officer of Verizon Media, was asked about the future of XR in a post-pandemic world. According to him, the pandemic influenced interest in XR because hybrid reality experiences that blend in-person and digital elements have seen an adoption surge. This was justified by the fact that industries like e-commerce, retail, healthcare and entertainment have all relied heavily on XR to close the connection gap with consumers during lockdowns. Regarding the future, he believes that XR will continue to rise with new products, innovation and use cases. The pandemic helped consumers to notice value of XR experiences. It is up to researchers and companies to keep this fast track and expand the opportunities that are currently available.

Authors/Institutions profile
As important as to analyze how the area was explored throughout the years by SVR publications is to take a look at the people and institutions responsible for them. For all 609 full papers published in SVR we have gathered author and institution information, also taking into account the research groups associated to the authors. This section intends to provide some metrics regarding the publications from the institutions point of view, followed by the authors and their research groups. This is a valuable information for new researchers that are looking for reference names in the area for possible academic and research collaborations.

Most active institutions
The institution information was gathered based on the header section of the papers, in which the title of paper, together with author names and affiliation is made available. A compilation of such information was performed and Table 3 shows the results for the 15 most active institutions in the events throughout the years.
It is interesting to notice that some of these most active institutions began to participate on the events in different moments. For example, both USP and UFPE, the two most active institutions, were not present in the first edition of the events. PUC-Rio, despite being the fourth most active institution in the history of SVR, began participating on the events from 2002 on. LNCC, the sixth most active institution, started publishing papers on SVR only in 2006.
Another interesting information is the number of different institutions that contributed with published papers on different event editions. Figure 19 illustrates the number of institutions that published papers per event. We believe that such number is far above the mean (39.04 institutions per year) in SVR 2020 because it was the first time the event happened online, due to the COVID-19 pandemic. This fact may have contributed to the increase in participation of different institutions, since it was not necessary to pay for extra expenses besides author's registration.

Most active authors
The compilation of the most active authors was more complex than the one related to the institutions. This was due to two main reasons: there was a total of 1,264 different authors related to the 609 full papers published in these 22 years of events; there was no pattern for the authors names, which means that papers from the same author could show authors name with different writings (abbreviations, for instance). This way, after listing all the authors from the papers, we carefully verified and specified a single name pattern (the most complete representation) for each author. After that, we were able to have the unique counting of SVR authors.
The 15 most active authors are shown in Figure 20 and Figure 21. In this cumulative graph, it is possible to perceive how the number of publications for each author evolved as the years passed. Based on the image, we may find that the last time Claudio Kirner published a paper in SVR was in 2010. The total number of publications regarding each of the 15 most active authors together with the mean number of published papers by event is shown in Table 4. Column C represents the "productivity rate", calculated by dividing the total number of published papers by the number of years the author had published. For example, despite the fact that Veronica Teichrieb has started participating since the second event (WRV 1999), she has a productivity rate of 2.95, which means that she publishes about 3 papers every time she participates on the event.

Most active research groups
Since 22 years of analysis comprises a big time span, we opted to use a different strategy to obtain the most active research groups. Instead of listing the research groups throughout the years, we decided to list the ones that are related to the most active authors, using as reference the research groups associated with the authors in the CNPq platform. This choice was made because as the time went by we perceived that some authors changed their institution, and together with that their research group. Also, it is possible that a research group from a researcher that has older publications does not exist anymore. Given that, the most active research groups are listed as follows: • Recently, in the visualization area, a tool for analysis of hotspots of crimes is being used by the Public Safety Secretary of Ceará, and is being implanted in many Public Safety Secreataries in other states. This tool was also adapted to analize the propagation of COVID-19 in Ceará in 2020. • Visualization, Interaction and Simulation Lab -Inf UFRGS: The Visualization, Interaction and Simulation Laboratory (VISLab) is part of the Computer Graphics, Image Processing and Interaction research group, which started its activities in 1978 developing projects mainly on rendering and animation. Along the years, as new researchers joined the group, new research fields such as image acquisition and analysis, virtual reality, non-conventional interaction, and visualization of complex data started to be investigated. Within this group, VISLab is majorly concerned with research on human-computer interaction, with emphasis on nonconventional, 3D interaction and haptics, and immersive visualization in the context of virtual and augmented reality applications. VISLab's research focus is to enhance the human with computers, extending the perception capabilities,and improving the human power of action in a natural way. • VOXAR labs: Voxar Labs is a research group focused in augmenting experiences through research, innovation and collaboration with academia and industry. It develops cutting-edge multi-disciplinary research in the large area of Spatial Computing, tackling the inner areas of Extended Reality, Computer Vision and Natural Interaction. The laboratory aims to create impact through R&D&I,technology transfer, scientific publications, patents and human-resources formation. It is one of the most productive Augmented Reality research groups in the Latin America, also being recognized with seven best papers and ten first-place competitions' prizes over the fisrt ten years of its existence. Voxar Labs is part of the Informatics Center of the Federal Uni-versity of Pernambuco, located in Recife -Pernambuco, Brazil.

Conclusion
This paper has reviewed the development in XR research presented over the last 22 SVR editions (previously WRV). This research classified the 609 analyzed papers regarding the main conference topics and inferred their relevance according to their citation numbers. Currently, to bring XR research from laboratories to industry and widespread use is still challenging, but both academia and industry (as seen in Gartner Hype Cycle (Gartner, 2020)) believe that there is huge potential for XR technology in a wide range of areas. Fortunately, more researchers are paying attention to these areas, and it is becoming easier than ever before to be involved in XR research. A suggestion would be to know more about previous SVR publications, available online at the proposed open catalog. As future work, we suggest that this research could be extended by considering short paper and other satellite publications that are also present in the events. This would allow a more complete understanding including the impact initial research has in the area.