Human-robot collaboration could be valuable in some challenging tasks. Previous researches only consider the human-centered systems, but there will be many changes in the symmetrical reality (SR) systems because there are two perceptual centers in symmetrical reality. In this paper, we introduce the contents of the symmetrical reality-based human-robot collaboration and interpret the humanrobot collaboration from the perspective of equivalent interaction. By analyzing task definition in symmetrical reality, we present the special features of human-robot collaboration. Furthermore, there are many fields in which the symmetrical reality can produce a remarkable effect, we only list some typical applications, such as service robots, remote training, interactive exhibition, digital assistants, companion robots, the immersive entertainment community and so forth. The current situation and future development of this framework are also analyzed to provide a kind of guidance for researchers.
Physical common sense is the intuitive knowledge that can be obtained from the physical world. But the common sense will be broken in symmetrical reality because of the integration of the physical world and the virtual world. In this paper, we will introduce the specific physics in symmetrical reality from two perspectives: existence and interaction. We emphasize the bi-directional mechanical control within the symmetrical reality framework and why free wills of machines can break the common sense. Afterward, we give the experiments of discovering the new physical common sense of symmetrical reality systems. Experiment I is about learning physical common sense from symmetrical reality, which is used to show what can be learned and how to learn in a symmetrical reality environment. Experiment II is about changing the physical common sense of symmetrical reality, which is used to show why physical common sense deserves much attention. Finally, we draw an initial conclusion about the physical common sense in symmetrical reality and give some suggestions for understanding symmetrical reality-based physical common sense.
We study the hierarchical knowledge transfer problem using a cloth-folding task, wherein the agent is first given a set of human demonstrations in the virtual world using an Oculus Headset, and later transferred and validated on a physical Baxter robot. We argue that such an intricate robot task transfer across different embodiments is only realizable if an abstract and hierarchical knowledge representation is formed to facilitate the process, in contrast to prior literature of sim2real in a reinforcement learning setting. Specifically, the knowledge in both the virtual and physical worlds are measured by information entropy built on top of a graph-based representation, so that the problem of task transfer becomes the minimization of the relative entropy between the two worlds. An And-Or-Graph (AOG) is introduced to represent the knowledge, induced from the human demonstrations performed across six virtual scenarios inside the Virtual Reality (VR). During the transfer, the success of a physical Baxter robot platform across all six tasks demonstrates the efficacy of the graph-based hierarchical knowledge representation.
To investigate the effects of visual discomfort caused by long-term immersing in virtual environments (VEs), we conducted a comparative study to evaluate users’ visual discomfort in an eight-hour working rhythm and compared the differences between the VEs and the physical environments. Twenty-seven participants performed four different visual tasks with a head-mounted display (HMD) for the VE condition and with a monitor for the physical condition. Their subjective visual discomfort and objective oculomotor indicators were measured to evaluate their visual performances. The results show that the subjective visual fatigue symptoms, the objective pupil size, and the relative accommodation response vary across time for the two conditions, in which VEs affects visual fatigue the most compared to the physical environments. The results also show that pupil size is negatively related to subjective visual fatigue, and the long-term work based on displays only influences the maximum accommodation response of participants. This work is a supplement to the necessary but insufficient-researched field of visual fatigue in long-term immersing in VEs, which should be valuable to researchers involved in the evaluation of visual fatigue using HMDs.
We study the knowledge transfer problem by training the task of folding clothes in the virtual world using an Oculus Headset and validating with a physical Baxter robot. We argue such complex transfer is realizable if an abstract graph-based knowledge representation is adopted to facilitate the process. An And-Or-Graph (AOG) grammar model is introduced to represent the knowledge, which can be learned from the human demonstrations performed in the Virtual Reality (VR), followed by the case analysis of folding clothes represented and learned by the AOG grammar model.
Human-computer interaction (HCI) plays an important role in the near-field mixed reality, in which the hand-based interaction is one of the most widely-used interaction modes, especially in the applications based on optical see-through head-mounted displays (OST-HMDs). In this paper, such interaction modes as gesture-based interaction (GBI) and physics-based interaction (PBI) are developed to construct a mixed reality system to evaluate the advantages and disadvantages of different interaction modes. The ultimate goal is to find an efficient hybrid paradigm for mixed reality applications based on OST-HMDs to deal with the situations that a single interaction mode cannot handle. The results of the experiment, which compares GBI and PBI, show that PBI leads to a better performance of users regarding their work efficiency in the proposed two tasks. Some statistical tests, including T-test and one-way ANOVA, have also been adopted to prove that the difference regarding the efficiency between different interaction modes is significant. Experiments for combining both interaction modes are put forward in order to seek a good experience for manipulation, which proves that the partially-overlapping style would help to improve work efficiency for manipulation tasks. The experimental results of the proposed two hand-based interaction modes and their hybrid forms can provide some practical suggestions for the development of mixed reality systems based on OST-HMDs.
In a mixed reality (MR) environment that combines the physical objects with the virtual environments, users' feelings are immersed in the virtual world, while their bodies remain in the physical world. Compared to the purely physical environments, such characteristic has led to some special needs for users' long-term immersion. However, the deficiency needs that we have to face for long-term immersion still need further research. In this paper, we apply the theory of Maslow's Hierarchy of Needs (MHN) to guide the design of MR systems for long-term immersion. Taking the normal biological rhythm of human beings as the basic unit (24 hours), we propose the fundamental needs for long-term immersion in Ves through combining the theory of MHN with the special needs of virtual reality (VR). In order to verify whether those needs can satisfy users' long-term immersion, we design an MR office system for basic operations based on the theory of MHN. A long-term exposure experiment (duration of 8 hours) is designed to evaluate those needs by comparing the results with a physical work environment after a short-term preliminary study. The physiological and psychological effects are tested in both two environments and the deficiency needs for short-term immersion and long-term immersion are also compared. The results showed that the design based on the theory of MHN can support users' long-term immersion, which means that it can be a guideline for long-term use of MR systems.
We present a text entry technique called HiFinger, which is an eyes-free, one-handed wearable text entry technique for immersive virtual environments by thumb-to-fingers touch. This technique enables users to input text quickly, accurately, and comfortably with the sense of touch and a two-step input mode. It is especially suitable for mobile scenarios where users need to move (such as walking) in virtual environments. Various input signals can be triggered by moving the thumb towards ultra-thin pressure sensors placed on other fingers. After acquiring the comfort range of the touch between the thumb and other fingers, six placement modes for text entry are designed and tested, resulting in an optimal placement mode that leverages six pressure sensors for the text entry and two for the control function. A three-day study is conducted to evaluate the proposed technique, and experimental results show that novices can achieve an average text entry efficiency of 9.82 words per minute (WPM) in virtual environments based on head-mounted displays after a training period of 25 min.
We propose VRGym, a virtual reality testbed for realistic human-robot interaction. Different from existing toolkits and virtual reality environments, the VRGym emphasizes on building and training both physical and interactive agents for robotics, machine learning, and cognitive science. VRGym leverages mechanisms that can generate diverse 3D scenes with high realism through physics-based simulation. We demonstrate that VRGym is able to (i) collect human interactions and fine manipulations, (ii) accommodate various robots with a ROS bridge, (iii) support experiments for human-robot interaction, and (iv) provide toolkits for training the state-of-the-art machine learning algorithms. We hope VRGym can help to advance general-purpose robotics and machine learning agents, as well as assisting human studies in the field of cognitive science.
In this paper, we review the background of physical reality, virtual reality, and some traditional mixed forms of them. Based on the current knowledge, we propose a new unified concept called symmetrical reality to describe the physical and virtual world in a unified perspective. Under the framework of symmetrical reality, the traditional virtual reality, augmented reality, inverse virtual reality, and inverse augmented reality can be interpreted using a unified presentation. We analyze the characteristics of symmetrical reality from two different observation locations (i.e., from the physical world and from the virtual world), where all other forms of physical and virtual reality can be treated as special cases of symmetrical reality.
Long-term exposure to VR will become more and more important, but what we need for long term immersion to meet users fundamental needs is still under-researched. In this paper, we apply the theory of Maslows Hierarchy of Needs to guide the design of VR for longterm immersion based on the normal biological rhythm of human beings (24 hours). An office environment is designed to verify those needs. The efficiency, the physical and the psychological effects of this VR office system are tested. The results show that the VR office environment is as comfortable as the physical environment at short-term immersion and it can support users basic immersion. It means that the Maslows Hierarchy of Needs can be a guideline for long-term immersion.
This paper presents a design that jointly provides hand pose sensing, hand localization, and haptic feedback to facilitate real-time stable grasps in Virtual Reality (VR). The design is based on an easy-to-replicate glove-based system that can reliably perform (i) a high-fidelity hand pose sensing in real time through a network of 15 IMUs, and (ii) the hand localization using a Vive Tracker. The supported physicsbased simulation in VR is capable of detecting collisions and contact points for virtual object manipulation, which drives the collision event to trigger the physical vibration motors on the glove to signal the user, providing a better realism inside virtual environments. A caging-based approach using collision geometry is integrated to determine whether a grasp is stable. In the experiment, we showcase successful grasps of virtual objects with large geometry variations. Comparing to the popular LeapMotion sensor, we demonstrate the proposed glove-based design yields a higher success rate in various tasks in VR. We hope such a glove-based system can simplify the data collection of human manipulations with VR.
During continuous use of displays, a short rest can relax users' eyes and relieve visual fatigue. As one of the most important devices of virtual reality, head‐mounted displays (HMDs) can create an immersive 3D virtual world. When users have a short rest during the using of HMDs, they will experience a transition from virtual world to real world. In order to investigate how this change affects users' eye condition, we designed a 2 × 2 experiment to explore the effects of short rest during continuous using of HMDs and compared the results with those of 2D displays. The Visual Fatigue Scale, critical flicker frequency, visual acuity, pupillary diameter, and accommodation response of 80 participants were measured to assess the subject's performance. The experimental results indicated that a short rest during the using of 2D displays could significantly reduce users' visual fatigue. However, the experimental results of using HMDs showed that short rest during continuous using of HMD induced more severe symptoms of subjectively visual discomfort, but reduced the objectively visual fatigue.
Building a human-centered editable world can be fully realized in a virtual environment. Both mixed reality (MR) and virtual reality (VR) are feasible solutions to support the attribute of edition. Based on the current development of MR and VR, we present the vision-tangible interactive display method and its implementation in both MR and VR. We address the issue of MR and VR together because they are similar regarding the proposed method. The editable mixed and virtual reality system is useful for studies, which exploit it as a platform. In this paper, we construct a virtual reality environment based on the Oculus Rift, and an MR system based on a binocular optical see-through head-mounted display. In the MR system about manipulating the Rubik's cube, and the VR system about deforming the virtual objects, the proposed vision-tangible interactive display method is utilized to provide users with a more immersive environment. Experimental results indicate that the vision-tangible interactive display method can improve the user experience and can be a promising way to make the virtual environment better.
Vision-tangible mixed reality (VTMR) is a further development of the traditional mixed reality. It provides an experience of directly manipulating virtual objects at the perceptual level of vision. In this paper, we propose a mixed reality system called “VTouch”. VTouch is composed of an optical see-through head-mounted display (OST-HMD) and a depth camera, supporting a direct 6 degree-of-freedom transformation and a detailed manipulation of 6 sides of the Rubik’s cube. All operations can be performed based on the spatial physical detection between virtual and real objects. We have not only implemented a qualitative analysis of the effectiveness of the system by a functional test, but also performed quantitative experiments to test the effects of depth occlusion. In this way, we put forward basic design principles and give suggestions for future development of similar systems. This kind of mixed reality system is significant for promoting the development of the intelligent environment with state-of-the-art interaction techniques.
Text entry is an imperative issue to be addressed in current entry systems for virtual environments (VEs). The entry method using a physical keyboard is still the most dominant choice for an efficient interaction regarding text entry. In this paper, we propose a typing system with a style of mixed reality, which is called HiKeyb, and it possesses a similar high-efficiency with the single physical keyboard in the real environment. The HiKeyb system consists of a depth camera, a pose tracking module, a head-mounted display (HMD), a QWERTY keyboard and a black table mat. This system can guarantee the entry efficiency and the amenity by not only introducing the force feedback from a movable physical keyboard, but also improving the immersion with the real hand image. In addition, the infrared absorption material helps improve the robustness of the system against different lighting environments. Experiments have proved that users wearing HMDs in Virtual Phrases session can achieve an entry rate of 23.1 words per minute and an error rate of 2.76\\%, and the rate ratio of virtual reality to real world is 78\\% when typing phrases. Besides, we find that the proposed system can provide a relatively close entry efficiency to that using a pure physical keyboard in the real environment.
We propose a framework called inverse augmented reality (IAR) which describes the scenario that a virtual agent living in the virtual world can observe both virtual objects and real objects. This is different from the traditional augmented reality. The traditional virtual reality, mixed reality and augmented reality are all generated for humans, i.e., they are human-centered frameworks. On the contrary, the proposed inverse augmented reality is a virtual agent-centered framework, which represents and analyzes the reality from a virtual agent's perspective. In this paper, we elaborate the framework of inverse augmented reality to argue the equivalence of the virtual world and the physical world regarding the whole physical structure.
Calibration accuracy is one of the most important factors to affect the user experience in mixed reality applications. For a typical mixed reality system built with the optical see‐through head‐mounted display, a key problem is how to guarantee the accuracy of hand–eye coordination by decreasing the instability of the eye and the head‐mounted display in long‐term use. In this paper, we propose a real‐time latent active correction algorithm to decrease hand–eye calibration errors accumulated over time. Experimental results show that we can guarantee an effective calibration result and improve the user experience with the proposed latent active correction algorithm. Based on the proposed system, experiments about virtual buttons are also designed, and the interactive performance regarding different scales of virtual buttons is presented. Finally, a direct physics‐inspired input method is constructed, which shares a similar performance with the gesture‐based input method but provides a lower learning cost due to its naturalness.
Single point active alignment method is a widely used calibration method for optical-see-through head-mounted displays (OST-HMDs) since its appearance. It always requires high-accuracy alignment for data acquisition, and the collected data affect the calibration accuracy to a large extent. However, there are often many kinds of alignment errors occurring in the calibration process. These errors may contain random errors of manual alignment and system errors of the fixed eye-HMD model. To tackle these problems, we first leverage a random sample consensus approach to recurrently decrease the random error of the collected data sequence and use a region-induced data enhancement method to reduce the system error. We design a typical framework to enhance the data acquisition for calibration, sequentially reducing the random error and the system error. Experimental results show that the proposed method can significantly make the calibration more robust due to the elimination of sampling points with large errors. At the same time, the calibration accuracy can be increased by the proposed dynamic eye-HMD model that takes the eye movement into consideration. The improvement about calibration should be significant to promote the applications based on OST-HMDs.
This paper details the design, implementation and an initial evaluation of a collaborative platform named OptoBridge, which is aimed at enhancing remote guidance and skill acquisition for spatially distributed users. OptoBridge integrates augmented reality (AR), gesture interaction with video mediated communication and is preliminarily applied to the experimental teaching of the adjustment task with Michelson interferometer. An exploratory study has been conducted to qualitatively and quantitatively evaluate the extent to which different viewpoints affect the student's sense of presence, task performance, learning outcomes and subjective feelings in the remote collaborative augmented environment. 16 students from local universities have participated in the evaluation. The result shows the influence of two different viewpoints and indicates that OptoBridge can effectively support remote guidance and enhance the collaborators' experience.
Hand-based interaction is one of the most widely-used interaction modes in the applications based on optical see-through head-mounted displays (OST-HMDs). In this paper, such interaction modes as gesture-based interaction (GBI) and physics-based interaction (PBI) are developed to construct a mixed reality system to evaluate the advantages and disadvantages of different interaction modes for near-field mixed reality. The experimental results show that PBI leads to a better performance of users regarding their work efficiency in the proposed tasks. The statistical analysis of T-test has been adopted to prove that the difference of efficiency between different interaction modes is significant.
Calibration accuracy is one of the most important factors to affect the user experience in mixed reality applications. For a typical mixed reality system built with the optical see-through head-mounted display (OST-HMD), a key problem is how to guarantee the accuracy of hand-eye coordination by decreasing the instability of the eye and the HMD in long-term use. In this paper, we propose a real-time latent active correction (LAC) algorithm to decrease hand-eye calibration errors accumulated over time. Experimental results show that we can successfully use the LAC algorithm to physics-inspired virtual input methods.
Since artificial intelligence has been integrated into virtual reality, a new branch of virtual reality, which is called inverse virtual reality (IVR), is created. A typical IVR system contains both the intelligence-driven virtual reality and the physical reality, thus constructing an intelligence-driven mutually mirrored world. We propose the concept of IVR, and describe the details about the definition, structure and implementation of a typical IVR system. The paral-lei living environment is proposed as a typical application of IVR, which reveals that IVR has a significant potential to extend the human living environment.
Single point active alignment method (SPAAM) has become the basic calibration method for optical-see-through head-mounted displays since its appearance. However, SPAAM is based on a simple static pinhole camera model that assumes a static relationship between the user's eye and the HMD. Such theoretic defects lead to a limitation in calibration accuracy. We model the eye as a dynamic pinhole camera to account for the displacement of the eye during the calibration process. We use region-induced data enhancement (RIDE) to reduce the system error in the acquisition process. The experimental results prove that the proposed dynamic model performs better than the traditional static model, and the RIDE method can help users obtain a more accurate calibration result based on the dynamic model, which improves the accuracy significantly compared to the standard SPAAM.
The most commonly used single point active alignment method (SPAAM) is based on a static pinhole camera model, in which it is assumed that both the eye and the HMD are fixed. This leads to a limitation for calibration precision. In this work, we propose a dynamic pinhole camera model according to the fact that the human eye would experience an obvious displacement over the whole calibration process. Based on such a camera model, we propose a new calibration data acquisition method called the region-induced data enhancement (RIDE) to revise the calibration data. The experimental results prove that the proposed dynamic model performs better than the traditional static model in actual calibration.
In this paper an experimental teaching platform named OptoBridge is presented which supports the sharing of the collaborative space for spatially distributed users to assist skill acquisition. The development of OptoBridge is based on augmented reality (AR) and integrates free-hand gesture interactions with the video mediated communication. The prototype is preliminarily applied in the optics field to promote skill execution in the case of the Michelson interferometer. OptoBridge enables the remote teacher to monitor the experimental scenario as well as the detailed optical phenomena through the transmitted video captured on the local side. Meanwhile the local learner equipped with the optical see-through head-mounted display (OSTHMD) can be indicated by virtual hands and augmented annotations controlled by the teacher's gestures and follow the guidance to get their skills practiced. The implementation of OptoBridge is also presented, aimed at providing a more engaging and efficient approach for remote skill teaching.
With the rapid development of Virtual and Augmented Reality systems, it becomes more and more important to develop an efficient calibration method for optical see-through head-mounted displays (OST-HMDs). In this paper, a modular calibration framework with two calibration phases is proposed. In the first phase, an eye-involved equivalent camera model is proposed in order to compute the spatial position of the human eye directly, in the second phase, the gesture information is integrated to the system with a depth camera. In addition, a fast correction algorithm is introduced to ensure that the calibration result work for new users without additional complex recalibration procedures. The precision of the proposed modular calibration and optimization method is evaluated, and the result shows that the proposed method can simplify the recalibration procedures for OST-HMDs.
The cybersickness is still a huge obstacle for virtual reality(VR) system. Current researches on cybersickness are mostly based on static or dynamic simulators, while studies in the real movement state is rare. An evaluating system which employed a head-mounted display(HMD) with a running vehicle was proposed to study the cybersickness in the real movement state. In this system, users sitting in the vehicle could see virtual scenes which was consistent with the real motion through the HMD. Subjective and objective evaluating experiments were proposed to analyze the different levels of the cybersickness caused by visual-vestibular conflict. The results show that the consistency of real movement and visually perceived movement have a great impact on cybersickness. Cybersickness gets worse when the consistency decreases. Serious cybersickness may lead to extreme situation where the discomfort may not be afforded by users.
The combination of health and entertainment becomes possible due to the development of wearable augmented reality equipment and corresponding application software. In this paper, we implemented a fast calibration extended from SPAAM for an optical see-through head-mounted display (OSTHMD) which was made in our lab. During the calibration, the tracking and recognition techniques upon natural targets were used, and the spatial corresponding points had been set in dispersed and well-distributed positions. We evaluated the precision of this calibration, in which the view angle ranged from 0 degree to 70 degrees. Relying on the results above, we calculated the position of human eyes relative to the world coordinate system and rendered 3D objects in real time with arbitrary complexity on OSTHMD, which accurately matched the real world. Finally, we gave the degree of satisfaction about our device in the combination of entertainment and prevention of cervical vertebra diseases through user feedbacks.