Background
Immersive virtual reality (VR) simulation is being used increasingly in medical and surgical education, offering interactive experiences in an immersive environment.- With recent advances in hardware and software, VR has become more accessible and affordable, providing flexible and autonomous learning opportunities with automated instructions and integrated real-time feedback.- Further, this new technology provides an environment for learners to safely explore procedural skills without compromising patient safety and is expected to enhance the acquisition and retention of technical skill and theoretical knowledge. Evidence for the effectiveness of VR simulation in medical and surgical education has been reported.- In a recently reported randomized controlled trial we found that VR was effective in teaching technical skills (chest tube insertion) to medical students.
Although recent research has demonstrated VR’s educational potential, user experience with this technology has not been sufficiently studied. A randomized controlled trial comparing immersive VR and screen-based VR demonstrated higher cognitive load and impaired performance among learners trained with immersive VR. The cognitive load theory is an important consideration when developing an educational program delivered with new technology, given that learners need to learn how to use these new technologies in addition to the content for which they are being used., The main premise of this theory is that the capacity of our working memory and information processing for long-term memory is limited.- Because of the additional mental burden required to learn how to use the technology, it is important to assess and manage the learner’s cognitive load in order to not exceed the individual learner’s cognitive capacities. The theory describes three different types of cognitive load: intrinsic, extraneous, and germane. The intrinsic load is based on the learning task (inherent to the complexity of the task). The extraneous load is determined by how the teaching is delivered, how the learning task is presented, or the learning situation (poor teaching and instructions, and distractions in learning environment lead to higher extraneous load). The germane load refers to the learning process and the retention (the effort required to process new information into long-term memory by developing mental schema). In the medical education context, intrinsic load should be managed by adjusting the curriculum to learners’ pre-existing knowledge and skills, while extraneous load should be reduced by incorporating user-friendly technologies and avoiding irrelevant elements and complex instructions to foster generative processing.,
In addition to cognitive load, usability is an essential concept to evaluate the integration and ease of use of technology as a learning tool. Usability problems can be a barrier in successful implementation of VR in medical education. It is postulated that the concepts of usability and cognitive load theory are interrelated, wherein poor usability can be a significant contributor to higher levels of extraneous cognitive load.
In a national survey of simulation centres, it was found that lack of evidence for its utility is one of the major barriers to implementation of VR in medical education. Such evidence should be multifaceted, including the effectiveness and the efficiency in learning, and the user experience. The objective of this study is to address these gaps by investigating user experience in VR applications for technical skill learning, particularly with respect to usability and cognitive load.
Methods
Experimental Design
A previously reported randomized controlled study was conducted in July, 2023 (IRB approval A03-E17-23B), involving 30 medical students with no or limited experience with VR and no clinical experience with chest tube insertion. In that study, knowledge and performance were compared between students trained with online learning alone vs those with online learning + VR training. Subjects trained with VR simulation + online learning showed greater technical skill when assessed in a mannequin-based simulation, compared to participants prepared only with the online module., For the present study, we conducted further analysis of data based on students trained with VR simulation between July and September, 2023 (n = 22).
Participants completed a pre-training survey and multiple-choice knowledge assessment, then received access to the online learning module on chest tube insertion. They then received two training sessions with immersive VR simulation (Figure 2). The first session consisted of a tutorial to familiarize participants with use of the VR equipment and working in a virtual environment, followed by two repetitions of the immersive VR chest tube insertion simulation. The second session took place approximately a week after the first session with two additional trials of the VR simulation. After the fourth VR simulation, all participants were asked to perform chest tube insertion in a mannequin model (Figure 3), while being evaluated by a trained assessor. After this technical skill assessment, participants completed a post-training multiple-choice knowledge assessment and a survey on the user experience to evaluate the usability and cognitive load of using the VR simulator. In addition, participants were asked to provide suggestions for improvement in VR simulation through an open-ended question. Flow of the intervention and evaluation is shown in Figure 1.

Figure 1
Flowchart of the study. VR, virtual reality.
VR Simulation
Immersive VR simulation was provided using the chest tube module of a commercially available product (Vantari VR, Sydney, Australia) delivered through a Quest 2 VR Headset (Meta, Menlo Park, CA, United States) and its hand controllers.,, The system situates a learner in a fully immersive virtual operating room environment, and learners receive detailed verbal instructions for 26 steps to execute the procedure while the steps are highlighted on the screen (Figure 2). The module also provides real-time feedback, based on its embedded metrics and data, and reports time to complete each step and total procedure. The study team had verified that the instructions in the VR simulation were aligned with the Advanced Trauma Life Support guidelines.

Figure 2
Screenshots during the VR simulation. (A): A trainee is wearing a headset and controllers for VR simulation. (B): 26 steps shown on a board during the immersive VR simulation. (C): A step to identify to locate the site of chest tube insertion. VR, virtual reality.
Assessment and Outcomes
The usability of the VR simulation was evaluated using the System Usability Scale (SUS) (Supplemental material 1). The SUS is composed of 10 questions asking the degree of agreement to five positive statements and five negative statements, using 5-point Likert scale, where 1 represents “strongly disagree” and 5 “strongly agree.” To calculate overall SUS score, rates from positive statements are transformed as “[Score]-1” and ones from negative statements are transformed as “5-[Score]”. The sum of each transformed score is multiplied by 2.5, so that scores ranged from 0-100, where higher scores indicate better usability; a score greater than 68 is considered to be above average.
Cognitive load theory describes three types of cognitive load: intrinsic, extraneous, and germane cognitive load. Intrinsic cognitive load is related to the inherent difficulty of the learning subject, whereas extraneous cognitive load is impacted by the presentation of the information and is influenced by instructional procedures and technology. Germane cognitive load refers to the construction and storage of information. Although intrinsic cognitive load is generally expected to be unaffected by instructional technology, teaching technologies with smaller extrinsic cognitive load (smaller number in this scale), and greater germane cognitive load (greater number in this scale) are considered optimal for learning.- Cognitive load was evaluated using a 10-item questionnaire (three items for intrinsic, three items for extraneous, and four items for germane cognitive load) based on a 0-10 scale, where 0 represents “not at all the case” and 10 is considered “completely the case” (Leppink’s scale) (Supplemental material 2). The mean scores for each domain were calculated and used for the correlation analysis with other variables.
The time taken during each VR iteration and total VR times were measured. Participants’ knowledge about chest tube insertion was assessed by a 15-item multiple-choice assessment conducted pre- and post-training., Technical skills were assessed by an experienced surgeon using a modified Objective Structured Assessment of Technical Skills (OSATS) score (5-point, 11 items) (Supplemental material 3) as participants performed chest tube insertion on a mannequin (Figure 3). In this secondary analysis, we analyzed technical skill (OSATS scores) and knowledge as they relate to usability and cognitive load.

Figure 3
Mannequin-based simulation. (A): Participant preparing to perform a simulated chest tube insertion on a mannequin, prior to sterile draping of the surgical field. (B): Participant inserting a chest tube to the mannequin during the simulation.
Statistics
Quantitative data analysis was used to examine scores from knowledge tests, technical skills, usability and cognitive load. Numerical data are reported as median and interquartile range (IQR). Wilcoxon signed rank test was used to compare time to complete VR trials. Spearman rank correlation was used to describe the relationships between variables. Answers for open-ended questions were categorized based on our codebook, and the frequency of each category was reported.
Results
Participants required a median total of 36.5 (IQR 30.0-39.7) minutes to complete four trials of chest tube insertion in VR simulation. The median time per trial progressively improved from the 1st through the 4th iteration, from 14.7 to 5.3 minutes (P < 0.001) (Figure 4).

Figure 4
Learning curve: time to complete 4 VR simulations. Median score significantly improved over the course of study. Box plot shows median score, inter-quartile range; whiskers show range. VR, virtual reality.
Usability of VR simulation was reported to be excellent with the SUS being 82.5 (73.8-88.8) out of 100. The medians of intrinsic, extraneous, and germane cognitive load were 3.7 (1.8-6.1), 0.2 (0-1.4), and 9.2 (6.0-10.0) out of 10, respectively.
The percentage of correct answers in the knowledge assessments increased significantly from 46.7% (40.0-53.3) pre-training to 86.7% (80.0-90.3) post-training, (P < 0.001). The OSATS score for technical skills in the post-training mannequin-based simulation was 40.5 (35.5-49.3) out of 55. There was a significant positive correlation between OSATS and SUS scores (r = 0.51, P = 0.04) (Figure 5). There was a slightly positive correlation between germane cognitive load and post-training knowledge (r = 0.37), but this was not statistically significant (P = 0.11). There was no statistically significant correlation between extraneous cognitive load and usability (r = −0.15, P = 0.41).

Figure 5
Regression analysis: Relationship between technical performance (chest tube OSATS score) and usability (SUS score). There was significant positive correlation between OSATS score and SUS (r = 0.51, P = 0.04). OSATS, objective structured assessment of technical skill; SUS, system usability scale.
In response to the open-ended question regarding areas for improvement in the immersive VR simulation, eight participants (36.4%) cited some discrepancy between the content taught in the online learning module and the VR simulation. Five participants (22.7%) suggested that some steps should be more fully modelled in the VR simulation, such as draping and suturing, which were automated in the module. Three participants (13.6%) suggested having a self- assessment mode without instruction and would like interactions with controllers to be more like the movements in actually conducting the procedure.
Discussion
Our previous randomized controlled trial showed the benefits of using immersive VR simulation for teaching technical skills to medical students in a blended learning framework. VR training added further value in learning the technical skills associated with the procedure. Since many medical students were inexperienced with the use of immersive virtual reality, there was concern that adding this new technology may increase their cognitive burden, diminishing their capacity to actually learn. The present study shows that the usability of the VR technology used in this study was excellent and the cognitive load was not overwhelming. Participants were able to become familiarized with the technology easily and quickly, and became progressively more comfortable within a few simulation trials. There was a strong correlation between SUS score and technical skill, which suggests that the perceived usability of immersive VR may impact the efficiency and effectiveness of learning. In contrast, despite the ideal theoretical cognitive load (relatively low intrinsic cognitive load, low extraneous cognitive load, and high germane load), there was no correlation between cognitive load and other measures, although extraneous cognitive load and usability should theoretically be negatively correlated. We assume the variability in extraneous and germane load reported by the participants was too little to show statistically significant correlations with usability and other learning outcomes. Reported intrinsic cognitive load was low considering that the participants were medical students with little or no prior training experience for chest tube insertion, however the use of online material, for preparation prior to the immersive VR simulation may have helped to lower the intrinsic cognitive load.
Regarding responses to the open-ended question, more than one third of participants cited discrepancies in the content between the online learning material and VR simulation. Before conducting these studies, a panel of experts reviewed the content of both modules and confirmed they are compatible with the current ATLS guidelines. This result may highlight that the participants are naïve learners, and are very sensitive to subtle changes, such as variations of the names of instruments. Simulation for basic surgical manoeuvres like draping and suturing were not the main purpose of this learning, but these skills can be taught in new different immersive VR simulation modules if there is substantial demand. The participants’ suggestion of integrating an assessment mode (without real-time guidance) can be helpful feedback to the software developer. Currently, the module includes audio and visual instructions for the steps of the procedure at all times; an assessment mode without any instructions may be a useful capability of the immersive VR simulation for learners who would like to self-assess their performance in a competency-based learning framework.
This study has some limitations. The small sample size and the analysis of a single surgical procedure may limit the generalizability of the results from this study. In addition, the measurements of usability and cognitive load were subjective and representative of the overall learning experience; self-reported outcomes may have been influenced by individual biases. The use of some objective measurements, such as counting of errors in interactions for usability, or physiologic monitoring, eg, heart rate, and brain function, could provide real-time assessment and may be useful to identify the interface elements to be improved.
Conclusion
In conclusion, usability and cognitive load of immersive VR simulation are important to evaluate. This secondary analysis of using immerse VR simulation for training chest tube insertion, found there to be excellent usability and low demands on extrinsic cognitive load, which may have contributed to its effectiveness for technical skill learning. Along with the results from the previous study, there is emerging evidence that well-designed immersive VR simulation can be useful for procedural training for medical students.
Author Contributions JT, RK, and FB designed and carried out the experiment, analyzed data, wrote a draft of manuscript, HF assisted experiments technology support, TC provided critical feedback on the manuscript, GF supervised the project and reviewed the manuscript.
Declaration of Conflicting Interests The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Mr. Howard Fried is the senior director of strategic partnerships and professional affairs for Vantari VR. He coached participants in the optimal use of VR technology but did not participate in the assessments of performance.
Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Ryan Knobovitch was funded through the Summer Research Bursary Program of the McGill University Faculty of Medicine and Health Sciences. Junko Tokuno was funded through The Foundation of Health Care Science Institute, Japan (Research Grant).
Supplemental Material Supplemental material for this article is available online.
References
- 1. Lohre R, Bois AJ, Pollock JW, et al. Effectiveness of immersive virtual reality on orthopedic surgical skills and knowledge acquisition among senior surgical residents: a randomized clinical trial. JAMA Netw Open. 2020;3(12):e2031217. doi:10.1001/jamanetworkopen.2020.31217
- 2. Kim HJ, Lee HK, Jang JY, et al. Immersive virtual reality simulation training for cesarean section: a randomized controlled trial. Int J Surg. 2024;110(1):194–201. doi:10.1097/JS9.0000000000000843
- 3. Knobovitch RM, Tokuno J, Botelho F, Fried HB, Carver TE, Fried GM. Virtual reality training improves procedural skills in mannequin-based simulation in medical students. Surg Innov. 2025:15533506251334693. doi:10.1177/15533506251334693. Published online on Apr 10.
- 4. Mao RQ, Lan L, Kay J, et al. Mmersive virtual reality for surgical training: a systematic review. J Surg Res. 2021;268:40–58. doi:10.1016/j.jss.2021.06.045
- 5. Yi WS, Rouhi AD, Duffy CC, Ghanem YK, Williams NN, Dumon KR. A Systematic review of immersive virtual reality for nontechnical skills training in surgery. J Surg Educ. 2024;81(1):25–36. doi:10.1016/j.jsurg.2023.11.012
- 6. Frederiksen JG, Sørensen SMD, Konge L, et al. Cognitive load and performance in immersive virtual reality versus conventional virtual reality simulation training of laparoscopic surgery: a randomized trial. Surg Endosc. 2020;34(3):1244–1252. doi:10.1007/s00464-019-06887-8
- 7. Davids MR, Halperin ML, Chikte UME. Optimising cognitive load and usability to improve the impact of e-learning in medical education. Afr J Health Prof Educ. 2015;7(2):147. doi:10.7196/AJHPE.659
- 8. Tokuno J, Carver TE, Fried GM. Measurement and management of cognitive load in surgical education: a narrative review. J Surg Educ. 2023;80(2):208–215. doi:10.1016/j.jsurg.2022.10.001
- 9. van Merriënboer JJ, Sweller J. Cognitive load theory in health professional education: design principles and strategies. Medical education. 2010;44(1):85–93. doi:10.1111/j.1365-2923.2009.03498.x
- 10. Mayer RE. Applying the science of learning to medical education. Med Educ. 2010;44(6):543–549. doi:10.1111/j.1365-2923.2010.03624.x
- 11. Hochstrasser K, Stoddard HA. Use of cognitive load theory to deploy instructional technology for undergraduate medical education: a scoping review. Med Sci Educ. 2022;32(2):553–559. doi:10.1007/s40670-021-01499-1
- 12. Sauro J, Dumas JS. Comparison of three one-question, post-task usability questionnaires. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM; 2009:1599–1608.
- 13. Lie SS, Helle N, Sletteland NV, Vikman MD, Bonsaksen T. Implementation of virtual reality in health professions education: scoping review. JMIR Med Educ. 2023;9:e41589. doi:10.2196/41589
- 14. Tokuno J, Bilgic E, Gorgy A, Harley JM. Perceptions and reported use of extended reality technology in Royal College-Accredited Canadian Simulation Centres: a national survey of simulation centre directors. Can Med Educ J. 2024;15(5):64–74. doi:10.36834/cmej.79000
- 15. Tokuno J, Valanci-Aroesty S, Uchino H, et al. Teaching chest tube insertion by blended learning: a multi-dimensional analysis. Surg Innov. 2024;31(1):92–102. doi:10.1177/15533506231211049
- 16. Savir S, Khan AA, Yunus RA, et al. Virtual reality: the future of invasive procedure training? J Cardiothorac Vasc Anesth. 2023;37(10):2090–2097. doi:10.1053/j.jvca.2023.06.032
- 17. Savir S, Khan AA, Yunus RA, et al. Virtual reality training for central venous catheter placement: an interventional feasibility study incorporating virtual reality into a standard training curriculum of novice trainees. J Cardiothorac Vasc Anesth. 2024;38(10):2187–2197. doi:10.1053/j.jvca.2024.07.002
- 18. American College of Surgeons. ATLS Manual. 10th ed. Chicago, IL: American College of Surgeons; 2018.
- 19. Sauro J, Lewis JR. Standardized usability questionnaires. In: Sauro J, Lewis JR, eds. Quantifying the User Experience. 2nd ed. Morgan Kaufmann; 2016:185–248.
- 20. Leppink J, Paas F, Van der Vleuten CP, Van Gog T, Van Merriënboer JJ. Development of an instrument for measuring different types of cognitive load. Behav Res Methods. 2013;45(4):1058–1072. doi:10.3758/s13428-013-0334-1
- 21. Hornbaek K. Current practice in measuring usability: challenges to usability studies and research. Int J Human-Comput Stud. 2006;64:79–102.
- 22. Zakeri Z, Mansfield N, Sunderland C, Omurtag A. Physiological correlates of cognitive load in laparoscopic surgery. Sci Rep. 2020;10(1):12927. doi:10.1038/s41598-020-69553-3
- 23. Plazak J, DiGiovanni DA, Collins DL, Kersten-Oertel M. Cognitive load associations when utilizing auditory display within image-guided neurosurgery. Int J Comput Assist Radiol Surg. 2019;14(8):1431–1438. doi:10.1007/s11548-019-01970-w