Popular version of paper 5pSP6: “Assessing the Accuracy of Head Related Transfer Functions in a Virtual Reality Environment”, presented Friday afternoon, November 9, 2018, 2:30 – 2:45pm, RATTENBURY A/B, ASA 176th Meeting/2018 Acoustics Week in Canada, Victoria, Canada.
While visual graphics in Virtual Reality (VR) systems are very well developed, the manner in which acoustic environments and sounds may be recreated in a VR system is not. Currently, the standard procedure to represent sound in a virtual environment is to use a generic head related transfer function (HRTF), i.e. a user selects a generic HRTF from a library, with limited personal information. It is essentially a ‘best-guess’ representation of an individual’s perception of a sound source. This limits the accuracy of the representation of the acoustic environment, as every person has a HRTF that is unique to themselves.
What is a HRTF?
If you close your eyes and someone jangles keys behind your head, you will be able to identify the general location of the keys just from the sound you hear. A HRTF is a mathematical function that captures these transformations, and can be used to recreate the sound of those keys in a pair of headphones – so that it appears that the sound recording of the keys has a direction associated with it. However, everyone has vastly different ear and head shapes, therefore HRTFs are unique to each person. The objective of our work was to determine how the accuracy of sound localization in a VR world varies for different users, and how we can improve it.
In our tests, volunteers entered a VR world, which was essentially an empty room, and an invisible sound source made a short bursts of noise at various positions in the room. Volunteers were asked to point to the location of the sound source, and results were captured using the VR’s motion tracking system. Results were captured to the nearest millimeter. We tested three cases: 1) where volunteers were not allowed to move their head to assist in the localization, 2) where some slight head movements were allowed to assist in sound localization, and 3) where volunteers could turn around freely and ‘search’ (with their ears) for the sound source. The head movement was tracked by using the VR system to track the volunteer’s eye movement, and if the volunteer moved, the sound source was switched off.
We observed that the accuracy with which volunteers were able to localize the sound source varied significantly from person to person. There was significant error when volunteers’ head movements were restricted, but the accuracy significantly improved when people were able to move around and listen to the sound source. This suggests that the initial impression of a sounds location in a VR world is refined when the user can move their head to refine their search.
We are currently analyzing our results in more detail to account for the different characteristics of each user (e.g. head size, size and shape of ear, etc). Further, we are aiming to develop the experimental methodology to use machine learning algorithms enabling each user to create a pseudo-personalized HRTF, which would improve the immersive experience for all VR users.