Appropriateness of acoustic characteristics on perception of disaster warnings

Naomi Ogasawara
Kenta Ofuji      
Akari Harada

Popular version of paper, 5aSC43, “Appropriateness of acoustic characteristics on perception of disaster warnings.”
Presented Friday morning, December 2, 2016
172nd ASA Meeting, Honolulu

As you might know, Japan has often been hit by natural disasters, such as typhoons, earthquakes, flooding, landslides, and volcanic eruptions. According to the Japan Institute of Country-ology and Engineering [1], 20.5% of all the M6 and greater earthquakes in the world occurred in Japan, and 0.3% of deaths caused by natural disasters worldwide were in Japan. These numbers seem quite high compared with the fact that Japan occupies only 0.28% of the world’s land mass.

Municipalities in Japan issue and announce evacuation calls to local residents through the community wireless system or home receiver when a disaster is approaching; however, there have been many cases reported in which people did not evacuate even after they heard the warnings [2]. This is because people tend to not believe and disregard warnings due to a normalcy bias [3]. Facing this reality, it is necessary to find a way to make evacuation calls more effective and trustworthy. This study focused on the influence of acoustic characteristics (voice gender, pitch, and speaking rate) of a warning call on the listeners’ perception of the call and tried to make suggestions for better communication.

Three short warnings were created: 1) Kyoo wa ame ga furimasu. Kasa wo motte dekakete kudasai. ‘It’s going to rain today. Please take an umbrella with you.’ 2) Ookina tsunami ga kimasu. Tadachini hinan shitekudasai. ‘A big tsunami is coming. Please evacuate immediately.’ and 3) Gakekuzure no kiken ga arimasu. Tadachini hinan shitekudasai. ‘There is a risk of landslide. Please evacuate immediately.’ A female and a male native speaker of Japanese, who both have relatively clear voices and good articulation, read the warnings out aloud at a normal speed (see Table 1 for the acoustic information of the utterances), and their utterances were recorded in a sound attenuated booth with a high quality microphone and recording device. Each of the female and male utterances was modified using the acoustic analysis software PRAAT [4] to create stimuli with 20% higher or lower pitch and 20% faster or slower speech rate. The total number of tokens created was 54 (3 warning types x 2 genders x 3 pitch levels x 3 speech rates), but only 4 of the warning 1) tokens were used in the perception experiment as practice stimuli.

  1. Table 1: Acoustic Data of Normal Tokens
  2. oga1

34 university students listened to each stimulus through the two speakers placed on the right and left front corners in a classroom (930cm x 1,500cm). Another group of 42 students and 11 people from the public listened to the same stimuli through one speaker placed on the front in a lab (510cm x 750cm). All of the participants rated each token on 1-to-5 scale (1: lowest, 5: highest) in terms of Intelligibility, Reliability, and Urgency.

Figure 1 summarizes the evaluation responses (n=87) in a bar chart, with the average scores calculated from the ratings on a 1-5 scale for each combination of the acoustic conditions. Taking Intelligibility, for example, the average score was the highest when the calls were spoken with a female voice, with normal speed and normal pitch. Similar results are seen for Reliability as well. On the other hand, respondents felt a higher degree of Urgency for both faster speed and higher pitch.
Figure 1.  Evaluation responses (bar graph, in percent) and Average scores (data labels and
line graph on 1 – 5 scale)


The data were then analyzed with an analysis of variance (ANOVA, Table 2). Figure 2 illustrates the same results as bar charts. It was confirmed that for all of Intelligibility, Reliability, and Urgency, the main effect of speaking speed was the most dominant. In particular, Urgency can be influenced by the speed factor alone by up to 43%.
Table 2: ANOVA results

Figure 2: Decomposed variances in stacked bar charts based on the ANOVA results


Finally, we calculated the expected average evaluation scores, with respect to different levels of speed, to find out how much influence speed has on Urgency, with a female speaker and normal pitch (Figure 3). Indeed, by setting speed to fast, the perceived Urgency can be raised to the highest level, even at the expense of Intelligibility and Reliability to some degrees. Based on these results, we argue that the speech rate may effectively be varied depending on the purpose of an evacuation call, whether it prioritizes Urgency, or Intelligibility and Reliability.
Figure 3: Expected average evaluation scores on 1-5 scale, setting female voice and normal



  1. Japan Institute of Country-ology and Engineering (2015). Kokudo wo shiru [To know the national land]. Retrieved from:

2. Nakamura, Isao. (2008). Dai 6 sho Hinan to joho, dai 3 setsu Hinan to jyuumin no shinri [Chapter 6 Evacuation and Information, Section 3 Evacuation and Residents’ Mind]. In H. Yoshii & A. Tanaka (Eds.), Saigai kiki kanriron nyuumon [Introduction to Disaster Management Theory] (pp.170-176). Tokyo: Kobundo.

  1. Drabek, Thomas E. (1986). Human System Responses to Disaster: An Inventory of Sociological Findings. NY: Springer-Verlag New York Inc.
  1. Boersma, Paul & Weenink, David (2013). Praat: doing phonetics by computer [Computer program]. Retrieved from:

-Emergency warnings/response
-Natural disasters
-Speech rate


Share This