Masataka Goto - asa.m.goto@m.aist.go.jp
National Institute of Advanced Industrial Science and Technology (AIST)
1-1-1 Umezono, Tsukuba, Ibaraki 305-8568, JAPAN
Popular version of paper 2pMU6
Presented Tuesday afternoon, May 25, 2004
147th ASA Meeting, New York, NY
NEWS: |
I developed SmartMusicKIOSK, a new music-playback interface for music-listening stations. Traditionally in music stores, customers often search for the chorus or "hook'' of a song by repeatedly pressing the fast-forward button, rather than by passively listening to the music. This activity is not well supported by current technology. SmartMusicKIOSK technology, by analyzing song audio automatically, gives listeners a function for jumping to the chorus section and other key parts of a song, plus a function that lets users see the different parts of the song . These functions eliminate the hassle of searching for the chorus and make it easier for a listener to find desired parts of a song, thereby encouraging an active listening experience. This interface, which allows a listener to look for a section of interest by changing the playback position, is useful not only for music-listening stations but also for more general purposes in selecting and using music. The proposed functions are achieved through my automatic audio-based chorus-section detection method, and the results of implementing them in a listening station have demonstrated their usefulness. While entire songs of no interest to the listener can be skipped on conventional music-playback interfaces, SmartMusicKIOSK is the first interface that allows the user to easily skip sections of no interest even within a song.
BACKGROUND: |
Most music stores have listening stations that allow customers to hear compact discs (CDs) on a trial basis. In general, the main objective of listening to music is to appreciate it, and at home or in a car it is common for a listener to play a musical selection from start to finish. In trial listening, however, the objective is to quickly determine whether a selection is the music one has been looking for and whether one likes it, and the above manner of listening to full selections is rare here. In the case of popular music, for example, customers often want to listen to the most representative, uplifting part of a song, which is generally the chorus or refrain, to pass judgment on that song. This desire produces a special way of listening in which the trial listener first listens briefly to a song's "intro" and then jumps ahead in search of the chorus by pushing the fast-forward button repeatedly, eventually finding it and listening to it.
The functions provided by conventional listening stations for music CDs, however, do not support this unique way of trial listening. These listening stations are equipped with playback-operation buttons typical of an ordinary CD player, and among these, only the fast-forward and rewind buttons can be used to find the chorus section of a song. On the other hand, digital listening stations that have recently come to be installed in CD stores enable playback of musical selections from a hard disk or over a network. Here, however, only one part (e.g., the beginning) of each musical selection (an interval typically of about 30 or 45 seconds) is mechanically excerpted and stored, which means that a trial listener may not necessarily hear the chorus section.
Against the above background, I developed SmartMusicKIOSK, in which a trial listener can jump to the beginning of a song's chorus (perform an instantaneous fast-forward to the chorus) by simply pushing the button for this function.
INTERFACE: |
For music that would normally not be appreciated unless some time was taken for listening, the problem here is how to enable listeners to change between specific playback positions before actual listening. To solve this problem, I propose the following two methods assuming the main target to be popular music.
TECHNOLOGY: |
To enable the handling of a large number of songs, this research aims for a general and robust chorus-section detection method using no prior information on acoustic features unique to choruses. To this end, I focus on the fact that chorus sections are usually the most repeated sections of a song and adopt the following basic strategy: find sections that repeat and output those that appear most often. It must be pointed out, however, that it is difficult for a computer to judge repetition because it is rare for repeated sections to be exactly the same.
I therefore developed a method that overcomes this difficulty and automatically detects the beginning and end points of chorus sections and repeated sections in compact-disc recordings of popular music. Most previous methods detected as a chorus a repeated section of a given length and had difficulty in identifying both ends of a chorus section and in dealing with modulations (key changes). By analyzing relationships between various repeated sections, my method can detect all the chorus sections in a song and estimate both ends of each section. It can also detect modulated chorus sections by introducing a similarity measure that enables modulated repetition to be judged correctly. Experimental results with a popular-music database show that this method detects the correct chorus sections in 80 of 100 songs.
VIDEO CLIPS: |
SCREEN SNAPSHOTS: |
"NEXT CHORUS" button (jump to chorus) |
(A-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing. |
(A-2) SmartMusicKIOSK keeps playing. |
(A-3) When the user pushes the "NEXT CHORUS" button, SmartMusicKIOSK jumps to the start of the next chorus section in the song from the present cursor position. |
(A-4) When the user pushes the "NEXT CHORUS" button again, it jumps to the chorus section next to the previous one. |
(A-5) When the user pushes the "NEXT CHORUS" button again, it jumps to the chorus section next to the previous one. |
(A-6) When the user pushes the "NEXT CHORUS" button after the final chorus section, it returns to the first chorus section. |
"NEXT SECTION" button (jump to next section in song) |
(B-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing. |
(B-2) SmartMusicKIOSK keeps playing. |
(B-3) When the user pushes the "NEXT SECTION" button, SmartMusicKIOSK jumps to the start of the next repeated section in the song from the present cursor position. |
(B-4) When the user pushes the "NEXT SECTION" button again, it jumps to the repeated section next to the previous one. |
(B-5) When the user pushes the "NEXT SECTION" button again, it jumps to the repeated section next to the previous one. |
"PREV SECTION" button (jump to previous section in song) |
(C-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing. |
(C-2) SmartMusicKIOSK keeps playing. |
(C-3) When the user pushes the "PREV SECTION" button, SmartMusicKIOSK jumps to the start of the previous repeated section in the song from the present cursor position. |
(C-4) When the user pushes the "PREV SECTION" button again, it jumps to the previous repeated section. |
CONTRIBUTION: |
The main contribution of this research is to propose a novel music-playback interface SmartMusicKIOSK, considering that conventional playback-operation buttons on CD players or media-player software have not been improved for a long time. One of the innovations brought by CD players is to enable a listener to immediately skip a song (track) of no interest --- i.e., "listen to any track of a CD whenever one likes." I believe that the SmartMusicKIOSK brings about a similar innovation at a different level: it enables a listener to immediately skip a structural section (part) of no interest --- i.e., "listen to any part of a song whenever one likes" without having to follow the timeline of the original song. I hope this research opens up new vistas for future research that reexamines the entire functional makeup of music-playback interfaces to make interaction between people and music more active and enriching.
ACKNOWLEDGMENTS: |
This research utilized the RWC Music Database "RWC-MDB-P-2001" (Popular Music).