ASA PRESSROOM

147th ASA Meeting, New York, NY



SmartMusicKIOSK: Music Listening Station with Chorus-Search Function

Masataka Goto - asa.m.goto@m.aist.go.jp
National Institute of Advanced Industrial Science and Technology (AIST)
1-1-1 Umezono, Tsukuba, Ibaraki 305-8568, JAPAN

Popular version of paper 2pMU6
Presented Tuesday afternoon, May 25, 2004
147th ASA Meeting, New York, NY


NEWS:

``You can jump to the 'hook' of a song with just a push of a button!''

I developed SmartMusicKIOSK, a new music-playback interface for music-listening stations. Traditionally in music stores, customers often search for the chorus or "hook'' of a song by repeatedly pressing the fast-forward button, rather than by passively listening to the music. This activity is not well supported by current technology. SmartMusicKIOSK technology, by analyzing song audio automatically, gives listeners a function for jumping to the chorus section and other key parts of a song, plus a function that lets users see the different parts of the song . These functions eliminate the hassle of searching for the chorus and make it easier for a listener to find desired parts of a song, thereby encouraging an active listening experience. This interface, which allows a listener to look for a section of interest by changing the playback position, is useful not only for music-listening stations but also for more general purposes in selecting and using music. The proposed functions are achieved through my automatic audio-based chorus-section detection method, and the results of implementing them in a listening station have demonstrated their usefulness. While entire songs of no interest to the listener can be skipped on conventional music-playback interfaces, SmartMusicKIOSK is the first interface that allows the user to easily skip sections of no interest even within a song.

SmartMusicKIOSK implemented on a tablet PC
SmartMusicKIOSK implemented on a tablet PC.


BACKGROUND:

Most music stores have listening stations that allow customers to hear compact discs (CDs) on a trial basis. In general, the main objective of listening to music is to appreciate it, and at home or in a car it is common for a listener to play a musical selection from start to finish. In trial listening, however, the objective is to quickly determine whether a selection is the music one has been looking for and whether one likes it, and the above manner of listening to full selections is rare here. In the case of popular music, for example, customers often want to listen to the most representative, uplifting part of a song, which is generally the chorus or refrain, to pass judgment on that song. This desire produces a special way of listening in which the trial listener first listens briefly to a song's "intro" and then jumps ahead in search of the chorus by pushing the fast-forward button repeatedly, eventually finding it and listening to it.

The functions provided by conventional listening stations for music CDs, however, do not support this unique way of trial listening. These listening stations are equipped with playback-operation buttons typical of an ordinary CD player, and among these, only the fast-forward and rewind buttons can be used to find the chorus section of a song. On the other hand, digital listening stations that have recently come to be installed in CD stores enable playback of musical selections from a hard disk or over a network. Here, however, only one part (e.g., the beginning) of each musical selection (an interval typically of about 30 or 45 seconds) is mechanically excerpted and stored, which means that a trial listener may not necessarily hear the chorus section.

Against the above background, I developed SmartMusicKIOSK, in which a trial listener can jump to the beginning of a song's chorus (perform an instantaneous fast-forward to the chorus) by simply pushing the button for this function.


INTERFACE:

Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
SmartMusicKIOSK screen display.

For music that would normally not be appreciated unless some time was taken for listening, the problem here is how to enable listeners to change between specific playback positions before actual listening. To solve this problem, I propose the following two methods assuming the main target to be popular music.

  1. "Jump to chorus" function: Automatic jumping to the beginning of sections relevant to a song's structure (lower window)
    The structure of a song is automatically analyzed beforehand and functions are provided enabling automatic jumping to sections that would be of interest to listeners. These functions are "jump to chorus (NEXT CHORUS button)," "jump to previous section in song (PREV SECTION button),'' and "jump to next section in song (NEXT SECTION button).'' With these functions, a listener can directly jump and listen to chorus sections, or jump to the previous or next repeated section of the song.
  2. "Music map" function: Visualization of song contents (upper window)
    A function is provided to enable the contents of a song to be visualized to help the listener decide where to jump next. Specifically, this function provides a visual representation of the song's structure consisting of chorus sections and repeated sections (results of automatic chorus-section detection). In the above figure, the horizontal axis is the time axis covering the entire song; the top row shows chorus sections, the five lower rows show repeated sections, and the bottom thin horizontal bar is a playback slider. On each row, colored sections indicate similar (repeated) sections. For example, the bottom row with two short colored sections indicates the similarity between the "intro" and "ending" of this song.


TECHNOLOGY:

To enable the handling of a large number of songs, this research aims for a general and robust chorus-section detection method using no prior information on acoustic features unique to choruses. To this end, I focus on the fact that chorus sections are usually the most repeated sections of a song and adopt the following basic strategy: find sections that repeat and output those that appear most often. It must be pointed out, however, that it is difficult for a computer to judge repetition because it is rare for repeated sections to be exactly the same.

I therefore developed a method that overcomes this difficulty and automatically detects the beginning and end points of chorus sections and repeated sections in compact-disc recordings of popular music. Most previous methods detected as a chorus a repeated section of a given length and had difficulty in identifying both ends of a chorus section and in dealing with modulations (key changes). By analyzing relationships between various repeated sections, my method can detect all the chorus sections in a song and estimate both ends of each section. It can also detect modulated chorus sections by introducing a similarity measure that enables modulated repetition to be judged correctly. Experimental results with a popular-music database show that this method detects the correct chorus sections in 80 of 100 songs.


VIDEO CLIPS:


SCREEN SNAPSHOTS:

"NEXT CHORUS" button (jump to chorus)
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(A-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(A-2) SmartMusicKIOSK keeps playing.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(A-3) When the user pushes the "NEXT CHORUS" button, SmartMusicKIOSK jumps to the start of the next chorus section in the song from the present cursor position.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(A-4) When the user pushes the "NEXT CHORUS" button again, it jumps to the chorus section next to the previous one.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(A-5) When the user pushes the "NEXT CHORUS" button again, it jumps to the chorus section next to the previous one.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(A-6) When the user pushes the "NEXT CHORUS" button after the final chorus section, it returns to the first chorus section.

"NEXT SECTION" button (jump to next section in song)
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(B-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(B-2) SmartMusicKIOSK keeps playing.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(B-3) When the user pushes the "NEXT SECTION" button, SmartMusicKIOSK jumps to the start of the next repeated section in the song from the present cursor position.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(B-4) When the user pushes the "NEXT SECTION" button again, it jumps to the repeated section next to the previous one.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(B-5) When the user pushes the "NEXT SECTION" button again, it jumps to the repeated section next to the previous one.

"PREV SECTION" button (jump to previous section in song)
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(C-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(C-2) SmartMusicKIOSK keeps playing.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(C-3) When the user pushes the "PREV SECTION" button, SmartMusicKIOSK jumps to the start of the previous repeated section in the song from the present cursor position.
Snapshot of SmartMusicKIOSK (RWC-MDB-P-2001 No. 18)
(C-4) When the user pushes the "PREV SECTION" button again, it jumps to the previous repeated section.


CONTRIBUTION:

"SmartMusicKIOSK makes the music listening experience more active and interactive."

The main contribution of this research is to propose a novel music-playback interface SmartMusicKIOSK, considering that conventional playback-operation buttons on CD players or media-player software have not been improved for a long time. One of the innovations brought by CD players is to enable a listener to immediately skip a song (track) of no interest --- i.e., "listen to any track of a CD whenever one likes." I believe that the SmartMusicKIOSK brings about a similar innovation at a different level: it enables a listener to immediately skip a structural section (part) of no interest --- i.e., "listen to any part of a song whenever one likes" without having to follow the timeline of the original song. I hope this research opens up new vistas for future research that reexamines the entire functional makeup of music-playback interfaces to make interaction between people and music more active and enriching.


ACKNOWLEDGMENTS:

This research utilized the RWC Music Database "RWC-MDB-P-2001" (Popular Music).


[ Lay Language Paper Index | Press Room ]