Some Thoughts on Cross-Cultural Video Game and Music

This was originally written on September 15, 2020 and published on LinkedIn. I’ve decided to move the posts here for better searchability. Those that know me well, know that I am a video game history researcher, and this is just some of my random published articles.

In his aptly named journal article ‘Are Video Games Art?’ (Smuts, 2005), Aaron Smuts addresses the issue as to whether video games can accurately be defined as art in both a philosophical sense, as well as an aesthetic sense. Ultimately declaring that video games are “an art or sister art of the moving image” (Smuts, 2005, p. 4), Smuts concedes that although some video games may be considered art, not all video games fall into this category. Conversely to this however, he also concludes that “several recent games have reached levels of excellence that exceed the majority of popular cinema.” (Smuts, 2005, p. 12.) Likewise, Henry Jenkins in his essay ‘Art Form for the Digital Age’ (Jenkins, 2000) asserts that video games are not simply an extension of film, but rather also influential on contemporary cinema; he also asserts that video games are “an emerging art, [and] a largely unrecognized art, but art nevertheless” (Jenkins, 2000, p. 1.) Given that video games may be considered a sister art of moving images and given their likeness of similar digital formats, it is not without reason to declare that video games, and the technological development of video games, have both been inspired by cinema and – as both Smuts and Jenkins puts it – in some cases, gone beyond contemporary cinema techniques. Indeed, for a medium which has all of the same aspects – and more (due to interactivity) – of film, such as moving images and sound audio, it would be hard to argue that video games do not in some respect share likeness with film, and therefore their development has not been inspired by film. Likewise, given that video games have historically been treated with the same discretion as film such as by censorship both within their respective hegemonic industries – with the development of the Hays code within the American film industry (Black, 1989) versus the development of Nintendo’s own “game development guidelines” (Stuckey, 2016, p. 114) which have directly been compared to the Hays code (Donovan, 2010) – as well as outside of the industry such as through government regulations, it naturally follows that video games are largely comparable to films. Video games have not only followed film in their ontological sense, but rather followed in separate parallel strains: film genre to video game genre, film stories to video game stories, and film conventions to video game conventions (Arsenault, 2009). The development of music and sounds for video games have also followed suit, with various conventions of film sound and music finding its way into video games, albeit with a slightly different history due to limitations of technical capabilities. Given that conventions of diegetic film sound (such as talking/discourse) have largely been emulated by written dialogue in video games due to technological barriers, it is not a surprise that ‘music’, in the sense of non-diegetic background melodies and theme songs, have historically largely been the main focus of video game audio. In this essay, I will argue that due to the technological affordances of early video games, it is both required and appropriate to treat in-game written dialogue as a ‘sound’. Using the Japanese video game ‘Moeru! Oniisan’ (Toho, 1989) as an example, I will evaluate how early video games tackled the issue of sound by both borrowing the techniques developed by film as well as creating substitutes, and how the localization practices of video game music often involved re-development or re-scoring rather than just dubbing. In comparison to this, I will explore how recent technological developments have allowed sound to be fully realized in video games, which has led the way for dubbing practices of cinema to be carried over to video game sound design.

Both films and video games have – within their respective generations – undergone dramatic changes in their production, in particular in their development of sounds, voices, and music. Film, which saw its first major synchronized-sound release with the 1927 release of The Jazz Singer (Crosland, 1927), largely focused on both natural sound devices which are appropriate for a story being told (that is, diegetic sounds such as talking), as well as the incorporation of non-diegetic musical devices such as through film scores which hold the story together with “motifs, themes, harmonics, [and] textures” (Reyland, 2012, p. 58.) Music in early video games, however, focused only on non-diegetic sounds and music stemming from early film music, such as background melodies, and theme songs, with the technological affordances being only basic boops and beeps for dynamic diegetic sound – for example, the inclusion of leitmotifs in early films can be directly compared to the leitmotifs included in various video game franchises, such as The Legend of Zelda (Nintendo, 1986) to create a “returning hero” (Walch, 2017, p. 8) effect. Clear diegetic dialogue was not realized. Although some early video games do follow film in their use of some sound, such as the use of looping audio, which Collins (2008, p. 34) discusses was used in early film music to “create ambience” and “reduce the cost of production” (p. 34), diegetic sound in early video games were limited by technology, and largely had to be improvised by methods which were not developed for film. Up until 1978, video game music was largely static, with the release of Space Invaders (Taito, 1978) bringing continuous on-screen sound that was dependent on what was happening on-screen, to the mainstream (Collins, 2008.) Although there are examples of some early video game sound practices following the early days of film, such as the use of mickey-mousing in Super Mario Bros. (Nintendo, 1985) which provided auditory feedback to the physical movements of the player (Wahlen, 2004), perhaps the most important aspect of any visual story-telling was still technologically impossible: speech and dialogue. Given that the inclusion of spoken dialogue was not available in video games for many years, developers had to improvise and create medium-specific approaches to discourse. Ultimately, most video games opted for written dialogue. Instead of voices coming from the speakers of whatever platform was being used, players were instead required to read off the screen; mental enunciation rather than verbal enunciation was used. Despite being text-based, it is not completely correct to say that transcripted dialogue written on the screen was not audio based, however. Sounds of mumbling, or other noises such as that similar to a typewriter were added to games which played while the text was being written to the screen, and perhaps the addition of moving faces to emulate real talking, were also part of the formula. Despite there being no real in-game talking, the embodiment of such talking was created; to the player, despite not specifically hearing words, they were still ‘listening’ to ‘talking’, and therefore, to ‘listen to a game’, we must read text as well. The sonic culture of early video games, therefore, is not purely that of sounds, but also of text. Ultimately, it is clear that non-diegetic sound in early games was developed with cinema in mind, whereas diegetic sound methods – especially in regards to speech – had to be developed independently.

Given that the methods by which early video games handled sound has been established, it is now possible to explore the concept of how cultural exchange has been achieved in early video game sound, and compare it to that of film. It is important to note that if in-game text is a ‘sound’ (in the sense that we, the player or audio-viewer, create sounds within our head based on what we are seeing on the screen), then the translation of in-game text into another language must be an act of dubbing. Indeed, as Dwyer (2017, p. 27) considers, the difference between subbing and dubbing may be distinguished by their action: “dubbing substitutes whereas subtitling adds” (p. 27.) Likewise, Dwyer (2017, p. 28) also comments that criticism of subtitles (subbing) revolves around “distortion or misrepresentation, disruption and elitism” (p. 28.) Given that the translation of in-game text substitutes, and neither disrupts, nor distorts the translated game, it would be impossible to call the changing of in-game text subtitling.

The localization process of both film and video games have traditionally flirted with the same obstacles that follow with such a process. Video games that were developed in one country may have no symbolic meaning in the country next door, while films directed in one country may also have no meaning in another (or may have a meaning where it is desirable to be changed.) Despite this, as Dwyer (2017) says, the largest cultural barrier of film is generally considered that of language. To combat this, dubbing has largely been used in the western world to introduce audiences to foreign films. In early video games however, “dubbing” was not the only major source of localization, but rather it was also the complete re-production and re-development of video games for overseas audiences that took place. Many early video game franchises have their history rooted in Japan both through physical development, and also culture through design. As Mangiron (2012) has explored, early video games were often modified to fit with the cultural expectations of their respective markets, notably in North America, when they were released. Although this included translations of in-game text into English and sometimes into French, Italian, German, and Spanish (FIGS) for smaller games similar to films – and the changing of code to combat the difference in NTSC and PAL formats of the game consoles which either sped up or slowed down in-game audio (sometimes unsuccessfully) (game.waiting, n.d., para. 1) – it was not uncommon for games to be completely redesigned for the purpose of “toning down the traces of Japanese culture in order to make their products more appeal in to a western audience” (Mangiron, 2012, p. 9) or to change storylines which publishers thought would be “distasteful outside of Japan – or outright banned” (Altice, 2015, p. 114.) One of the most well known examples of this is the original American release of the game ‘Super Mario Bros. 2’ (Nintendo, 1988) which instead of being developed as a Mario game, was instead a previously published game re-developed to look like a Mario game (Donovan, 2010.) In the context of audio, using the Japanese-made game ‘Moeru! Oniisan’ (1989, Toho) as an example, it can be shown that the localization process of a Japanese video game involved both ‘dubbing’ (that is, changing of in-game text), but also major changes in its non-diegetic sound. Released in Japan and with a name which translates to English ‘Burn! Older Brother’, the game was based on a Japanese anime series of the same name, with the story-line following various characters from the show. Both the iconography and the sonic design of the game was developed similarly to the show, under license from the series publisher. When the game was released in America however, it underwent re-development, and was aptly named ‘Circus Caper.’ Not only were all of the characters changed to circus characters (clowns, magicians, etc.), both the diegetic and non-diegetic sounds were changed. The background music was changed to play in a B major rather than D major that the Japanese version used, and some of the mini-games included children-friendly jingles rather than low-pitch ambient noise (Circus Caper/Regional Differences, n.d., para. 10). At first it may seem simple to differentiate the sounds between the games due to their difference of in-game text (and therefore their ‘diegetic dialogue’), but as Michel Chion discusses in relation to to the auditory perception of the game music (Chion, 2019), a reduced listening of the different versions offers the important effect that the change in pitch has on the player. Although at first it may not seem to make much difference, this change in pitch effectively gives new meaning to the sound; a dark D major from the Japanese version is replaced with a happy B major in the American version, and therefore the whole meaning is changed. Dwyer (2017, p. 34) does note a similar case in which the localization of a film affected the specific reduced traits of the audio, in the re-release of La Jetée (Marker, 1962), where “reviewers [..] noted that the music and special effects [were] ‘mixed far lower’ in this version, reducing the dramatic effect of the final scene in particular” (p. 34.) However, since this was not a deliberate changing of audio, it is not directly comparable to the case of video games. Given that it is not only the dialogue of ‘Moeru! Oniisan’ that has been changed but rather the whole sound design intentionally, we can see that the game did not follow the simple dubbing procedures that film uses; instead of ‘a Japanese game made with English dialogue’, it is now a whole new ‘English game with English dialogue’, largely devoid of its original meaning.

Since video game systems have become more powerful and the technological restraints that inhibited audio and sound have largely been lifted, video games have now begun to include more complicated sounds in their sonic makeup. Full-fledged audio capabilities have been afforded and sound design is now an important part of the creation of video games (Mangiron, 2013.) Instead of ones and zeros, sound design involves various procedures which surfaced from film, such as the use of a foley, field recording, and complicated digital composition (Alten, 2013.) Unlike classic systems that were limited in the amount of consecutive tracks used, modern video games instead treat audio as collections in a library, which may be recorded in advance and played on command (Aleten, 2013.) Following this, on-screen dialogue has been pushed into the audio-sphere, and players now listen – literally – to dialogue. They are now both viewers, and proper audio-viewers; true auditory retentitve listening (Collins, 2008) can now be achieved, such as through remembering what an in-game character has said in exposition. With these technological changes, the ability to provide comprehensive dubbing – such as in films – has also been introduced. With this in mind, it has naturally followed that the dubbing practices that have been developed for film are now an important part of the localization process of games. However, as Mangiron (2013) found, subbing for the most part has not been realized in games as they have in film, and guidelines have not been established in the industry. Unlike film, which has incorporated the use of guidelines when subbing (Mangiron, 2013) to assure audiences can receive information in a proper way, video games have largely neglected to standardize such processes and games that do have subtitles either make their own rules, or as Fernández (2007) found in the case of one prominent video game displaying 91 characters for only 4.6 seconds, rules are not made at all. Whereas it is still possible to completely re-develop a video game to incorporate different graphics or sound design for different regions, it is no longer common to do so. Instead, video games are localized through dubbing of dialogue, with their original story-lines kept largely intact and their adaptation being faithful. In respect to film, as various researches have identified, dubbing is largely considered by the public to be less authentic to the original story (Dwyer, 2017) (Szarkowska, 2005) (Mera, 1999) due to the alteration of the audio which removes information that is encoded – that is, in the semantic sense (Hall, 2001) – into the film. However perhaps surprisingly, as Dwyer (2017, p. 43) notes of Mera’s work, it has been found that dubbing, not subbing, provides the most authentic representation of an original film’s experience: “subbing reduce ‘original’ dialogue by around 50 per cent and are hence less respectful and less faithful than dubbing” (Dwyer, 2017, p. 43.) This is also echoed in video game, where research has shown that “localizing a video game improves the game experience” (Esser, Bernal-Merino & Smith, 2016, p. 196), and that players actually prefer dubbed games over subbed games, “as they can get easily immersed [in the game]” (Esser, Bernal-Merino & Smith, 2016, p. 185.) Ultimately, modern video game localization practises in respect to sound design have only followed that of film in respect to the practice of dubbing. Much like film, where it is possible to publish work to ‘international’ audiences through the translation and replacement of spoken discourse, video games are also offered to non-native speaking audiences through this method as well. Subbing, however, has largely been ignored by the games industry and has not made a considerable impact on the medium.

As has been shown in this essay, the method which video games adapted audio – and tackled transnational releases – cannot completely be compared to that of film. Having argued that in-game text of early video games may be considered as a sound, it has followed that in the case of early video games, non-diegetic sound conventions follow film through the use of melodies and background music, while diegetic sound was developed independent of the conventions of film. Likewise, as has been evaluated, the transnational sound practise of subbing for diegetic sounds was not used in early video games due to the technological affordances of video games, while dubbing and re-development of games, as shown in the case of ‘Moeru Oniisan!’, were used for diegetic sounds and non-diegetic sounds respectively. In newer video games, dubbing has continued to be the prominent form of translation between cultures which has partially followed film, albeit has stopped in going so far as utilizing subbing as a major form of translation technique like film.

References

Alten, S. (2013). Audio in media. Syracuse, America: Cengage Learning.

Altice, N. (2015). I Am Error. Massachusetts, America: The MIT Press.

Arsenault, D. (2009). Video game genre, evolution and innovation. Eludamos. Journal for computer game culture, 3(2), 149-176.

Black, G. (1989). Hollywood Censored: The Production Code Administration and the Hollywood Film Industry, 1930-1940. Film History, 3(3), 167-189.

Chion, M. (2019). Audio-vision: sound on screen. Columbia University Press.

Circus Caper/Regional Differences - The Cutting Room Floor. (n.d.). Retrieved May 30, 2020, from https://tcrf.net/Circus_Caper/Regional_Differences

Collins, K. (2008). Game Sound. Massachusetts, America: MIT Press.

Crosland, A. (1927). The Jazz Singer [Film]. Hollywood: Warner Bros.

Donovan, T. (2010). Replay: The history of video games. Yellow Ant.

Dwyer, Tessa. (2017). Sub/Dub Wars: Attitudes to Screen Translation. In T. Dwyer (Ed.), Speaking in Subtitles : Revaluing Screen Translation (pp. 19-51). Edinburgh: EUP.

Esser, A., Smith, I. R., & Bernal-Merino, M. Á. (Eds.). (2016). Media across borders: Localising TV, film and video games. Routledge.

Fernández, A. (2007). Anàlisi de la localització de Codename: Kids Next Door–Operation VIDEOGAME. Analysis of the localisation of Codename: Kids Next Door–Operation VIDEOGAME]. Tradumàtica, 5, 1-7.

game.wiring. (n.d.). Retrieved 30 May 2020, from https://gw.eternal.dk/pal

Hall, S. (2001). Encoding/decoding. Media and cultural studies: Keyworks, 2.

Jenkins, H. (2000). Art form for the digital age. Technology Review-Manchester NH-, 103(5), 117-120.

Mangiron, C. (2012). The Localisation of Japanese Video Games: Striking the Right Balance. The Journal Of Internationalization And Localization, 2(1), 1-20. doi: 10.1075/jial.2.01man

Mangiron, C. (2013). Subtitling in game localisation: a descriptive study. Perspectives, 21(1), 42-56. doi: 10.1080/0907676x.2012.722653

Mera, M. (1999). Read my lips: Re-evaluating subtitling and dubbing in Europe. Links & Letters, 73-85.

Moreu! Oniisan [Computer software]. (1989). Tokyo: Toho.

Reyland, Nicholas. (2012). “The Beginnings of a Beautiful Friendship?: Music Narratology and Screen Music Studies” Music, Sound & the Moving Image. 6(1): 55-71.

Smuts, A. (2005). Are Video Games Art? Contemporary Aesthetics, 35(1), 1-15.

Space Invaders [Computer software]. (1978). Tokyo: Taito.

Stuckey, H. (2016). Remembering Australian videogames of the 1980s: what museums can learn from retro gamer communities about the curation of game history (Doctoral dissertation, Flinders University).

Super Mario Bros. 2 [Computer software]. (1988). Tokyo: Nintendo.

Super Mario Bros. [Computer software]. (1985). Tokyo: Nintendo.

Szarkowska, A. (2005). The power of film translation. Translation journal, 9(2), 2005.

Wahlen, Z. (2004). Play Along - An Approach to Videogme Music. the international journal of computer game research, 4(1). Retrieved from http://www.gamestudies.org/0401/whalen/

Walch, B. (2017). The Legend of Zelda and Leitmotif: Backtracking in an Open World (Masters thesis, Ultrecht University).

Some Thoughts on Cross-Cultural Video Game and Music

References

Related Posts