Placeholder Image

Subtitles section Play video

  • This is Animalese.

  • KK Slider: [unintelligible babble]

  • It's the pseudo-language of Animal Crossing. This is what

  • it sounds like in the Japanese version:

  • KK Slider: [unintelligible babble]

  • And here's what it sounds like in the

  • English version:

  • KK Slider: [slightly deeper unintelligible babble]

  • They sound different, which is weird because it's supposed to

  • be nonsense, right?

  • So... why does Nintendo dub KK Slider?

  • To understand we need to

  • do a taxonomy of all the ways games have tried to represent -- or avoid representing --

  • human speech.

  • In the beginning, there was the Word, and the Word was:

  • Oop! That... was supposed to say voice synthesis.

  • Early attempts at adding audio to games were a

  • mix of pre-recorded voices and genuine voice synthesizers. But they were mostly

  • gimmicky, expensive add-ons. Voice chips made more sense in arcade machines

  • because they were already a huge investment of space and money -- but it was

  • still a technical struggle to get them to work. Like Q*bert, known for his mad ups

  • and foul mouth, this drop of Tang was originally supposed

  • to speak English instead of:

  • Q*Bert: [garbled synthetic phonemes]

  • but audio engineer David Thiel couldn't get the voice chip to

  • produce the sounds he was hoping for. So instead of continuing to mess with it,

  • he just said [Q*bert garble curse] and had it string together some incoherent phonemes instead.

  • Thiel, like many designers that followed, came to the conclusion that

  • human voices just weren't worth the fuss. Other developers opted for a style

  • that's entirely unique to video games, and I looked around but I couldn't find

  • a single definitive phrase used to describe this style -- which I think speaks

  • to how much we take it for granted, even though it is super weird.

  • I'm talking about using nonsensical sound effects to stand in for language, or simply put

  • [slow, low beeps that appear in time with the words]

  • [very high-pitched piercing beeps that appear in time with the words]

  • [sharp, high-pitch beeps that appear in time with the words]

  • for the purpose of this video -- and because it's cool to name things --

  • I'm going to call this beep speech.

  • The earliest examples of beep speech I could

  • find were in JRPGs like Star Arthur Legend - Planet Mephius [short pattern of beeps]

  • and Legend of Zelda [mid-pitch beeps]

  • Some American games used a similar trope of mimicking on-screen text, but

  • it's not meant to stand in for a voice so it's not quite the same.

  • That distinction is important because of beep speech's peculiar function; games that

  • use beep speech slowly reveal text and accompany each word with audio, which

  • makes the player process information as if they were really listening to somebody speak.

  • It's not a straight info-dump; it replicates the act of listening,

  • which makes it easier to stay engaged with the written text. That's

  • assuming you enjoy listening to bebe bebe be bebeep which is a great weakness

  • of the beep speech of the cartridge era. Because audio capabilities were still

  • limited, most games use the same beep for every character in every situation. Later

  • games - including Animal Crossing - could pitch the beeps higher or lower, and that

  • really helped spice things up.

  • Then there were games like Star Fox which gave each

  • character a different kind of "voice" so you could easily distinguish your kind

  • friends Slippy [synthetic sounds similar to frog croaks]

  • from that no-good hotshot Falco [deeper babble]

  • These were synthetic voices and

  • total nonsense with no real association with the text. Another strategy was to

  • use vocal grunts -- things like sighs and yells and other non-language forms

  • of communication. These were great for adding variety,

  • conveying emotion, and giving a character a voice without giving them language.

  • Although they use different strategies Star Fox and Ocarina of Time have two

  • weird things in common:

  • first of all, both have friendly frogs that never get their due.

  • [rhythmic frog croaking]

  • Second of all, both have English language lines even in the original Japanese versions.

  • Navi in Ocarina of Time: Hello!

  • The [GOOD LUCK] and [HEY, LISTEN] were the same in every

  • version of the games. And that points to one of the biggest strengths of beep speech

  • and vocal grunts: you DON'T have to translate them. A shiver is a shiver

  • in every language.

  • Link in Ocarina of Time: [shivers]

  • Localizing a game was - and is - a huge expenditure of time and

  • money, which makes these non-voice options the perfect replacement for

  • voice lines. Quality localization is basically a requirement for most games

  • now, but the 90s and early 2000s were a dark time for translations and voice

  • acting alike, leaving us with such gems as:

  • Dracula in Castlevania: What is a man?!

  • Barry in Resident Evil: A Jill Sandwich!

  • And that's when localization happened at all - sorry Earthbound fans. During this period,

  • beep speech was usually a stand-in for a real language. But Banjo Kazooie made a

  • huge innovation in that their gibberish was... just what it was.

  • Bottles: [a gentle honking]

  • Like Star Fox, the characters

  • had distinct voices - but they weren't synthetic. They were powered by

  • real human pipes, which is wild because it's human voices replicating a

  • synthetic style, that was made to replace human voices, like an aural ouroboros.

  • An auralboros.

  • Plenty of games of this era had full voice acting... but they weren't

  • on the N64. Nintendo's insistence on using cartridges would continue to

  • restrict their options for speech representation. On other consoles, games

  • became more invested in the cinematic experience of having characters say real shit.

  • Which mean a greater investment in voice acting. That caused a split in style where beep speech,

  • previously just fine for serious stories, came to represent a more lighthearted

  • cartoonish feel.

  • Mushi's Mama in Okami: [soft mid-tone beeps]

  • By the early 2000s, a trend emerged of

  • entirely fictional spoken languages. Whereas beep speech stood in for the

  • player's native speech, these constructed languages were more about making certain

  • characters and settings appear foreign -- while still empowering the player to

  • understand what they're saying. It's during this period that

  • both Animal Crossing and The Sims arrived.

  • Simlish actually predates The Sims; it first

  • appeared in Simcopter. But for The Sims, the team at Maxis knew they'd need

  • something more elaborate. Because the game was so much about the human

  • condition, they wanted to communicate emotions which would encourage players

  • to connect with their creations. Plus the practical considerations - anything that's

  • comprehensible can become repetitive, and having a huge scroll of dialogue meant

  • writing, translating, and redubbing a huge scroll of dialogue. Following the style

  • of Banjo Kazooie, they captured the real human voices of two improvisers and then

  • spent a year remix that audio to become the perfect blend of nonsense.

  • Sim 1: Dag dag aulf, Sim 2: Anamana blastamana

  • But that strategy can't work with every franchise. Animal Crossing had different

  • intentions and different styles, and so they needed a different approach. When

  • you hear Animalese for the first time, it sounds a lot like a standard

  • voice synthesis. But KK Slider is actually saying REAL WORDS.

  • Here he is slowed down:

  • KK Slider: [deeper and slower than normal, words that match the text box identifiable]

  • The synthetic voice doesn't exactly nail the pronunciation of each word, but

  • that works to its advantage; once it's sped back up, it's even harder to tell

  • that KK is speaking English. Dōbutsu no Mori, the original n64 Japanese

  • version of Animal Crossing, features Animalese in Japanese. Region-specific

  • Animalese is also the default language in New Leaf... but not Wild World or City

  • Folk. Instead they use a pretty standard sounding voice synthesis called Bebebese.

  • That's because Animal Crossing was never intended to be localized. In fact,

  • Nintendo didn't localize the first version of Animal Crossing; the American

  • release was based on the updated GameCube gamebutsu no Mori+.

  • Members of the Nintendo treehouse had to advocate for it to be translated,

  • partially because they had already gotten addicted to playing it. Because

  • they never intended to localize the game, Nintendo included a lot of specific

  • Japanese cultural elements, including of course the language. All of those had to

  • be changed in the American version because the style of translation at the

  • time called for completely eradicating any hint of a foreign culture. The

  • prevailing notion was that American audiences didn't want anything that had

  • what cultural theoristichi Iwabuchi called "cultural odor,"

  • a phrase I hate to say out loud but have to respect the usefulness of. The localizers for the

  • first Animal Crossing did an amazing job replacing content and adding new

  • events for American audiences -- so much so that their game was actually real localized

  • back into Japanese and released for the Gamecube asbutsu no Mori e+.

  • So when it came time to make Animal Crossing: Wild World, Nintendo needed a

  • localization strategy from the start.

  • And that strategy was to make a game with no regionality at ALL.

  • No cherry blossoms.

  • No Halloween.

  • And no regional Animalese.

  • The Bebebese of Wild World stuck around for City Folk,

  • but by New Leaf, Animalese had made a triumphant,

  • multilingual return. Why?

  • Well!

  • I don't know.

  • But my theory is this! City Folk got

  • a lot of criticism for being too similar to early entries in the franchise. The

  • next game had to distinguish itself significantly to avoid another letdown.

  • Aya Kyogoku, who co-directed New Leaf alongside Isao Moro,

  • viewed the game as a tool to communicate with both animal characters and other

  • players. So it made sense that the communication in game would be more

  • elaborate than Bebebese.

  • But doing full voice acting would have

  • been exorbitantly expensive;

  • City Folk had a huge script around, 640,000 words.

  • For perspective, Infinite Jest clocks in about 483,000 words,

  • so this cute little game about bugs and letters has it beat

  • by over a hundred thousand words, and that means

  • it's better.

  • On top of that it's just plain science that when creatures speak in adorable baby talk

  • they're cute and you just want to squish their widdle faces.

  • All of that probably