Subtitles section Play video
-
This is Animalese.
-
KK Slider: [unintelligible babble]
-
It's the pseudo-language of Animal Crossing. This is what
-
it sounds like in the Japanese version:
-
KK Slider: [unintelligible babble]
-
And here's what it sounds like in the
-
English version:
-
KK Slider: [slightly deeper unintelligible babble]
-
They sound different, which is weird because it's supposed to
-
be nonsense, right?
-
So... why does Nintendo dub KK Slider?
-
To understand we need to
-
do a taxonomy of all the ways games have tried to represent -- or avoid representing --
-
human speech.
-
In the beginning, there was the Word, and the Word was:
-
Oop! That... was supposed to say voice synthesis.
-
Early attempts at adding audio to games were a
-
mix of pre-recorded voices and genuine voice synthesizers. But they were mostly
-
gimmicky, expensive add-ons. Voice chips made more sense in arcade machines
-
because they were already a huge investment of space and money -- but it was
-
still a technical struggle to get them to work. Like Q*bert, known for his mad ups
-
and foul mouth, this drop of Tang was originally supposed
-
to speak English instead of:
-
Q*Bert: [garbled synthetic phonemes]
-
but audio engineer David Thiel couldn't get the voice chip to
-
produce the sounds he was hoping for. So instead of continuing to mess with it,
-
he just said [Q*bert garble curse] and had it string together some incoherent phonemes instead.
-
Thiel, like many designers that followed, came to the conclusion that
-
human voices just weren't worth the fuss. Other developers opted for a style
-
that's entirely unique to video games, and I looked around but I couldn't find
-
a single definitive phrase used to describe this style -- which I think speaks
-
to how much we take it for granted, even though it is super weird.
-
I'm talking about using nonsensical sound effects to stand in for language, or simply put
-
[slow, low beeps that appear in time with the words]
-
[very high-pitched piercing beeps that appear in time with the words]
-
[sharp, high-pitch beeps that appear in time with the words]
-
for the purpose of this video -- and because it's cool to name things --
-
I'm going to call this beep speech.
-
The earliest examples of beep speech I could
-
find were in JRPGs like Star Arthur Legend - Planet Mephius [short pattern of beeps]
-
and Legend of Zelda [mid-pitch beeps]
-
Some American games used a similar trope of mimicking on-screen text, but
-
it's not meant to stand in for a voice so it's not quite the same.
-
That distinction is important because of beep speech's peculiar function; games that
-
use beep speech slowly reveal text and accompany each word with audio, which
-
makes the player process information as if they were really listening to somebody speak.
-
It's not a straight info-dump; it replicates the act of listening,
-
which makes it easier to stay engaged with the written text. That's
-
assuming you enjoy listening to bebe bebe be bebeep which is a great weakness
-
of the beep speech of the cartridge era. Because audio capabilities were still
-
limited, most games use the same beep for every character in every situation. Later
-
games - including Animal Crossing - could pitch the beeps higher or lower, and that
-
really helped spice things up.
-
Then there were games like Star Fox which gave each
-
character a different kind of "voice" so you could easily distinguish your kind
-
friends Slippy [synthetic sounds similar to frog croaks]
-
from that no-good hotshot Falco [deeper babble]
-
These were synthetic voices and
-
total nonsense with no real association with the text. Another strategy was to
-
use vocal grunts -- things like sighs and yells and other non-language forms
-
of communication. These were great for adding variety,
-
conveying emotion, and giving a character a voice without giving them language.
-
Although they use different strategies Star Fox and Ocarina of Time have two
-
weird things in common:
-
first of all, both have friendly frogs that never get their due.
-
[rhythmic frog croaking]
-
Second of all, both have English language lines even in the original Japanese versions.
-
Navi in Ocarina of Time: Hello!
-
The [GOOD LUCK] and [HEY, LISTEN] were the same in every
-
version of the games. And that points to one of the biggest strengths of beep speech
-
and vocal grunts: you DON'T have to translate them. A shiver is a shiver
-
in every language.
-
Link in Ocarina of Time: [shivers]
-
Localizing a game was - and is - a huge expenditure of time and
-
money, which makes these non-voice options the perfect replacement for
-
voice lines. Quality localization is basically a requirement for most games
-
now, but the 90s and early 2000s were a dark time for translations and voice
-
acting alike, leaving us with such gems as:
-
Dracula in Castlevania: What is a man?!
-
Barry in Resident Evil: A Jill Sandwich!
-
And that's when localization happened at all - sorry Earthbound fans. During this period,
-
beep speech was usually a stand-in for a real language. But Banjo Kazooie made a
-
huge innovation in that their gibberish was... just what it was.
-
Bottles: [a gentle honking]
-
Like Star Fox, the characters
-
had distinct voices - but they weren't synthetic. They were powered by
-
real human pipes, which is wild because it's human voices replicating a
-
synthetic style, that was made to replace human voices, like an aural ouroboros.
-
An auralboros.
-
Plenty of games of this era had full voice acting... but they weren't
-
on the N64. Nintendo's insistence on using cartridges would continue to
-
restrict their options for speech representation. On other consoles, games
-
became more invested in the cinematic experience of having characters say real shit.
-
Which mean a greater investment in voice acting. That caused a split in style where beep speech,
-
previously just fine for serious stories, came to represent a more lighthearted
-
cartoonish feel.
-
Mushi's Mama in Okami: [soft mid-tone beeps]
-
By the early 2000s, a trend emerged of
-
entirely fictional spoken languages. Whereas beep speech stood in for the
-
player's native speech, these constructed languages were more about making certain
-
characters and settings appear foreign -- while still empowering the player to
-
understand what they're saying. It's during this period that
-
both Animal Crossing and The Sims arrived.
-
Simlish actually predates The Sims; it first
-
appeared in Simcopter. But for The Sims, the team at Maxis knew they'd need
-
something more elaborate. Because the game was so much about the human
-
condition, they wanted to communicate emotions which would encourage players
-
to connect with their creations. Plus the practical considerations - anything that's
-
comprehensible can become repetitive, and having a huge scroll of dialogue meant
-
writing, translating, and redubbing a huge scroll of dialogue. Following the style
-
of Banjo Kazooie, they captured the real human voices of two improvisers and then
-
spent a year remix that audio to become the perfect blend of nonsense.
-
Sim 1: Dag dag aulf, Sim 2: Anamana blastamana
-
But that strategy can't work with every franchise. Animal Crossing had different
-
intentions and different styles, and so they needed a different approach. When
-
you hear Animalese for the first time, it sounds a lot like a standard
-
voice synthesis. But KK Slider is actually saying REAL WORDS.
-
Here he is slowed down:
-
KK Slider: [deeper and slower than normal, words that match the text box identifiable]
-
The synthetic voice doesn't exactly nail the pronunciation of each word, but
-
that works to its advantage; once it's sped back up, it's even harder to tell
-
that KK is speaking English. Dōbutsu no Mori, the original n64 Japanese
-
version of Animal Crossing, features Animalese in Japanese. Region-specific
-
Animalese is also the default language in New Leaf... but not Wild World or City
-
Folk. Instead they use a pretty standard sounding voice synthesis called Bebebese.
-
That's because Animal Crossing was never intended to be localized. In fact,
-
Nintendo didn't localize the first version of Animal Crossing; the American
-
release was based on the updated GameCube game Dōbutsu no Mori+.
-
Members of the Nintendo treehouse had to advocate for it to be translated,
-
partially because they had already gotten addicted to playing it. Because
-
they never intended to localize the game, Nintendo included a lot of specific
-
Japanese cultural elements, including of course the language. All of those had to
-
be changed in the American version because the style of translation at the
-
time called for completely eradicating any hint of a foreign culture. The
-
prevailing notion was that American audiences didn't want anything that had
-
what cultural theorist Kōichi Iwabuchi called "cultural odor,"
-
a phrase I hate to say out loud but have to respect the usefulness of. The localizers for the
-
first Animal Crossing did an amazing job replacing content and adding new
-
events for American audiences -- so much so that their game was actually real localized
-
back into Japanese and released for the Gamecube as Dōbutsu no Mori e+.
-
So when it came time to make Animal Crossing: Wild World, Nintendo needed a
-
localization strategy from the start.
-
And that strategy was to make a game with no regionality at ALL.
-
No cherry blossoms.
-
No Halloween.
-
And no regional Animalese.
-
The Bebebese of Wild World stuck around for City Folk,
-
but by New Leaf, Animalese had made a triumphant,
-
multilingual return. Why?
-
Well!
-
I don't know.
-
But my theory is this! City Folk got
-
a lot of criticism for being too similar to early entries in the franchise. The
-
next game had to distinguish itself significantly to avoid another letdown.
-
Aya Kyogoku, who co-directed New Leaf alongside Isao Moro,
-
viewed the game as a tool to communicate with both animal characters and other
-
players. So it made sense that the communication in game would be more
-
elaborate than Bebebese.
-
But doing full voice acting would have
-
been exorbitantly expensive;
-
City Folk had a huge script around, 640,000 words.
-
For perspective, Infinite Jest clocks in about 483,000 words,
-
so this cute little game about bugs and letters has it beat
-
by over a hundred thousand words, and that means
-
it's better.
-
On top of that it's just plain science that when creatures speak in adorable baby talk
-
they're cute and you just want to squish their widdle faces.
-
All of that probably
-
made it worthwhile to switch from bee Bebebese back to Animalese, an easy way
-
to show that you're turning over a New Leaf.
-
That brings us up to New Horizons and Nintendo has once again flipped the
-
script by making the language...
-
Tom Nook: [a few recognizable English words then nonsense babble]
-
semi-Animalese?
-
It's not quite Bebebese; there are
-
parts that still sound like words, and certain sounds do repeat with specific
-
text like "I" being pronounced like:
-
Goose: Ah
-
Rocket: Ah
-
Gulliver: Ah
-
but the weird thing is this has still been localized!
-
You can hear the difference in the way Nook addresses the audience in
-
the Japanese and English Nintendo Direct.
-
He's using the same quasi-English here that appears in New Horizons, which
-
brings us to an important question:
-
Why localized Animalese?
-
and why not localize Simlish?
-
Melissa Baese-Berk: One of the the ways
-
that we best understand how languages
-
differ from each other is in terms of what's called their prosody or their
-
sort-of rhythmic and pitch information.
-
Jenna: This is Dr. Baese-Berk,
-
a psycholinguist at the University of Oregon, studying how people process speech,
-
especially as a second language. Prosody is a linguistic concept that
-
covers a lot of speech elements that aren't explicitly phonetic -- like if you
-
hear somebody talking through a wall, you can often tell if they're speaking your
-
native language or not, even if you can't hear specific words.
-
Dr. Baese-Berk: We know that the rhythm matters a ton for recognizability, and when you disturb the rhythm
-
information and pitch information, it can have really big consequences for how you
-
understand the speech. That said there's a lot of variability, so I could have
-
sort of weird prosody, weird pitch and rhythm information, and you could
-
probably still tell that I'm the native speaker of English.
-
Jenna: Which is exactly why
-
Animalese is so difficult to parse, even though it's just been peppered with a
-
few audio artifacts. The sped-up pace alters the prosody and makes it very
-
hard to understand.
-
Hard... but not impossible.
-
Dr.Baese-Berk: There areways in which you can distort the signal so much that it feels at
-
first just like it's impossible for you to understand it, but once you have
-
started to figure it out, it becomes easy to understand.
-
Jenna: I can vouch for that,
-
having listened to a lot of Animal Crossing clips while researching this
-
video. I've gotten to the point where I can sort of like... half-understand the
-
villagers while they're speaking, and I've also begun to dream in Animalese,
-
and that's... that's probably fine right?
-
Dr. Baese-Berk: How we define gibberish is
-
going to be based on our native language, right? So how gibberish-y something
-
sounds is going to be related to how similar or different it might sound to
-
your native language, and... but I could imagine if it sounded so distinct from
-
your native language. It might not even sound like gibberish; it might just sound
-
like something that isn't really language-y
-
Jenna: So even though it's gibberish,
-
the distance from your native language can determine, even
-
subconsciously, whether you perceive it as a language. Which means that Simlish
-
probably isn't as universal as the team at Maxis hopes.
-
Dr. Baese-Berk: The specific sounds that
-
they're using are English-like sounds, in part because we know producing
-
non-english-like sounds is something that's really hard if you're a native
-
English speaker. So if you're improvising and producing gibberish, you're going to
-
produce the sounds that are within your inventory.
-
Jenna: A lot of effort was put into
-
mixing and chopping up this audio but the raw materials were still inevitably
-
lacking in variety. Simlish could still be localized to make more regionally
-
accessible forms of gibberish.
-
But you're not a member of the community in The Sims;