/ Blog

Emojis are Unicode?

December 4th 2015

In May of this year, the New Yorker published How Emoji Got to the White House. Two months ago, The New York Times wrote a follow up, How Emojis Find Their Way to Phones.

Brief History of Emojis

In summary, emojis were “invented” in Japan and evolved from visual depictions popular in manga. Japanese cellular carrier began incorporating them into their texting platforms  in the late 1990s and each carrier competed to make better, more complex emojis! These emojis were propriety, so an emoji sent from NTTDoCoMo (largest Japanese cellular carrier) could not necessarily be read by your friend using Softbank. When Google launched Gmail in Japan, they realized they had to support emojis, but in order to support it in e-mail, they had to promote a standard. Enter the Unicode Consortium. Google, along with a handful of other American tech companies, had emojis added to Unicode, a character set standard that incorporates practically every language in the world.

But Emojis are Unicode??

As the New York Times pointed out:

Some of these modern hieroglyphics have prompted debate. Sets of default emojis that included only white skin tones prompted Unicode to release more diverse characters last year. And one image in the latest group has prompted protest: The British gun control group Infer Trust has spoken out against a proposal for a rifle emoji.

And that gets to the crux of it. A text message conversation using emojis The purpose of Unicode is to provide an international character set that incorporates the languages of all regions. The Unicode character set includes Roman letters, French accents, Chinese characters, etc. This helps facilitate cross border communication –  I can copy Korean text and send it to an English speaker without a problem as long as the text is encoded in a Unicode standard and the English speaker receives it in an application which supports that  Unicode standard (I’m saying “Unicode standard” here because technically Unicode defines a character set, and there are separate encoding standards such as UTF-8 and UTF-16 that specify how each character in this set gets turned into bits – 0s and 1s).

While emojis are a medium of communication, it’s a bit arbitrary exactly what emojis should be included in a standard, and what shouldn’t.  This becomes especially apparent when the set of standard emojis includes not only facial expressions but animals, electronics, and food. Why is a burger (?) included in the Unicode standard and not a steak? There’s a symbol in the standard for a fax machine (?) … why not also include a stylus?  In fact, any image could essentially be a means of communication. Why are certain pictorial representations treated as text (as part of the Unicode standard) while others are just ordinary images?

Emojis are here to stay, and having a standard mechanism to exchange them is certainly necessary. From a practical standpoint, incorporating emojis into Unicode is the most straightforward solution since many devices already have partial Unicode support. It just seems strange to add an arbitrary set of objects and expressions to an international language standard.