Korean.png
Let's have a simple start: ASCII. ASCII is really just a very simple way of encoding alphabetic characters, digit symbols, and other commonly used symbols in binary. The original version had 7 bits (how many different things can it represent?).

ASCII_Code_Chart-Quick_ref_card.jpg
Note the following:
  • Representing the symbol "2" is different to representing the number 2
  • 000xxxx and 001xxxx are control codes (mostly obsolete except for HT, CR and LF!)
  • 010xxxx are symbols
  • 011xxxx are numbers and symbols
  • 100xxxx and 101xxxx are upper case letters (and symbols)
  • 110xxxx and 111xxxx are lower case letters (and symbols)
See Wikipedia for general discussions of Character Encoding. Note the difference between Character Encodings and Character Sets.

Questions
  • What is "HI" in binary-encoded ASCII?
  • What is "HELLO" in hex-encoded ASCII?
  • What is "hello" in hex-encoded ASCII

We can use a Binary-ASCII converter or Online Hex Editor to demonstrate this for longer pieces of text, e.g.
The race is not to the swift or the battle to the strong, nor does food come to the wise or wealth to the brilliant or favor to the learned;
but time and chance happen to them all
For some ASCII fun, see ASCIImation

nous avons un problème! We are being too Naïve!


Enter ISO/IEC 8859-1 !

But we still have a problem, one which crops up in many representations.... can you see what it is?

AÄÅaäáàâårghhh!!!!


This leads to sorting and equivalence problems...
See CodingHorror article - "natural" verus "asciibetical" sort
See Google search for Katy Borner vs Katy Börner
But what about schön schoen and schon?
And where would Ølstykke get sorted?

And we have a bigger problem - how to handle our Korean (or any other language...)? Enter Unicode (Wikipedia - also see Unicode.org) . Very simply, Unicode tries to represent every character in every language

See Unicode Charts

To use Unicode in webpages (HTML), you can use the following form &#xHHHH, where HHHH is the hex code. For example, the letter Y in Unicode is U+0059 and in HTML you could use Y. You can try out these codes with this online HTML editor.