Character Sets
How computers represent text using ASCII and Unicode.
Think of it like: The Secret Agent Codebook
Imagine sending a secret message using numbers. You send "65", and your friend looks it up in their
book to see "A".
A Character Set is just that agreed book. If you use one book (ASCII) and your
friend uses another, the message will be garbled!
ASCII
American Standard Code for Information Interchange.
- 7-bit code (128 characters).
- Extended ASCII uses 8 bits (256 characters).
- Limited to English language.
- Small file size.
Unicode
Universal Character Set.
- 16-bit or 32-bit (Millions of characters).
- Covers ALL languages + Emojis 🚀.
- Backwards compatible with ASCII.
- Larger file size.
Calculating Text File Size
You must be able to calculate the data capacity required for an uncompressed text file.
Bits: 8 bits × 100 characters = 800 bits.
Bytes: 800 bits / 8 = 100 Bytes. (Tip: Since 8 bits is exactly 1 Byte, 100 characters in ASCII is simply 100 Bytes!)
Text Decoder
Type to see how the computer stores your text in Memory.
Max 20 chars.
Check Your Understanding
1. How many characters can standard 7-bit ASCII represent?
2. Why did computing globally transition towards using Unicode?
3. The letter 'A' is 65 in Decimal (01000001). What happens if you try to read it as an image pixel?
Evaluation Exam Scenario (AO3)
"Ahmet is writing a simple text-based program that only outputs numbers and the English alphabet. He decides to use a 32-bit Unicode character set to guarantee it is future-proof. Explain why this is an incredibly inefficient decision for this specific scenario." (3 marks)
Identify the Overkill: A 32-bit Unicode character set uses 4 bytes of storage per letter, giving over 4 billion available characters.
Apply to Scenario: Ahmet only needs the English alphabet and numbers. This requires fewer than 128 characters, meaning he only entirely needs a 7-bit/1-byte ASCII character set.
Conclusion: By using 32-bit Unicode instead of ASCII, Ahmet is wasting 24-25 bits of blank zeroes for every single letter he types, making his text file almost 400% larger than necessary, wasting storage space and memory to process.