SEC: 1.2.4 CHARACTERS

Character Set Lab

Mission: Understand how computers turn binary 0s and 1s into the text, emojis, and symbols you see every day.

Core Theory

Character Mapping

🔤
The Character Set

A character set is a defined list of characters that a computer system can recognize. It acts as a "lookup table" where each character is assigned a unique binary code.

🔢
The Binary Link

Computers only process binary. To display an 'A', the computer stores its binary equivalent (e.g., 01000001) and "looks up" which symbol to draw on screen.

📐
The Bit-to-Character Rule

The more bits you use per character, the more characters you can represent. This follows a mathematical pattern:

Characters = 2n (where n is the number of bits)
7 Bits = 128 characters
8 Bits = 256 characters

The Heavyweights

ASCII vs Unicode

ASCII

The Old Standard. Primarily for the English language and basic symbols.

  • • Uses 7 or 8 bits per character.
  • • Small file size (efficient).
  • • Limited: Cannot represent different alphabets (Greek, Chinese) or emojis.

Unicode

The Global Standard. Designed to represent every character from every language.

  • • Uses 16 or 32 bits per character.
  • • Large character set (Thousands of symbols).
  • • Supports Emojis and Ancient scripts.
  • • Drawback: Larger file size than ASCII.

👀
Examiner's Eye: Character Traps

The Bit Count: In the J277 exam, ASCII uses 8 bits for calculations. This includes 1 bit for error checking (parity bit).

The "Relationship" Question: If asked about bits vs characters, you must say: "Increasing the bits increases the number of unique characters that can be represented."

Order Logic: Character sets are logically ordered. If 'A' is 65, 'B' will be 66. This is a common 1-mark question!

Knowledge Check

Question 1 of 5