ASCII explained: the 128 characters that run the web

Ask what letter the byte 65 means and every computer on Earth gives the same answer: A. That agreement — boring, total, invisible — is ASCII, and it's arguably the most successful standard in computing history. It was designed in 1963, for machines that printed on paper, and it is still load-bearing in every URL you type.

What ASCII actually is.

ASCII — the American Standard Code for Information Interchange — is a mapping between numbers and characters: 128 codes, 0 through 127, fitting in 7 bits. First published by the American Standards Association in 1963 and revised into its familiar form in 1967, it gave every letter, digit, punctuation mark, and machine-control signal of American English text a fixed number, so that any two machines could exchange text and mean the same thing by it.

The number 128 wasn't an accident of ambition but of hardware: 7 bits was what the teleprinter world could afford per character, and 2⁷ = 128 slots is what 7 bits buys. Into those 128 slots went 33 control codes (more on those below), the space, 10 digits, 26 uppercase letters, 26 lowercase letters, and 32 punctuation and symbol characters.

The table's deliberate layout.

The assignments look arbitrary until you write them in binary — then the design jumps out:

Range	Contents	Why there
0–31	Control characters	Machine instructions, kept out of the printable range
32	Space	The first "printable" — blank on purpose
48–57	Digits 0–9	Low 4 bits equal the digit's value: `0` is 0110000
65–90	A–Z	Alphabetical, contiguous
97–122	a–z	Exactly uppercase + 32 — one bit apart
127	DEL	All bits set — punch-tape erasure (punch every hole)

Two of those rows are little gifts to programmers. Digits: because '7' is 55 and '0' is 48, converting a digit character to its value is a single subtraction — c - '0' — and the low four bits of a digit's code are the digit, in binary. Letters: the alphabet is contiguous, so 'z' - 'a' is 25 and range checks like c >= 'a' && c <= 'z' work. Neither property was guaranteed by earlier codes (EBCDIC, IBM's contemporary encoding, famously has gaps inside its alphabet).

And 127, DEL, is a fossil worth keeping: on paper tape, you erase a mistyped character by punching out all the holes — which is why "delete" is the code with every bit set, 1111111.

The invisible 33: control characters.

Codes 0–31 (plus DEL) don't print — they instruct. Most are dead letters from the teleprinter age, but a handful still run your day:

LF (10, \n) and CR (13, \r) — literally "line feed" (roll the platen one line) and "carriage return" (slide the print head back to column one): two separate physical motions on a teletype, hence two codes. Unix chose bare LF as its line ending; DOS and Windows kept the mechanical pair CR LF — and that split is why files still show ^M artifacts and why git has autocrlf. HTTP, split down the middle, mandates CRLF between headers.

TAB (9, \t) — horizontal tab, of tabs-versus-spaces fame. NUL (0, \0) — all bits clear; C picked it to terminate strings, making it the most security-consequential byte in history. ESC (27) — escape, reborn as the prefix for terminal color codes (ESC[31m = red) and as the key you mash to exit vim. BEL (7, \a) — rang an actual bell on the teletype; terminals still beep for it.

The uppercase bit trick.

Uppercase and lowercase letters differ by exactly 32 — which in binary is a single bit, bit 5:

A  =  1000001   (65)
a  =  1100001   (97)
       ^
       bit 5: 0 = uppercase, 1 = lowercase

That's deliberate. It means case conversion is one bitwise operation — OR with 32 to lowercase, AND with ~32 to uppercase — and case-insensitive comparison can just mask that bit off. It's also why Ctrl+letter shortcuts map the way they do: the terminal convention strips the high bits, so Ctrl+H (H = 72, 1001000) becomes 8 — backspace — and Ctrl+I becomes 9, tab. The keyboard shortcuts you use today are ASCII arithmetic from the 1960s.

"Extended ASCII" and the mojibake years.

When 8-bit bytes became universal, everyone noticed the spare bit: codes 128–255, another 128 slots, unclaimed. And every vendor claimed them differently. IBM PC code page 437 put box-drawing characters there; ISO 8859-1 (Latin-1) put Western European accents; Windows-1252 tweaked Latin-1; and dozens of other "code pages" covered Greek, Cyrillic, Hebrew, and more. All were casually called "extended ASCII," and none agreed with each other.

The result was a decades-long era of garbled text: a file written as Windows-1252 and read as something else turns café into cafÃ©. The bytes never changed — only the assumed meaning of the upper 128 did. The lesson, learned expensively: there is no such thing as plain text without a declared encoding. Below 128, everyone agreed. Above it, chaos.

How UTF-8 made ASCII immortal.

Unicode's answer was one character set for every script, and UTF-8 — sketched by Ken Thompson and Rob Pike in 1992 — is the encoding that won the web. Its defining design decision: bytes 0–127 in UTF-8 mean exactly what they mean in ASCII. Every ASCII file is already a valid UTF-8 file, byte for byte. Multi-byte characters use only bytes 128–255, so they can never be confused with an ASCII character — a / is a path separator and a < opens a tag no matter what language surrounds it.

That backward compatibility is why the transition to Unicode happened at all — no flag day, no conversion of the world's existing files — and it's why ASCII is effectively immortal: it's not a legacy encoding UTF-8 replaced, it's the first 128 codes of UTF-8. English text, JSON keys, HTML tags, and URLs encode one byte per character today because a 1963 teleprinter standard got grandfathered into the future.

Takeaways.

The thing to remember: ASCII is 128 codes in 7 bits, laid out on purpose — digits carry their value in their low bits, the alphabet is contiguous, and case is a single bit. The control characters explain \r\n, \0, and Ctrl-key shortcuts. "Extended ASCII" was never one thing, and UTF-8 deliberately embeds real ASCII as its first 128 codes.

Most standards live a decade or two; ASCII is past sixty and underneath more text than ever. It earned that by being small, cheap to implement, and cleverer than it looks — the rare table worth actually reading.

The whole table, searchable, in your browser.

The ASCII Table tool lists all 128 codes — decimal, hex, octal, binary, character, and name — with instant search. Handy for the "what's 0x1B again?" moments. Entirely client-side, like everything in the workshop.

Open the ASCII Table

Made with love by a very serious person pretending not to be. Tooly McToolface is a workshop of free, client-side web tools. If character-level archaeology is your thing, hex, binary, and why programmers count differently explains the notation this article writes codes in, and magic numbers covers what the bytes before the text say about a file.

What ASCII actually is.

The table's deliberate layout.

The invisible 33: control characters.

The uppercase bit trick.

"Extended ASCII" and the mojibake years.

How UTF-8 made ASCII immortal.

Takeaways.

The whole table, searchable, in your browser.

More from the workshop.

Hex, binary, and why programmers count differently.

The ASCII Table.