Back to Blog
Reference 2026-04-16

HTML Entities Deep Dive

When to use HTML entities, when not to, and how UTF-8 changed everything.

HTML entities like & and © were once essential. With UTF-8 ubiquitous, most are unnecessary, but a few remain critical.

What Entities Are

Three forms encode characters:

&     named entity

& decimal numeric

& hex numeric

All three render as &.

The Five You Must Use

These special characters always need escaping in HTML:

CharacterEntity

|-----------|--------|

&& << >> "" (in attributes) '' or ' (in attributes)

If you forget the first three, the parser breaks. The last two only matter inside attribute values.

Everything Else: Use UTF-8

Once HTML pages are served as UTF-8 (default for decades), entities are obsolete for typography:

  • ©©
  • &heart;

Just type the character. Source files are easier to read; output is identical.

When Entities Still Matter

  • Whitespace control:   (non-breaking space), ­ (soft hyphen), (zero-width joiner) — invisible in source if you type them directly.
  • Encoding boundaries: when content is generated by a system that may not preserve UTF-8.
  • Email HTML: some clients still struggle with non-ASCII bytes; entities are safer.

Escaping User Input

For any user-supplied string rendered into HTML:

const escape = s => s.replace(/[&<>"']/g, c => ({

'&': '&', '<': '<', '>': '>', '"': '"', "'": '''

}[c]));

Frameworks (React, Vue, Svelte) do this automatically for text bindings — only worry about v-html/{@html}/dangerouslySetInnerHTML.

URL Encoding Is Different

HTML entities are not URL encoding. %20 and & solve different problems. URL decode for query strings; HTML decode for display.

Decode and inspect with the [HTML Entity Decoder](https://sdk.is/html-entity-decoder).