HTML Entity Converter — Encode & Decode HTML Entities

Encode special characters to HTML entities (&, <, >) or decode them back to plain text.

Mode:
Input
Output
🛡HTML escaping is a fundamental technique for preventing XSS vulnerabilities.

About HTML Entity Converter — Encode & Decode HTML Entities

HTML Entity Converter encodes special characters like <, >, &, and " into HTML entities and decodes them back to plain text. Essential for safely embedding text in HTML pages and preventing XSS vulnerabilities.

How to Use

  1. 1Paste the text you want to encode or decode into the input area.
  2. 2Choose "Encode" to convert special characters to HTML entities, or "Decode" to reverse the process.
  3. 3Click the button and copy the result from the output field.

Features

  • Prevent XSS vulnerabilities by properly escaping HTML special characters
  • Supports both encoding and decoding in one tool
  • Handles all common HTML entities including named and numeric forms
  • No login required — runs instantly in the browser
01

Understanding HTML Entities and Character Encoding

HTML entities are special text sequences that represent characters that would otherwise be interpreted as HTML markup. Learning when and why to use them is essential for every web developer.

What Are HTML Entities?

HTML entities are short codes that represent characters with special meaning in HTML syntax. The most critical ones are the angle brackets (< and >), which the browser interprets as the start and end of HTML tags, and the ampersand (&), which begins every entity reference. When you want to display these characters as visible text rather than having the browser parse them as markup, you must use their entity equivalents: &lt; for <, &gt; for >, and &amp; for &. Other commonly encoded characters include the double quote (&quot;) and apostrophe (&apos;), which matter inside attribute values. HTML also supports hundreds of named entities for symbols and accented letters — for example, &copy; for ©, &eacute; for é, and &mdash; for an em dash. Numeric entities (&#60; or &#x3C;) work as an alternative when a named form does not exist. Understanding entities helps you write more robust, standards-compliant HTML that displays correctly across all browsers and character sets.

Named vs. Numeric Entities

HTML entities come in two flavors: named and numeric. Named entities use a descriptive keyword, such as &amp; (ampersand), &lt; (less-than), and &nbsp; (non-breaking space). They are human-readable and widely recognized, but not every character has a named form. Numeric entities reference a character by its Unicode code point, either in decimal form (&#38; for &) or hexadecimal form (&#x26; for &). Every Unicode character can be expressed numerically, making numeric entities more universal. Both forms are fully equivalent to browsers — &amp; and &#38; both render as an ampersand. The choice between them is mostly stylistic: use named entities for readability in hand-written HTML, and numeric entities when you need to represent a character that has no name or when generating HTML programmatically without worrying about entity name lookups.

Character Encoding vs. HTML Entities

It is important to distinguish between HTML entity encoding and character encoding. Character encoding (such as UTF-8) defines how text bytes are stored and transmitted at the byte level. As long as your page declares UTF-8 with a proper <meta charset="UTF-8"> tag and your server sends the correct Content-Type header, you can include most Unicode characters — including Japanese, emoji, and accented letters — directly in your HTML source without entity-encoding them. HTML entities, by contrast, are specifically for escaping characters that have syntactic significance in HTML (&, <, >) or that must appear inside attribute values safely. A common misconception is that all non-ASCII characters must be entity-encoded; with modern UTF-8 pages, only the handful of HTML syntax characters truly require escaping.

02

HTML Encoding and XSS Prevention

Cross-site scripting (XSS) is one of the most prevalent web security vulnerabilities, and HTML entity encoding is a fundamental defense against it. Understanding how encoding relates to XSS helps you write safer applications.

How XSS Attacks Work

Cross-site scripting occurs when an attacker injects malicious HTML or JavaScript into a web page that is then executed in another user's browser. The classic example: a comment form that accepts user input and renders it directly into the page. If a user submits <script>alert("hacked")</script> and the application outputs it without encoding, the browser executes the script rather than displaying it as text. This can allow attackers to steal session cookies, redirect users to phishing sites, log keystrokes, or deface pages. XSS vulnerabilities are ranked among the top web security risks in every edition of the OWASP Top 10. The fundamental cause is always the same: untrusted data is placed into an HTML context without being properly escaped.

How HTML Encoding Prevents XSS

By converting characters like < and > into &lt; and &gt; before inserting user-supplied data into a page, you ensure the browser renders them as visible text rather than parsing them as HTML tags. A script tag becomes &lt;script&gt;alert()&lt;/script&gt;, which displays harmlessly on screen. This technique is called output encoding and is the first and most reliable layer of XSS defense. It must be applied in the correct context: HTML-entity encoding works for text nodes and attribute values, but URL encoding is needed for href attributes, and JavaScript string escaping is needed inside <script> blocks. Using a single encoding strategy for all contexts is insufficient. Modern web frameworks like React, Angular, and Vue automatically HTML-encode dynamic content in templates, but you must still be careful with features that allow raw HTML injection, such as React's dangerouslySetInnerHTML.

FAQ

What characters are encoded?
At minimum <, >, &, ", and ' are encoded. Non-ASCII characters can also be encoded as numeric entities.
What is the difference between named and numeric entities?
Named entities like &amp; are human-readable; numeric entities like &#38; are universally supported. Both are valid HTML.
Does this protect against XSS?
Encoding HTML entities is a key step in preventing XSS, but full security requires proper context-aware sanitization on the server side as well.
When should I encode HTML entities in web development?
Always encode user-supplied data before inserting it into HTML to prevent Cross-Site Scripting (XSS) attacks. The five characters with special meaning in HTML — &lt; (less-than), &gt; (greater-than), &amp; (ampersand), &quot; (double quote), and &#x27; (apostrophe) — must be encoded as &amp;lt;, &amp;gt;, &amp;amp;, &amp;quot;, and &amp;#x27; respectively. Modern web frameworks (React, Vue, Angular) do this automatically for text content, but it is crucial to understand when working with raw HTML strings.
What is the difference between named entities and numeric entities?
Named entities use descriptive names like &amp;nbsp; (non-breaking space), &amp;copy; (copyright ©), and &amp;euro; (€). Numeric entities use decimal (&amp;#169;) or hexadecimal (&amp;#xA9;) Unicode code points. Named entities are more readable but only cover a limited set of characters defined in the HTML specification. Numeric entities can represent any Unicode character, making them the universal option for encoding any character not already covered by a named entity.

Found a bug or something not working as expected?

Report a bug →