UTF-8 to Hex Converter — Inspect Byte-Level Encoding (Free)
Convert UTF-8 text to its hex byte representation. Configurable separators. Free, in-browser, no signup.
About UTF-8 to Hex Converter
A UTF-8 to hex converter encodes a text string as the sequence of hexadecimal byte values its UTF-8 representation produces — useful for inspecting non-ASCII characters byte-by-byte, debugging encoding mismatches, and embedding raw bytes in places that accept hex literals (Wireshark filter rules, hex-edited binary files, low-level network packets). The ZTools UTF-8 to Hex converter handles full Unicode (BMP plus supplementary planes — emoji, CJK extensions), supports multiple separator styles (space, comma, none, 0x prefix), and is paired with an inverse Hex-to-UTF-8 converter for round-trip verification.
Use cases
- Debugging encoding bugs. Mystery character renders as "’". Convert the original UTF-8 to hex; compare to what the system reads. Mismatches reveal the wrong-decode path.
- Wireshark / network analysis. Filter rules need byte-level patterns (e.g. `tcp contains 48:65:6c:6c:6f`). Convert "Hello" to hex once.
- Embedding bytes in code. A test fixture or migration script embeds a known string as `\x48\x65\x6c\x6c\x6f` for byte-precision. Convert text once.
- Educational demos. Demonstrate that "€" is 3 UTF-8 bytes (E2 82 AC). Concrete byte view makes the abstract concrete.
How it works
- Paste text. Any Unicode — Latin, CJK, emoji, etc.
- Encode UTF-8. Each codepoint encodes to 1–4 bytes per UTF-8 rules.
- Format as hex. Each byte → two-digit hex. Pick separator: space (`48 65 6c`), comma (`48,65,6c`), 0x (`0x48 0x65 0x6c`), or no separator.
- Copy. Output to clipboard. Round-trip via Hex-to-UTF-8 to verify.
- Inspect codepoints. Optional view shows codepoint, character, byte sequence side-by-side.
Examples
Input: Hello (UTF-8 → hex, space-separated)
Output: 48 65 6c 6c 6f
Input: € (UTF-8)
Output: e2 82 ac
Input: 🚀 (UTF-8, 4 bytes)
Output: f0 9f 9a 80
Frequently asked questions
Why is "Hello" 5 bytes but "héllo" is 6?
ASCII characters fit in 1 byte each. "é" (U+00E9) requires 2 bytes in UTF-8 (c3 a9). So "héllo" is 6 bytes.
How big can emoji be?
4 bytes in UTF-8 (codepoints in the supplementary planes). Newer emoji like 🚀 (U+1F680) take 4 bytes.
Can I convert directly to UTF-16 hex?
Yes — switch the encoding selector to UTF-16 BE / LE. Output is hex of those bytes.
What if my text has a BOM?
A leading BOM (EF BB BF for UTF-8) appears in the hex output. Toggle "strip BOM" to omit it.
Are uppercase or lowercase hex digits used?
Selectable. Lowercase is the modern convention; some legacy systems prefer uppercase.
Does it handle invalid UTF-8?
Input is assumed valid Unicode text from your browser; output is always valid UTF-8 hex. For decoding broken byte streams, use Hex-to-UTF-8 in lenient mode.
Pro tips
- For Wireshark filters, no-separator hex (`48656c6c6f`) is the right format.
- For embedded code literals, `\xNN` style is most common — switch the format to that.
- Round-trip via Hex-to-UTF-8 always — round-trip mismatches reveal off-by-one or BOM issues.
- For mojibake debugging, also dump the bytes the system *thinks* it has — comparing the two reveals the wrong-decode step.
- Use the codepoint table for in-depth investigation — sometimes a "weird character" is a confusable Unicode lookalike.
Reviewed by Ahsan Mahmood · Last updated 2026-05-05 · Part of ZTools.
For the full,
formatted version of this page, please enable JavaScript and reload
https://ztools.zaions.com/utf8-to-hex.