Skip to content

HMAC Explained: How Keyed Hashing Authenticates Messages

How HMAC works — the inner/outer hash construction, ipad/opad, key handling — why hash(key‖message) is insecure, and where HMAC is used (JWT, TLS, API signing).

Published on 10 min read

A plain hash tells you whether data changed. It cannot tell you who produced it, because anyone can recompute a hash over modified data. To prove that a message came from someone holding a shared secret — and that nobody tampered with it in transit — you need a keyed primitive. HMAC is the workhorse for this job: it turns an ordinary hash function like SHA-256 into a message authentication code, and it does so with a construction that is both elegant and provably resistant to attacks that break the obvious naive approaches.

This article explains what a MAC is, why the tempting hash(key ‖ message) design is dangerously broken, how HMAC's nested construction fixes it, and where you encounter HMAC every day.

What a MAC is and what it guarantees

A message authentication code (MAC) is a short tag computed from a message and a secret key. The sender computes tag = MAC(key, message) and transmits (message, tag). The receiver, who shares the same key, recomputes the tag over the received message and checks that it matches.

A secure MAC delivers two guarantees at once:

  • Integrity — if a single bit of the message changes, the recomputed tag will not match, so tampering is detected.
  • Authenticity — only a party holding the key could have produced a valid tag, so a matching tag proves the message originated from someone who knows the secret.

Formally, a MAC is secure if it is existentially unforgeable under chosen-message attacks: even an adversary who can request valid tags for arbitrary messages of their choosing cannot produce a valid tag for any new message. That bar is higher than mere collision resistance, and it is exactly the bar the naive constructions fail to clear.

Note what a MAC does not provide: confidentiality (the message is not encrypted) and non-repudiation (because the key is shared, the receiver can forge tags too, so you cannot prove to a third party which of the two parties created a given message).

Why hash(key ‖ message) is broken

The intuitive way to build a MAC is to prepend the secret to the message and hash the result: tag = H(key ‖ message). This is the secret-prefix MAC, and on the most widely deployed hash functions it is catastrophically insecure because of length-extension attacks.

The vulnerability comes from the internal structure of MD5, SHA-1, and the SHA-2 family (SHA-256, SHA-512). These are Merkle–Damgård hashes: they process the input in fixed-size blocks, maintaining a running internal state that is updated block by block. Critically, the final hash output is simply that internal state after the last block (after length padding). The mechanics of this block-by-block design are covered in detail in how SHA-256 processes message blocks.

Here is why that breaks the secret-prefix MAC. Suppose an attacker sees a valid tag = H(key ‖ message) and also knows message (but not key). Because the tag is the hash's internal state at the end of processing key ‖ message ‖ padding, the attacker can load that state back into a fresh hash instance and continue hashing from there. They can append arbitrary new bytes and produce:

H(key ‖ message ‖ glue-padding ‖ attacker-data)

— a valid tag for an extended message — without ever knowing the key. The attacker only needs to guess the key's length to reconstruct the glue padding, which is a small search space. This is a real forgery against the unforgeability definition, and it has burned real systems (notably the Flickr API signature scheme).

The secret-suffix variant H(message ‖ key) avoids length extension but leans entirely on collision resistance: two colliding messages produce the same tag regardless of the key. HMAC sidesteps both failure modes.

The HMAC construction

HMAC wraps the hash in two nested invocations with two distinct key-derived pads. Given a hash H with block size B bytes (64 for SHA-256, 128 for SHA-512), a key K, and a message m:

$$\text{HMAC}(K, m) = H\big((K' \oplus \text{opad}) , | , H((K' \oplus \text{ipad}) , | , m)\big)$$

The two constants are block-sized byte strings:

  • ipad = the byte 0x36 repeated B times (the inner pad)
  • opad = the byte 0x5c repeated B times (the outer pad)

K' is the key after preprocessing to exactly one block:

  • If the key is shorter than B, it is right-padded with zero bytes to length B.
  • If the key is longer than B, it is first hashed (K' = H(K)), and the resulting digest is then zero-padded to B. This is why an over-long key is no stronger than a digest-length key — and why supplying a key longer than the block size offers no security benefit.

The two pads differ in every bit position (0x36 ⊕ 0x5c = 0x6a), which is what makes the inner and outer keys effectively independent.

Pseudocode

function HMAC(K, m):
    if length(K) > B:
        K = H(K)                  # hash over-long keys down to a digest
    K = K padded with 0x00 to B bytes

    inner_key = K XOR ipad        # ipad = 0x36 repeated B times
    outer_key = K XOR opad        # opad = 0x5c repeated B times

    inner = H(inner_key || m)     # first (inner) hash
    return H(outer_key || inner)  # second (outer) hash over the inner digest

The output length equals the underlying hash's digest length: 32 bytes for HMAC-SHA-256, 64 bytes for HMAC-SHA-512. You can truncate the tag (HMAC-SHA-256-128 keeps the first 128 bits) when a shorter tag is acceptable.

Why the nesting defeats length extension

The outer hash is the key. Length extension works only because the attacker can treat a leaked tag as a resumable internal state. In HMAC the value an attacker observes is H(outer_key ‖ inner_digest) — the output of the outer hash. To extend it they would need to continue the outer computation, but to do so usefully they would need to know outer_key, which is derived from the secret. The inner digest, meanwhile, is never exposed; only the final outer digest leaves the box. There is no resumable state to attack.

The security rests on a proof by Bellare, Canetti, and Krawczyk. The intuition: if the hash's underlying compression function behaves like a pseudorandom function (PRF) when keyed through its initialization vector, then HMAC is itself a secure PRF, and therefore a secure MAC. The two independent keyed states (inner_key and outer_key) and the fact that the outer hash consumes a fixed, small input (one block of keyed pad plus one digest) are what let the proof go through. This is intuition, not the full reduction — but the upshot is that HMAC's security reduces to a property of the compression function rather than to the hash's full collision resistance.

HMAC versus other primitives

HMAC versus an unkeyed hash. A bare hash provides integrity against accidental corruption only. Anyone can recompute SHA-256(message) over altered data, so a plain digest authenticates nothing against an active adversary. HMAC adds the secret, converting "did this change?" into "did someone with the key produce this?".

HMAC versus digital signatures. Both authenticate, but the trust models differ. HMAC is symmetric: signer and verifier share one secret, so either party can produce valid tags. Digital signatures (RSA, ECDSA, Ed25519) are asymmetric: the signer holds a private key, verifiers hold only the public key, and a valid signature provides non-repudiation — proof of origin that holds up to a third party. HMAC is far cheaper to compute and verify and needs no PKI, which is why it dominates high-throughput, point-to-point scenarios. Signatures win when verifiers must not be able to forge, or when there is no pre-shared secret.

HMAC as a building block for KDFs

HMAC is the engine inside the standard key-derivation functions:

  • HKDF (RFC 5869) is built entirely from HMAC. Its extract step is HMAC(salt, input_key_material) to concentrate entropy into a uniform pseudorandom key, and its expand step iterates HMAC to stretch that key into as many output bytes as needed. This is the KDF used inside TLS 1.3.
  • PBKDF2 repeatedly applies HMAC (typically HMAC-SHA-256) over a password and salt — hundreds of thousands of iterations — to derive a slow-to-brute-force key. The iteration count, HMAC's role, and why memory-hard alternatives like scrypt and Argon2 are often preferred are covered in PBKDF2 versus scrypt for password hashing.

In both cases HMAC's PRF behavior is exactly the property the KDF relies on.

Where HMAC shows up in practice

  • JWT HS256. A JSON Web Token signed with the HS256 algorithm is authenticated by HMAC-SHA-256 over the base64url-encoded header and payload, using a shared secret. (The asymmetric RS256/ES256 variants use real signatures instead.)
  • TLS. Older TLS versions used HMAC for record integrity and in the Finished message; TLS 1.3 moved bulk integrity into AEAD ciphers but still uses HMAC via HKDF throughout its key schedule.
  • AWS Signature Version 4. Each request is signed with a chain of HMAC-SHA-256 operations that derives a per-request signing key from your secret access key, date, region, and service, then HMACs the canonical request.
  • Webhook signatures. Stripe, GitHub, Shopify, and most webhook providers sign payloads with HMAC-SHA-256 so receivers can verify the event genuinely came from the provider and was not replayed or tampered with.

Verify tags in constant time

A subtle but critical implementation detail: always compare MACs with a timing-safe comparison. A naive byte-by-byte comparison that returns early on the first mismatch leaks, through its runtime, how many leading bytes were correct — enough for an attacker to forge a tag one byte at a time over many requests. Use a constant-time routine such as crypto.timingSafeEqual (Node), hmac.compare_digest (Python), or subtle.ConstantTimeCompare (Go). Recompute the expected tag and compare the full fixed-length buffers.

Security notes and algorithm choice

A famous nuance: HMAC remains secure even when built on a broken hash. HMAC-MD5 and HMAC-SHA-1 are not broken by the collision attacks that destroyed MD5 and SHA-1 for signatures, because HMAC's security depends on the compression function being a PRF, not on the hash being collision-resistant. The keys are secret, so the collision-finding machinery does not apply.

That said, this is no reason to deploy them. Use HMAC-SHA-256 or stronger for anything new: you get a comfortable security margin, the underlying hash is sound for every other purpose, and you avoid future surprises. For high performance on modern CPUs without hardware SHA acceleration, the BLAKE2 and BLAKE3 keyed-hashing modes offer built-in MAC functionality that is often faster than HMAC-SHA-256. To understand the Merkle–Damgård structure underneath all of this — and why keyed constructions matter — start with the pillar overview of how hashing works.

Key hygiene matters too: use a key with at least as much entropy as the digest is wide (256 bits for SHA-256), generate it from a CSPRNG, and never reuse an authentication key for encryption.

Try it in your browser

You can experiment with keyed hashing directly in our tool: compute HMAC-SHA256 in your browser. The generator has an HMAC key option, so you can paste a secret and a message and see the tag instantly. Everything runs client-side in WebAssembly — your key and message never leave the page and nothing is uploaded to any server, which makes it safe to test with real secrets.

Conclusion

HMAC takes a primitive built for integrity and, by nesting it around a secret with two carefully chosen pads, produces a MAC that delivers both integrity and authenticity — while sidestepping the length-extension and collision pitfalls that sink the naive hash(key ‖ message) design. Its provable reduction to the compression function's PRF property is why it has survived weaknesses in the very hashes it is built on, and why it underpins JWTs, TLS, cloud request signing, and the KDFs that protect passwords. Reach for HMAC-SHA-256, compare tags in constant time, and when you want to see exactly what a tag looks like, try the in-browser HMAC generator.

Related articles

A deep, developer-focused guide to how cryptographic hash functions work — properties, Merkle–Damgård vs sponge constructions, the birthday bound, and where each family fits.
MD5 vs SHA-256 vs SHA-3 compared — output size, internal construction, speed, security status, and a clear decision guide for integrity, security, and password use cases.
How the MD5 hash algorithm works internally — Merkle–Damgård, the 64-step compression function, padding — and why MD5 is cryptographically broken yet still used for checksums.