ML-DSA: To Pre-Hash or Not to Pre-Hash
The TL;DR: Use pure ML-DSA.
FIPS 204 gives you two signing modes: pure ML-DSA and HashML-DSA. If you have come across this and are deciding between them, use pure ML-DSA. Every major protocol specification published since the standard's release (RFC 9881 for X.509, RFC 9882 for CMS, the TLS 1.3 ML-DSA draft) and the NSA's CNSA 2.0 guidance have either prohibited or excluded HashML-DSA.
Pure ML-DSA
In ECDSA or RSA-PSS, your application hashes the message and then signs the digest as two separate operations. ML-DSA is different, it takes the entire message as input. Internally, it uses SHAKE-256 to hash the message together with a digest of the signer's public key (a value called tr), producing a 64-byte intermediate called mu (the message representative - which now binds the message to the signer's public key). All of the lattice arithmetic that follows operates only on mu, the private key, and randomness. The message is never touched again after mu is computed.
This design gives ML-DSA a property called non-resignability. Because mu is derived in part from the signer's public key, an attacker who obtains mu cannot reuse it to forge a signature under a different key. ECDSA digests don't have such a binding.
Pure ML-DSA also keeps the signature scheme self-contained. The security proof covers one primitive with one set of assumptions (Module-LWE, Module-SIS, properties of SHAKE-256). There is one OID, one Verify() routine, and no external parameters. Every verifier that supports ML-DSA can verify your signatures without needing to know anything about how you sign messages.
HashML-DSA
HashML-DSA was added to FIPS 204 (Section 5.4) because implementers asked for a standardized way to pre-hash messages before signing, similar to the hash-then-sign pattern used with ECDSA. In HashML-DSA, you hash the message with an approved function like SHA-512, then pass the digest into a modified ML-DSA algorithm that flips a domain separation byte and encodes the hash function's OID into the signed data. The result is a different signature from what pure ML-DSA would produce over the same content.
The core problem with HashML-DSA is that it turns the hash function into a formal parameter of the signature primitive. With pure ML-DSA, if your protocol hashes the message before signing, ML-DSA does not know or care. The hash function choice lives in your protocol specification, external to the primitive, irrelevant to its security proof. With HashML-DSA, the hash function is inside the primitive: its OID is encoded in the signed data, the security argument formally depends on its collision resistance, and the verifier must know which hash function was used.
This creates a dilemma. Either the hash function is fixed by your protocol, in which case HashML-DSA is redundant (your protocol could specify application-layer hashing followed by pure ML-DSA and achieve the same result), or the hash function varies and must be communicated to the verifier as an untrusted parameter. The latter is the same structural weakness that has produced repeated vulnerabilities in systems like JWT, where the algorithm identifier travels with the token and can be manipulated.
There is also an interoperability cost. HashML-DSA uses separate OIDs and requires a different Verify() routine. RFC 9881 lists this as the primary reason for prohibiting it in X.509: supporting both modes forces operators to commit a key to pure or pre-hash at certificate creation time, before they may know which mode they need.
HashML-DSA also removes a secondary security property that pure ML-DSA provides for free. In pure ML-DSA, mu is computed by hashing the message together with tr (the public key fingerprint). A collision attack against the internal hash would need to target a specific public key: the attacker must find m1 and m2 where H(tr || m1) = H(tr || m2) (simplified for brevity) for a particular tr. In HashML-DSA, the pre-hash is computed before tr is mixed in, so a single collision H(m1) = H(m2) works against every key that ever signs using that hash function.
Handling large messages
Pure ML-DSA takes the full message as input and this is not always practical. For large payloads or constrained signing interfaces, there are two ways to reduce the input size while staying within pure ML-DSA.
Application-layer hashing
Nothing in FIPS 204 restricts what you pass to pure ML-DSA as the message. If you compute SHA-512(some_big_payload) in your application and hand that 64-byte digest to ML-DSA, it will hash it again internally (binding it to the public key via mu), produce a signature, and that signature will be a valid pure ML-DSA signature. The verifier does not need to know whether the original input was raw data or a digest. From ML-DSA's perspective, the digest is the message. The double hash costs one extra hash operation (computers are fast) and you keep non-resignability because mu is still bound to the public key.
If the application-layer hash function weakens (say someone finds SHA-512 collisions), an attacker could construct two messages that produce the same digest, and a signature over that digest would apply to both. This is a definitely a risk, but it is a protocol-layer risk identical to ECDSA today. ML-DSA signed a specific value, and that signature cannot be forged. Whether that value uniquely maps to one original message is your protocol's problem, not ML-DSA's.
The choice of which hash function to use, when to apply it, and how to handle the mapping between original content and signed content are all protocol decisions. They belong in your protocol specification, documented alongside your message format and signing policy. This is where hash function choices have always lived in well-designed systems. Keeping them at the protocol layer means ML-DSA remains a self-contained primitive with a pretty clean security proof. This approach works everywhere, requires nothing special, and is the simplest option for most systems.
External Mu
Application-layer hashing reduces the message to a digest, which is then passed to a standard ML-DSA.Sign() call. External Mu goes one step further by splitting ML-DSA's own internal computation across two modules.
ML-DSA computes mu from only the public key and the message. Everything after mu depends only on mu, the private key, and randomness. FIPS 204 Algorithm 7 includes an explicit comment on this: the message representative "may optionally be computed in a different cryptographic module". They even have a nice diagram for this. A module with access to the public key computes mu = SHAKE-256(tr || 0x00 || 0x00 || M, 64), where M is the message and tr is the public key fingerprint. That 64-byte mu is sent to the signing module, which calls ML-DSA.Sign_mu(sk, mu), a distinct function from ML-DSA.Sign(), to complete the signature. Sign_mu picks up the algorithm at exactly the point where mu has already been computed, executing only the lattice arithmetic and rejection sampling. The result is a standard pure ML-DSA signature, indistinguishable from one produced in a single module. The advantage over application-layer hashing is that the signing module is executing a FIPS-validated portion of the ML-DSA algorithm, not signing an opaque blob.
In summary, use pure ML-DSA.