Common DMARC XML report errors (and what they actually mean)
Most DMARC aggregate reports parse cleanly, but a handful of receiver-specific quirks trip up parsers. This is what each error usually means and whether it's safe to skip.
TL;DR
- 1 Missing envelope_from / header_from: RFC requires it; many ESPs omit. Skip the record, don't fail the report.
- 2 Empty selector field: legitimate — many receivers don't fill it. Don't fail the report.
- 3 Malformed disposition value: receiver bug, treat as 'none' and continue.
What it does
DMARC aggregate report XML is governed by RFC 7489 + the schema bundled with it. The schema specifies which fields are required (minOccurs=1) and which are optional. Almost every receiver violates the schema somewhere — some omit selectors, some put the policy reason in the wrong element, some send invalid date ranges.
Your parser has a choice: fail-fast (drop the report on the first schema violation) or fail-tolerant (extract what's parseable, skip the bad bits). PhishFence's parser is fail-tolerant by design — a partial report is more useful than no report.
How it works
-
1
On parse error: log + skip the affected record, continue with the rest of the report.
-
2
On missing field: use the empty-string / None default, continue.
-
3
On schema-divergent layout (reason on policy_evaluated instead of auth_results): fall back to the alternative location.
-
4
On unparseable dates: skip the date and use received_at from the email envelope.
Common pitfalls
-
Dropping the whole report on one bad record. You lose 99 records worth of signal because of 1 bug in the receiver's emitter.
-
Trusting the report's date_range. Some receivers send reports with end < begin, or with timestamps in receiver-local time. Use email-envelope received_at as fallback.
-
Parsing only with strict XSD validation. The RFC's own schema disagrees with the prose in places (fo element minOccurs); strict parsers reject ~30% of real-world reports.
-
Ignoring the policy_evaluated/reason override. Receivers use this to explain why they did NOT apply the published policy (forwarded mail, mailing list rewrite, etc.) — losing this data hides real auth signal.