Reference Concepts

Understanding PhishFence confidence levels

What each alert confidence band (registered, suspected, likely, confirmed) means, what signals drive it, and how to triage.

TL;DR

1Confirmed phishing: brand + credential form + threat-intel hit. Take down immediately.
2Likely phishing: brand + active infrastructure (HTTP, SSL, MX) but no external confirmation yet. Treat as confirmed.
3Suspected impersonation / Registered lookalike: monitor; investigate based on context, not as default-emergency.

What confidence levels mean

Detection without prioritization is just noise. A single scan against your brand can surface hundreds of registered lookalike domains; the vast majority are inactive, parked, or unrelated to phishing. The job of confidence scoring is to separate "this is happening, act now" from "this exists, keep watching", so your team's response time goes to the right alerts.

PhishFence buckets every detection into one of four confidence levels based on the signals it collects on each scan. The bucket determines the recommended action: confirmed and likely both warrant immediate takedown work; suspected and registered are monitoring states. The bucket can change between scans as new infrastructure comes online or as threat-intel feeds confirm.

Buckets aren't a score, they're a routing rule. The point is to spend triage time on the alerts that need it and skip the ones that don't, without missing real threats by under-categorizing them.

How each bucket is computed

1
Confirmed phishing. Brand match (typosquat / homoglyph / combosquat) + live HTTP + credential harvest form (login fields, payment form, or known phishing-kit fingerprint) + at least one threat-intel hit (URLhaus, PhishTank, Google Safe Browsing, VirusTotal). When all four agree, the case is closed: take it down.
2
Likely phishing. Brand match + live HTTP + at least one of {MX records, valid SSL, content references your brand, suspicious hosting}. Lacks the threat-intel confirmation of Confirmed, but the infrastructure tells you a phishing campaign is imminent. Treat as confirmed for response.
3
Suspected impersonation. Brand match + live HTTP, but the content doesn't trip the phishing classifier. Could be a parody site, a competitor, a fan site, or pre-content phishing infrastructure. Investigate manually; don't auto-escalate.
4
Registered lookalike. Brand match + DNS resolution, but minimal or no active infrastructure (no HTTP, no MX, no SSL). The variant exists but isn't doing anything yet. Monitor: if signals come online later, the bucket promotes automatically.

Common pitfalls

Treating Registered lookalikes as emergencies. Most never become active. Acting on every registered variant burns your team's time and creates alert fatigue that hides real threats.
Ignoring Likely phishing because there's no threat-intel hit. Threat-intel feeds have latency. A campaign launches before it appears in any feed; the infrastructure signals (active HTTP + SSL + MX + cloned content) are leading indicators.
Manual confidence overrides without audit logging. Sometimes you need to demote a Confirmed bucket because it's actually a fan site. Do it through the per-alert override (which logs the actor + reason), not by ignoring the alert. Otherwise you lose the institutional memory.
Reading confidence as a probability number. It's a routing decision based on signal combinations, not a calibrated probability. "Likely" doesn't mean "50% probability", it means "active infrastructure with brand match, treat as a real threat."

TL;DR

What confidence levels mean

How each bucket is computed

Common pitfalls

Related tools & guides