Detecting and preventing misdirected emails with correspondence semantics
Identifying if the recipient of a particular email appears correct in the wider context of user activity.
Data breaches do not always start with malicious actors, they are sometimes the result of human error. Misdirecting an email is a risk for every employee and every organization. Detecting and preventing misdirected emails has the potential to stop accidental personal data breaches before they happen.
Detecting misdirected emails involves more than checking whether the sender and the recipient are known correspondents. Either through a simple mistake, or misplaced trust in the address field autocomplete, it is easy to misdirect an email and send it to a previous correspondent. Accurate classification must also consider the content of the message.
A practicable solution must also overcome two challenges. First, the training set for confirmed positive matches is very small. Misdirected emails are both uncommon and embarrassing, making classifier supervision challenging.
Second, response time is critical to functionality in this case. Intervention must happen before the email leaves the sender’s outbox. However, maintaining an acceptable response time is problematic because tracking correspondence within networks of varying size requires dynamic scaling.
We have made a classifier that tracks semantics shared over a network of correspondents.The classifier detects anomalous messages by comparing message content to that of previous communications between sender and recipient. It uses unsupervised learning to detect anomalies without relying on reported and confirmed breaches and produces anomaly scores that are weighted against previous correspondence from the entire network. Complexity is constant with the number of correspondents, but memory scaling is sublinear.
The classifier responds quickly enough that Antigena (Darktrace’s Autonomous Response technology) will be able to warn the sender of possible misdirection and ask for confirmation. It might also be used to suggest more likely recipients, by finding similar addresses with a lower anomaly score, to which Antigena can divert the message with a single click.