In a previous post we discussed artificial intelligence and machine learning, and how these emerging technologies are augmenting existing email security solutions to complete our defenses against email attacks. Now we enter the mode of using AI for defense: machine learning algorithms.
There is no solution, vendor or organization today which does not refer to artificial intelligence (AI) and its most sought-after application, machine learning, in its strategy. But most announcements and marketing slogans are little more than “AI-washing” existing technologies; in reality, these tools are more similar to analytical solutions than to intelligence, that is, the ability to automate behavior or processes. Little of this is applied to email security, where AI is used by only a few vendors who know how to effectively harness its potential to fight against the greatest threats to enterprises: email-based phishing, business email compromise, malware and ransomware attacks.
In general, email defense is stuck in the past, and difficult to evolve when faced with email where the protocols are not well-secured, and with traditional solutions used to protect them, resting mainly on simple rules, address lists and identified attack signatures. In addition, hackers continue to improve their methods of attack. To protect oneself today, one must be reactive to rapidly detect waves of attacks which are becoming ever-more sophisticated. They must also be predictive to find a threat’s assumptions and anticipate an attack. That’s where AI and ML become very useful, as complements to existing capabilities.
How AI functions when it protects emails
Classic AI uses algorithms, rules and instructions which are often statistical, which describe issues and are involved in resolving them. To detect and separate infected emails from healthy ones, algorithms are applied to secure emails, using well-known rules from traditional solutions. Due to its self-learning capabilities, ML can call on enormous volumes of data coming from protected messaging systems and processed in Big Data mode. It can compare events (emails or waves of emails) to detect changes, particularly those that could potentially hide a threat. Using these data, and those provided by administrators, ML then assists in creating new rules to feed the threat knowledge database.
Two methods are used from the mass of algorithms used by those rare email protection solution vendors who can develop and deploy them:
– Supervised algorithms
If AI algorithms are generally very complex, the principles of this type of algorithm are ‘simple’: the publisher who knows the nature of the threats defines decision models which will continue to be fed by training the AI using both healthy and malicious emails. These data, which are verified and validated manually by operators, are transformed into a body of specific characteristic vectors, or features, for the threat to be detected; these are used to create a model and determine a result depending upon the class of email. This is how ML can learn and reproduce procedures to qualify threats. Note that human expertise is still required to verify the characteristics that one hopes to detect, the results obtained, to monitor and supervise algorithms, to work on the precision of results, and to update the data.
– Unsupervised algorithms
These algorithms learn data to identify new threats. Phishing emails and malware attack in waves, changing regularly using small implementation changes. These are content modifications in emails or code for polymorphic malware which can trick traditional tools, because the attack signatures don’t match those of known attacks. Unsupervised algorithms use clustering. They identify groups of emails and waves of malware using resemblances, find correlations, detect outliers and emails which stick out. They also help humans to detect patterns which describe the threats which can be exploited using supervised algorithms.
Humans retain control of AI in cybersecurity
AI and machine learning are technologies which, when mastered, can detect and block an email attack in two dimensions: predictive and reactive. ML offers the ability to detect unknown attacks, and to add new rules to continually strengthen existing solutions. In addition, humans retain control to qualify and validate the evolution of algorithms, re-work their initial specifications, and improve solution efficacy.
The fight against fraud is never ending. While we develop new solutions to protect emails, hackers do the same to get around defenses. With the coming of AI, they look for faults in AI in a vicious circle where the user, who receives the emails, and the enterprise behind him are the losers… That is why vendors are truly committed and skillful in AI and ML, working to apply these technologies to ever-more domains to extend these algorithms against all types of threats. They work on creating a corpus and extracting characteristics to develop new specialized models to detect certain types of threats. The most virulent, for example, are those which can bring down an entire enterprise.