Machine Learning and AI: Cybersecurity Use Cases
As a Chief Information Security Officer (CISO), my day job affords me the luxury of attending cybersecurity conferences all over the world. These meet-ups provide invaluable opportunities for peer networking and knowledge sharing. Many attend conferences to hear directly from cybersecurity vendors in the event’s trade halls. In fact, many accept the subconscious trade-off involving ‘free stuff -v- data privacy’ and have their badges scanned in exchange for a t-shirt.
So why the primer on information security conferences? I’ll explain…
Over the past several years, the vendor halls of conferences have been filled with solution providers espousing the virtues of machine learning. Machine learning capabilities are certainly de rigueur. As a battle-weary industry CISO, I have seen many technologies come and go and am always cautious when technology in isolation is seen as a cyber-protection panacea. In this article, we will explore what machine learning is and how it can be used for cyber-good (and bad!).
Machine learning and artificial intelligence - the same thing?
I have been involved in many a forum discussion regarding the terminology associated with all things automation and orchestration. Many people see artificial intelligence (AI) and machine learning (ML) as the same thing. Such a generalisation may rarely get one in trouble, but AI is a broad catch-all, almost a principle.
AI covers any situation where a machine is considered to make a decision on its own, thus exhibiting intelligence. ML is a mechanism for delivering AI. ML technologies invariably take sizeable datasets and use algorithms to identify patterns, thus solving problems which are simply involving too many data points for the human brain.
Machine learning is everywhere!
The average human now experiences interaction with ML in many aspects of their daily lives. Virtual assistants are a ubiquitous addition to the modern western household. How many of us now use Siri, Google or Alexa to turn on our lights, ask for weather updates, or order a pizza?
ML solutions are the delivery mechanism for these technological advancements. Even children’s toys sometimes include ‘natural language processing’ capabilities, with toys moving in response to instructions from the child. Behind the scenes, a recording of the child’s voice is transmitted to the manufacturer’s datacentre, parsing the recording against a dataset of similar data before returning a response which instructs the doll to perform an action.
Perhaps many of us are not comfortable with a microphone implanted in our kids’ dolls…
Societal and moral implications of machine learning
ML technologies introduce a moral dilemma which is profoundly important in our modern lives: where do accountability and responsibility lie when something goes wrong?
As I mentioned earlier, ML solutions are permeating their way into so many aspects of our existence. The costs for data storage and internet access continue to decline and these technology elements underpin machine learning; it’s here to stay and adoption will continue to grow.
Let us pause for a moment and consider questions of accountability and responsibility when computers are making decisions on behalf of the human. In a world where actions are taken based on ‘ones and zeros’ what happens when an incorrect decision is made?
Depending on the implementation (of ML) the impact of that mistake varies from the inconvenient to the life-threatening. Mistakes happen and many will perform subconscious risk analysis in their adoption of virtual assistants or chatbots to order take away. Yes, I may receive pepperoni instead of a cheese supreme but that’s a low likelihood, low impact disruption.
But, machine learning is embedded in the core of our existence. ML technologies are being used to deliver the cars of the future. What happens when your family saloon misreads a road sign or slams on the brakes after wronging computing the characteristics of its surroundings? Impacts here could be fatal.
The permutations above are of the accidental variety, what about the cybercriminal? What if somebody intentionally aims to ‘game’ an instance of ML for nefarious purposes? What if threat actors intentionally affect the integrity of the information being fed into machine learning algorithms?
Contemporary technologies such as machine learning, blockchain and big data all need to undergo threat modelling exercises before being deployed within an enterprise environment. Organizations need to consider the business benefits that technology advancements bring in terms of operational efficiencies and bottom line growth, comparing these against the potential for business disruption manifesting as risk – be that financial, reputational or health & safety (for further details on this, read Cyber Risk Management).
Bridging the cybersecurity skills gap
If you work in cybersecurity, you’ll be aware that we have a global lack of people to fill cybersecurity roles. What does this mean? It means that organizations are making concessions – either important cybersecurity tasks are being carried out by staff ‘spinning too many plates’, or critical processes such as patch management, asset inventory and incident response are not being carried out at all.
Can machine learning help with our skills shortage? Yes (and no). ML models can be used to automate the time-intensive or the mundane. It frees up human beings to focus on trickier tasks which cannot be performed by robots or computers (yet!).
In many cases, ML algorithms require test data. Algorithms need to be fed with information which delineates the wheat from the chaff. These algorithms are colloquially referred to as supervised learning algorithms.
For all the automation and orchestration which is facilitated through machine learning, we still require human interaction. There is an intrinsic association between technology and people. Although automated methods of identifying suspicious files, or provisioning firewall rules will remove resource burden at the first line, there will (for now) always be a need for a human hand to triage and sanity check.
Machine learning helps see the wood for the tree
ML is certainly capable of easing operational burden in many cybersecurity teams. Take, for example, signature-based anti-malware solutions and the deluge of alerts an operator will encounter on a daily basis - no wonder our industry has coined the phrase ‘alert fatigue’.
Signature-based malware solutions function by comparing the signature (a cryptographic hash) of a file with a list of ‘known bad’ signatures (malware). According to plenty of industry reports, the average lifetime of a specific piece of malware is now in the order of seconds and minutes, not hours and days which was common when the original signature-based anti-malware solutions were devised in the late 1990s.
If hundreds of thousands of malware variants are created on a daily basis, it’s futile to expect a human operative to triage and respond to each and every alert. ML algorithms provide the modern incident response team with an automated mechanism for the identification, classification and prioritization of incident activity.
Incident response isn’t the only cybersecurity field where machine learning can augment existing capabilities. ML is now being used to look for malicious content within web pages; ML is particularly good at identifying spam email. Machine learning is also ideally suited for the identification of insider threat activity – nefarious activity carried out by company employees for a myriad of reasons (financial motivation, disgruntlement).
Machine learning for nefarious purposes
We have spoken of the benefits ML can bring to the digitally-enabled business, but unfortunately, financially-motivated cybercriminals are also operating a lot like traditional corporates, and as such, they too can prosper with reduced staffing overheads and powerful data crunching algorithms.
For example, ML algorithms are ideally suited for the creation of phishing emails which will evade anti-phishing capabilities, supervised learning algorithms can be exploited to gather information about potential victims based on their social media profiles and browsing habits.
With great power comes great responsibility. Cybersecurity defenders must spend time thinking like an attacker. Activity such as red teaming and attack simulation can add immeasurable value to the cybersecurity team, enabling them to see things from an ‘outside-in’ perspective and design controls accordingly.
Conclusion
Technology is only part of the cybersecurity puzzle. Organizations need robust technical solutions and established processes to deal with incidents, and qualified, experienced people to design, maintain and operate infrastructure and applications.
Machine learning brings with it opportunities for both enterprise enablement, but also the opportunity for malevolence.