Artificial intelligence shows promise in fortifying cybersecurity, but programs are still in early stages and a scarcity of the data needed to train models is slowing progress, researchers and cyber specialists say.
Within the U.S. government, many applications of AI are still in exploratory phases, said Matt Hayden, who until January was an assistant secretary for cyber, infrastructure, risk and resilience at the Department of Homeland Security.
“We’re kicking the tires, trying to see where some of these advantages lie. We aren’t mature enough as a department yet to take it on as a whole-of-government approach,” Mr. Hayden said on a virtual conference Wednesday organized by Ai4 LLC.
Other speakers at the conference said the situation in the private sector is at a similarly early stage. Anna Trikalinou, a security research scientist at
, said the chip maker has a number of efforts under way, including a partnership with
to use AI to analyze and classify malware variants by examining their coding.
Zachary Hanif, senior director of machine learning at
Capital One Financial
, said the company is examining how AI could be used in cybersecurity through internal research and by partnering with vendors.
“Larger enterprises are probably understanding how it fits into their doctrines,” Mr. Hanif said. “Individual contributors and researchers are actively pursuing and leading the field as the tip of the spear, so to speak.”
Among the issues hindering faster adoption is a lack of available data to train AI models, said Charlie Greenbacker, head of federal and strategic technology programs at cybersecurity firm Snorkel AI Inc., who also spoke at the conference.
“Anyone who does anything in AI will tell you the biggest problem is usually training data,” he said.
Machine learning and other forms of AI rely on vast quantities of data to inform their algorithms and teach them how to differentiate between objects, behaviors and other elements. Facial-recognition algorithms, for instance, might analyze millions of photographs before they can begin to distinguish facial features. Likewise, an algorithm designed to detect and identify road signs will need to churn through a similarly large volume of data.
If … you want to train models on publicly available cyber data, you’re pretty much out of luck since most organizations don’t want to share highly sensitive data like that, whether for operational reasons or it’s just embarrassing.
While libraries of training data exist for those applications, cybersecurity can be complex, and the data required to train an algorithm to spot hackers could be specific to a given company’s systems. Acquiring data in bulk can be difficult, Mr. Hanif of Capital One said, as it can touch on intellectual property or other material that companies generally prefer to keep in-house.
Getting hold of data from successful attacks to train models is even more difficult, Mr. Greenbacker said.
“If … you want to train models on publicly available cyber data, you’re pretty much out of luck since most organizations don’t want to share highly sensitive data like that, whether for operational reasons or it’s just embarrassing,” he said.
While many companies are still in the nascent stages of exploring AI, so too are hackers, Ms. Trikalinou of Intel warned. Cybersecurity professionals will need to leverage machine learning and similar technologies soon to keep ahead of attacks, she said.
“I see AI playing a critical role in advancing the security domain both from the defensive side and from the other side,” she said. “So I think this is something that we really, really need to pay attention to—right now.”
Write to James Rundle at firstname.lastname@example.org