ChatGPT Provides Limited Help Identifying Malware

A report published today by Endor Labs, a provider of a platform for identifying software dependencies, suggested that the ability to rely on large language models (LLMs) that drive generative artificial intelligence (AI) to accurately identify the level of risk malware represents is fairly limited.

The report found that current LLM-based technologies, like ChatGPT, can accurately classify malware risk in only 5% of cases—and may never be able to recognize novel approaches used to create malware—simply because there hasn’t been enough time to train the LLMs.

AWS Builder Community Hub

Henrik Plate, head of security research for Endor Labs, said ultimately, LLMs may do a lot more for the cybercriminals that use them to create new types of malware than they will for defenders.

Endor Labs has been making a case for a platform that leverages graph technology to enable IT organizations to improve application security by understanding what code is actually vulnerable to any given exploit.

The report found, for example, that while 71% of typical Java application code comes from open source components, only 12% of that imported code actually runs. Vulnerabilities in unused code are rarely exploitable, so organizations can eliminate or de-prioritize 60% of remediation work if they simply had more visibility into what code is actually running in their production environments.

However, the report also noted that organizations often become overconfident because they fail to account for the dependencies of their application. The report found that while 45% of applications have no calls to security-sensitive application programming interfaces (APIs) in their codebase, that percentage dropped to 5% when dependencies are included.

The report also found that ChatGPT’s API is now used in 900 npm and PyPi packages across a range of domains; 75% of those packages are brand new.

In general, functional dependencies are created whenever developers download a third-party component. The most important thing for any software development team to determine when assessing the actual risk levels created by those dependences is determining how accessible any given vulnerability is to an attacker regardless of its severity score.

In the wake of a series of high-profile breaches, there has, fortunately, been an increased focus on software supply chains. The challenge is most developers don’t have a lot of cybersecurity expertise. They often create patches for applications based on a list of potential vulnerabilities surfaced by cybersecurity teams. The issue, of course, is that just because there is a vulnerability it does not always mean that the affected software component can be externally accessed.

If cybersecurity teams want to bridge the divide that has existed between themselves and developers for years, they need to be able identify what code needs to be fixed immediately versus code that could be updated at some point in the future in a DevSecOps workflow. Otherwise, developers will continue to ignore most alerts that, from their perspective, are little more than the proverbial cry of “Wolf!”

Avatar photo

Michael Vizard

Mike Vizard is a seasoned IT journalist with over 25 years of experience. He also contributed to IT Business Edge, Channel Insider, Baseline and a variety of other IT titles. Previously, Vizard was the editorial director for Ziff-Davis Enterprise as well as Editor-in-Chief for CRN and InfoWorld.

mike-vizard has 620 posts and counting.See all posts by mike-vizard