Technical Articles

Review Cloudmersive's technical library.

Why is Deterministic Threat Detection Critical for Uncovering Invalid File Threats?
1/18/2024 - Brian O'Neill


Protecting our systems from malware-based attacks should involve a combination of probabilistic and deterministic threat detection methods. These opposing threat detection methods effectively complement one another, boasting higher success rates for detecting zero-day malware threats and established (documented) malware threats respectively. Threat actors can use malware that falls into either category when they attempt to exploit vulnerabilities in our systems, and we need to be ready for both scenarios.

man cupping hands around floating lock

Malware, however, is only one category of threat we can expect our systems to encounter. Sophisticated threat actors can just as easily compromise our systems with non-malware threats (i.e., specially crafted invalid file uploads) designed to exploit vulnerabilities in our applications and bypass malware-focused detection policies. To combat threats in this category, we need to rely more heavily on deterministic (rules-based) threat detection methods.

Understanding Deterministic Threat Detection

At its core, deterministic threat detection is a binary concept. It relies on a clearly defined set of threat detection rules to uncover specific types of threats with a high degree of accuracy. Though inherently limited against zero-day (unknown/previously unrecorded) threats, deterministic threat detection methods tend to be efficient, limiting false-positives while consuming a very manageable set of resources.

In the context of malware threat detection, deterministic threat detection often boils down to signature-based scanning (referencing file signatures against a continuously updated database of recorded malware threats). This policy is a double-edged sword in malware detection; files that match a certain malware signature are almost sure to contain a threat, while files that contain zero-day malware signatures are just as sure to avoid detection.

When it comes to detecting customized non-malware threats, however, the efficacy of deterministic threat detection is greatly amplified. Rather than relying on previously recorded threat signatures to find a match – a moving target – we can instead lean on clearly-defined, static file formatting standards to determine when files have been spoofed or tampered with (i.e., made invalid) in potentially threatening ways. We can identify certain threatening content types – such as scripts, executables, or macros – based on rigid characteristics associated with those content types and safely block them without requiring a specific signature match.

Example Threat Actor Workflow

A malicious user could, for example, author custom scripts or executables and rename the file extension “.JPG” so it appeared valid for a particular image processing application. A malicious file crafted in this way would not contain discernable malware, so it would likely bypass forward-deployed malware threat detection policies. If the target application did not rigorously validate this file past the extension level, it might run the hidden script or executable, and the attacker might then be able to execute malicious code and/or gain access to the application servers.

By performing deterministic, in-depth content verification of the file format in question, however, we would be able to clearly identify that the contents of this alleged “.JPG” file did not conform to JPG formatting standards. Depending on the attacker’s commitment to obfuscation, we might also be able to separately identify the underlying scripts or executable contents themselves. Either way, because of this deterministic content verification process, we could make the decision to allow or disallow the compromised file from being processed by the target application.

Deterministic Threat Detection with the Cloudmersive Advanced Virus Scan API

The Cloudmersive Advanced Virus Scan API combines dynamic malware threat detection policies with deterministic content verification policies to provide 360-degree protection for web applications and cloud storage containers.

Custom rules can be set to flag invalid files or files containing scripts, executables, macros, password-protection measures, macros, HTML, JSON, and other threatening file types. Additionally, a custom comma-separated whitelist can be provided (e.g., ‘.png,.jpg,.pdf’) to deterministically block all file types which fail to match those specific files’ formatting standards.

For more information on the Cloudmersive Advanced Virus Scan API, please do not hesitate to reach out to a member of our sales team.

800 free API calls/month, with no expiration

Get started now! or Sign in with Google

Questions? We'll be your guide.

Contact Sales