VirusTotal += Crowdsourced AI

We are pleased to announce the launch of Crowdsourced AI, a new initiative from VirusTotal, dedicated to leveraging the power of AI in tandem with community contributions. Spearheading this endeavor, Hispasec brings to the table an AI solution designed to analyze Microsoft document formats, particularly those containing macros, such as Word, Excel, and PowerPoint files. We extend a warm invitation to all interested parties to join this effort and explore innovative ways to contribute features that will strengthen the cybersecurity community.

About three months ago, we rolled out Code Insight, an AI tool geared to help security analysts better understand unfamiliar code snippets with explanations in natural language. In a more recent Q&A, we put out a call to anyone keen to lend their own AI models or use cases to VirusTotal to benefit the community. Now, Hispasec has stepped in and added a powerful solution for Microsoft Office documents. They’re using a different AI model not only to explain the macros but also to deliver judgements about any potential malicious content, boosting VirusTotal capabilities.

In the words of the company:
“We are incorporating a specialized AI component from our Content Disarm & Reconstruction (CDR) solution, DeepClean, into VirusTotal. This component leverages a Large Language Model (LLM) to interpret and explain the code within macros in specific Microsoft document formats. Additionally, it offers a verdict—based on the model’s criteria—on whether the analyzed content can be considered malicious or benign. It’s important to emphasize that this is just one facet of DeepClean. Our broader solution recreates files into clean versions, eliminating executable code while preserving the essential content.”

This new integration not only bolsters our AI-driven security analysis but also exemplifies the strength in diversity, mirroring our existing initiatives like Crowdsourced IDS, Sigma, and YARA rules. In line with VirusTotal’s mission, we openly welcome various complementary solutions, reaffirming our commitment to a collaborative defense strategy against cyber threats.

Let’s dive into a few examples showcasing how this new crowdsourced AI section and the contributions from Hispasec perform and are displayed within VirusTotal.

In the example below, we see the verdict label “malicious” at the beginning of the explanation, emphasized in red for easy visibility. This is followed by a detailed description of how macros within this .XLS file employ various obfuscation techniques. These include base64 encoded strings and the concatenation of variables with diverse names, in an attempt to disguise their behavior. However, the model deobfuscates these measures, revealing the true intent of the macros. It turns out they are attempting to download a script containing a PowerShell reverse shell and subsequently execute it.