Internet Scale Malware Analysis BH US2015



Video: Black Hat USA 2015 - Internet Scale File Analysis 




3 Tools

Problem being solved: “Current malware analysis is broken”:

The Novetta “Answer”: Leverage Big Data to analysis existing data


What it is

Analytics supported

Dynamic scaling



What it is

What it isn’t

Technology under the covers

Some interesting ideas re technology approaches:


“Blockbuster” group’s analysis of Sony malware




My guesses:

While the SPE attack occurred over a year ago, we are releasing this report now to detail our technical findings, clarify details surrounding the SPE hack, and profile the Lazarus Group[4]


we were able to scan signatures over hundreds of millions of samples… From the billions of files scanned, Novetta’s signatures produced approximately 2000 samples, of which 1000 were manually vetted and catalogued as belonging to the Lazarus Group …[5]

Note that “Lazarus Group” is Novetta’s name for the threat actors behind the Sony Incident.



“Lazarus Group” activities

Section 3 describing the “Lazarus Group” activities is rife with disclaimers:

“However, Novetta’s analysis and findings suggest that the SPE attack was one of several attacks”

“Although Novetta is unable to determine via technical malware analysis whether or not the SPE attack was carried out by an identified nation-state …”

“these numbers should not be considered reflective of the totality of Lazarus Group tools detected in this Operation, due to the nature of our approach”

” These linked cyber operations over several years … suggest actions of a single group, or perhaps very close groups with similar goals who share tools, methods, taskings, and even operational duties”

In addition, the report associates observations[6]

“The above example is a media report discussing the May 2015 South Korean parliamentary election, which included candidates for the Saenuri Party, South Korea’s ruling party since 2008 … “

with possibly unrelated context:

“Saenuri has taken a much stronger stance toward North Korea aggressions … Saenuri is also a major advocate of cyber security and the National Intelligence Service …”

This sort of tactic can quickly fall into “guilt by association” / “argumentum ad hominem” fallacies.

Binary Code analysis

On the other hand, section 4 “Malware Tooling” is a really interesting read.

The analysis presents a detailed list of code characteristics that can be used to identify common code that has been reused through the various malware. The authors develop a compelling argument for an underlying malware framework including wrappers:

The code characteristics enumerated include:

This base analysis is used to deduce the various malware making up the attack toolset. The malware is classified in families.

All in all, this particular section is a fascinating read.




[4] Pg 6 Executive Summary

[5] Pg 10.

[6] Pg 19

[7] Cf pg 47: “First, the order of the extensions is constant. Second, the function has a typo where the file extension .hwp is checked for twice in a row”