September 04, 2014

More Data, More Problems: Part #3

Big Data Server Side JavaScript Injection 

Bd3In Part #2 of this blog series, Mark Kraynak covered some of the application security risks that reside in Big Data implementations, and in follow-up, I’d like to address some of these potential risks in greater detail.

In the application world, ‘input sanitation’ has always been one of the biggest concerns from a security standpoint. Unchecked or poorly checked input may result in injection attacks that can result in a data breach, alter an application behavior or a server hijack.

For 12 years, SQL Injection has been one of the biggest concerns in application security due to the fact that it allows a hacker to manipulate the input to the application in such a way that it alters a query to the backend database and may compromise or leak data. (More can be learned on SQL Injection here.)

With the rise of Big Data and the NoSQL concept, where SQL Language is replaced by scripted APIs, it became common belief that the SQL Injection problem had been solved because the backend database was a NoSQL database. Well… that is not entirely true.

While there are several application security problems inherited in some of the big data platforms, including Insecure Direct Object Reference, Client side enforcement of Server side security, and Server Side JavaScript Injection, it’s the latter -- Server Side JavaScript Injection -- that is very similar in its overall structure to SQL Injection. For the sake of argument, then, we will demonstrate this phenomenon.

Server Side JavaScript (SSJS)

In the past few years, as part of the browser race to become the best and the fastest and with JavaScript becoming one of the core components of web enablement, JavaScript engines have become powerful and robust, and were have helped the increase in popularity of server side JavaScript as well. This makes lots of sense as the technology behind it is very mature; plus, with the knowledge base that has grown tremendously inside organizations as they have developed their own applications, it makes total sense that JavaScript has been harnessed for server side development as well.

For example, Node.js was born as a JavaScript runtime environment which can easily become your web server’s platform. Other implementations include database mapping, visual and hardware controls and advanced server side computation. Finally, the technology is also heavily used in Big Data. Specifically MongoDB uses JavaScript for some of its core functionality (for instance, in the query API).

Sever Side JavaScript Injection (SSJI)

To set the premise, we must understand how the backend logic works. The backend big data server will hold either stored JavaScript logic or will be queried with JavaScript functionality (much like a Stored Procedure and an SQL Query), and will then require input into these functions in order to produce the query itself. Also, some Big Data databases allow JavaScript execution within their query dialog as means for advanced lookups, which is a key component to big data’s robustness.

This however, means that there is a potential of altering queries that are executed on the server by manipulating the query that is sent from the application itself!

Technically Speaking

In order to demonstrate the problem, I will show a flow using MongoDB. For demonstration purposes I will be using the zips.json sample dataset that is available online; the dataset includes US City/State mapping to population and cities and is useful for such a demonstration.

First, let’s query our collection as a database administrator would. We would like to get all documents of cities where the population is greater than 100 people.

db.zips.find({'pop': {$gt:100}})

A MongoDB query can be more advanced, and include JavaScript in it in order to perform advanced computations.

db.zips.find( { $where: function() { return this.pop > 100; } } );

Of course, in an application usage, the query will require user input and not a hard coded parameter like ‘100’. That’s where the problem also starts… A simple application implementation of this MongoDB query in PHP might look something like this (pardon my French):

  $mongo=new Mongo();
  $population = $_GET[‘pop’];
  $collection= $db->zips;
  $query = ‘function() {var query_pop= \’’.$year.’\’;’. Return this.pop > query_pop;}’;
  $cursor = $collection->find(array(‘$where’ => $query));

In reality, the parameter ‘pop’ in this code is populated by the application via a GET parameter. A normal URL for that might look something like this:


However, an attacker could alter this parameter to a different value, and introduce the injection point.In our case and for an abstract example, we will put the application to sleep. Our payload will be: 1;sleep(60000)


The result of this URL call will inject the sleep() function to the server, which upon execution will halt for 60 seconds. The same technique could be used to gain file system access, execute remote code, or steal information remotely. of course, escaping out of the query brackets is also an option which makes this vector limitless.

Final words

While NoSQL is very different than SQL, the usage of it in applications isn’t. Developers are leveraging big data’s speed and data volumes to create applications that can do more, but injection points still exist that attackers can potentially hit. The query language is not prohibitive for the attack to happen.

The conclusion is of course, input sanitation at the server side and/or with compensating controls. By detecting an injection attempt on a value that is not intended as input, such attacks could be deterred. The application threat for big data implementations is definitely more than meets the eye and special attention should be made.

Where can I learn more?

  • Bryan Sullivan’s excellent paper and presentation from BlackHat 2011.
  • Felipe Aragon’s example based article to JS Injection techniques.
  • blog post on SQL Injection in MongoDB, here.
  • Earlier installments in this blog series from Mark Kraynak in Part #1 and Part #2
  • A blog on the MongoHQ breach from last year, here




Authors & Topics:

Share on LinkedIn


Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Comments are moderated, and will not appear until the author has approved them.