Eric Lippert Dissects CVE-2014-6332, a 19 year-old Microsoft bug

Posted by Eric, Comments

Today's Coverity Security Research Lab blog post is from guest blogger Eric Lippert.

[UPDATE 1: The MISSING_RESTORE checker regrettably doesn't find the defect in the code I've posted here. Its heuristics for avoiding false positives causes it to suppress reporting, ironically enough. We're working on tweaking that heuristic for an upcoming release.]

It was with a bizarre combination of nostalgia and horror that I read this morning about a 19-year-old rather severe security hole in Windows. Nostalgia because every bit of the exploited code is very familiar to me: working on the portion of the VBScript engine used to exploit the defect was one of my first jobs at Microsoft back in the mid-1990s. And horror because this is really a quite serious defect that has been present probably since Windows 3.1, [Update 2: heard that Windows 3.1 is in fact not affected, so you IE 2-5 users are safe ;)] and definitely exploitable since Windows 95. Fortunately we have no evidence that this exploit has actually been used to do harm to users, and Microsoft has released a patch. (Part of my horror was the fear that maybe this one was my bad, but it looks like the actual bug predates my time at Microsoft. Whew!)

The thirty-thousand foot view is the old familiar story. An attacker who wishes to run arbitrary code on a user's machine lures the user into browsing to a web page that contains some hostile script -- VBScript, in this case. The hostile script is running inside a "sandbox" which is supposed to ensure that it only does "safe" operations, but the script attempts to force a particular buggy code path through the underlying operating system code. If it does so successfully, it produces a corrupt data structure in memory which can then be further manipulated by the script. By cleverly controlling the contents of the corrupted data structure, the hostile script can read or write memory and execute code of their choice.

Today I want to expand a bit on Robert Freeman's writeup, linked above, to describe the underlying bug in more detail, the pattern that likely produced it, better ways to write the code, and whether static analysis tools could find this bug. I'm not going to delve into the specifics of how this initially-harmless-looking bug can be exploited by attackers.

What's so safe about a SAFEARRAY?

Many of the data structures familiar to COM programmers today, like VARIANT, BSTR and SAFEARRAY, were created for "OLE Automation"; old-timers will of course remember that OLE stood for "object linking and embedding", the "paste this Excel spreadsheet into that Word document" feature. OLE Automation was the engine that enabled Word and Excel objects to be accessed programmatically by Visual Basic. (In fact the B in BSTR stands for "Basic".) Naturally, Visual Basic uses these data structures for its representations of strings and arrays. The data structure which particularly concerns us today is SAFEARRAY:

typedef struct tagSAFEARRAY 
{
  USHORT         cDims;         // number of dimensions
  USHORT         fFeatures;     // type of elements
  ULONG          cbElements;    // byte size per element
  ULONG          cLocks;        // lock count
  PVOID          pvData;        // data buffer
  SAFEARRAYBOUND rgsabound[1];  // bounds, one per dimension
} SAFEARRAY;

typedef struct tagSAFEARRAYBOUND
{
  ULONG cElements; // number of indices in this dimension
  LONG  lLbound;   // lowest valid index
} SAFEARRAYBOUND;

SAFEARRAYs are so-called because unlike an array in C or C++, a SAFEARRAY inherently knows the dimensionality of the array, the type of the data in the array, the number of bytes in the buffer, and finally, the bounds on each dimension. How multi-dimensional arrays and arrays of unusual types are handled is irrelevant to our discussion today, so let's assume that the array involved in the attack is a single-dimensional array of VARIANT.

The operating system method which contained the bug was SafeArrayRedim, which takes an existing array and a new set of bounds for the least significant dimension -- though again, for our purposes, we'll assume that there is only one dimension. The function header is:

HRESULT SafeArrayRedim(
  SAFEARRAY      *psa,
  SAFEARRAYBOUND *psaboundNew
)

Now, we do not have the source code of this method, but based on the description of the exploit we can guess that it looks something like the code below that I made up just now.

Bits of code that are not particularly germane to the defect I will omit, and I'll assume that somehow the standard OLE memory allocator has been obtained. Of course there are many cases that must be considered here -- such as "what if the lock count is non zero?" -- that I am going to ignore in pursuit of understanding the relevant bug today.

As you're reading the code, see if you can spot the defect:

{
  // Omitted: verify that the arguments are valid; produce
  // E_INVALIDARG or other error if they are not.

  PVOID pResourcesToCleanUp = NULL; // We'll need this later.
  HRESULT hr = S_OK;

  // How many bytes do we need in the buffer for the original array?
  // and for the new array?

  LONG cbOriginalSize = SomehowComputeTotalSizeOfOriginalArray(psa);
  LONG cbNewSize = SomehowComputeTotalSizeOfNewArray(psa, psaboundNew);
  LONG cbDifference = cbNewSize - cbOriginalSize;

  if (cbDifference == 0)
  {
    goto DONE;
  }

  SAFEARRAYBOUND originalBound = psa->rgsabound[0];
  psa->rgsabound[0] = *psaboundNew;
  // continues below ...

Things are looking pretty reasonable so far. Now we get to the tricky bit.

Why is it so hard to shrink an array?

If the array is being made smaller, the variants that are going to be dropped on the floor might contain resources that need to be cleaned up. For example, if we have an array of 1000 variants containing strings, and we reallocate that to only 300, those 700 strings need to be freed. Or, if instead of strings they are COM objects, they need to have their reference counts decreased.

But now we are faced with a serious problem. We cannot clean up the resources after the reallocation. If the reallocation succeeds then we no longer have any legal way to access the memory that we need to scan for resources to free; that memory could be shredded, or worse, it could be reallocated to another block on another thread and filled in with anything. You simply cannot touch memory after you've freed it. But we cannot clean up resources before the reallocation either, because what if the reallocation fails? It is rare for a reallocation that shrinks a block to fail. While the documentation for IMalloc::Realloc doesn't call out it can fail when shrinking (doc bug?), it doesn't rule it out either. In that case we have to return the original array, untouched, and deallocating 70% of the strings in the array is definitely not "untouched".

The solution to this impass is we have to allocate a new block and copy the resources into that new block before the reallocation. After a successful reallocation we can clean up the resources; after a failed reallocation we of course do not.

  // ... continued from above
  if (cbDifference < 0)
  {
    pResourcesToCleanUp = pmalloc->Alloc(-cbDifference);
    if (pResourcesToCleanUp == NULL)
    {
      hr = E_OUTOFMEMORY;
      goto DONE;
    }
    // Omitted: memcpy the resources to pResourcesToCleanUp
  }

  PVOID pNewData = pmalloc->Realloc(psa->pvData, cbNewSize);
  if (pNewData == NULL)
  {
    psa->rgsabound[0] = originalBound;
    hr = E_OUTOFMEMORY;
    goto DONE;
  }
  psa->pvData = pNewData;

  if (cbDifference < 0)
  {
    // Omitted: clean up the resources in pResourcesToCleanUp
  }
  else
  {
    // Omitted: initialize the new array slots to zero
  }

  hr = S_OK; // Success!

DONE:

  // Don't forget to free that extra block.
  if (pResourcesToCleanUp != NULL)
    pmalloc->Free(pResourcesToCleanUp);
  return hr;
}

Did you spot the defect?

Part of the contract of this method is that when this method returns a failure code, the original array is unchanged. The contract is violated in the code path where the array is being shrunk and the allocation of pResourcesToCleanUp fails. In that case we return a failure code, but never restore the state of the bounds which were mutated earlier to the smaller values. Compare this code path to the code path where the reallocation fails, and you'll see that the restoration line is missing.

In a world where there is no hostile code running on your machine, this is not a serious bug. What's the worst that can happen? In the incredibly rare case where you are shrinking an array by an amount bigger than the memory you have available in the process, you end up with a SAFEARRAY that has the wrong bounds in a program that just produced a reallocation error anyways, and any resources that were in that memory are never freed. Not a big deal. This is the world in which OLE Automation was written: a world where people did not accidentally download hostile code off the Internet and run it automatically.

But in our world this bug is a serious problem! An attacker can make what used to be an incredibly rare situation -- running out of virtual address space at exactly the wrong time -- quite common by carefully controlling how much memory is allocated at any one time by the script. An attacker can cause the script engine to ignore the reallocation error and keep on processing the now-internally-inconsistent array. And once we have an inconsistent data structure in memory, the attacker can use other sophisticated techniques to take advantage of this corrupt data structure to read and write memory that they have no business reading and writing. Like I said before, I'm not going to go into the exact details of the further exploits that take advantage of this bug; today I'm interested in the bug itself. See the linked article for some thoughts on the exploit.

How can we avoid this defect? How can we detect it?

It is surprisingly easy to write these sorts of bugs in COM code. What can you do to avoid this problem? I wrote who knows how many thousands of lines of COM code in my early days at Microsoft, and I avoided these problems by application of a strict discipline. Among my many rules for myself were:

  • Every method has exactly one exit point.
  • Every local variable is initialized to a sensible value or NULL.
  • Every non-NULL local variable is cleaned up at the exit point
  • Conversely, if the resource is cleaned up early on a path, or if its ownership is ever transferred elsewhere, then the local is set back to NULL.
  • Methods which modify memory locations owned by their callers do so only at the exit point, and only when the method is about to return a success code.

The code which I've presented here today -- which I want to emphasize again I made up myself just now to illustrate what the original bug probably looks like -- follows some of these best practices, but not all of them. There is one exit point. Every local is initialized. One of the resources -- the pResourcesToCleanUp block -- is cleaned up correctly at the exit point. But the last rule is violated: memory owned by the caller is modified early, rather than immediately before returning success. The requirement that the developer always remember to re-mutate the caller's data in the event of an error is a bug waiting to happen, and in this case, it did happen.

Clearly the code I presented today does not follow my best practices for writing good COM methods. Is there a more general pattern to this defect? A closely related defect pattern that I see quite often in C, C++, C# and Java is:

someLocal = someExternal;
someExternal = differentValue;
DoSomethingThatDependsOnTheExternal();
//... lots of code ...
if (someError) return;
//... lots of code ...
someExternal = someLocal;

And of course the variation where the restoration of the external value is skipped because of an unhandled exception is common in C++, C# and Java.

Could a static analyzer help find defects like this? Certainly; Coverity's MISSING_RESTORE analyzer finds defects of the form I've just described. (Though I have not yet had a chance to run the code I presented today through it to see what happens.)

There are a lot of challenges in designing analyzers to find the defect I presented today; one is determining that in this code the missing restoration is a defect on the error path but correct on the success path. This real-world defect is a good inspiration for some avenues for further research in this area; have you seen similar defects that follow this pattern in real-world code, in any language? I'd love to see your examples; please leave a comment if you have one.

Detecting SSLv3 in Java

Posted by Jon, Comments

Our SAST product, Security Advisor, recently released a couple of new checkers and updated a couple existing ones. One of the checkers, RISKY_CRYPTO, is now looking for SSLv3. SSLv3 should be considered beyond deprecated because of POODLE, so its use is truly risky at this point. The checker looks for its use implicitly (e.g. some JRE defaults) or explicitly, in either client or server sockets.

An example defect is in org.alfresco.encryption.ssl.AuthSSLProtocolSocketFactory.createSocket, from the Alfresco project. The new version of the analysis flags a defect on line 175, when the socket is bound. Implicitly, the SSLv3 protocol is allowed in the JVM, so this socket potentially exposes itself to the POODLE vulnerability. (If CBC isn't allowed, then this would be a false positive.)



Not bad remediation advice, eh?

As a recovering security consultant, I really hated any tool that reported general crypto usage of say MD5 or something. While RISKY_CRYPTO does this, because sadly people do ask for us to look for this, we're also releasing a smarter crypto checker called WEAK_PASSWORD_HASH.

Take the method com.laoer.bbscs.comm.Util.hash from Java web application called BBS Community System.

public synchronized static final String hash(String data) {
    if (digest == null) {
        try {
            digest = MessageDigest.getInstance("MD5");
        } catch (NoSuchAlgorithmException nsae) {
            System.err
                    .println("Failed to load the MD5 MessageDigest. " + "We will be unable to function normally.");
            nsae.printStackTrace();
        }
    }
    // Now, compute hash.
    digest.update(data.getBytes());
    return encodeHex(digest.digest());
}

The RISKY_CRYPTO checker will flag the "MD5" as being bad, mmmkay. The issue here isn't that MD5 is being used, it's that it's being used to hash a password. Now pointing out MD5 might be good enough for a security professional. It's like blood in the water. However, just stating XYZ algorithm is in use isn't necessarily evil. Developers want more. If you're a security person, be ready to answer: "why is it bad in this context, in this piece of code?"

Glossing over some of the details, the new WEAK_PASSWORD_HASH checker flags data it thinks is a password. It then tracks the password data flow until it reaches a hashing sink it thinks is not adequate. (There's a bit about salts in there I'm skipping, but you get the idea.)

Case in point, WEAK_PASSWORD_HASH correctly infers the Struts 2 entry point com.laoer.bbscs.web.action.Cpasswd.setOldPassword as a source of password data. It tracks the data flow of this field to line 99:

UserInfo ui = this.getUserService().findUserInfoById(this.getUserSession().getId());
if (ui != null) {
    String op = Util.hash(this.getOldpasswd());
    if (!op.equals(ui.getRePasswd())) {
        this.addActionError(this.getText("error.changepasswdold"));
        return INPUT;
    }

... where the unsafe Util.hash method is called. Now that's a defect I'd rather see than RISKY_CRYPTO, or whatever your SAST tool's checker is called, flagging the use of MD5. Now, your developers have the answer to their question without involving anyone. Devs like that.

Understanding Python Bytecode

Posted by Romain, Comments

I've been working with Python bytecode recently, and wanted to share some of my experience working with it. To be more precise, I've been working exclusively on the bytecode for the CPython interpreter, and limited to versions 2.6 and 2.7.

Python is a dynamic language, and running it from the command line essentially triggers the following steps:

  • The source is compiled the first time it is encountered (e.g., imported as a module or directly executed). This step generates the binary file, with a pyc or pyo extension depending on your system.
  • The interpreter reads the binary file and executes the instructions (opcodes) one at a time.

The python interpreter is stack-based, and to understand the dataflow, we need to know what the stack effect is of each instruction (i.e., opcode and argument).

Inspecting a Python Binary File

The simplest way to get the bytecode of a binary file is to unmarshall the CodeType structure:

import marshal
fd = open('path/to/my.pyc', 'rb')
magic = fd.read(4) # python version specific magic num
date = fd.read(4)  # compilation date
code_object = marshal.load(fd)
fd.close()

The code_object now contains a CodeType object which represents the entire module from the loaded file. To inspect all nested code objects from this module, meaning class declarations, methods, etc. we need to recursively inspect the const pool from the CodeType; that means doing something like this:

import types

def inspect_code_object(co_obj, indent=''):
  print indent, "%s(lineno:%d)" % (co_obj.co_name, co_obj.co_firstlineno)
  for c in co_obj.co_consts:
    if isinstance(c, types.CodeType):
      inspect_code_object(c, indent + '  ')

inspect_code_object(code_object) # We resume from the previous snippet

In this case, we'll print a tree of code objects nested under their respective parents. For the following simple code:

class A:
  def __init__(self):
    pass
  def __repr__(self):
    return 'A()'
a = A()
print a

We'll get the tree:

 <module>(lineno:2)
   A(lineno:2)
     __init__(lineno:3)
     __repr__(lineno:5)

For testing, we can get the code object from a string that contains the Python source code by using the compile directive:

co_obj = compile(python_source_code, '<string>', 'exec')

For more inspection of the code object, we can have a look at the co_* fields from the Python documentation.

First Look Into the Bytecode

Once we get the code objects, we can actually start looking at the disassembly of it (in the co_code field). Parsing the bytecode to make sense out of it means:

  • Interpreting what the opcode means
  • Dereference any argument

The disassemble function in the dis module shows how to do that. It will actually provide the following output from our previous code example:

2   0 LOAD_CONST        0 ('A')
    3 LOAD_CONST        3 (())
    6 LOAD_CONST        1 (<code object A at 0x42424242, file "<string>", line 2>)
    9 MAKE_FUNCTION     0
   12 CALL_FUNCTION     0
   15 BUILD_CLASS
   16 STORE_NAME        0 (A)

8  19 LOAD_NAME         0 (A)
   22 CALL_FUNCTION     0
   25 STORE_NAME        1 (a)

9  28 LOAD_NAME         1 (a)
   31 PRINT_ITEM
   32 PRINT_NEWLINE
   33 LOAD_CONST        2 (None)
   36 RETURN_VALUE

Where we get:

  • The line number (when it changed)
  • The index of the instruction
  • The opcode of the current instruction
  • The oparg, which is what the opcode takes to resolve to the actual argument, it knows where to look based on the opcode. For example, with a LOAD_NAME opcode, the oparg will point to the index in the co_names tuple.
  • The resolved argument in parentheses

As we can see at the index 6, the LOAD_CONST opcode takes an oparg that points to which object should be loaded from the co_consts tuple. Here, it points to the type declaration of A. Recursively, we can go and decompile all code objects to get the full bytecode of the module.

The first part of the bytecode (index 0 to 16) relates to the type declaration of A while the rest represents the code where we instantiate an A and print it. Even in this code, there are constructs that are not relevant unless you plan on modifying the bytecode and changing types, etc.

Interesting Bytecode Constructs

The overall opcodes are fairly straight forward, but a few cases seem weird as they might come from:

  • Compiler optimizations
  • Interpreter optimizations (therefore leading to extra opcodes)

Variables Assignment with Sequences

In the first category, we can have a look at what happens when the source assign sequences of variables:

(1) a, b = 1, '2'
(2) a, b = 1, e
(3) a, b, c = 1, 2, e
(4) a, b, c, d = 1, 2, 3, e

These 4 statements produce quite a different bytecode.

The first case is the simplest one since the right-hand side (RHS) of the assignment contains only constants. In that case, CPython can create the tuple (1, '2'), use UNPACK_SEQUENCE to put 2 elements on the stack, and create a STORE_FAST for each variable a and b:

0 LOAD_CONST               5 ((1, '2'))
3 UNPACK_SEQUENCE          2
6 STORE_FAST               0 (a)
9 STORE_FAST               1 (b)

The second case however introduce a variable on the RHS, so the generic case is called where an expression is fetched (here, a simple one with a LOAD_GLOBAL). The compiler however does not need to create a new tuple from the values on the stack (at index 18) and use an UNPACK_SEQUENCE; it's sufficient to call the ROT_TWO which swaps the 2 top elements from the stack (it might have been enough to switch 19 and 22 though):

12 LOAD_CONST               1 (1)
15 LOAD_GLOBAL              0 (e)
18 ROT_TWO
19 STORE_FAST               0 (a)
22 STORE_FAST               1 (b)

The third case is where it becomes really strange. Putting the expressions on the stack is exactly the same mechanism as in the previous case, but after it first swap the 3 top elements, then swap again the 2 top elements:

25 LOAD_CONST               1 (1)
28 LOAD_CONST               3 (2)
31 LOAD_GLOBAL              0 (e)
34 ROT_THREE
35 ROT_TWO
36 STORE_FAST               0 (a)
39 STORE_FAST               1 (b)
42 STORE_FAST               2 (c)

The final one represents the generic case, where no more ROT_*-play seems possible and a tuple is created and then a call to UNPACK_SEQUENCE to put them on the stack:

45 LOAD_CONST               1 (1)
48 LOAD_CONST               3 (2)
51 LOAD_CONST               4 (3)
54 LOAD_GLOBAL              0 (e)
57 BUILD_TUPLE              4
60 UNPACK_SEQUENCE          4
63 STORE_FAST               0 (a)
66 STORE_FAST               1 (b)
69 STORE_FAST               2 (c)
72 STORE_FAST               3 (d)

Call Constructs

The last set of interesting examples are around the call constructs and the 4 different opcodes to create calls. I suppose the number of opcodes is to optimize the interpreter code, since it's not like in Java where it makes sense to have one of the invokedynamic, invokeinterface, invokespecial, invokestatic, or invokevirtual.

In Java, invokeinterface, invokespecial and invokevirtual are originally coming from the static typing of the language (and invokespecial is only used for calling constructors and superclasses AFAIK). invokestatic is self describing (no need to put the receiver on the stack) and there is no such concept (down to the interpreter and not through decorators) in Python. In short, Python calls could always be translated with an invokedynamic.

The different CALL_* opcodes in Python are indeed not here because of typing, static methods, or the need to have a special access for constructors. They are all targeting on how a method call can be specified in Python; from the grammar:

  Call(expr func, expr* args, keyword* keywords,
       expr? starargs, expr? kwargs)

The calls structure allow for code like this:

func(arg1, arg2, keyword=SOME_VALUE, *unpack_list, **unpack_dict)

The keyword arguments allow for passing formal parameters by name and not just position, the * puts all elements from the iterable as arguments (inlined, not in a tuple), and the ** expects a dictionary of keywords with values.

This example actually uses all possible features of the call site construction:

  • Variables argument list passing (_VAR): CALL_FUNCTION_VAR, CALL_FUNCTION_VAR_KW
  • Keyword based dict passing (_KW): CALL_FUNCTION_KW, CALL_FUNCTION_VAR_KW

The bytecode looks like this:

 0 LOAD_NAME                0 (func)
 3 LOAD_NAME                1 (arg1)
 6 LOAD_NAME                2 (arg2)
 9 LOAD_CONST               0 ('keyword')
12 LOAD_NAME                3 (SOME_VALUE)
15 LOAD_NAME                4 (unpack_list)
18 LOAD_NAME                5 (unpack_dict)
21 CALL_FUNCTION_VAR_KW   258

Usually, a CALL_FUNCTION takes as oparg the number of arguments for the function. Here however, more information is encoded. The first byte (0xff mask) carries the number of arguments and the second one ((value >> 8) & 0xff) the number of keyword arguments passed. To compute the number of elements to pop from the stack, we then need to get:

na = arg & 0xff         # num args
nk = (arg >> 8) & 0xff  # num keywords
n_to_pop = na + 2 * nk + CALL_EXTRA_ARG_OFFSET[op]

where CALL_EXTRA_ARG_OFFSET contains an offset specific to the call opcode (2 for CALL_FUNCTION_VAR_KW). Here, that gives us 6, the number of elements to pop before accessing the function name.

To relate to other CALL_* keywords, it then all depends if the code is either using the list passing or dictionary passing argument; it's all about combination here!

Building a Minimal CFG

For understanding how the code actually works, it's interesting to build a control-flow graph (CFG) so we can follow which unconditional sequences of opcodes (basic blocks) will be executed, and under what conditions.

Even if the bytecode is a fairly small language, building a reliable CFG requires more details than this blog post can allow, so for an actual implementation of a CFG construction, you can have a look at equip.

Here, we'll focus on loop/exception free code, where the control flow only depends on if statements.

There are a handful of opcodes that carry a jump address (for non-loop/exceptions); they are:

  • JUMP_FORWARD: Relative jump in the bytecode. Takes the amount of bytes to skip.
  • JUMP_IF_FALSE_OR_POP, JUMP_IF_TRUE_OR_POP, JUMP_ABSOLUTE, POP_JUMP_IF_FALSE, and POP_JUMP_IF_TRUE all take absolute index in the bytecode.

Building the CFG for a function means creating basic blocks (sequence of opcodes that have unconditional execution -- except when an exception can occur), and connecting them in a graph that contains conditions on branches. In our case, we only have True, False, and Unconditional branches.

Let's consider the following code example (which should never be used in practice):

def factorial(n):
  if n <= 1:
    return 1
  elif n == 2:
    return 2
  return n * factorial(n - 1)

As mentioned before, we get the code object for the factorial method:

module_co = compile(python_source, '<string>', 'exec')
meth_co = module_co.co_consts[0]

The disassembly looks like this (minus my annotations):

3           0 LOAD_FAST                0 (n)
            3 LOAD_CONST               1 (1)
            6 COMPARE_OP               1 (<=)
            9 POP_JUMP_IF_FALSE       16              <<< control flow

4          12 LOAD_CONST               1 (1)
           15 RETURN_VALUE                            <<< control flow

5     >>   16 LOAD_FAST                0 (n)
           19 LOAD_CONST               2 (2)
           22 COMPARE_OP               2 (==)
           25 POP_JUMP_IF_FALSE       32              <<< control flow

6          28 LOAD_CONST               2 (2)
           31 RETURN_VALUE                            <<< control flow

7     >>   32 LOAD_FAST                0 (n)
           35 LOAD_GLOBAL              0 (factorial)
           38 LOAD_FAST                0 (n)
           41 LOAD_CONST               1 (1)
           44 BINARY_SUBTRACT
           45 CALL_FUNCTION            1
           48 BINARY_MULTIPLY
           49 RETURN_VALUE                            <<< control flow

In this bytecode, we have 5 instructions that change the structure of the CFG (so adds constraints or allows for quick exit):

  • POP_JUMP_IF_FALSE: Jump to the absolute index 16 and 32,
  • RETURN_VALUE: Pop one element from the stack and returns it.

Extracting the basic blocks becomes easy since these instructions that change the control flow are the only one we're interested in detecting. In our case, we don't have jumps that impose no fall-through, but JUMP_FORWARD or JUMP_ABSOLUTE do that.

Example code to extract such structure:

import opcode
RETURN_VALUE = 83
JUMP_FORWARD, JUMP_ABSOLUTE = 110, 113
FALSE_BRANCH_JUMPS = (111, 114) # JUMP_IF_FALSE_OR_POP, POP_JUMP_IF_FALSE

def find_blocks(meth_co):
  blocks = {}
  code = meth_co.co_code
  finger_start_block = 0
  i, length = 0, len(code)
  while i < length:
    op = ord(code[i])
    i += 1
    if op == RETURN_VALUE: # We force finishing the block after the return,
                           # dead code might still exist after though...
      blocks[finger_start_block] = {
        'length': i - finger_start_block - 1,
        'exit': True
      }
      finger_start_block = i
    elif op >= opcode.HAVE_ARGUMENT:
      oparg = ord(code[i]) + (ord(code[i+1]) << 8)
      i += 2
      if op in opcode.hasjabs: # Absolute jump to oparg
        blocks[finger_start_block] = {
          'length': i - finger_start_block
        }
        if op == JUMP_ABSOLUTE: # Only uncond absolute jump
          blocks[finger_start_block]['conditions'] = {
            'uncond': oparg
          }
        else:
          false_index, true_index = (oparg, i) if op in FALSE_BRANCH_JUMPS else (i, oparg)
          blocks[finger_start_block]['conditions'] = {
            'true': true_index,
            'false': false_index
          }
        finger_start_block = i
      elif op in opcode.hasjrel:
        # Essentially do the same...
        pass

  return blocks

And we get the following basic blocks:

Block  0: {'length': 12, 'conditions': {'false': 16, 'true': 12}}
Block 12: {'length': 3, 'exit': True}
Block 16: {'length': 12, 'conditions': {'false': 32, 'true': 28}}
Block 28: {'length': 3, 'exit': True}
Block 32: {'length': 17, 'exit': True}

With the current structure of the blocks:

Basic blocks
  start_block_index :=
     length     := size of instructions
     condition  := true | false | uncond -> target_index
     exit*      := true

we have our control flow graph (minus the entry and implicit return blocks), and we can for example convert it to dot for visualization:

def to_dot(blocks):
  cache = {}

  def get_node_id(idx, buf):
    if idx not in cache:
      cache[idx] = 'node_%d' % idx
      buf.append('%s [label="Block Index %d"];' % (cache[idx], idx))
    return cache[idx]

  buffer = ['digraph CFG {']
  buffer.append('entry [label="CFG Entry"]; ')
  buffer.append('exit  [label="CFG Implicit Return"]; ')

  for block_idx in blocks:
    node_id = get_node_id(block_idx, buffer)
    if block_idx == 0:
      buffer.append('entry -> %s;' % node_id)
    if 'conditions' in blocks[block_idx]:
      for cond_kind in blocks[block_idx]['conditions']:
        target_id = get_node_id(blocks[block_idx]['conditions'][cond_kind], buffer)
        buffer.append('%s -> %s [label="%s"];' % (node_id, target_id, cond_kind))
    if 'exit' in blocks[block_idx]:
      buffer.append('%s -> exit;' % node_id)

  buffer.append('}')
  return '\n'.join(buffer)

To produce the source of that graph:

Why Bother?

It's indeed fairly rare to only have access to the Python bytecode, but I've had this case a few times in the past. Hopefully, this information can help someone starting a reverse engineering project on Python.

Right now however, I've been investigating the ability to instrument Python code, and especially its bytecode since there are no facilities for doing so in Python (and instrumenting source code often leaves with very inefficient instrumentation code with decorators, etc.). That's where equip comes from.

Shell Shock in Java Apps

Posted by Ian, Comments

A few weeks ago, security researchers disclosed a vulnerability in Bash, a shell commonly installed on most Unix-style operating systems. This vulnerability, commonly referred to as "Shell Shock," has the potential to allow arbitrary code execution on the target system. Furthermore, the ubiquity of Bash means that the majority of web servers are potentially vulnerable to this issue.

The vulnerability in Bash is caused by the shell executing the entirety of environment variables which represent functions. By appending commands to a function definition, an attacker could execute arbitrary commands on the system. However, for this vulnerability to be exploited the server must present some interface for an attacker to control environment variables before a shell is launched. The most direct avenue for such an attack is CGI which directly places user-specified values into environment variables before executing a target script or command. However, there are many other server applications which pass user-supplied values into environment variables such as Postfix and OpenVPN. SSH is also vulnerable to this attack when depending on a restrictive command in the authorized_keys file (in which case the original command is placed in the SSH_ORIGINAL_COMMAND environment variable). Gitolite is a very common use case for this and can also be exploited.

The original defect in Bash is not one that could be detected through a general static analysis. Although it likely wasn't, for all an analyzer could tell this semantic behavior of Bash could very well have been intentional, with the developer holding the expectation that any process executing it would set a sane environment before invoking the shell. To make the distinction an automated tool would require a formal specification of the intended behavior of Bash. On the other hand, server applications passing user-controllable data into environment variables is something that can be detected by static analysis, and it is worth looking for since this would provide precisely what an attacker needs in order to exploit a defect such as Shell Shock.

Allowing user-controllable data into environment variables has always been a vulnerability, but the ability to exploit it has been raised dramatically with the disclosure of Shell Shock. Given the raised impact of this defect, we at Coverity set upon the task of answering the question "Is it a common anti-pattern for Java web applications to pass user-controllable data into environment variables when spawning applications?" To answer this question, we built a new checker which looked for tainted data (i.e. input from an HTTP request, database, filesystem, etc.) flowing into the environment variables of a new process.

To implement the new checker we utilized our existing dataflow analysis tools which are used for our other checkers such as SQL injection, XSS, and OS command injection. The sinks for this dataflow are the JDK interfaces for creating processes: java.lang.Runtime and java.lang.ProcessBuilder. The former was a simple dataflow analysis, with the second parameter of the various Runtime.exec() methods acting as the sink. Analyzing ProcessBuilder, on the other hand, is a slightly more complicated task since the sink for tainted data is the Map.put() and Map.putAll() methods returned from the ProcessBuilder.environment() method. Just specifying the methods on the Map interface would be too general, but we found a number of applications which passed the Map returned from environment() into other methods (which themselves make no reference to ProcessBuilder), so we also cannot just rely on the contextual presence of a ProcessBuilder.

To handle this we modeled ProcessBuilder.environment() as the source of a new taint type. To report a defect on Map.put() and Map.putAll(), we required both the Map to have a dataflow path in which through which this new taint type was propagated, and that the second parameter have a dataflow path in which it received an untrusted source of taint (such as a servlet request). That these dataflow paths are considered independently is an over-approximation that could lead to false positives. For example:

public void entryPoint(HttpServletRequest request) {
  doPut(request.getParameter("name"), new HashMap<String, String>());
  doPut("name", new ProcessBuilder().environment());
}
private void doPut(String name, Map<String, String> env) {
  env.put(name);
}

This code would result in a false positive at the env.put(name) call since it has the two requisite dataflow paths described above. By not engineering a solution in which our dataflow engine understands a necessary overlap of two searches we may see false positives such as the above, but for the sake of an experiment this allowed us to create the checker from our existing tools with only half a day of work.

With the checker in-hand, we ran an analysis on a test suite of 76 Java applications. Although the checker did find some defects, the only source of taint flowing into the new environment variables was from the JVM's own environment variables. In some cases — such as in Bash — environment variables should be considered an untrusted source, but this is not usually part of the threat model for long-running Java web applications. With this taint source ignored, we were pleased to discover that no additional defects were detected (in spite of the aforementioned over-approximation). Although our necessarily limited search is not going to be a representative sampling of all Java applications, it suggests that it is not a typical pattern for Java applications to pass user-controlled data to other processes through environment variables. So although developers should remain vigilant in their handling of user-controllable inputs in all contexts, environment variable injection is unlikely to be a high-frequency defect in Java web applications.

Have you encountered Java applications which put user-controllable data into environment variables? If you have seen examples or believe this to be common, let us know!

Secure Code: By Design? Serendipity? Or...?

Posted by Jon, Comments

While researching Struts 2 again to expand our current framework support, I became a bit more familiar with its tag library. This lead to a better understanding of Struts 2 basic OGNL syntax and the creation of some test cases. During the test case development, I was amused at all the different ways one can obtain the value from a getter function. Here's a snippet for your amusement:

"%{tainted}": <s:property value="%{tainted}" escape="false" />
"getTainted()": <s:property value="getTainted()" escape="false" />
"%{getTainted()}": <s:property value="%{getTainted()}" escape="false" />
"#this.tainted": <s:property value="#this.tainted" escape="false" />
"%{#this.tainted}": <s:property value="%{#this.tainted}" escape="false" />
"top.tainted": <s:property value="top.tainted" escape="false" />
...

There are more ways, for sure, but getter synonyms isn't the purpose of this blog. Secure code by chance is.

Secure code isn't just having code that's devoid of vulnerabilities for whatever reasons. Just as insecure code isn't code that happens to have a couple of weaknesses. I've looked at code that had a couple of issues and it seemed relatively secure to me. Just as I've looked at code without any obvious vulnerabilities that I wouldn't consider secure. There are properties to code, hopefully not too subjective, that convey security. To me, one of these security properties is the ability to understand why something is secure or not. If you cannot understand if something is or isn't secure, it isn't secure; it's just serendipity.

Background

If you look at the last line in the JSP snippet above, notice the use of top as a value. This special value obtains the top value from the current ValueStack. In Struts 2 this often is a controller, which is sometimes a subclass of ActionSupport. Why's this important?

Struts 2 has been making the rounds again because of a couple of security advisories: S2-021 and S2-022. Good ol' ParametersInterceptor had a bad couple of days it seems, leading to remote code execution issues. Curiosity got the best of me and I started looking into the updated ParametersInterceptorTest test case and the updated / added ExcludedPatterns class. Looking at that list, you'll see that top isn't in it. Hmm...

Weakness

So top can be a parameter name. Does the framework actually do anything special? ParametersInterceptor eventually calls CompoundRootAccessor.getProperty when top is the parameter name. This returns the root or top of the ValueStack, which again is usually the Action controller that's being accessed. Nice! So we have access to the controller via a parameter name. Let's call some methods!

Since we're calling methods with a value, in OGNL, we're setting a property. This is handled by OgnlRuntime.setProperty:

public static void setProperty(OgnlContext context, Object target, Object name, Object value)
        throws OgnlException
{
    PropertyAccessor accessor;

    if (target == null) {
        throw new OgnlException("target is null for setProperty(null, \"" + name + "\", " + value + ")");
    }
    if ((accessor = getPropertyAccessor(getTargetClass(target))) == null) {
        throw new OgnlException("No property accessor for " + getTargetClass(target).getName());
    }

    accessor.setProperty(context, target, name, value);
}

When using top the target is the specific class, which is a custom class in this case. Since there's no specific PropertyAccessor for this class, this results in an accessor field set to an instance of ObjectAccessor. This calls its super, ObjectPropertyAccessor.setProperty, which calls ObjectPropertyAccessor.setPossibleProperty:

public Object setPossibleProperty(Map context, Object target, String name, Object value)
        throws OgnlException
{
// snip
    if (!OgnlRuntime.setMethodValue(ognlContext, target, name, value, true))
    {
        result = OgnlRuntime.setFieldValue(ognlContext, target, name, value) ? null : OgnlRuntime.NotFound;
    }

    if (result == OgnlRuntime.NotFound)
    {
        Method m = OgnlRuntime.getWriteMethod(target.getClass(), name);
        if (m != null)
        {
            result = m.invoke(target, new Object[] { value});
        }
    }
// snip

In this method, three different ways are used to set or write a value to a property:

  • OgnlRuntime.setMethodValue
  • OgnlRuntime.setFieldValue
  • Invoking the method with the user-provided value.

If all of these fail, the caller method ObjectPropertyAccessor.setProperty throws an exception. And when devMode is on, something like the below is logged:

Unexpected Exception caught setting 'top.foo' on 'class com.coverity.testsuite.property.PropertyAction: Error setting expression 'top.foo' with value '[Ljava.lang.String;@1ef1a094'

So if an exception like the above occurs, we know we hit a Whammy. Otherwise, if we don't hit a Whammy we might be in luck. :) So let's request ?top.text=%25{1*2} and get some RCE! No Whammy, no Whammy, no Whammy, and.... stop!

RCE Attempt 1

Whammy :(

Unexpected Exception caught setting 'top.text' on 'class com.coverity.testsuite.property.PropertyAction: Error setting expression 'top.text' with value '[Ljava.lang.String;@16321271'

Well, what happened? In this case, the call to OgnlRuntime.getWriteMethod returns null in ObjectPropertyAccessor.setPossibleProperty. Hmm...

public static Method getWriteMethod(Class target, String name, int numParms)
{
// snip 
    if ((methods[i].getName().equalsIgnoreCase(name)
         || methods[i].getName().toLowerCase().equals(name.toLowerCase())
         || methods[i].getName().toLowerCase().equals("set" + name.toLowerCase()))
        && !methods[i].getName().startsWith("get")) {
// snip

D'oh! Notice the last part of that conditional. There's a check disallowing one to set a property on a method that starts with get. Boo!

OK, so any methods outside of get* can be called. Guess what, there's yet another OGNL sink on ActionSupport that doesn't start with 'get': hasKey(String)! Let's request ?top.hasKey=%25{1*2}... No Whammy, no Whammy, no Whammy, and... stop!

RCE Attempt 2

Whammy again, drat.

Unexpected Exception caught setting 'top.hasKey' on 'class com.coverity.testsuite.property.PropertyAction: Error setting expression 'top.hasKey' with value '[Ljava.lang.String;@36c63134'

Debugging shows that OgnlRuntime.getWriteMethod when called from ObjectPropertyAccessor.setPossibleProperty did return something this time, awesome! So the method was invoked by reflection via m.invoke(target, new Object[] { value}); with tainted data, nice! Except, it wasn't....

Breaking on ObjectPropertyAccessor.setPossibleProperty and stepping through shows an IllegalArgumentException exception thrown with the message argument type mismatch. Hmm. Looking at the log output, you can see [Ljava.lang.String;, which is the mangled name for String[]. So it seems Struts 2 stores request parameters in a String array. That make sense since one could specify the same parameter name but have different values. And the parameter signature for hasKey is expecting a String, not a String[]. Mismatched argument type. :( Well, shucks. What's one to do w/ a vector that doesn't do anything?!

Recap

What did we learn?

  • The top value bypasses the regular expression exclusion and passes the acceptable names in ParameterInterceptor.
  • top should return a reference to the instance of the Action associated to the URL.
  • It's common, although not required, that Actions exposed are usually subclasses of ActionSupport
  • Methods that start with get* cannot be used as a parameter key. (ActionSupport.getText failed here.)
  • Methods that don't conform to the JavaBean getter / setter convertion need to accept a String[] parameter. (ActionSupport.hasKey failed here.)

It's just dumb luck that RCE didn't happen. For example, if the value was massaged from a String[] to a String, as what happens when the normal getter / setters are called (see XWorkBasicConverter.convertValue), then this could have been RCE. Is it obvious to anyone supporting this code that a custom public method on a class called addValues(String[] values) is accessible via ?top.addValues=value1&top.addValues=value2&... ?

Final Thoughts

I tried to think what the Struts 2 developers could do but I'm lost. I'd rather they remove the exclusion list and remove the functionality that's causing the code to be evaluated in such a way. Exclusion or black lists are like door braces trying to keep out the invading hordes; eventually the hordes break through and raid the castle. Maybe they could ensure ObjectAccessor isn't called on a parameter name, which is tainted. However, I'm guessing there are a lot of things I dunno about Struts 2 that makes this a horrible (and possibly insecure) design choice. Maybe in this case a black list is as good as it gets? If so, is the code still secure? Or is it just lucky?

Appendix

'top.hasKey' vs. 'hasKey'

If you're wondering why specifying top.hasKey is different than hasKey, debug the call to OgnlRuntime.setProperty. Here's the snippet from above again:

public static void setProperty(OgnlContext context, Object target, Object name, Object value)
        throws OgnlException
{
    PropertyAccessor accessor;

    if (target == null) {
        throw new OgnlException("target is null for setProperty(null, \"" + name + "\", " + value + ")");
    }
    if ((accessor = getPropertyAccessor(getTargetClass(target))) == null) {
        throw new OgnlException("No property accessor for " + getTargetClass(target).getName());
    }

    accessor.setProperty(context, target, name, value);
}

When called, if the target is of type CompoundRoot, which it will be for hasKey, accessor is set to an instance of CompoundRootAccessor. When calling top.hasKey the target is the specific class, which is a custom class in this case. This results in an accessor field set to an instance of ObjectAccessor. These two types perform different checks when calling their setProperty methods, with the top case having some potential holes.

Shenanigans with Numbers

Try out the following :)

  • ?0xdeadbeef['equals']=something
  • ?066['equals']=something
  • ?1L['equals']=something

You shouldn't notice any exceptions firing. The OGNL parser is parsing those as numbers and successfully calling the equals method on the respective boxed number class (e.g. Integer.equals).

Now try these:

  • ?12_['ignored']=whatever // Throws ognl.ParseException in ognl.OgnlParser.topLevelExpression()
  • ?123[066]=whatever // 066 converted to decimal 54, and Integer.54() is called, which results in a "54" property not found exception.

Handling web frameworks; a case of Spring MVC - Part 1

Posted by Romain, Comments

Coverity has been known for years for its static analysis technology for C/C++ applications. A couple of years ago, we started a new project to focus on the security analysis of Java web applications. During development, one of the first issues we faced analyzing open source applications was the prevalence and diversity of web frameworks: we did not find many security defects because we lacked the understanding of how untrusted data enters the application as well as how the control flow is affected by these frameworks. To change this, we started developing of framework analyzers.

This blog post presents examples and explains what the analysis needs to understand and extract to perform a solid analysis. We focus on Spring MVC, one of the most common and complex Java web frameworks.

Our example Spring MVC application

To illustrate the framework analysis, I've created a small Spring MVC application that you can find on Github blog-app-spring-mvc. It has features that most Spring applications are using: auto-binding, model attributes, JSPs, and JSON responses. The application itself is very simple and can add users to a persistent store; there is a simple interface to query it, and we also display the latest user.

To show the different features of the framework, I will use two kinds of defects: cross-site scripting (XSS) and path manipulation. The application can be run, and it's possible to exploit these issues; we have 2 simple XSS and 3 path manipulations that are mostly present to trigger defects from the analysis.

Here's the layout of the application that you can build and run using Maven.

src/main/
├── java
│   └── com
│       └── coverity
│           └── blog
│               ├── HomeController.java
│               ├── UsersController.java
│               ├── beans
│               │   └── User.java
│               └── service
│                   └── UsersService.java
└── webapp
    └── WEB-INF
        ├── spring
        │   ├── appServlet
        │   │   └── servlet-context.xml
        │   └── root-context.xml
        ├── views
        │   ├── error.jsp
        │   ├── home.jsp
        │   └── user
        │       └── list.jsp
        └── web.xml

The Java code lives under src/main/java and our package names, while the Spring configurations, JSP files, and web.xml are under the webapp directory. This is a very common structure.

Build and run

If you're not using Maven frequently, you'll need to get it here, go to the root of the project (where the pom.xml is), and run:

  mvn package

to create the WAR file.

You can also run the application directly with a Maven command (always good for proving framework behaviors):

  mvn jetty:run

Developer view: some simple code using Spring MVC

Spring MVC is one of the most common web frameworks. If you look at the abysmal documentation, you will see it has tons of features. Spring MVC implements the model-view-controller (MVC) pattern, where its definition of the MVC is essentially:

  • Model: Passing data around (typically from the controller to the view)
  • View: Presentation of the data (e.g., rendered with JSP, JSON, etc.)
  • Controller: What is responsible for getting data and calling business code

Here's our first controller example, HomeController.java:

@Controller
public class HomeController {
  // Being polite
  @RequestMapping(value="/hello", produces="text/plain")
  public @ResponseBody String sayHello(String name) {
    return "Hello " + name + "!"; // No XSS
  }

  // Display our view
  @RequestMapping(value="/index")
  public String index(User user, Model model) {
    model.addAttribute("current_user", user);
    return "home";
  }
}

In this case, the configuration is very basic. As usual, we need to specify that we want Spring's Servlet to analyze the bytecode to look for @Controller and map the associated entry points (@RequestMapping annotated) with the configuration in servlet-context.xml (we also need to enable Spring in the container, so there are references to it in the web.xml). In this class, we only have 2 entry points:

  1. HomeController.sayHello(...) which takes one parameter "name", and returns a String that contains the body of the response (what's being displayed by the browser).
  2. HomeController.index(...) which has 2 parameters and returns a String that points to the location of the view (a JSP in this case)

Workflow for HomeController.sayHello(...)

Executing this code by reaching /hello?name=Doctor produces the output to be displayed by the browser being:

  Hello Doctor!

The browser also receives the content type as text/plain, so no markup will be rendered here.

Workflow for HomeController.index(...)

The second entry point uses a common feature from web frameworks: auto binding. Spring MVC will instantiate a new User and pass it to the entry point, its fields will be populated by what's passed from the HTTP parameters; the Model is also given by Spring, but its meant to be a map-like object to pass to the view for rendering.

Executing /index?name=TheDoctor&email=doc@future.com will call the HomeController.index(...) method with the first parameter being the bean User({name: TheDoctor, email: doc@future.com}). We later add it to the model so it will automatically be dispatched to the view and accessible from the JSP using EL.

Our JSP is minimalist and contains the following code:

<c:choose>
  <c:when test="${not empty current_user.name}">
    Hello ${cov:htmlEscape(current_user.name)}! <%-- No XSS here --%>
  </c:when>
  <c:otherwise>
    The bean field `name` is reflected, such as <a href="/blog/index?name=TheDoctor">here</a>.
  </c:otherwise>
</c:choose>

where the bean current_user set from the entry point code in the Model is filled with the data populated from the HTTP request. The JSP code will display an HTML escaped name in the current_user bean if it's not null or empty, otherwise display the static contents, body of the c:otherwise tag.

Analysis view: call graph roots, etc.

When running a vanilla analysis on this type of code, not much happens. In fact, the type HomeController has no known instance and HomeController.sayHello(...) is never called in the entire program. A typical analysis would mark this type of method as dead code. The challenge of the framework analysis is to translate how Spring MVC is used in the application into a set of facts that can be acted upon by our static analysis.

The kind of properties we need to extract belong to the following areas:

  • Control flow: How can this function be reached, what's happening when the function returns
  • Data flow: How is the data provided by the framework (automatically dispatched, etc.)
  • Taint analysis: What data is tainted and how (e.g., what parts are tainted or what fields)
  • Domain specific: Facts related to HTTP requests, responses, URLs, Servlet filters in place, etc.

To achieve this, we've created a new phase in the analysis that looks for framework footprints in the source code as well as in the bytecode. The framework analyzers also require access to the configuration files in order to properly simulate how the framework will operate at runtime.

This framework analysis phase extracts the following facts for the first entry point:

  1. HomeController.sayHello(...) is an entry point (or callback)
  2. The name parameter cannot be trusted, so it is tainted (with a particular type of taint)
  3. The entry point is reachable via any HTTP method and the URL is /hello
  4. The return value of this method is a body of HTTP response (so a sink for cross-site scripting)
  5. The response has a content-type of text/plain (so the response is not prone to XSS)

In the case of the second entry point, here are the facts we extract:

  1. HomeController.index(...) is an entry point
  2. Only its first parameter user is tainted with a deep-write (i.e., all fields with a public setter are set as tainted)
  3. The entry point is reachable via any HTTP method and the URL is /index
  4. The return value "home" of this entry point is a location of a view
    1. Inspecting the Spring configuration servlet-context.xml, "home" resolves to WEB-INF/views/home.jsp
    2. Connect the return to the _jspService method of WEB-INF/views/home.jsp through a control-flow join
  5. The model is considered a map-wrapper that contains the bean current_user which is our tainted parameter

With these types of facts integrated in the analysis, it is then possible to properly identify the full execution paths and consider what's tainted or not based on the framework rules itself. We can conceptually create a "trace" that needs to act as a backbone for the analysis:

The "trace" has been annotated with facts directly coming directly from the framework analysis.

Final words

We've seen that the properties extracted by the framework analysis are important for the analysis tool to understand how Spring MVC instantiates the entry points and maps them to URLs. That's how we are able to understand when tainted data is entering the application, how it's entering it, and where it's going.

Without this analysis, we would have to make blunt assumptions and suffer from either a high number of false-negatives when we do not understand the framework configuration, or false-positives if we are overly permissive and don't try to properly resolve what's tainted and where it can possibly go. It's actually a good way to test the capabilities of a static analysis tool: modify the framework configurations, insert EL beans that will always be null at runtime, etc.

However, the framework analysis is not limited to providing value for the taint analysis, but also provides information about how the URLs are reached and what are the constraints attached to them, which is important to identify CSRF issues for example.

On Detecting Heartbleed with Static Analysis

Posted by Andy, Comments

Many of our customers have asked whether Coverity can detect Heartbleed. The answer is Not Yet - but we've put together a new analysis heuristic that works remarkably well and does detect it. (UPDATE: the Coverity platform now detects the Heartbleed defect) We wanted to tell our customers and readers about this heuristic and what it shows about the way we approach static analysis.

John Regehr blogged (1) last week about Coverity Scan, our free scanning service for the open source community. While there were interesting defects found in OpenSSL, Heartbleed was not among them. After adding the new heuristic designed to catch this and other similar defects, we shared our updated results with John and he was gracious enough to write a follow-up blog (2), which we think is fantastic.

Initially, we wanted to independently verify the results so we ran the latest production version of our analysis (7.0.3) against openssl-1.0.1. Coverity Scan uses a particular set of analysis options, and we wondered if different settings might cause the defect to appear. After a few experiments, we determined that analysis settings didn't make a difference for this particular defect.

So we dug into the code further to determine why. At its heart, Heartbleed is an out of bounds memory read based on tainted data being used as an argument to memcpy. The main difficulty in detecting it is in realizing the source data is tainted. Most of the descriptions of Heartbleed begin with this line:

unsigned char *p = &s->s3->rrec.data[0]

But for a static analysis, it is not obvious that the field data is tainted, and finding the evidence for this in the program can be difficult. One illustration of this is in the definition of the structure that contains data:

typedef struct ssl3_record_st
        {
/*r */  int type;               /* type of record */
/*rw*/  unsigned int length;    /* How many bytes available */
/*r */  unsigned int off;       /* read/write offset into 'buf' */
/*rw*/  unsigned char *data;    /* pointer to the record data */
/*rw*/  unsigned char *input;   /* where the decode bytes are */
/*r */  unsigned char *comp;    /* only used with decompression - malloc()ed */
/*r */  unsigned long epoch;    /* epoch number, needed by DTLS1 */
/*r */  unsigned char seq_num[8]; /* sequence number, needed by DTLS1 */
        } SSL3_RECORD;

The comments aid human comprehension, but static analysis doesn't benefit much from them. Instead, we attempt to trace the flow of tainted data from where it originates in a library call into the program's data structures. This can be difficult to do without introducing large numbers of false positives, or scaling performance exponentially poorly. In this case, balancing these and other factors in the analysis design caused us to miss this defect.

Program analysis is hard and approximations and trade-offs are absolutely mandatory. We've found that the best results come from a combination of advanced algorithms and knowledge of idioms that occur in real-world code. What's particularly insightful is to analyze critical defects for clues that humans might pick up on but are hard to derive from first principles. These patterns form pieces of evidence that can then be generalized and tested empirically to make the analysis "smarter." Our experience is that this is the only way to build analyses that scale to large programs with low false positive rates, yet find critical defects. Many program analysis problems are undecidable in general, and in practice NP-complete problems and severe time/space/accuracy trade-offs crop up everywhere. Giving the analysis intuition and developer "street smarts" is key to providing high quality analysis results.

The Heartbleed bug is a perfect example of this. I sat down with one of our analysis engineers to examine whether there was any hope for finding this defect in a smarter way. It seemed bleak. The flow of tainted data into the data field was convoluted, and even manually we had a hard time understanding exactly how the code worked.

Then we noticed that the tainted data was being converted via n2s, a macro that performs byte swapping:

Byte swaps are relatively rare operations. They can occur in cryptographic and image processing code, but perhaps the most widespread use is to convert between network and host endianness (e.g. ntohs). We had a hunch that byte swaps constitute fairly strong evidence that the data is from the outside network and therefore tainted (this also applies to reading a potentially untrusted binary file format such as an image). In addition to byte swapping, we also look for the bytes being subsequently recombined into a larger integer type. We also require that the tainted value flows into a tainted sink, such as an array index or, as in this case, a length argument to a memory operation. These additional conditions help avoid false positives when byte swapping is being used in a situation which isn't tainted. For example, outgoing data that is byte swapped is unlikely to flow into a tainted sink.

With that, Heartbleed revealed itself.

This heuristic bypasses the complex control-flow and data-flow path that reaches this point in the program, and instead infers and tracks tainted data near near the point where it is used. It generalizes to all programs that use byte swaps so it is not overly specific to OpenSSL. Nor is this restricted to intraprocedural cases. We've added this heuristic to the derivers that compute function summaries, so any tainted data inferred is automatically propagated throughout the rest of the program. By collecting this and other similar idioms together, we can pick up large amounts of tainted data without any codebase-specific modeling.

Beyond Heartbleed, we found a handful of additional issues in OpenSSL with this heuristic which we are investigating. We believe (hope?) they are false positives. If they are, we will further tune the analysis to understand why. Even without such tuning, we have not seen an "explosion" of false positives.

The entire set of results, including the new heuristic, will be made available to a selected group of users on the OpenSSL project on Coverity Scan shortly.

We plan on performing additional experiments on a larger corpus of code including 60M+ lines of open source and some additional proprietary code to validate our assumptions and determine if there are other common idioms for use of byte swapping that do not imply taintedness. These steps are part of our standard process for vetting all analysis changes before releasing them to our customers.

A quick post on Apple Security 55471, aka goto fail

Posted by Jon, Comments

goto... fail?

If you haven't heard about the ironically named "goto fail" vulnerability, please read Adam Langley's well written article. A summary of the issue is as follows:

static OSStatus
SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams,
                                 uint8_t *signature, UInt16 signatureLen)
{
...
    if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;
...
fail:
    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;
}

See the two goto fail; statements. The second ensures subsequent statements within the same block will not be executed, making that code unreachable. The problem, as Adam points out, is that the unreachable statement below actually performs a security-critical function. By not having it called, an invalid certificate can be generated with an incorrect signature when certain types of cipher suites are used. Again, read Adam's detailed blog post on the issue.

Would Static Analysis Find This Issue?

Yes.

A successful analysis via cov-analyze, part of Coverity Quality Advisor, on the reduced code shows the UNREACHABLE checker firing. We mocked certain portions of the code so others could quickly compile and analyze the code snippet. However, the SSLVerifySignedServerKeyExchange function is structurally equivalent to the original code.

Pic or it didn't happen:



Since the code snippet is open source, it qualifies for Coverity Scan. Click here to view the project. Or if you're an existing Coverity customer, skip to the bottom of the blog for information on how to reproduce the above.

What Is the Defect?

The UNREACHABLE checker reports many instances of code blocks that cannot be executed because there is no possible control flow to reach it. These defects often occur because of missing braces, which results in unreachable code after break, continue, goto or return statements. In the affected code, the second subsequent goto fail; statement causes the control flow to always bypass the line below it, jumping to the fail: block.

Looking at the Linux Kernel, an open source project in Scan, the UNREACHABLE checker fired 112 times since the Linux kernel has been in Scan. In 2013, 33 of those defects were fixed in the kernel. And since it's been in Scan, 74 of these defects have been resolved. This is about 9 million lines of code. The UNREACHABLE checker is a good example of a high-quality checker in that even on such a large project, it's firing about once ever 80,000 lines. Noisy? Not at all.

Even still, developers may have hundreds to thousands of defects to fix in a large code base, especially one that hasn't had a static analysis tool ran on it before. So which ones should be fixed first?

How Can Developers Prioritize Such Defects?

Any defects found via static analysis in security component code should be treated as a guilty until proven innocent. These defects should take priority in triaging and remedying. This begs the question how will developers know what's a security component? Here's a great spot where developers and security engineers can work together.

Security engineers and application architects should work together to identify security sensitive components. Here's a short list:

  • Authentication: passwords, password resets, login / logout, session management.
  • Authorization: function and data access controls, roles, entitlement.
  • User account management.
  • Cryptographic code, such as TLS. :)
  • Common security controls, such as escapers, validators, etc.

We've blogged about quality defects occurring in security components before. MySQL also had a defect found in an authentication component, which resulted in the ability to authenticate with an incorrect password (CVE-2012-2122).

Out of all the code in an application that should be "Coverity Clean", security components come to mind. What would an UNREACHABLE do to your code base?

Showing My Math

If you're a Coverity customer, you can capture the above command with cov-build via Coverity Quality Advisor 7.0.3 since it includes clang support. You can analyze the above on older versions of Coverity Quality Advisor, such as 6.6.x. Just point to your gcc instead of the clang binary. The rest of the command should be the same.

Grab the code:

git clone https://github.com/coverity-scan/AAPL-Security-55471 AAPL-Security-55471
cd AAPL-Security-55471

Build the code, using cov-build to capture the results.

rm *.o
/Applications/cov-analysis-macosx-7.0.3/bin/cov-build --dir ../55471-ir env LANG=en_US.US-ASCII /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang -x c -arch x86_64 -std=gnu99 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk -c sslKeyExchange.c -o sslKeyExchange.o
Coverity Build Capture version 7.0.3 on Darwin 11.4.2 x86_64
Internal version numbers: 95ee7d28a9 p-fresno-push-17316.506

1 C/C++ compilation units (100%) are ready for analysis
The cov-build utility completed successfully.

And then analyze the code via cov-analyze:

/Applications/cov-analysis-macosx-7.0.3/bin/cov-analyze --dir ../55471-ir
Coverity Static Analysis version 7.0.3 on Darwin 11.4.2 x86_64
Internal version numbers: 95ee7d28a9 p-fresno-push-17316.506

Looking for translation units
|0----------25-----------50----------75---------100|
****************************************************
[STATUS] Computing links for 1 translation unit
|0----------25-----------50----------75---------100|
****************************************************
[STATUS] Computing virtual overrides
|0----------25-----------50----------75---------100|
****************************************************
[STATUS] Computing callgraph
|0----------25-----------50----------75---------100|
****************************************************
[STATUS] Topologically sorting 1 function
|0----------25-----------50----------75---------100|
****************************************************
[STATUS] Computing node costs
|0----------25-----------50----------75---------100|
****************************************************
[STATUS] Starting analysis run
|0----------25-----------50----------75---------100|
****************************************************
Analysis summary report:
------------------------
Files analyzed                 : 1
Total LoC input to cov-analyze : 3842
Functions analyzed             : 1
Paths analyzed                 : 17
Time taken by analysis         : 00:00:01
Defect occurrences found       : 1 UNREACHABLE

To Escape or Not to Escape, That Is The Question

Posted by Jon, Comments

ESAPI Canonicalization

From the Encoder.canonicalize JavaDoc:

Canonicalization is simply the operation of reducing a possibly encoded string down to its simplest form. This is important, because attackers frequently use encoding to change their input in a way that will bypass validation filters, but still be interpreted properly by the target of the attack. Note that data encoded more than once is not something that a normal user would generate and should be regarded as an attack.

Everyone says you shouldn't do validation without canonicalizing the data first. This is easier said than done. The canonicalize method can be used to simplify just about any input down to its most basic form. Note that canonicalize doesn't handle Unicode issues, it focuses on higher level encoding and escaping schemes. In addition to simple decoding, canonicalize also handles:

  • Perverse but legal variants of escaping schemes
  • Multiple escaping (%2526 or &lt;)
  • Mixed escaping (%26lt;)
  • Nested escaping (%%316 or &%6ct;)
  • All combinations of multiple, mixed, and nested encoding/escaping (%253c or ┦gt;)

What's wrong with this?

Not all data that looks doubly encoded is actually doubly encoded.

Hideous Java example:

public void getTitleSuggestionsEntryPoint(HttpServletRequest req) {
    String hint = req.getParameter("hint");
    // bug 12345, infosec told us to do this
    String safeHint = getEsapiEncoder().canonicalize(hint);
    Suggestions suggestions = dao.getTitleSuggestions(safeHint);
    ...
}

...

public Suggestions getTitleSuggestions(String query) {
  try {
    Connection conn = getConnection();
    String query = "SELECT * FROM Movies WHERE title LIKE ?";
    PreparedStatement pstmt = conn.prepareStatement(query);
    pstmt.setString(1, query);
    ResultSet rs = pstmt.executeQuery();
    ...
}

OK, the above is totally contrived, I get it. Even more contrived, say the application allows its users to enter the "%" and "_" characters to assist in the query, which are often interpreted by many SQL drivers as "zero or many characters" and "any single character", respectively.

Now I love Kubrick films. And I want to get suggestions based on '2001 Space Odyssey' as a hint. But I can't spell odyssey without having to look it up (truth). So I enter the following: %2001 Space%. If this is via a GET request, the hint parameter should look like this: %252001%20Space%25. When the Java container receives the request, it decodes the first level of URL encoding, setting the hint field to the value of %2001 Space%. Then the field safeHint is set to  01 Space%.

Wait, what?

The value %20 could be interpreted as the URL encoded value for a space character. However, its context actually is in a SQL LIKE context, not a URL, so this interpretation is incorrect. Encoder.canonicalize doesn't know this. As it iterates over its codecs, it calls PercentCodec.decode. PercentCodec.decode checks to see: if the sequence walks like a URL encoded value and quacks like a URL encoded value, it must be a URL encoded value. So, it decodes it. It doesn't understand that the data is going into a SQL LIKE context.

And here's the above as a simple test:

(jython-env)jpasski@jpasski-mac: ~/.virtualenvs/jython-env
$ jip install org.owasp.esapi:esapi:2.1.0
# ...
$ touch /Users/jpasski/.virtualenvs/jython-env/ESAPI.properties
$ jython-all
Jython 2.5.3 (2.5:c56500f08d34+, Aug 13 2012, 14:48:36)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.7.0_17
Type "help", "copyright", "credits" or "license" for more information.
>>> from org.owasp.esapi.reference import DefaultEncoder
>>> encoder = DefaultEncoder.getInstance()
...
>>> encoder.canonicalize("%2001 Space%")
...
u' 01 Space%'

So What To Do?

Don't use Encoder.canonicalize. However, do canonicalize!

Encoder.canonicalize conflates decoding / unescaping with canonicalization. It also decodes / unescapes data without regard to the data's actual context. If it only did canonicalization I'd have no issue with it. But it doesn't.

Canonicalization is reducing multiple ways of describing the same data to just one way. For example, these two UNIX-style file paths are equivalent: /foo/../bar and /bar, with the latter being in the canonical form. In this example, the canonical form ought to be checked against a known trusted prefix, or else CWE-73 rears its ugly head. Using another example, these two sequences are not equivalent: &lt; and <. In an HTML context, the prior is an HTML named character reference that represents the escaped form of the latter character. The latter character is used to start a tag open state. Since they aren't equivalent, they shouldn't be treated as such. However, Encoder.canonicalize makes them equivalent.

Again, what to do?

Ideally, the application developer needs to understand the context to where the data is sent. A control applied at the entry point is farther away from the eventual sink contexts. One piece of tainted data could easily end up in two or more different contexts, each of which have their own security and usability requirements.

But there just isn't a silver bullet.

Canonicalize data when it can have equivalent but different forms. Validate the input, regardless if canonicalization occurs, against business requirements. And then at the point where the context requirements change, understand these new security requirements. This can be either via the use of a security library or via research / custom coding. But that need doesn't change. Once a developer understands the context, no magic needs to be performed. She or he can use the correct sanitizer, be that an escaper, filter, or validator, and move on to other things like actually adding features.

Deliberate null pointer dereferences in the Linux Kernel

Posted by Andy, Comments

A recent debate on the Linux kernel list about some Coverity defect reports highlights some interesting questions about how to interpret static analysis results.

Here's the patch in question:

---
xfs: fix possible NULL dereference in xlog_verify_iclog

In xlog_verify_iclog a debug check of the incore log buffers prints an
error if icptr is null and then goes on to dereference the pointer
regardless.  Convert this to an assert so that the intention is clear.
This was reported by Coverty.

Reported-by: Geyslan G. Bem 
Signed-off-by: Ben Myers 
---
 fs/xfs/xfs_log.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)
Index: b/fs/xfs/xfs_log.c
===================================================================
--- a/fs/xfs/xfs_log.c  2013-10-23 14:52:47.875216875 -0500
+++ b/fs/xfs/xfs_log.c  2013-10-23 14:53:53.775245830 -0500
@@ -3714,11 +3714,9 @@ xlog_verify_iclog(
    /* check validity of iclog pointers */
    spin_lock(&log->l_icloglock);
    icptr = log->l_iclog;
-   for (i=0; i < log->l_iclog_bufs; i++) {
-       if (icptr == NULL)
-           xfs_emerg(log->l_mp, "%s: invalid ptr", __func__);
-       icptr = icptr->ic_next;
-   }
+   for (i=0; i < log->l_iclog_bufs; i++, icptr = icptr->ic_next)
+           ASSERT(icptr);
+
    if (icptr != log->l_iclog)
        xfs_emerg(log->l_mp, "%s: corrupt iclog ring", __func__);
    spin_unlock(&log->l_icloglock);
--

You can read the debate on LKML. The checker that detects this defect attempts to identify code that follows this pattern:

if(p == NULL)
    ...
*p

where the body of the if-statement does not reassign p to a non-null value. The algorithm the checker uses is much more advanced than a simple pattern match, so it can deal with variations such as passing p into a function which subsequently dereferences it arbitrarily deep in the call stack. On its face, this kind of code presents a contradiction. If the pointer can actually be null, then the dereference causes the code to crash. If the pointer can never be null, then the test if(p == NULL) is unnecessary. Either way, the code signals disagreement with itself about the potential values of p.

An unnecessary test is usually a lot less harmful than a null pointer dereference. But for our purposes, we will count either as a defect. One motivation for removing unnecessary null tests is to enhance the precision of the analysis. For example, removing an unnecessary null test of a function parameter tells the analysis the function unambiguously expects the parameter to be non-null. This expectation is automatically propagated to callsites and then enforced. The analysis can then detect more serious defects caused by the callers of the function passing in null values.

Clarity of intent also helps other human developers to make sense of the code. When collaborating with other developers to maintain code, clarity and consistency can be a great aid in avoiding misunderstanding and speeding up code comprehension.

But we're realists. Not all code is going be perfectly clear, and the mantra of "it works" is deeply ingrained in the collective developer psyche. And beyond that, there are legitimate disagreements about what constitutes clarity. That's why we tune the checker carefully by identifying idioms developers have when writing code in the real world. By eliminating code that follows these idioms, we can reduce the noise level in the results, leaving mostly real defects remaining. Sampling across hundreds of millions of lines of code shows that this checker has a low false positive rate - less than 10% for most code bases.

A simple example of an idiom is a killpath, a function that terminates execution. Common killpath functions include abort() and panic(). There are also synthetic killpath functions that extend this concept, for example by logging a message and then calling a primitive killpath function.

Taking the example above, if we fill in the blank with a killpath function, the contradiction in the code suddenly disappears:

if(p == NULL)
    abort();
*p

Now, the path where p is null does not continue past abort(), so there's never a null dereference. You might ask, why bother with the abort if the pointer dereference will crash? If you replace the abort with a call to a logging function, then the program terminates on the null pointer dereference instead of an abort() - a marginal difference - plus you have a log entry to debug the problem. This is precisely the debate on the LMKL.

We recognize these situations occur. When they do, there are several options, some of which are:

  1. Triage the defect as Intentional or False Positive. By marking the defect in the Coverity database, you won't be bothered by it again. This requires having a trusted process for triage and review, otherwise someone could unilaterally ignore an important defect.

  2. Modeling. It might be worthwhile to tell the analysis that xfs_emerg is a killpath function. I am no XFS developer, but my reading of the code says that xfs_emerg is a logging function that signals an "XFS emergency". If so, then a crash shortly after a call to such a function is expected. It is easy to tell the analysis to treat xfs_emerg as a killpath, so defects containing path traces that go past it won't be reported.

  3. Components. Perhaps defects found in the file xfs_log.c are uninteresting because they are only used in debugging modes. A component can be created to segregate this file and temporarily hide the defects in it. This technique is often useful to prevent test code, 3rd party code, and other lower priority areas from drowning out defects in more important areas.

  4. Checker options. The analysis has hundreds of options to affect the behavior of the algorithms. Some of these options turn on detection of certain idioms that we determined to not be suitable as a default. They might work perfectly well for a specific code base, only experimentation will tell. In this case there don't appear to be any that affect this defect report.

  5. Fix the issue. Sometimes, it's just better to change the code. If it's confusing for the analyzer, it's potentially confusing to other human developers too. But if the fix is made solely to "shut the analysis up" then it's likely to be a missed opportunity. Often, examining the code nearby can be worthwhile as well. Not taking this opportunity is like not fully washing your hands after going to the bathroom. It's unhygienic and it's a false economy.

The bigger picture is that the decision about what to do with static analysis findings is not a purely technical one. If developers feel the analysis is forcing them to do a lot of "extra work" that yields little benefit, they might ignore it entirely. Gaining the trust and adoption of developers is more important than winning a technical debate on any particular bug, or covering any particular defect type. Ultimately it's about what standards developers want to hold themselves accountable to. The analysis is just an algorithm doing its job, trying to enforce some measure of consistency against that standard.