Handling web frameworks; a case of Spring MVC - Part 1

Posted by Romain, Comments

Coverity has been known for years for its static analysis technology for C/C++ applications. A couple of years ago, we started a new project to focus on the security analysis of Java web applications. During development, one of the first issues we faced analyzing open source applications was the prevalence and diversity of web frameworks: we did not find many security defects because we lacked the understanding of how untrusted data enters the application as well as how the control flow is affected by these frameworks. To change this, we started developing of framework analyzers.

This blog post presents examples and explains what the analysis needs to understand and extract to perform a solid analysis. We focus on Spring MVC, one of the most common and complex Java web frameworks.

Our example Spring MVC application

To illustrate the framework analysis, I've created a small Spring MVC application that you can find on Github blog-app-spring-mvc. It has features that most Spring applications are using: auto-binding, model attributes, JSPs, and JSON responses. The application itself is very simple and can add users to a persistent store; there is a simple interface to query it, and we also display the latest user.

To show the different features of the framework, I will use two kinds of defects: cross-site scripting (XSS) and path manipulation. The application can be run, and it's possible to exploit these issues; we have 2 simple XSS and 3 path manipulations that are mostly present to trigger defects from the analysis.

Here's the layout of the application that you can build and run using Maven.

src/main/
├── java
│   └── com
│       └── coverity
│           └── blog
│               ├── HomeController.java
│               ├── UsersController.java
│               ├── beans
│               │   └── User.java
│               └── service
│                   └── UsersService.java
└── webapp
    └── WEB-INF
        ├── spring
        │   ├── appServlet
        │   │   └── servlet-context.xml
        │   └── root-context.xml
        ├── views
        │   ├── error.jsp
        │   ├── home.jsp
        │   └── user
        │       └── list.jsp
        └── web.xml

The Java code lives under src/main/java and our package names, while the Spring configurations, JSP files, and web.xml are under the webapp directory. This is a very common structure.

Build and run

If you're not using Maven frequently, you'll need to get it here, go to the root of the project (where the pom.xml is), and run:

  mvn package

to create the WAR file.

You can also run the application directly with a Maven command (always good for proving framework behaviors):

  mvn jetty:run

Developer view: some simple code using Spring MVC

Spring MVC is one of the most common web frameworks. If you look at the abysmal documentation, you will see it has tons of features. Spring MVC implements the model-view-controller (MVC) pattern, where its definition of the MVC is essentially:

  • Model: Passing data around (typically from the controller to the view)
  • View: Presentation of the data (e.g., rendered with JSP, JSON, etc.)
  • Controller: What is responsible for getting data and calling business code

Here's our first controller example, HomeController.java:

@Controller
public class HomeController {
  // Being polite
  @RequestMapping(value="/hello", produces="text/plain")
  public @ResponseBody String sayHello(String name) {
    return "Hello " + name + "!"; // No XSS
  }

  // Display our view
  @RequestMapping(value="/index")
  public String index(User user, Model model) {
    model.addAttribute("current_user", user);
    return "home";
  }
}

In this case, the configuration is very basic. As usual, we need to specify that we want Spring's Servlet to analyze the bytecode to look for @Controller and map the associated entry points (@RequestMapping annotated) with the configuration in servlet-context.xml (we also need to enable Spring in the container, so there are references to it in the web.xml). In this class, we only have 2 entry points:

  1. HomeController.sayHello(...) which takes one parameter "name", and returns a String that contains the body of the response (what's being displayed by the browser).
  2. HomeController.index(...) which has 2 parameters and returns a String that points to the location of the view (a JSP in this case)

Workflow for HomeController.sayHello(...)

Executing this code by reaching /hello?name=Doctor produces the output to be displayed by the browser being:

  Hello Doctor!

The browser also receives the content type as text/plain, so no markup will be rendered here.

Workflow for HomeController.index(...)

The second entry point uses a common feature from web frameworks: auto binding. Spring MVC will instantiate a new User and pass it to the entry point, its fields will be populated by what's passed from the HTTP parameters; the Model is also given by Spring, but its meant to be a map-like object to pass to the view for rendering.

Executing /index?name=TheDoctor&email=doc@future.com will call the HomeController.index(...) method with the first parameter being the bean User({name: TheDoctor, email: doc@future.com}). We later add it to the model so it will automatically be dispatched to the view and accessible from the JSP using EL.

Our JSP is minimalist and contains the following code:

<c:choose>
  <c:when test="${not empty current_user.name}">
    Hello ${cov:htmlEscape(current_user.name)}! <%-- No XSS here --%>
  </c:when>
  <c:otherwise>
    The bean field `name` is reflected, such as <a href="/blog/index?name=TheDoctor">here</a>.
  </c:otherwise>
</c:choose>

where the bean current_user set from the entry point code in the Model is filled with the data populated from the HTTP request. The JSP code will display an HTML escaped name in the current_user bean if it's not null or empty, otherwise display the static contents, body of the c:otherwise tag.

Analysis view: call graph roots, etc.

When running a vanilla analysis on this type of code, not much happens. In fact, the type HomeController has no known instance and HomeController.sayHello(...) is never called in the entire program. A typical analysis would mark this type of method as dead code. The challenge of the framework analysis is to translate how Spring MVC is used in the application into a set of facts that can be acted upon by our static analysis.

The kind of properties we need to extract belong to the following areas:

  • Control flow: How can this function be reached, what's happening when the function returns
  • Data flow: How is the data provided by the framework (automatically dispatched, etc.)
  • Taint analysis: What data is tainted and how (e.g., what parts are tainted or what fields)
  • Domain specific: Facts related to HTTP requests, responses, URLs, Servlet filters in place, etc.

To achieve this, we've created a new phase in the analysis that looks for framework footprints in the source code as well as in the bytecode. The framework analyzers also require access to the configuration files in order to properly simulate how the framework will operate at runtime.

This framework analysis phase extracts the following facts for the first entry point:

  1. HomeController.sayHello(...) is an entry point (or callback)
  2. The name parameter cannot be trusted, so it is tainted (with a particular type of taint)
  3. The entry point is reachable via any HTTP method and the URL is /hello
  4. The return value of this method is a body of HTTP response (so a sink for cross-site scripting)
  5. The response has a content-type of text/plain (so the response is not prone to XSS)

In the case of the second entry point, here are the facts we extract:

  1. HomeController.index(...) is an entry point
  2. Only its first parameter user is tainted with a deep-write (i.e., all fields with a public setter are set as tainted)
  3. The entry point is reachable via any HTTP method and the URL is /index
  4. The return value "home" of this entry point is a location of a view
    1. Inspecting the Spring configuration servlet-context.xml, "home" resolves to WEB-INF/views/home.jsp
    2. Connect the return to the _jspService method of WEB-INF/views/home.jsp through a control-flow join
  5. The model is considered a map-wrapper that contains the bean current_user which is our tainted parameter

With these types of facts integrated in the analysis, it is then possible to properly identify the full execution paths and consider what's tainted or not based on the framework rules itself. We can conceptually create a "trace" that needs to act as a backbone for the analysis:

The "trace" has been annotated with facts directly coming directly from the framework analysis.

Final words

We've seen that the properties extracted by the framework analysis are important for the analysis tool to understand how Spring MVC instantiates the entry points and maps them to URLs. That's how we are able to understand when tainted data is entering the application, how it's entering it, and where it's going.

Without this analysis, we would have to make blunt assumptions and suffer from either a high number of false-negatives when we do not understand the framework configuration, or false-positives if we are overly permissive and don't try to properly resolve what's tainted and where it can possibly go. It's actually a good way to test the capabilities of a static analysis tool: modify the framework configurations, insert EL beans that will always be null at runtime, etc.

However, the framework analysis is not limited to providing value for the taint analysis, but also provides information about how the URLs are reached and what are the constraints attached to them, which is important to identify CSRF issues for example.

Comments!