ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications

Reference: Warszawski, Todd, and Peter Bailis. "ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications." Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 2017.

0. Abstract

In theory, database transactions protect application data from corruption and integrity violations. In practice, database transactions frequently execute under weak isolation that exposes programs to a range of concurrency anomalies, and programmers may fail to correctly employ transactions. While low transaction volumes mask many potential concurrency-related errors under normal operation, determined adversaries can exploit them programmatically for fun and profit. In this paper, we formalize a new kind of attack on database-backed applications called an ACIDRain attack, in which an adversary systematically exploits concurrency-related vulnerabilities via programmatically accessible APIs. These attacks are not theoretical: ACIDRain attacks have already occurred in a handful of applications in the wild, including one attack which bankrupted a popular Bit-coin exchange. To pro-actively detect the potential for ACIDRain attacks, we extend the theory of weak isolation to analyze latent potential for non-serializable behavior under concurrent web API calls. We introduce a language-agnostic method for detecting potential isolation anomalies in web applications, called Abstract Anomaly Detection (2AD), that uses dynamic traces of database accesses to efficiently reason about the space of possible concurrent inter-leavings. We apply a prototype 2AD analysis tool to 12 popular self-hosted eCommerce applications written in four languages and deployed on over 2M websites. We identify and verify 22 critical ACIDRain attacks that allow attackers to corrupt store inventory, over-spend gift cards, and steal inventory.

1. Introduction

  • Although for many applications the concurrency-related problems are not latent, the rise of the web-facing interfaces(i.e., API) leads to the possibility of increased concurrency and the deliberate exploitation of concurrency-related errors.
  • In this paper, the authors investigate the cause, detection, prevalence of concurrency-related attacks on database-backed web applications.
  • The paper considers attacks that trigger two kinds of anomalies: (1) level-based isolation anomalies: if the database does not provide the application with serializable isolation, then concurrently-issued transactions may lead to non-serializable behavior; (2) scoping isolation anomalies: if the application does not correctly scope, or encapsulate, its logic using transactions, concurrent requests to the application may lead to behavior that would not have arisen sequentially.
  • This paper examines popular eCommerce platforms: OpenCart and Spree Commerce, and analyzes actual SQL traces (i.e., logs) using a new approach called Abstract Anomaly Detection(2AD).

2. ACIDRain Attacks

  • TargetEnvironment: this paper focuses on web applications - applications that expose functionality to third-parties via programmatically accessible APIs, both over the Internet and via related protocols such as HTTP and REST.
  • AttackDefinition: an exploit allowing an attacker to elicit undesirable application behavior by issuing concurrent requests to trigger non-serializable access to database-managed state.
  • An application is vulnerable to an ACIDRain attack if two conditions are met: (1) Anomalies possible. Under concurrent API access, the application may exhibit behaviors (i.e., anomalies) that could not have arisen under a serial execution, including level-based isolation anomalies and scope- based isolation anomalies; (2) Sensitive invariants. The anomalies arising from concurrent access lead to violations of application invariants.
  • Threat model: the authors assume that an attacker can only access the web application via concurrent requests against publicly-accessible APIs (e.g., HTTP, REST).

3. 2AD: Detecting Anomalies

  • Static analysis tools are not applicable for web applications, because they are usually written using a variety of frameworks and languages.
  • The core idea behind 2AD is to execute API calls against a live application and database to generate a (possibly sequential) trace of database activity, then analyze the trace for potential anomalies that could arise under concurrent execution.
  • 2AD requires considerable work to achieve for two primary reasons: (1) existing models for database isolation reason about anomalies in a particular concurrent execution; (2) to detect scope-based anomalies, we need to reason about behavior across transactions within the same API call.

3.1 2AD Concepts and Procedures

3.1.1 Trace Generation

  • From the database logs, the framework extracts the sequence of transactions generated by each API call.

3.1.2 Abstract History Generation

  • Given a concrete trace generated by API calls, this paper determines whether concurrently executing a set of calls to the same APIs might result in non-serializable behavior. The primary challenge here is that existing theories of isolation pertain to concrete traces, or histories, of transactions.
  • Abstract History: a finite multi-graph(i.e., allows multiple edges between the same pair of nodes) that represents the set of all possible expansions of a given trace, with nodes for each operation, supernodes of operations for each transaction, and supernodes of transactions for each API call. Undirected write and read edges capture interactions between pairs of writes and pairs of reads and writes, respectively. Intuitively, the abstract history captures all possible concurrent inter-leavings of the API calls.
  • How to construct the abstract history: see paper.

3.1.3 Witness Generation

  • Non-trivial abstract cycles: cycles of API nodes in the abstract history correspond to potentially anomalous behavior in API calls.
  • Witness-Finding Algorithm: see paper.

3.1.4 Witness Refinement

  • To reduce these false positives, thereby improving the soundness of 2AD analysis, this paper introduces an optional witness refinement step.
  • There are two main sources of knowledge: (1) Isolation-Based Refinement, their method memorizes refinement information(e.g., isolation level) when finding witness; (2) Application-Level Refinement, using information about the application and execution environment.

3.2 2AD Overview and Discussion

  • Benefits
  • Soundness and Completeness
  • Limitations
  • Extensions
  • Summary

4. ACIDRain in the Wild

4.1 From Anomalies to Vulnerabilities

  • 2AD’s ability to highlight anomalies that affect particular data items (e.g., a table containing account balances) and determine the API calls that may trigger them (e.g., two concurrent withdrawal requests) allows users to determine which anomalies affect key program invariants.

4.2 Attacking Self-Hosted eCommerce

4.2.1 Target Application Corpus

  • The authors selected a set of 12 eCommerce applications written in four languages based on popularity measures including GitHub stars and references in popular articles.

4.2.2 Target Application Invariants

  • Inventory Invariant
  • Voucher Invariant
  • Cart Invariant

4.2.3 Prototype 2AD Analysis Tool

Code

  • Workflow, False Positives, and Targeted Analysis
  • Running time
  • Tool Limitations

4.2.4 Experimental Methodology

  • To avoid configuring a custom HTTP request generator for each application, they reproduced all the anomalies manually, via rapid, successive HTTP requests (sometimes in separate browsers).

4.2.5 Analysis Results

  • Which vulnerabilities occurred?
  • Were particular applications more likely to contain vulnerabilities?
  • What types of anomalies caused vulnerabilities?
  • Were there false positives?

4.2.6 Avoiding ACIDRain Attacks

  • SELECT FOR UPDATE: appending FOR UPDATE to the end of a SELECT query prevents the data read from being modified until the end of the transaction.
  • User level concurrency control: a few applications used user-level locking to prevent concurrent execution of a section of code. PHP automatically performs “session locking” on session files preventing concurrent calls in the same session.
  • Single read of data
  • Multiple validations

4.2.7 Response and Discussion

  • Potential fixes
  • Developer response
Written on April 26, 2017