Latest Articles

## A Classical Sequent Calculus with Dependent Types

Dependent types are a key feature of the proof assistants based on the Curry-Howard isomorphism. It is well known that this correspondence can be... (more)

## Context-Free Session Type Inference

Some interesting communication protocols can be precisely described only by context-free session types, an extension of conventional session types supporting a general form of sequential composition. The complex metatheory of context-free session types, however, hinders the definition of corresponding checking and inference algorithms. In this... (more)

## Probabilistic Termination by Monadic Affine Sized Typing

We introduce a system of monadic affine sized types, which substantially generalizes usual sized types and allows in this way to capture probabilistic... (more)

## ML, Visibly Pushdown Class Memory Automata, and Extended Branching Vector Addition Systems with States

We prove that the observational equivalence problem for a finitary fragment of the programming... (more)

### TOPLAS participates in ORCID

ORCID is a community-based effort to create a global registry of unique researcher identifiers for the purpose of ensuring proper attribution of works to their creators. When you submit a manuscript for review, you will be presented with the opportunity to register for your ORCID.

#### PYE: A Framework for Precise-Yet-Efficient Just-In-Time Analyses for Java Programs

Static Identification of Injection Attacks in Java

The most dangerous security-related software errors, according to the OWASP Top Ten list, are those affecting web applications and potentially leading to injection attacks, such as SQL injection, cross-site scripting injection, redirect injection, command injection, path-traversal and trust boundary violation. All these attacks exploit the same weakness: unconstrained propagation of data from sources that the user of a web application controls into sinks whose activation might trigger dangerous operations. This article defines a unified, sound protection mechanism against such attacks, based on the identification of all possible explicit flows of tainted data in Java code. Such flows can be arbitrarily complex, passing through dynamically allocated data structures in the heap. The analysis is based on abstract interpretation and is interprocedural, flow-sensitive, and context-sensitive. Its notion of taintness applies to reference (non-primitive) types dynamically allocated in the heap, and is object-sensitive and field-sensitive. The analysis works by translating the program into Boolean formulas that model all possible data flows. Its implementation, within the Julia analyzer for Java, found injection security vulnerabilities in the Internet banking service and in the customer relationship management of large Italian banks. For objective, repeatable results, this article also evaluates the implementation on two open-source security benchmarks: the Juliet Suite and the OWASP Benchmark for cybersecurity. We compared this technique against more than ten other static analyzers, both free and commercial. The result is that ours is the only sound analysis for injection currently available for industrial Java and it is much more precise than other tools.

Combinatorial Register Allocation and Instruction Scheduling

This paper introduces a combinatorial optimization approach to register allocation and instruction scheduling, two central compiler problems. Combinatorial optimization has the potential to solve these problems optimally and to exploit processor-specific features readily. Our approach is the first to leverage this potential in practice: it captures the complete set of program transformations used in state-of-the-art compilers, scales to medium-sized functions of up to 1000 instructions, and generates executable code. This level of practicality is reached by using constraint programming, a particularly suitable combinatorial optimization technique. Unison, the implementation of our approach, is open source, used in industry, and integrated with the LLVM toolchain. An extensive evaluation confirms that Unison generates better code than LLVM while scaling to medium-sized functions. The evaluation uses systematically selected benchmarks from MediaBench and SPEC CPU2006 and different processor architectures (Hexagon, ARM, MIPS). Mean estimated speedup ranges from 1% to 9.3% and mean code size reduction ranges from 0.8% to 3.9% for the different architectures. Executing the generated code on Hexagon confirms that the estimated speedup indeed results in actual speedup. Given a fixed time limit, Unison solves optimally functions of up to 647 instructions, improves functions of up to 874 instructions, and achieves more than 85% of the potentially optimal speed for 90% of the functions on Hexagon. The results show that our combinatorial approach can be used in practice to trade compilation time for code quality beyond the usual compiler optimization levels, fully exploit processor-specific features, and identify improvement opportunities in heuristic algorithms.

Modular Product Programs

Many interesting program properties like determinism or information flow security are hyperproperties, that is, they relate multiple executions of the same program. Hyperproperties can be verified using relational logics, but these logics require dedicated tool support and are difficult to automate. Alternatively, constructions such as self-composition represent multiple executions of a program by one product program, thereby reducing hyperproperties of the original program to trace properties of the product. However, existing constructions do not fully support procedure specifications, for instance, to derive the determinism of a caller from the determinism of a callee, making verification non-modular. We present modular product programs, a novel kind of product program that permits hyperproperties in procedure specifications and, thus, can reason about calls modularly. We provide a general formalization of our product construction and prove it sound and complete. We demonstrate its expressiveness by applying it to information flow security with advanced features such as declassification and termination-sensitivity. Modular product programs can be verified using off-the-shelf verifiers; we have implemented our approach for both secure information flow and general hyperproperties using the Viper verification infrastructure. Our evaluation demonstrates that modular product programs can be used to prove hyperproperties for challenging examples in reasonable time.

Consistent Subtyping for All

Consistent subtyping is employed in some gradual type systems to validate type conversions. The original definition by Siek and Taha serves as a guideline for designing gradual type systems with subtyping. Polymorphic types à la System F also induce a subtyping relation that relates polymorphic types to their instantiations. However Siek and Tahas definition is not adequate for polymorphic subtyping. The first goal of this paper is to propose a generalization of consistent subtyping that is adequate for polymorphic subtyping, and subsumes the original definition by Siek and Taha. The new definition of consistent subtyping provides novel insights with respect to previous polymorphic gradual type systems, which did not employ consistent subtyping. The second goal of this paper is to present a gradually typed calculus for implicit (higher-rank) polymorphism that uses our new notion of consistent subtyping. We develop both declarative and (bidirectional) algorithmic versions for the type system. The algorithmic version employs techniques developed by Dunfield and Krishnaswami to deal with instantiation. We prove that the new calculus satisfies all static aspects of the refined criteria for gradual typing. We also study an extension of the type system with static and gradual type parameters, in an attempt to support a variant of the dynamic criterion for gradual typing. Armed with a coherence conjecture for the extended calculus, we show that the dynamic gradual guarantee of our source language can be reduced to that of »B, which, at the time of writing, is still an open question.

Higher-Order Demand-Driven Program Analysis

Developing accurate and efficient program analyses for languages with higher-order functions is known to be difficult. Here we define a new higher-order program analysis, DDPA, which extends well-known demand-driven lookup techniques found in first-order CFL-reachability program analyses to higher-order programs. This task presents several unique challenges to obtain good accuracy, including the need for a new method for demand-driven lookup of non-local variable values. DDPA is flow- and context-sensitive and provably polynomial-time. To efficiently implement DDPA we develop a novel push-down automaton metaprogramming framework, the Push-Down Reachability automaton (PDR). The analysis is formalized and proved sound, and an implementation is described.

On the Impact of Programming Languages on Code Quality

This paper is a reproduction of work by Ray et al. which claimed to have uncovered a statistically significant association between eleven programming languages and software defects in projects hosted on GitHub. First we conduct an experimental repetition, repetition is only partially successful, but it does validate one of the key claims of the original work about the association of ten programming languages with defects. Next, we conduct a complete, independent reanalysis of the data and statistical modeling steps of the original study. We uncover a number of flaws that undermine the conclusions of the original study as only four languages are found to have a statistically significant association with defects, and even for those the effect size is exceedingly small. We conclude with some additional sources of bias that should be investigated in follow up 9 work and a few best practice recommendations for similar efforts.

Non-polynomial Worst-Case Analysis of Recursive Programs

We study the problem of developing efficient approaches for proving worst-case bounds of non-deterministic recursive programs. Ranking functions are sound and complete for proving termination and worst-case bounds of non-recursive programs. First, we apply ranking functions to recursion, resulting in measure functions. We show that measure functions provide a sound and complete approach to prove worst-case bounds of non-deterministic recursive programs. Our second contribution is the synthesis of measure functions in non-polynomial forms. We show that non-polynomial measure functions with logarithm and exponentiation can be synthesized through abstraction of logarithmic or exponentiation terms, Farkas' Lemma, and Handelman's Theorem using linear programming. While previous methods obtain polynomial worst-case bounds, our approach can synthesize bounds of various forms including $\mathcal{O}(n \log n)$ and $\mathcal{O}(n^r)$ where $r$ is not an integer. We present experimental results to demonstrate that our approach can efficiently obtain worst-case bounds of classical recursive algorithms such as (i)~Merge sort, Heap sort and the divide-and-conquer algorithm for the Closest Pair problem, where we obtain $\mathcal{O}(n \log n)$ worst-case bound, and (ii)~Karatsuba's algorithm for polynomial multiplication and Strassen's algorithm for matrix multiplication, for which we obtain $\mathcal{O}(n^r)$ bounds such that $r$ is not an integer and is close to the best-known bound for the respective algorithm. Besides the ability to synthesize non-polynomial bounds, we also show that our approach is equally capable of obtaining polynomial worst-case bounds for classical programs such as Quick sort and the dynamic programming algorithm for computing Fibonacci numbers.

Failure Recovery in Resilient X10

Cloud computing has made the resources needed to execute large-scale in-memory distributed computations widely available. Specialized programming models have emerged to offer transparent fault tolerance and recovery for specific computational patterns, but they sacrifice generality. In contrast, the Resilient X10 programming language adds failure containment and failure awareness to a general-purpose, distributed programming language. A Resilient X10 application spans over a number of places. Its formal semantics precisely specify how it continues executing after a place failure. Thanks to failure awareness, the X10 programmer can in principle build redundancy into an application to recover from failures. In practice however, correctness is elusive as redundancy and recovery are often complex programming tasks. This paper further develops Resilient X10 to shift the focus from failure awareness to failure recovery. We revisit the happens-before invariance principle and its implementation. We shift most of the burden of redundancy and recovery from the programmer to the runtime system and standard library. We make it easy to protect critical data from failure using resilient stores and harness elasticity to persist not just the data but also its spatial distribution. We demonstrate the flexibility and practical usefulness of Resilient X10 by building several representative high-performance in-memory parallel application kernels and frameworks. We empirically establish that the runtime overhead of resiliency is less than 8%. By comparing application kernels written in the Resilient X10 and Spark programming models we demonstrate that Resilient X10's more general programming model can enable significantly better application performance for resilient in-memory distributed computations.

#### Analysis and Optimization of Task Granularity on the Java Virtual Machine

###### All ACM Journals | See Full Journal Index

Search TOPLAS
enter search term and/or author name