publications | Hanxi "Gary" Chen

2025

Vellvm: Formalizing the Informal

Calvin Beck, Hanxi Chen, and Steve Zdancewic

Accepted by CoqPL 2025, Jan 2025

Abs

The Vellvm project is an extensive undertaking to formalize a large subset of LLVM IR in the Coq proof assistant. LLVM IR is an intermediate language that is widely used as a common target for front-end compilers. Compilers for different languages can share the LLVM infrastructure for performing optimizations and generating target code for various instruction set architectures. The bedrock of many languages is LLVM IR, and formalizing it gives us an important tool for compiler verification efforts. In this talk we will explore the difficulties that come up when attempting to formalize a large real world language. We’ll touch upon how proof assistants like Coq can reveal issues in informal specifications, how testing tools like QuickChick can help bridge the gap between informal real-world implementations and our formal Coq semantics, and we will also touch upon aspects of the LLVM IR could be updated to better suit verification efforts. Furthermore, we’ll discuss some of the challenges we have encountered along the way, including limitations with the Coq type system, extraction bugs, and performance issues for proofs and extracted code.
Vellvm: Formalizing the Informal

Calvin Beck, Hanxi Chen, and Steve Zdancewic

Submitted to NFM 2025, Jan 2025

Abs

This report presents our methodology for and experience with formalizing a specification of LLVM IR in the Verified LLVM Project (Vellvm). The project aims to mechanize a large, practical subset of LLVM IR in the Coq/Rocq Proof Assistant in the support of verified compilers and program transformations. The methodology implements the semantics via monadic interpreters, which rely on a coinductively- defined data structure called itrees, and a two-phase memory model that provides low-level operations, such as pointer-to-integer casts. Crucially, this methodology supports the extraction of an executable interpreter, proved to refine the specification. We use the reference interpreter to validate the accuracy of the formalization, employing random differen- tial testing between Clang and Vellvm implemented in our own LLVM IR program generator (GenLLVM), as well as via state-of-the-art C compiler testing frameworks (CSmith and YARPGen). Such testing has found bugs in both the Vellvm semantics and Clang. We believe that tools from the Vellvm project can be useful for other LLVM IR-related projects, and that the overall methodology applies to other formal verification efforts.

2024

A Two-Phase Infinite/Finite Low-Level Memory Model: Reconciling Integer–Pointer Casts, Finite Space, and undef at the LLVM IR Level of Abstraction

Calvin Beck, Irene Yoon, Hanxi Chen, Yannick Zakowski, and Steve Zdancewic

Proc. ACM Program. Lang., Aug 2024

Abs DOI

This paper provides a novel approach to reconciling complex low-level memory model features, such as pointer–integer casts, with desired refinements that are needed to justify the correctness of program transformations. The idea is to use a "two-phase" memory model, one with an unbounded memory and corresponding unbounded integer type, and one with a finite memory; the connection between the two levels is made explicit by a notion of refinement that handles out-of-memory behaviors. This approach allows for more optimizations to be performed and establishes a clear boundary between the idealized semantics of a program and the implementation of that program on finite hardware. The two-phase memory model has been incorporated into an LLVM IR semantics, demonstrating its utility in practice in the context of a low-level language with features like undef and bitcast. This yields infinite and finite memory versions of the language semantics that are proven to be in refinement with respect to out-of-memory behaviors. Each semantics is accompanied by a verified executable reference interpreter. The semantics justify optimizations, such as dead-alloca-elimination, that were previously impossible or difficult to prove correct.
Random Differential Testing at the LLVM IR Level

Hanxi Chen

CIS Undergraduate Thesis, supervised by Steve Zdancewic, read by Benjamin Pierce, May 2024

Abs

This thesis presents a randomized test-case generator, GenLLVM, for producing undefined-behavior-free, expressive, deterministic LLVM IR programs. GenLLVM generates LLVM IR programs directly, without relying on a frontend compiler, allowing users to tune program distributions to target particular features of LLVM IR. Motivated by the desire to find bugs in VIR, a formal semantics for LLVM IR implemented as part of the Vellvm project, this project uses GenLLVM to perform differential testing of the VIR interpreter against Clang. GenLLVM is able to uncover numerous flaws in the VIR implementation, miscompilation and crash errors in Clang, as well as ambiguities in the LLVM IR informal specification. This thesis compares GenLLVM’s coverage of LLVM IR features to that of Csmith, YARPGen, and YARPGen V2, confirming that these tools do indeed exercise different aspects of LLVM IR semantics.
GenLLVM: Randomized Differential Testing for LLVM IR

Calvin Beck, Hanxi Chen, and Steve Zdancewic

May 2024
Knapsack Problem with Precedence Constraint

Hanxi Chen

Wharton Research Scholars Thesis, supervised by Sanjeev Khanna, May 2024