CS358 2020-2021

CS350: SECURE COMPILATION

Below you will find the following information for the course:
– general course information,
– prerequisites and interest,
– location and timeslots,
– evaluation and grading,
– high-level syllabus outline,
– class and lecture outline.

Some entries are still TBD, please contact the instructor if you have questions.

PIAZZA, CANVAS AND OTHER LINKS:

Canvas: https://canvas.stanford.edu/courses/135723
Piazza: https://piazza.com/class/kmewl9o18oi3je
Gradescope: https://www.gradescope.com/courses/255564 (find the code in Piazza)

We will be using Miro, an online whiteboard, the one for the course is here: https://miro.com/app/board/o9J_lO0m4ZI=/

General Course Information

This course will explore the nascent field of secure compilation, which sits at the intersection between security and programming languages. The perspective from which we teach this field is that of formal methods and foundations of computer security.
The goal of a secure compiler is to compile programs in order to preserve source security properties like data confidentiality and code integrity. This is challenging because attackers operating at the level of the compiler output are inherently more powerful than attackers in the source language. For example, target attackers can mount buffer overflows that bypass compiled code security abstractions that derive e.g., from compiled source level type annotations.

This course will describe:

the threat model that secure compilers consider
correctness criteria for secure compilation,
specific instances of secure compilers
proof techniques for secure compilation.

Correctness criteria define that a compiler is secure. The course will explain why can we be sure that presented criteria have security meaning (i.e., what kinds of attacks can be defended against and which not).
Secondly, the course will discuss specific instances of secure compilers and how they achieve security (i.e., what mechanism — including types, crypto, security architecture, etc.— do these compilers exploit to realise one of the presented criteria and what are their tradeoffs).
Finally the course will cover proof techniques for secure compilation, i.e., how do you formally prove that a compiler adheres to one of the presented criteria and therefore is secure. For any question, please contact Marco Patrignani (see mail here).

Prerequisites and Interests

Students should have a basic understanding of programming and of how a compiler works (CS143, CS243). A background in formal languages and semantics is a plus (CS358, CS242).
Given the formal view on the subject, this course should appeal to students with interest in security, formal methods, foundations of computer science.

Location and Time Slots (all times are in Pacific)

Time slot: T-TH, 10:30-11:50
Location: Zoom : https://stanford.zoom.us/j/94556898594 (PWD in piazza)
Office hours:
Friday afternoon 12-13, but other slots can be opened if needed.
Please drop an email ahead of time if you plan to come.

Evaluation and Grading

Evaluation will be based on:

6 assignments:
[ass1.pdf] | [ass2.pdf] | [ass3.pdf] | [ass4.pdf] | [ass6.pdf] | [ass7.pdf]
1 non-optional student presentation [ass5.pdf] of a paper chosen from this list: [ list.pdf ]
1 final exam consisting of a written text + oral discussion (which we’ll run on zoom) [exam.pdf]

Grading:
60% assignments
20% presentation
20% final exam

Depending on the student’s interest, part of the evaluation can also be based on a project (research- or educational-focussed) related to the topics discussed in class. This can be discussed with the instructor and will be adjudicated based on student motivation and interest.
Please note that depending on the amount of attendees, some modification in the evaluation criteria may be necessary. All changes will be communicated to the students at the beginning of the class when the number of attendees is known.

High-level Syllabus

Below is a more precise outline of what we will cover in class, grouped by notions. Please note that there is not a direct mapping between notions and lectures, so covering a notion may take 1 lecture while covering another one may take 5. Also, some topics here may vary, though lightly.

Mostly we will follow my preliminary lecture notes:

[350-bt.pdf] main notes
The original .tex files for this document are here [bt.tex] | [cmds.tex]
[350-asm-lang.pdf] assembly language formalisation.

And the following recommended reading: (no deadline, please do during first 2 weeks)

Formal Approaches to Secure Compilation. [.pdf]
Journey Beyond Full Abstraction. [.pdf]
Robustly Safe Compilation, an Efficient Form of Secure Compilation. [.pdf]
Fully-Abstract Compilation by Approximate Back-Translation. [.pdf]
Exorcising Spectres with Secure Compilers. [.pdf]

all these papers are available in my publications page.

Notions 1:
motivation behind secure compilation
Notions 2: program equivalences for security; expressing security properties with equivalences, classic examples. Traces for security: expressing security properties and hyperproperties (HP) with traces, HP diagram (safety, liveness, hypersafety). Examples: confidentiality, integrity, noninterference, specific properties
Notions 3: full abstraction (FAC) theory: definition, justification for security, attacks and threat model, violations of FAC, common mistakes. Proving FAC: approaches, what is a backtranslation, context and traces backtranslation, when to use what
Notions 4: trace preserving compilation: definition, justification, attacks, analogy with FAC, threat model
Notions 5: Proving FAC. Recap of PL semantics notions: syntax/semantics, reading rules. Source language, target language: syntax and semantics, examples of equivalent programs. Compiler definition, compiler correctness. Backtranslation definition, backtranslation correctness. Target extension with context inspection, target traces, change backtranslation from context-based to trace-based
Notions 6: Robustness, Robust property preservation, property-ful and -free criteria and their equivalence, different backtranslation classes. Robustly safe compilation (RSP) RSP in detail.
Notions 7: Proving RSP starting from what we saw in Notions 5, comparison with FAC proof
Notions 8: Advanced Robust Criteria: RHSP in detail, example of preserving NI, RSCHP, RHP. Relational (hyper)properties and their reason to exist.

Class Outline

Please note that this is tentative and precise class contents can vary slightly

w1: mar 30 apr 1

lecture 1:
introduction & details
recap PL: source lang with types, target language without types, compiler

lecture 2:
compiler correctness (CC)
partial programs, contexts, compiler for partials, compositional CC
security props
limitations of CC

w2: apr 6 8

lecture 3:
contextual equivalence to express integrity and confidentiality
contextual equivalence formally
formalising assembly languages
lecture 4:
fully abstract compilation (fac): general statement, motivation, preservation
fac proofs: deriving reflection from CC
fac reflection for our compiler

w3: apr 13 15

lecture 5:
fac proofs: proving preservation
fac preservation: need a context backtranslation (ctx bt)
fix the ctx bt: inject/extract
fac preservation via bt correctness

lecture 6:
add target reflection and try the BT
define trivial target traces, fac with traces, derive what is needed of the traces
fac preservation via traces, fa traces

w4: apr 20 22

lecture 7:
impure languages in source and target, ceq for stateful languages, correct compilation
formalising protection mechanisms: capabilities
fac for stateful languages and new encodable security properties
complexity of deriving a fac proof, high-level scheme
lecture 8:
traces as a generic security mechanism,
properties and hyperproperties and relational properties
adding interaction traces to our languages as well as I/O,
robust property preservation as a criterion for secure compilation

w5: apr 27 29

lecture 9:
different criteria that preserve different classes
equivalent characterisation and why are they equivalent

Cheatsheet:
http://theory.stanford.edu/~mp/mp/Publications_files/RC-diagram.pdf

lecture 10:
safety preservation
liveness preservation
rsc definition, implications

Suggested reading:
https://squera.github.io/pdf/scAndHp.pdf

w6: may 4 6

lecture 11:
rsc via correctness of trace-based BT and via cross-language relation

different forms of BT for our languages: can we make a context based BT

lecture 12:
stlc and ulc, erasing compiler
compiler correctness and cross-language logical relation
fa compiler, intuition, uval, emuldv

w7: may 11 13

lecture 13:
approximate bt, pseudotype LR
technicalities of the approx BT: in/case, in/dn, emulate, inject/extract
fac reflection and preservation via approx BT
lecture 14:
presentations

w8: may 18 20

lecture 15:
presentations
lecture 16:
presentations

w9: may 25 27

oral exams:
discuss your exam, some assignments and some course questions