CS358 2020-2021
CS350: SECURE COMPILATION
By prof. Marco Patrignani
Below you will find the following information for the course:
– general course information,
– prerequisites and interest,
– location and timeslots,
– evaluation and grading,
– high-level syllabus outline,
– class and lecture outline.
Some entries are still TBD, please contact the instructor if you have questions.
PIAZZA, CANVAS AND OTHER LINKS:
Canvas: https://canvas.stanford.edu/courses/135723
Piazza: https://piazza.com/class/kmewl9o18oi3je
Gradescope: https://www.gradescope.com/courses/255564 (find the code in Piazza)
We will be using Miro, an online whiteboard, the one for the course is here: https://miro.com/app/board/o9J_lO0m4ZI=/
General Course Information
This course will explore the nascent field of secure compilation, which sits at the intersection between security and programming languages. The perspective from which we teach this field is that of formal methods and foundations of computer security.
The goal of a secure compiler is to compile programs in order to preserve source security properties like data confidentiality and code integrity. This is challenging because attackers operating at the level of the compiler output are inherently more powerful than attackers in the source language. For example, target attackers can mount buffer overflows that bypass compiled code security abstractions that derive e.g., from compiled source level type annotations.
This course will describe:
- the threat model that secure compilers consider
- correctness criteria for secure compilation,
- specific instances of secure compilers
- proof techniques for secure compilation.
Correctness criteria define that a compiler is secure. The course will explain why can we be sure that presented criteria have security meaning (i.e., what kinds of attacks can be defended against and which not).
Secondly, the course will discuss specific instances of secure compilers and how they achieve security (i.e., what mechanism — including types, crypto, security architecture, etc.— do these compilers exploit to realise one of the presented criteria and what are their tradeoffs).
Finally the course will cover proof techniques for secure compilation, i.e., how do you formally prove that a compiler adheres to one of the presented criteria and therefore is secure.
For any question, please contact Marco Patrignani (see mail here).
Prerequisites and Interests
Students should have a basic understanding of programming and of how a compiler works (CS143, CS243). A background in formal languages and semantics is a plus (CS358, CS242).
Given the formal view on the subject, this course should appeal to students with interest in security, formal methods, foundations of computer science.
Location and Time Slots (all times are in Pacific)
- Time slot: T-TH, 10:30-11:50
- Location: Zoom : https://stanford.zoom.us/j/94556898594 (PWD in piazza)
- Office hours:
Friday afternoon 12-13, but other slots can be opened if needed.
Please drop an email ahead of time if you plan to come.
Evaluation and Grading
Evaluation will be based on:
- 6 assignments:
[ass1.pdf] | [ass2.pdf] | [ass3.pdf] | [ass4.pdf] | [ass6.pdf] | [ass7.pdf] - 1 non-optional student presentation [ass5.pdf] of a paper chosen from this list: [ list.pdf ]
- 1 final exam consisting of a written text + oral discussion (which we’ll run on zoom) [exam.pdf]
Grading:
60% assignments
20% presentation
20% final exam
Depending on the student’s interest, part of the evaluation can also be based on a project (research- or educational-focussed) related to the topics discussed in class. This can be discussed with the instructor and will be adjudicated based on student motivation and interest.
Please note that depending on the amount of attendees, some modification in the evaluation criteria may be necessary. All changes will be communicated to the students at the beginning of the class when the number of attendees is known.
High-level Syllabus
Below is a more precise outline of what we will cover in class, grouped by notions. Please note that there is not a direct mapping between notions and lectures, so covering a notion may take 1 lecture while covering another one may take 5. Also, some topics here may vary, though lightly.
Mostly we will follow my preliminary lecture notes:
- [350-bt.pdf] main notes
The original .tex files for this document are here [bt.tex] | [cmds.tex] - [350-asm-lang.pdf] assembly language formalisation.
And the following recommended reading: (no deadline, please do during first 2 weeks)
- Formal Approaches to Secure Compilation. [.pdf]
- Journey Beyond Full Abstraction. [.pdf]
- Robustly Safe Compilation, an Efficient Form of Secure Compilation. [.pdf]
- Fully-Abstract Compilation by Approximate Back-Translation. [.pdf]
- Exorcising Spectres with Secure Compilers. [.pdf]
all these papers are available in my publications page.
- Notions 1:
motivation behind secure compilation - Notions 2: program equivalences for security; expressing security properties with equivalences, classic examples. Traces for security: expressing security properties and hyperproperties (HP) with traces, HP diagram (safety, liveness, hypersafety). Examples: confidentiality, integrity, noninterference, specific properties
- Notions 3: full abstraction (FAC) theory: definition, justification for security, attacks and threat model, violations of FAC, common mistakes. Proving FAC: approaches, what is a backtranslation, context and traces backtranslation, when to use what
- Notions 4: trace preserving compilation: definition, justification, attacks, analogy with FAC, threat model
- Notions 5: Proving FAC. Recap of PL semantics notions: syntax/semantics, reading rules. Source language, target language: syntax and semantics, examples of equivalent programs. Compiler definition, compiler correctness. Backtranslation definition, backtranslation correctness. Target extension with context inspection, target traces, change backtranslation from context-based to trace-based
- Notions 6: Robustness, Robust property preservation, property-ful and -free criteria and their equivalence, different backtranslation classes. Robustly safe compilation (RSP) RSP in detail.
- Notions 7: Proving RSP starting from what we saw in Notions 5, comparison with FAC proof
- Notions 8: Advanced Robust Criteria: RHSP in detail, example of preserving NI, RSCHP, RHP. Relational (hyper)properties and their reason to exist.
Class Outline
Please note that this is tentative and precise class contents can vary slightly
w1: mar 30 apr 1
- lecture 1:
introduction & details
recap PL: source lang with types, target language without types, compiler
Suggested reading:
https://xavierleroy.org/publi/compcert-CACM.pdf
https://xavierleroy.org/courses/EUTypes-2019/slides.pdf
- lecture 2:
compiler correctness (CC)
partial programs, contexts, compiler for partials, compositional CC
security props
limitations of CC
Suggested reading:
https://www.ccs.neu.edu/home/amal/papers/next700ccc.pdf
w2: apr 6 8
- lecture 3:
contextual equivalence to express integrity and confidentiality
contextual equivalence formally
formalising assembly languages - lecture 4:
fully abstract compilation (fac): general statement, motivation, preservation
fac proofs: deriving reflection from CC
fac reflection for our compiler
w3: apr 13 15
- lecture 5:
fac proofs: proving preservation
fac preservation: need a context backtranslation (ctx bt)
fix the ctx bt: inject/extract
fac preservation via bt correctness
Suggested reading:
https://pdfs.semanticscholar.org/aa2e/a7d0f1dd1301e2fb7479acfb2e1f8e53cbc4.pdf
- lecture 6:
add target reflection and try the BT
define trivial target traces, fac with traces, derive what is needed of the traces
fac preservation via traces, fa traces
Suggested reading:
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.44.9264&rep=rep1&type=pdf
w4: apr 20 22
- lecture 7:
impure languages in source and target, ceq for stateful languages, correct compilation
formalising protection mechanisms: capabilities
fac for stateful languages and new encodable security properties
complexity of deriving a fac proof, high-level scheme - lecture 8:
traces as a generic security mechanism,
properties and hyperproperties and relational properties
adding interaction traces to our languages as well as I/O,
robust property preservation as a criterion for secure compilation
Suggested reading:
https://www.cs.cornell.edu/fbs/publications/EnfSecPols.pdf
https://www.cs.cornell.edu/fbs/publications/defliveness.pdf
https://dl.acm.org/doi/10.5555/1891823.1891830
w5: apr 27 29
- lecture 9:
different criteria that preserve different classes
equivalent characterisation and why are they equivalent
Cheatsheet:
http://theory.stanford.edu/~mp/mp/Publications_files/RC-diagram.pdf
- lecture 10:
safety preservation
liveness preservation
rsc definition, implications
Suggested reading:
https://squera.github.io/pdf/scAndHp.pdf
w6: may 4 6
- lecture 11:
rsc via correctness of trace-based BT and via cross-language relation
different forms of BT for our languages: can we make a context based BT
- lecture 12:
stlc and ulc, erasing compiler
compiler correctness and cross-language logical relation
fa compiler, intuition, uval, emuldv
w7: may 11 13
- lecture 13:
approximate bt, pseudotype LR
technicalities of the approx BT: in/case, in/dn, emulate, inject/extract
fac reflection and preservation via approx BT - lecture 14:
presentations
w8: may 18 20
- lecture 15:
presentations - lecture 16:
presentations
w9: may 25 27
- oral exams:
discuss your exam, some assignments and some course questions