「软件基础 - LF」 12. IMP

Posted by Hux on January 12, 2019
Z ::= X;;
Y ::= 1;;
WHILE ~(Z = 0) DO
  Y ::= Y * Z;;
  Z ::= Z - 1
END

A weird convention through out all IMP is:

  • a-: arith
  • b-: bool
  • c-: command

Arithmetic and Boolean Expression

Abstract Syntax

a ::= 
    | nat
    | a + a
    | a - a
    | a * a
b ::= 
    | true
    | false
    | a = a
    | a ≤ a
    | ¬b
    | b && b
1
2
3
4
5
6
7
8
9
10
11
12
Inductive aexp : Type :=
  | ANum (n : nat)
  | APlus (a1 a2 : aexp)
  | AMinus (a1 a2 : aexp)
  | AMult (a1 a2 : aexp).
Inductive bexp : Type :=
  | BTrue
  | BFalse
  | BEq (a1 a2 : aexp)
  | BLe (a1 a2 : aexp)
  | BNot (b : bexp)
  | BAnd (b1 b2 : bexp).

Evaluation

TODO: is this considered as “denotational semantics”?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Fixpoint aeval (a : aexp) : nat :=
  match a with
  | ANum n  n
  | APlus a1 a2  (aeval a1) + (aeval a2)
  | AMinus a1 a2  (aeval a1) - (aeval a2)
  | AMult a1 a2  (aeval a1) * (aeval a2)
  end.
Fixpoint beval (b : bexp) : bool :=
  match b with
  | BTrue  true
  | BFalse  false
  | BEq a1 a2  (aeval a1) =? (aeval a2)
  | BLe a1 a2  (aeval a1) <=? (aeval a2)
  | BNot b1  negb (beval b1)
  | BAnd b1 b2  andb (beval b1) (beval b2)
  end.

Supposed we have a Fixpoint optimize_0plus (a:aexp) : aexp

1
2
Theorem optimize_0plus_sound: a,
  aeval (optimize_0plus a) = aeval a.

During the proof, many cases of destruct aexp are similar! Recursive cases such as APlus, AMinus, AMult all require duplicated IH application.

From Coq Intensive: when we simpl on APlus case. it’s not “simplified” but give us a pattern matching. That’s a hint that we need to furthur case analysis by destruct n as 0 case or _ case.

Coq Automation

Tacticals

“higher-order tactics”.

try T and ; tacticals

if T fail, try T successfully does nothing at all

T;T' : performs T' on each subgoal generated by T.

Super blindly but useful: (only leave the “interesting” one.)

1
2
3
4
5
induction a;
    (* Most cases follow directly by the IH... *)
    try (simpl; rewrite IHa1; rewrite IHa2; reflexivity).
    (* ... or are immediate by definition *)
    try reflexivity.

. is the atomic ; cannot be stepped into…

T; [T1 | T2 | ... | Tn] tacticals

general form or ; T;T' is shorthand for: T; [T' | T' | ... | T'].

repeat tacticals

1
2
3
Theorem In10 : In 10 [1;2;3;4;5;6;7;8;9;10].
Proof.
  repeat (try (left; reflexivity); right). Qed.
  • stop when it fails
  • always succeeds, then loop forever! e.g. repeat simpl

This does not affect Coq’s logical consistency, construction process diverges means we have failed to construct a proof, not that we have constructed a wrong one.

Defining New Tactic Notations

  • Tactic Notation: syntax extension for tactics (good for simple macros)
1
2
Tactic Notation "simpl_and_try" tactic(c) :=
  simpl; try c.
  • Ltac: scripting language for tactics (good for more sophisticated proof engineering)
  • OCaml tactic scripting API (for wizards)

The omega Tactic

Presburger arithmetic

  • arith, equality, ordering, logic connectives
  • O(doubly expontential)

A Few More Handy Tactics

  • clear H
  • subst x, subst
  • rename ... into ... (change auto-generated name that we don’t like…)

the below three are very useful in Coq Automation (w/ try T; T')

  • assumption
  • contradiction
  • constructor (try to apply all constructors. Problem: might have multiple constructors applicable but some fail)

Evaluation as a Relation

Defined as Binary relation on aexp × nat. Exactly Big Step / Structural Operational Semantics.

More flexible than Fixpoint (computation, or Denotational). …Since we can operate on Inductive as data I guess? …and we can also induction on the relation. …and when things getting more and more “un-computable” (see below).

译注:求值关系不满足对称性,因为它是有方向的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Inductive aevalR : aexp  nat  Prop :=
  | E_ANum n :
      aevalR (ANum n) n
  | E_APlus (e1 e2: aexp) (n1 n2: nat) :
      aevalR e1 n1 
      aevalR e2 n2 
      aevalR (APlus e1 e2) (n1 + n2)
  | E_AMinus (e1 e2: aexp) (n1 n2: nat) :
      aevalR e1 n1 
      aevalR e2 n2 
      aevalR (AMinus e1 e2) (n1 - n2)
  | E_AMult (e1 e2: aexp) (n1 n2: nat) :
      aevalR e1 n1 
      aevalR e2 n2 
      aevalR (AMult e1 e2) (n1 * n2).

Noticed now we now define inductive in a mixed style: some arg is before : (named), some are after : (anonymous).

We could do this as well

1
2
3
4
 | E_APlus (e1 e2: aexp) (n1 n2: nat)
      (H1 : aevalR e1 n1)
      (H2 : aevalR e2 n2) :
      aevalR (APlus e1 e2) (n1 + n2)

Reserved Notation allow us using the notation during the definition!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Reserved Notation "e '\\' n" (at level 90, left associativity).

Inductive aevalR : aexp  nat  Prop :=
  | E_ANum (n : nat) :
      (ANum n) \\ n
  | E_APlus (e1 e2 : aexp) (n1 n2 : nat) :
      (e1 \\ n1)  
      (e2 \\ n2)  
      (APlus e1 e2) \\ (n1 + n2)
  | E_AMinus (e1 e2 : aexp) (n1 n2 : nat) :
      (e1 \\ n1)  
      (e2 \\ n2)  
      (AMinus e1 e2) \\ (n1 - n2)
  | E_AMult (e1 e2 : aexp) (n1 n2 : nat) :
      (e1 \\ n1)  
      (e2 \\ n2)  
      (AMult e1 e2) \\ (n1 * n2)

  where "e '\\' n" := (aevalR e n) : type_scope.

I hated this infix \\ notation…it tries to mimic (double down arrow).

1
2
3
4
                           e1 \\ n1
                           e2 \\ n2
                     -------------------- (E_APlus)
                     APlus e1 e2 \\ n1+n2

is actually:

1
2
3
4
                           e1 ⇓ n1
                           e2 ⇓ n2
                     -------------------- (E_APlus)
                     APlus e1 e2 ⇓ n1+n2

Coq Intensive: If you have two variables above the line. Think about if you need generalize dependent.

Computational vs. Relational Definitions INTERESTING

In some cases, relational definition are much better than computational (a.k.a. functional).

for situations, where thing beingdefined is not easy to express as a function (or not a function at all)

case 1 - safe division

1
2
Inductive aexp : Type :=
| ADiv (a1 a2 : aexp). (* <--- NEW *)
  • functional: how to return ADiv (ANum 5) (ANum 0)? probably has to be option (Coq is total!)
  • relational: (a1 \\ n1) → (a2 \\ n2) → (n2 > 0) → (mult n2 n3 = n1) → (ADiv a1 a2) \\ n3.
    • we can add a constraint (n2 > 0).

case 2 - non-determinism

1
2
Inductive aexp : Type :=
| AAny (* <--- NEW *)
  • functional: not a deterministic function…
  • relational: E_Any (n : nat) : AAny \\ n … just say it’s the case.

Nonetheless, functional definition is good at:

  1. by definition deterministic (need proof in relational case)
  2. take advantage of Coq’s computation engine.
  3. function can be directly “extracted” from Gallina to OCaml/Haskell

In large Coq developments:

  1. given both styles
  2. a lemma stating they coincise (等价)

Expressions with Variables

State (Environment) 环境

A machine state (or just state) represents the current values of all variables at some point in the execution of a program.

1
Definition state := total_map nat.

Syntax

1
2
Inductive aexp : Type :=
  | AId (x : string) (* <--- NEW *)

Notations & Coercisons – “meta-programming” and AST quasi-quotation

Quasi-quotation

OCaml PPX & AST quasi-quotation

quasi-quotation enables one to introduce symbols that stand for a linguistic expression in a given instance and are used as that linguistic expression in a different instance.

e.g. in above OCaml example, you wrote %expr 2 + 2 and you get [%expr [%e 2] + [%e 2]].

Coq’s Notation Scope + Coercision == built-in Quasi-quotation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
(** Coercision for constructors **)
Coercion AId : string >-> aexp.
Coercion ANum : nat >-> aexp.

(** Coercision for functions **)
Definition bool_to_bexp (b : bool) : bexp := if b then BTrue else BFalse.
Coercion bool_to_bexp : bool >-> bexp.

(** Scoped Notation **)
Bind Scope imp_scope with aexp.
Bind Scope imp_scope with bexp.

(** the Extension Point token **)
Delimit Scope imp_scope with imp.

(** now we can write... **)
Definition example_aexp := (3 + (X * 2))%imp : aexp.
Definition example_aexp : aexp := (3 + (X * 2))%imp. 
Definition example_aexp := (3 + (X * 2))%imp.    (* can be inferred *)

Evaluation w/ State (Environment)

Noticed that the st has to be threaded all the way…

1
2
3
4
5
6
7
8
Fixpoint aeval (st : state) (a : aexp) : nat :=
  match a with
  | AId x  st x (* <--- NEW *)              (** lookup the environment **)
  ...

Fixpoint beval (st : state) (b : bexp) : bool := ...

Compute (aeval (X !-> 5) (3 + (X * 2))%imp). (** ===> 13 : nat **)

Commands (Statement)

c ::= SKIP | x ::= a | c ;; c | TEST b THEN c ELSE c FI | WHILE b DO c END

we use TEST to avoid conflicting with the if and IF notations from the standard library.

1
2
3
4
5
6
Inductive com : Type :=
  | CSkip
  | CAss (x : string) (a : aexp)
  | CSeq (c1 c2 : com)
  | CIf (b : bexp) (c1 c2 : com)
  | CWhile (b : bexp) (c : com).

notation magics:

1
2
3
4
5
6
Bind Scope imp_scope with com.
Notation "'SKIP'" := CSkip : imp_scope.
Notation "x '::=' a" := (CAss x a) (at level 60) : imp_scope.
Notation "c1 ;; c2" := (CSeq c1 c2) (at level 80, right associativity) : imp_scope.
Notation "'WHILE' b 'DO' c 'END'" := (CWhile b c) (at level 80, right associativity) : imp_scope.
Notation "'TEST' c1 'THEN' c2 'ELSE' c3 'FI'" := (CIf c1 c2 c3) (at level 80, right associativity) : imp_scope.

Unset Notations

1
2
3
Unset Printing Notations.  (** e1 + e2 -> APlus e1 e2 **)
Set Printing Coercions.    (** n -> (ANum n) **)
Set Printing All.

The Locate command

1
2
3
4
5
6
Locate "&&".

(** give you two, [Print "&&"] only give you the default one **)
Notation
"x && y" := andb x y : bool_scope (default interpretation)
"x && y" := BAnd x y : imp_scope

Evaluating Commands

Noticed that to use quasi-quotation in pattern matching, we need

1
2
3
4
5
6
Open Scope imp_scope.
...
  | x ::= a1 =>     (**  CAss x a1  **)
  | c1 ;; c2 =>     (**  CSeq c1 c1 **)
...
Close Scope imp_scope.

An infinite loop (the %imp scope is inferred)

1
2
3
4
Definition loop : com :=
  WHILE true DO
    SKIP
  END.

The fact that WHILE loops don’t necessarily terminate makes defining an evaluation function tricky…

Evaluation as function (FAIL)

In OCaml/Haskell, we simply recurse, but In Coq

1
2
3
4
| WHILE b DO c END => if (beval st b)
                      then ceval_fun st (c ;; WHILE b DO c END)
                      else st
(** Cannot guess decreasing argument of fix **)

Well, if Coq allowed (potentially) non-terminating, the logic would be inconsistent:

1
Fixpoint loop_false (n : nat) : False := loop_false n.   (** False is proved! **)

Step-Indexed Evaluator (SUCC)

Chapter ImpCEvalFun provide some workarounds to make functional evalution works:

  1. step-indexed evaluator, i.e. limit the recursion depth. (think about Depth-Limited Search).
  2. return option to tell if it’s a normal or abnormal termination.
  3. use LETOPT...IN... to reduce the “optional unwrapping” (basicaly Monadic binding >>=!)
    • this approach of let-binding became so popular in ML family.

Evaluation as Relation (SUCC)

Again, we are using some fancy notation st=[c]=>st' to mimic : In both PLT and TaPL, we are almost exclusively use Small-Step, but in PLC, Big-Step were used:

1
2
3
4
                      beval st b1 = true
                       st =[ c1 ]=> st'
            ---------------------------------------  (E_IfTrue)
            st =[ TEST b1 THEN c1 ELSE c2 FI ]=> st'

is really:

1
2
3
4
                        H; b1 ⇓ true
                        H; c1 ⇓ H'
              ----------------------------------  (E_IfTrue)
              H; TEST b1 THEN c1 ELSE c2 FI ⇓ H'
1
2
3
4
5
6
7
8
9
10
11
12
13
Reserved Notation "st '=[' c ']⇒' st'" (at level 40).
Inductive ceval : com  state  state  Prop :=
...
| E_Seq : c1 c2 st st' st'',
    st =[ c1 ] st' 
    st' =[ c2 ] st'' 
    st =[ c1 ;; c2 ] st''
| E_IfTrue : st st' b c1 c2,
      beval st b = true 
      st =[ c1 ] st' 
      st =[ TEST b THEN c1 ELSE c2 FI ] st'
...
  where "st =[ c ]⇒ st'" := (ceval c st st').

By definition evaluation as relation (in Type level), we need to construct proofs (terms) to define example.

…noticed that in the definition of relaiton ceval, we actually use the computational aevel, beval.. …noticed that we are using explicit style rather than constructor argument style (for IDK reason). They are the same!

Determinism of Evaluation

Changing from a computational to a relational definition of evaluation is a good move because it frees us from the artificial requirement that evaluation should be a total function 求值不再必须是全函数

But it also raises a question: Is the second definition of evaluation really a partial function? 这个定义真的是偏函数吗?(这里的重点在于 偏函数 要求 right-unique 即 deterministic)

we can prove:

1
2
3
4
5
Theorem ceval_deterministic: c st st1 st2,
     st =[ c ] st1 
     st =[ c ] st2 
     st1 = st2.
Proof. ...

Reasoning About Imp Programs

Case plus2_spec

1
2
3
4
5
6
Theorem plus2_spec : st n st',
  st X = n 
  st =[ plus2 ] st' 
  st' X = n + 2.
Proof.
  intros st n st' HX Heval.

this looks much better as inference rules:

1
2
3
4
    H(x) = n
    H; x := x + 2 ⇓ H'
  --------------------- (plus2_spec)
    H'(x) = n + 2

By inversion on the Big Step eval relation, we can expand one step of ceval (对 derivation tree 的 expanding 过程其实就是展开我们所需的计算步骤的过程)

1
2
3
  st : string -> nat
  =================================
  (X !-> st X + 2; st) X = st X + 2

In inference rule:

1
2
3
  H : string → nat
  ================================
  (x ↦ H(x) + 2); H)(x) = H(x) + 2

Case no_whiles_terminating

1
2
3
4
5
6
7
8
Theorem no_whilesR_terminating_fail:
   forall c, no_whilesR c -> forall st, exists st', st =[ c ]=> st'.
Proof.
  intros.
  induction H; simpl in *. 
  - admit.
  - admit.
  - (* E_Seq *)

If we intros st before induction c, the IH would be for particular st and too specific for E_Seq (It’s actually okay for TEST since both branch derive from the same st)

1
2
3
4
IHno_whilesR1 : exists st' : state, st =[ c1 ]=> st'
IHno_whilesR2 : exists st' : state, st =[ c2 ]=> st'
============================
exists st' : state, st =[ c1;; c2 ]=> st'

So we’d love to

1
2
3
4
5
6
generalize dependent st.
induction H...
- specialize (IHno_whilesR1 st).  destruct IHno_whilesR1 as [st' Hc1].
  specialize (IHno_whilesR2 st'). destruct IHno_whilesR2 as [st'' Hc2].  (* specialize [IH2] with the existential of [IH1] **)
  exists st''.
  apply E_Seq with (st'); assumption.

Additional Exerciese

Stack Compiler

Things that evaluate arithmetic expressions using stack:

  • Old HP Calculators
  • Forth, Postscript
  • Java Virtual Machine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
infix:
      (2*3)+(3*(4-2))

postfix:
      2 3 * 3 4 2 - * +

stack:
      [ ]           |    2 3 * 3 4 2 - * +
      [2]           |    3 * 3 4 2 - * +
      [3, 2]        |    * 3 4 2 - * +
      [6]           |    3 4 2 - * +
      [3, 6]        |    4 2 - * +
      [4, 3, 6]     |    2 - * +
      [2, 4, 3, 6]  |    - * +
      [2, 3, 6]     |    * +
      [6, 6]        |    +
      [12]          |

Goal: compiler translates aexp into stack machine instructions.

1
2
3
4
5
6
Inductive sinstr : Type :=
| SPush (n : nat)
| SLoad (x : string)   (* load from store (heap) *)
| SPlus
| SMinus
| SMult.

Correct Proof

1
2
Theorem s_compile_correct : forall (st : state) (e : aexp),
  s_execute st [] (s_compile e) = [ aeval st e ].

To prove this, we need a stronger induction hypothesis (i.e. more general), so we state:

1
2
Theorem s_execute_theorem : forall (st : state) (e : aexp) (stack : list nat) (prog : list sinstr),
  s_execute st stack (s_compile e ++ prog) = s_execute st ((aeval st e) :: stack) prog.

and go through!

IMP Break/Continue

1
2
3
Inductive result : Type :=
  | SContinue
  | SBreak.

The idea is that we can add a signal to notify the loop!

Fun to go through!

Slide Q & A

st =[c1;;c2] => st'

  • there would be intermediate thing after inversion so… we need determinism to prove this!
    • (It won’t be even true in undetermincy)
  • the WHILE one (would diverge)
    • true…how to prove?
    • induction on derivation…!
      • show contradiction for all cases
  • to prove ¬(∃st', ...), we intro the existentials and prove the False.

Auto

auto includes try

  1. Proof with auto.
  2. Set Intro Auto