(** * The Proved Program of The Month - Type-safe printf via delimited continuations. In this development we define the generalized monad typeclass and one particular instance: the continuation monad with variable input and output types. The latter lets us define [shift] and [reset] delimited control operators with effect typing, answer-type modification, and polymorphism. As an application of these operators, we build a type-safe sprintf. This is joint fun with Oleg Kiselyov who implemented this in Haskell ([http://okmij.org/ftp/Haskell/ShiftResetGenuine.hs], [http://www.haskell.org/pipermail/haskell/2007-December/020034.html]) which itself is based on the lambda-sr-let calculus presented in: AK07, Polymorphic delimited continuations, Asai and Kameyama, APLAS 2007, [http://logic.cs.tsukuba.ac.jp/~kam/paper/aplas07.pdf] The type-safe sprintf and its typing problems (resolved in this development) are described by Kenichi Asai in [http://pllab.is.ocha.ac.jp/~asai/papers/tr07-1.ps.gz] Brought to Coq and documented by Matthieu Sozeau([at lri.fr]). Reviewed and corrected by Oleg Kiselyov. This file can be processed using the latest coq version available ([svn checkout svn://scm.gforge.inria.fr/svn/coq/trunk]) and requires first to get and compile the Haskell prelude port available at: [darcs get http://www.lri.fr/~sozeau/repos/coq/monads] *) (** Life is much easier without arguments getting in the way ! *) Set Implicit Arguments. Unset Strict Implicit. (** ** The generalized monad [GMonad] A generalized monad (or parameterized monad, or indexed monad) is a monad parameterized by two types, input and output that can change during computations. However two monadic computations compose only if the output of the first one is the input of the other one. The laws are just the same. In more detail, the generalized monads are described in Robert Atkey, Parameterised Notions of Computation, Msfp2006 [http://homepages.inf.ed.ac.uk/ratkey/param-notions.pdf] see also [http://www.haskell.org/pipermail/haskell/2006-December/018917.html] *) Class GMonad (m : Type -> Type -> Type -> Type) := gunit : forall {i a} (x : a), m i i a ; gbind : forall {i o' o a b}, m o' o a -> (a -> m i o' b) -> m i o b ; gbind_gunit_left : forall i o a b (x : a) (f : a -> m i o b), gbind (gunit x) f = f x ; gbind_gunit_right : forall i o a (x : m i o a), gbind x gunit = x ; gbind_assoc : forall i o' o'' o a b c (x : m o'' o a) (f : a -> m o' o'' b) (g : b -> m i o' c), gbind x (fun a : a => gbind (f a) g) = gbind (gbind x f) g. (** We define overloaded notations for all generalized monads, resembling the usual monad notations. *) Notation "T >>== U" := (gbind T U) (at level 20). Notation "x <-- T ;; E" := (gbind T (fun x : _ => E)) (at level 30, right associativity). Notation "'ret' t" := (gunit t) (at level 20). (** The [join] of a generalized monad. *) Definition gjoin [ GMonad m ] {I O O' A B} (e1 : m O' O (A -> m I O' B)) (x : A) : m I O B := f <-- e1 ;; f x. (** We can trivially embed any monad in a general monad, ignoring the extra input/output types. The proofs of the laws reduce to the proofs for the monad, hence they are automatically solved. *) Require Import Prelude.Monad. Program Instance [ Monad m ] => monad_gmonad : GMonad (fun input output => m) := gunit i a x := unit x ; gbind i out out' a b m f := bind m f. (** ** The continuation monad [cont]. A [cont I O A] object is a function, which takes a continuation with the answer-type [I] consuming objects of the type [A]. The result, the final answer, is an object of the type [O]. In general, [I] is not the same as [O], hence our [cont I O A] permits answer-type modification. See AK07 for more explanations. The continuation monad is fully polymorphic in the object, input and output type, hence the "final answer" type [O] of the continuation is not fixed, and neither is [I]. We will define combinators that are indeed instantiated at different types in a single expression. In lambda-sr-let calculus of [AK07], a [cont I O A] is the type of terms [t] whose evaluation may incur a control effect. The types appear in judgments of the form G; I |- t : A; O. The judgment asserts that the term [t] has type [A] and that the answer type is transformed from [I] to [O] by [t]. The correspondence to Danvy and Filinski's function space with answer types is as follows: a function [A / I -> B / O] is represented here as [(A -> cont I O I) -> cont B B O]. This is the type of functions from [A] to [B], which also transform the answer type of the whole term in its context from the type [I] to the type [O]. Frequently [A] is instantiated with [()], hence we can change the notation to: [I ---> B / O] for computations which produce [B] objects while transforming the answer type from [I] to [O]. We'll see how it gets instantiated for [sprintf] later on. *) Definition cont I O A := (A -> I) -> O. Notation " { I ---> B | O } " := (cont I O I -> cont B B O) (at level 50) : type_scope. (** *** [GMonad cont] The type constructor [cont] is a generalized monad, as shown below. The implementation is the same as the continuation monad, only the types of the operations are more general. *) Program Instance cont_gmonad : GMonad cont := gunit I A x := fun k => k x ; gbind I O' O A B m f := fun k => m (fun a => f a k). Next Obligation. Proof. extensionality k. reflexivity. Qed. Next Obligation. Proof. extensionality k. f_equal. rewrite eta_expansion. reflexivity. Qed. (** ** [shift] and [reset] The monad lets us implement delimited control operators [shift] and [reset] with answer-type modifications and polymorphism. The operators can realize complex control structures. The types of these combinators implement the rules of the calculus as defined in [AK07] (fig. 3). *) (** The [reset] combinator takes a computation of the type [sigma] modifying the answer-type from [sigma] to [tau]. The combinator yields a pure computation of the type [tau]: That computation has no control effects, hence both the input and output answer types are the same, [a], and it is polymorphic over [a]. *) Definition reset {sigma tau a} (c : cont sigma tau sigma) : cont a a tau := fun k => k (c id). (** The [shift] combinator is third-order, hence not too easy to explain... [shift f] captures the continuation [f] and gives back a continuation producing a [b]. *) Definition shift {tau t a s b} (f : (tau -> cont t t a) -> cont s b s) : cont a b tau := fun k => f (fun tau => gunit (k tau)) id. (** The [run] function actually runs a continuation using the [id] continuation to start with. *) Definition run {T} (k : cont T T T) : T := k id. (** ** Examples Using these combinators we can define all examples of the [AK07] paper. *** [appnd] First we define [appnd], given section 2.1, which returns a continuation to append a list to another list. *) Require Import Coq.Lists.List. (** Returns a continuation which takes a function from [list A] to [T] to produce a function from [list A] to [cont T' T' T] for any [T], [T']. *) Fixpoint appnd {A T T'} (l : list A) : cont T (list A -> cont T' T' T) (list A) := match l with | [] => shift (fun k => gunit k) | hd :: tl => appnd tl >>== fun tl' => ret (hd :: tl') end. (** The values are usually polymorphic on the input and output types so that we can plug them to other monadic expressions. However the type of the result term is specified: it's a function from lists of [A] to continuations producing lists of [A]. *) Example appnd123 {A T} : cont A A (list nat -> cont T T (list nat)) := reset (appnd [1;2;3]). (** We can look at the actual computation defined by this expression using compute to get the normal form of a term. *) Eval compute in @appnd123. (** The notation [convertible x y] builds a proof of reflexivity of equality of [x] and asserts that this is a proof of [x = y]. This will typecheck only if [x] and [y] are actually convertible. It is useful when used together with [Check] which just typechecks a term. A [Check (convertible x y)] command succeeds if and only if [x] and [y] are convertible. *) Notation " 'convertible' x y " := ((refl_equal x) : x = y) (at level 0, x at next level, y at next level). (** The compute command gave us the term on the rhs here. *) Check (convertible @appnd123 fun (A T : Type) (k : (list nat -> (list nat -> T) -> T) -> A) => k (fun (tau : list nat) (k0 : list nat -> T) => k0 (1 :: 2 :: 3 :: tau))). (** Running the computation and applying the continuation to [[4;5;6]] actually finishes the computation and gives [[1;2;3;4;5;6]] as a final result. *) Example test1 : list nat := run (appnd123 >>== (fun k => k [4;5;6])). Check (convertible test1 ([1;2;3;4;5;6])). (** *** The [sprintf] function (sec 2.3 of [AK07]) The paper argues this test requires both the changing of the answer type and polymorphism (fmt is used polymorphically in stest3). We first need to define some operations to build formats. *) (** We define show functions for [nat] and [string] using the existing [Show] typeclass. *) Require Import Prelude.Show. Definition pr_int : nat -> string := show. Definition pr_str : string -> string := id. (** Then we define [conc]atenation of format specifiers, using do-notation. This is simply lifting the concatenation of [string]s to the [cont] monad. We use the infix notation [^^] for format concatenation and [##] for format application, which is in fact the [join] of the [cont] monad. *) Definition conc {I O O'} (e1 : cont O' O string) (e2 : cont I O' string) : cont I O string := x <-- e1 ;; y <-- e2 ;; ret (x +++ y). Infix "^^" := conc (right associativity, at level 70). Infix "##" := gjoin (at level 60). (** The [fmt] combinator takes a function from [A] to [string]s and builds a computation producing a [string] and changing the answer type from [B] to [A -> cont T T B] for any [B], [T]. The computation corresponding to the output answer-type takes an [A], transforms it to a string using [to_str] and passes the result to the current continuation [k]. *) Definition fmt {A B} (to_str : A -> string) {T} : cont B (A -> cont T T B) string := shift (fun k => ret (k ∘ to_str)). (** For [sprintf], the continuation [c] must build a string in the end and takes the result elaborated to this point (of type [string]) as an argument. However, the output type [T] of the whole sprintf computation is polymorphic, depending on how its argument modifies the answer type. It will become a function type if we're using formats. We can use the notation [I -> B / O] of Danvy & Filinski to describe the type of [sprintf]. It is simply an instance of the more general [reset] combinator. Intuitively, it will capture the continuations built using the [shift] operator inside [fmt] and return a function which must be applied to objects of the right type to be able to construct a runnable [cont] object. *) Definition sprintf T : { string ---> string | T } := reset. (** We can now start writing examples uses of [sprintf]. We open the [string] scope to be able to write strings using the common double quote notation. There are implicit conversion going on between the [string] type defined in [Prelude.Show] as [list ascii] and the isomorphic [string] datatype defined in [String]. We force the coercion using the str combinator here. *) Definition str (s : string) {I} : cont I I string := ret s. Open Scope string_scope. (** This example shows the type of [reset] applied to a [cont] object at a polymorphic type [A]. This continuation can be plugged in other computations because of this polymorphism. *) Definition polymorphic_reset A : cont A A string := reset (a:=A) (str "Hello world"). (** On the other hand, [sprintf] defines a restriction of the [reset] operation, and the only thing we can do with an [sprintf] result is run it because the answer type is now fixed. *) Definition monomorphic_reset : cont string string string := sprintf (str "Hello world"). (** Lets run it. *) Example hello_world : string := run (sprintf (str "Hello world")). Check (convertible hello_world "Hello world"). (** More interestingly, we can use format strings which actually change the answer type. *) Example hello_world_fmt : string := run (sprintf (str"Hello " ^^ fmt pr_str ^^ str"!") ## "world"). Check (convertible hello_world_fmt "Hello world!"). (** Let's decompose that expression into its subexpressions, everything is first class here: the format components, the corresponding function and the arguments. First, the [fmt] operator applied to a conversion function gives: *) Definition fmt_str {T A} := fmt pr_str (T:=T) (B:=A). (** An interesting point here is that we are forced to generalize the types by ourselves here. Indeed in Haskell one would not need to add the type abstractions for [T] and [A] nor the type annotations for [fmt] specifying the use of these types. The unbound type variables would be implicitly generalized in put in front of the type. However, we can do without the explicit instantiation using the [(x:=t)] syntax if we add a type signature to the definition instead. *) Definition fmt_int {T A} : cont A (nat -> cont T T A) string := fmt pr_int. (** We could even have a default behavior to implicitly generalize unbound variables appearing in the type of a definition to get rid of the [`T A`] binders, but that's a very fragile notion when the namespace of type variables is the global namespace of defined objects as in Coq (or other dependently-typed programming languages). Compare the following: *) Example aformat {A T} := str "The value of " ^^ fmt (T:=T) pr_str ^^ str " is " ^^ fmt (T:=T) (B:=A) pr_int. Example aformat_with_spec {A T} : cont A (string -> cont T T (nat -> cont T T A)) string := str "The value of " ^^ fmt pr_str ^^ str " is " ^^ fmt pr_int. (** Clearly, in this style of putting a typing constraint, the types will get too big and writing them will be a problem. Although the explicit types may be helpful to the type-checker, they are less and less helpful to us as they become unreadable. However there is some regularity here, which we can take advantage of. Indeed the only relevant part of a format type is the types which appear on the left of the arrows. I.e., the types of the holes of the format string. Hence, we can build a type function which given a list of types constructs the corresponding type specification. This is a regular fixpoint definition in Coq. *) Fixpoint format_type A T (l : list Set) : Type := match l with | [] => A | hd :: tl => hd -> cont T T (format_type A T tl) end. (** Then we can wrap it inside a toplevel [cont] constructor which gives a string. *) Definition format_spec l := forall A T, cont A (format_type A T l) string. (** We can check that it does the right thing for the [aformat] example. *) Check (convertible (format_spec [string;nat]) (forall A T, cont A (string -> cont T T (nat -> cont T T A)) string)). (** Hence we can define formats like this: *) Example aformat_with_format_spec : format_spec [string; nat] := fun _ _ => str "The value of " ^^ fmt pr_str ^^ str " is " ^^ fmt pr_int. (** Note that we still have to abstract by [A] and [T] in the format, but we can make these arguments implicit. This permits to apply the format more easily, [A] and [T] will get instantiated by the context of application (e.g. to [string] if [sprintf] is applied). *) Implicit Arguments aformat_with_format_spec [[A] [T]]. (** Let's use that to define a complex [sprintf] expression: *) Example value_of : string := run ((sprintf aformat_with_format_spec ## "x" ## 3)). Check (convertible value_of "The value of x is 3"). (** That's all folks! *)