2021-03-23

Expression ProblemにおけるOOPと関数型の対比について

OCaml

Expression Problemという計算機科学・ソフトウェア工学の世界でそれなりに有名な問題がある。

この記事がいい感じにまとめていると思う：

eli.thegreenplace.net

Imagine that we have a set of data types and a set of operations that act on these types. Sometimes we need to add more operations and make sure they work properly on all types; sometimes we need to add more types and make sure all operations work properly on them. Sometimes, however, we need to add both - and herein lies the problem. Most of the mainstream programming languages don't provide good tools to add both new types and new operations to an existing system without having to change existing code. This is called the "expression problem".

で、この記事でもそうだし他の扱いでも基本的に

「OOPだと新しいデータ型を追加するのは簡単、だけど新しい関数を追加するのは（既存のクラスにメソッド生やさないといけないので）大変」
「関数型だと新しい関数は他のコードを触らずに追加できる、でも新しいデータ型（ADTの直和型）を追加したら（既存の関数にその型に対応する部分を追加しないといけないので）大変」

という対比が示される。

私も以前は「ふんふん、確かにそうだなー」と考えていたのだけど。（さらにいうとOCamlやHaskellの場合は、パターンマッチでワイルドカードを濫用しなければコンパイラが問題点を教えてくれてそれを逐次直していくだけだからそこまで問題視もしていなかった。特にアプリケーションコードだったらそこまで気になることでもないように思う・・・）

最近OCamlでオブジェクトやクラスを使わない代わりに何をするのか、について考えていたところ、この「OOP vs 関数型」な論はかなり雑じゃないか？と感じ始めた。

というのも例えばこちらの記事：

camlspotter.hatenablog.com

ここに書かれているような形で一つのモジュールに特定のデータ型とそのデータ型に対する操作を定義するのはOCamlではかなり普及している手法である。そして上記の記事の最後にもある通り、この形にしておく大きな利点は「ML module system における強力なプログラム再利用装置（上の記事から）」であるファンクタに適用しやすいモジュールインタフェースになるということだ。

OCamlではアドホック多相はないのだけど、「特定のインタフェースを提供する型に対して、コードは一回書くだけで全部に対応できるようにする」という「プログラム再利用」はファンクタで実現できる。

まあ例えばこんな感じ：

module type ADD = sig
  type t
  val add : t -> t -> t
end

module Int = struct
  type t = int
  let add = (+)
end

module String = struct
  type t = string
  let add = (^)
end

module MakeMul (M : ADD) = struct
  include M
  let rec mul x n =
    if n <= 1 then x
    else add x (mul x (n-1))
end

module MulInt = MakeMul(Int)
module MulString = MakeMul(String)

utop # MulInt.mul 3 5;;
- : int = 15

utop # MulString.mul "hello" 5;;
- : string = "hellohellohellohellohello"

アドホック多相ではないので、mul 3 5やmul "hello" 5とは書けないが、いちいち同じ処理をInt.tとString.tに対して書くのではなく、ADDというインタフェースを持っているすべてのモジュールのM.tに対して同一のコードで対応できる。

で、ここに「しまったMulの中でエラーを出す必要が生じたので、値を文字列化して出力しないと・・・」となった場合

module type ADD = sig
  type t
  val add : t -> t -> t
  val to_string : t -> string (* ここ *)
end

module Int = struct
  type t = int
  let add = (+)
  let to_string = string_of_int (* ここ *)
end

module String = struct
  type t = string
  let add = (^)
  let to_string x = x (* ここ *)
end

module MakeMul (M : ADD) = struct
  include M
  let rec mul x n =
    if n <= 1 then x
    else add x (mul x (n-1))
  let error x = failwith @@ ("Something wrong with " ^ (to_string x)) (* ここ *)
end

module MulInt = MakeMul(Int)
module MulString = MakeMul(String)

utop # MulInt.error 5;;
Exception: Failure "Something wrong with 5".

utop # MulString.error "hello";;
Exception: Failure "Something wrong with hello".

とまあいちいち元のモジュールを変えていく必要が生じる。

MLモジュールシステムが特に非関数型だとも、ことさらOOPだとも思えないので、Expression Problemのトレードオフをその二つの対比に使うのはずれている気がしてしまう。（Haskellの型クラスだとまた違う話になりそうで、実際More thoughts on the expression problem in Haskellというような記事もあるのだが、ちゃんと考えきれてないので割愛）

どちらかというと「直和型で特定のデータの違いを表すと、その違いを完全に別のデータ型で表すのと違ったトレードオフになるよ」「違うトレードオフを選択できる手段があるのは便利だよ」という代数的データ型の特徴がよく出る問題だと考えるのがいいのではないだろうか？

2021-03-22

OCamlProのOCaml Cheat Sheets

OCaml

OCamlProのブログを過去に辿っていたらこういう記事を見つけた：

www.ocamlpro.com

OCamlの構文の１ページまとめと、標準ライブラリの２ページまとめ。（出たのは2019年9月）

流石に構文に関しては（Objects & Classes関連以外は）知らないところはなかったが（でもloopは記憶があやふや・・・）、Stdlibに関しては結構便利で例えばイミュータブルな連想配列のMapモジュールの大まかな機能とかすぐに見れるのは嬉しい。

まあOCamlの標準ライブラリは貧弱なことで有名でCoreやBatteriesといった外部による拡張標準ライブラリを大抵使うことになるので、どちらかというとそちらのチートシートが欲しい気もするが・・・

あと少し古いので、例えばapplicative/monadic let（let*やlet+）のような構文についてはなんの言及もない。（と思ったが、よく考えてみるとあの機能はOCaml 4.08で入ったのでバージョン的にはこのチートシートがカバーしているはず。そこまで重要な構文だとは認識されなかったのか？）

ちなみにopamのものも出ているようだ：

www.ocamlpro.com

こちらはオプション全然知らないのでありがたい。このチートシートを眺めて面白そうな機能はドキュメント読もう。

OCamlProのgithub repoにいくともう少しあった：

github.com

OCamlのC APIについてのものやOCamlの外部ライブラリ紹介のものは面白そうだが、どの程度編集済みなものかはわからないので参考までに読む、に留めておいたほうが良さそうかもしれない。

2011年にも同じようなチートシートを出していたらしく、2011-2019でのOCamlの言語的・標準ライブラリ的な進化についても別記事で回想していた：

www.ocamlpro.com

2019年までのちょっと新しめの主要機能についてさらっと追うには便利そうだ。

2021-03-21

Hanson & SussmanのSoftware Design for Flexibilityを読み始めた

Software Design for Flexibility Scheme

Chris HansonとGerald Jay Sussmanによる共著のSoftware Design for Flexibility（今後SDfFと略す）が昨日届いたので読み始めた。今は第二章を読んでいるところ。

Gerald Jay SussmanはStructure and Interpretation of Computer Programs（通称SICP)の共著者なので、その本が聖典な一部界隈ではSDfFが出版される前から期待が高まっていた。（SussmanはScheme言語の作者の一人でもある）

副題は「How to Avoid Programming Yourself into a Corner」で、Schemeを使って「開発後の拡張の自由度を最大限確保する手法」をいろいろと紹介する本になっている。Forewordに「This book is a master class in specific program organization strategies that maintain flexibility」、謝辞に「In many ways this book is an advanced sequel to SICP」とある通り、ある程度プログラミングに習熟した読者層を想定しているようだ。

紹介される手法としては

コンビネータを使ったDSL
マルチディスパッチ
パターンマッチや探索のための様々なバックトラック
データやコードの階層化による分離
特定の問題解決のためのインタプリタ実装

あたりで、確かにSICPに出てくるモチーフを深化させたような印象。

第1章終わりまでの雑感

ForewordがGuy Steele！（SussmanとともにSchemeの作者、Common LispやJavaの言語仕様制定にも関わっている）
謝辞でSussmanの恩師としてMinskyとPapertが最初に挙げられてる。Minskyはそうだろうなと思ったけどPapertとの関係は知らなかった
第１章は生物学的な観点から「変化にRobustで進化・対応していけるシステムの構造」を探るような内容になっている。Postel's Law、分離したコンポーネントを組み合わせるためのコンビネータ、冗長性（システムの部品の失敗を許容するredundancyと新しい状況に適応する手段を複数持ち得るdegeneracy）、パターンを作成するgeneratorとそのパターンを選別するtesterの分離とフィードバックなど、ソフトウェア開発に適応できる指針をいくつか提示して（実際の手法を深掘りする）今後の章に繋いでいる。
ここらへんの生物学的なシステムを指針とするところは少しアラン・ケイを思わせる
生物学的な問題提起から（SICPでも見られた）Programming Language Theoryに近いトピックでの問題解決に向かうところは面白くもあり、少し我田引水なのでは？という懸念も（少なくとも現段階では）抱いた
個人的にはSchemeで提示される様々な手法のどれほどが別の言語でも使う気になるかが気になっている。例えばC++で使おうと思ったらあまりにもボイラープレートが多くなりすぎてげんなり、というのはありそう。PythonやOCamlだとどうだろう。

非常に面白そうなので「早く読みすすめたい！」と逸る気持ちと「個別の手法をしっかり追って理解したい」という気持ちがあって複雑。とにかく読んでいきたい。

2021-03-20

Robert Harperの型クラス批判

OCaml

Robert Harperのブログに（10年くらい前に）載っていたHaskellの型クラスに対する批判について@elpin1alさんに教えていただいたのでメモ。@elpin1al ありがとうございます！

型クラスとモジュール

言語の抽象化機構の中心的存在は何かと考えると、Haskellなら型クラス、MLならモジュールが挙がる。

www.quora.com

例えばこのOCamlとHaskellを比較する質問で：

Typeclasses vs Modules

Haskell has typeclasses, which are simply awesome. OCaml has a superb module system which can play some of the same roles, but typeclasses are simply more useful, more often. Some specific patterns—like monads—are nicer and easier to use thank to typeclasses. People using OCaml in the real world do still use particular monads like option or async, but they're more awkward than in Haskell, especially when you want to use multiple ones in related parts of your code. Additionally, it's much harder to write generic code against monads in OCaml.

のようなことが書かれたりする。

これを踏まえてこんなジョークを飛ばした：

who needs functors when you can have type classes? https://t.co/3wwlw3gV9R
— zehnpaard (@zehnpaard) February 24, 2021

そうしたらこういうツッコミ（？）をもらった：

Still waiting for an answer to this question
— Daniピー (@kidviddy) February 25, 2021

私も個人的にはまだ型クラスによる制御されたアドホック多相を放棄するほどの価値がモジュールシステムにあるのかは判断できない・・・

Modules Matter Most

こういう時に思い出すのがThe Definition of Standard MLの作者の一人Robert HarperのこのModules Matter Mostというブログ記事である：

existentialtype.wordpress.com

（ちなみにHarperは共著で「Modularity Matters Most」という論文も出している）

全体的にHarper節炸裂な感のある記事だが：

In Haskell you have type classes, which are unaccountably popular (perhaps because it’s the first thing many people learn). There are two fundamental problems with type classes. The first is that they insist that a type can implement a type class in exactly one way. For example, according to the philosophy of type classes, the integers can be ordered in precisely one way (the usual ordering), but obviously there are many orderings (say, by divisibility) of interest. The second is that they confound two separate issues: specifying how a type implements a type class and specifying when such a specification should be used during type inference. As a consequence, using type classes is, in Greg Morrisett’s term, like steering the Queen Mary: you have to get this hulking mass pointed in the right direction so that the inference mechanism resolves things the way you want it to.

挙げられている二点の批判のうちの最初のものはよく聞くもの（例えばIntがモノイドとしては(Int, +, 0)として定義されていてそのままだと(Int, *, 1)と使い分けられない）だが、二点目がなかなか理解できないでいた。

Modular Type Classes

その旨呟いたら@elpin1alさんに

instance宣言をした瞬間に、そのinstanceがoverload解決に使われうることを指しています
— El Pin Al (@elpin1al) March 20, 2021

https://t.co/eKDmTsA7Gz の2.2節 "Separating the definition of an instance from its use"を見ると良いです
— El Pin Al (@elpin1al) March 20, 2021

なるほど、ありがとうございます！
とするとやはり「there can only ever be a single instance of a class at a certain type in one program」という点が問題だという話になるんですね
— zehnpaard (@zehnpaard) March 20, 2021

いえ、そちらの問題はnamed instanceなどで解決できるのでそうではなく、「複数instanceが存在する(特に別々のモジュールで定義された)場合にどのinstanceが選択されるかの予測・制御が難しい」という点について言っているのだと思います
— El Pin Al (@elpin1al) March 20, 2021

ブログだと”in Greg Morrisett’s term, ... the way you want it to.”の部分、論文だと”thus facilitating modular decomposition and constraining inference to make use only of a clearly specified set of instances.”の部分です
— El Pin Al (@elpin1al) March 20, 2021

と教えていただいた。

Modular Type Classesという論文で、Modular Implicits関連で話は聞いていたものの読んだことがなかったのだが大変面白かった。

つまりHarperの批判というのは

型クラスはHaskellで実装されているものだと一つの型が同じ型クラスの複数インスタンスになり得ない、がこれはダメだ
Haskell界でもこの問題は認識されているけど、それに対する提案（named instancesなど）だと（Moduleから始めていないので？）特定の箇所での型推論時に期待されるインスタンスを使うようにするのが非常に大変（それに対してモジュールを中心としたデザインに型クラスを上乗せするような実装だと上手くいくよ）

ということか。最初のポイントはHaskellにおける型クラスに対する批判、第二のポイントは（複数インスタンスを可能とさせるような）Haskell界で提案されている型クラス拡張（そしてその思想的背景（＝モジュールの軽視？））に対する批判、という感じだろうか？

2021-03-19

open recursionなモジュールに型定義を持たせる（下改）

OCaml

open recursionの記事に関して、より良い書き方があることに気づいたので書き留めておく。

あっちの記事ではシグニチャ二つを定義していて、GT.t = GT.s greet、NT.t = NT.s named greetと、GT.t型とNT.t型の形が違っていた：

module type NT = sig
  type s
  type t = s named greet
  type ctor

  val create : ctor -> t
  val addressee : t -> string
  val greet : t -> string
end

そのせいでNamedF内のSelfモジュール（NT型だ）を直接GreetFに渡すことができず、struct ... endで無名モジュールを作って型合わせに色々とやる必要が出てきた：

module NamedF(Self : NT)
  : NT with type s = Self.s and type t = Self.t and type ctor = Self.ctor
  = struct
    module Super = GreetF(struct
      type s = Self.s named
      type t = s greet
      include (Self : GT with type s := Self.s named and type t := s greet)
    end)
    include Self
    let greet = Super.greet
    let addressee x = x.greet_child.name
  end

しかしよく考えるとNT.t型とGT.t型を両方s greetにしてしまえばこういう問題は生じない。NTではNT.tが'a named greet型になるようにしたい、という制約があるが、それを実現するために新しく型を追加すればいい、というのが今回の気づきポイント：

module type NT = sig
  type r
  include GT with type s = r named
end

type rを導入することによってtype s = r namedとsの形に制約をつけられる。それ以外はすべてGTと同じなのでinclude GT with type s = r namedでNTが作れる。

これでGTと同じようにtype t = s greetになるのでNamedFがスッキリ書ける：

module NamedF(Self : NT)
  : NT with type r = Self.r and type s = Self.s and type t = Self.t and type ctor = Self.ctor
  = struct
    module Super = GreetF(Self) (* ここ *)
    include Self
    let addressee x = x.greet_child.name
    let greet = Super.greet
  end

Superをモジュールとして定義せずにincludeできるかも考えたのだが、Selfの中のほしいものをshadowしてしまう（あるいは逆にSelfにshadowされてしまう）のでうまくいかなそう。

コード全容：

type 'child greet = { message : string; greet_child : 'child }

module type GT = sig
  type s
  type t = s greet
  type ctor

  val create : ctor -> t
  val addressee : t -> string
  val greet : t -> string
end

module GreetF(Self : GT)
  : GT with type s = Self.s and type t = Self.t and type ctor = Self.ctor
  = struct
    include Self
    let addressee _x = "world"
    let greet x = x.message ^ " " ^ Self.addressee x
  end

module rec Greet
  : GT with type s = unit and type ctor = string (* ここもtype tが省略できることに気づいて少し簡単になっている *)
  = GreetF(struct
    include Greet
    let create id = { message=id; greet_child=() }
  end)

type 'child named = { name : string; named_child : 'child }

module type NT = sig
  type r
  include GT with type s = r named
end

module NamedF(Self : NT)
  : NT with type r = Self.r and type s = Self.s and type t = Self.t and type ctor = Self.ctor
  = struct
    module Super = GreetF(Self)
    include Self
    let addressee x = x.greet_child.name
    let greet = Super.greet
  end

module rec Named
  : NT with type r = unit and type ctor = (string * string) (* 上に同じ *)
  = NamedF(struct
    include Named
    let create (message, name) = { message=message; greet_child={ name=name; named_child=() }}
  end)

2021-03-18

let rec内での多相とlocally abstract typeとpolymorphic type constraints

OCaml

こういう記事を最近読んだ：

tatta.hatenadiary.org

（書かれたのは10年以上前だが）

こういうコードは型チェック通らないよ、という話：

let rec f x = x 
and g () = f 1
and h () = f true;;

Line 3, characters 13-17:
Error: This expression has type bool but an expression was expected of type int

let rec ... and ... and ...という構文の中ではlet多相が効かないということらしい。

let rec ... in ...ならもちろんlet多相で大丈夫：

let rec f x = x in
let g () = f 1 in
let h () = f true in
(g ()), (h ());;

- : int * bool = (1, true)

じゃあ明示的に型をつければ？とナイーブな付け方をしてみると：

let rec f : 'a -> 'a = fun x -> x
and g () = f 1
and h () = f true;;

Line 3, characters 13-17:
Error: This expression has type bool but an expression was expected of type int

これは失敗する。その理由については以前記事を書いた：

zehnpaard.hatenablog.com

のでこういう風には多相を強要できず、let rec ... and ... and ...が（この構文の中では）単相になってしまう。

locally abstract type

記事を読んで手元で試したときは「はえー」と思ったくらいだったのだが、ここ数日GADTについて調べているときにlocally abstract typeに遭遇した。

ふとこの問題にlocally abstract typeが使えるのでは？と思って試してみた：

let rec f : type a. a -> a = fun x -> x
and g () = f 1
and h () = f true;;

val f : 'a -> 'a = <fun>
val g : unit -> int = <fun>
val h : unit -> bool = <fun>

これは通るので成功。locally abstract type、小技として便利そう。

Polymorphic type constraints

ここまで記事を書いてちょっと寝かせておいたら、こんな記事に遭遇した：

www.craigfe.io

The punchline is that OCaml actually does have syntax for expressing polymorphic constraints – and it even involves an explicit quantifier – but sadly it's not often taught to beginners:

let id : 'a. 'a -> 'a = (fun x -> x + 1)

知らなかった・・・

let rec f : 'a. 'a -> 'a = fun x -> x
and g () = f 1
and h () = f true;;

val f : 'a -> 'a = <fun>
val g : unit -> int = <fun>
val h : unit -> bool = <fun>

これで問題なくうまくいくようだ・・・

この構文で

let rec ... and ...などで関数が単相にならないようにする
関数の実装的に単相的なものを型検査で弾く

の両方ができるらしい。

しかしGADTで：

type 'a t =
  Int : int -> int t
| Char : char -> char t

let get : 'a. 'a t -> 'a = function
  Int x -> x
| Char x -> x

とは書けない。ここの違いに関してはまだ理解できていない・・・　（'a. 'a -> 'aがダメな理由はなんとなくわかるが、locally abstract typeはそれ以上の何をしているのだろう？）

追記１

こういうご指摘を@camloebaさんにいただいた：

let rec で再帰定義されている値は、定義中で多相型に推論するのが決定不能なので諦めています。and が無くても同じようなことが起きます。例えば let rec f : 'a. 'a -> unit = fun x -> f x; f (x,x)
— Jun Furuse 🐫🌴 (@camloeba) March 18, 2021

なるほど・・・　この例だと実行時は最初のf xで無限ループになるが、こういうこともできるわけだ：

let rec f : 'a. ('a * int) -> unit =
  fun (x, y) ->
    if y = 0 then ()
    else (print_endline "hello"; f ((x,x), y-1));;

引数のタプルの第一要素の型は再帰するたびに変わる（倍々になっていく）がpolymorphic type constraintを入れているので大丈夫。let rec f = fun ...と型制約を入れない場合は多相だと型推論しないのでエラー。

追記２

こちらは@dico_lequeさんに教えていただいた：

これは、locally abstract typesで通るというよりは、locally abstract typesの
Polymorphic syntaxで通ると言うべきではhttps://t.co/TMDsXeciiJ https://t.co/kyUY35hz0a
— でこれき (@dico_leque) March 18, 2021

たしかに・・・(type a) : a -> aのようなOCaml 3.12で追加されたlocally abstract type構文ではダメで：

let rec f (type a) : a -> a = fun x -> x
and g () = f 1
and h () = f true;;

Line 3, characters 13-17:
Error: This expression has type bool but an expression was expected of type int

4.00で追加されたlocally abstract typeのpolymorphic syntax let rec f : type a. a -> aを使う必要がある。（GADTで使うのもこちらの構文）

@camloebaさん、@dico_lequeさんご教示いただきありがとうございます！

2021-03-17

OCamlでGADTを試してみた

OCaml

なんとなく幽霊型がわかったので、その勢いに乗ってGeneralized Algebraic Data Typesについても調べてみた。

OCamlの言語機能としては：

caml.inria.fr

OCaml 4.00から導入された（ちなみに2021/3で最新版は4.12）とのことで、リリーススケジュールを見てみると：

ocaml.org

4.00がリリースされGADTが使えるようになったのは2012年のようだ。

GADTとは

簡潔な解説としてはこちらのreddit投稿がすごくよかった：

www.reddit.com

ちょっとアレンジして説明してみる。

例えばこんな型を定義したとする：

type t = Int of int | Char of char

ゲッターもほしい。が、どう定義すればいいのか。

簡単なのは返ってくる型ごとに定義してみる方式：

let getInt = function
| Int x -> x
| _ -> failwith "cannot getInt on non-Int"

let getChat = function
| Char x -> x
| _ -> failwith "cannot getChar on non-Char"

これはちゃんとコンパイルする。が二つ関数を定義することになるし、さらに型が合わないバリアントの場合は実行時エラーを投げることになる（静的型の恩恵を受けない）

多相的なgetを定義しようとすると：

let get = function
| Int x -> x
| Char x -> x

Line 3, characters 12-13:
Error: This expression has type char but an expression was expected of type int

と怒られてしまう。

つまり問題点は二つ：

特定の型とバリアントを対象に考えてパターンマッチする時に、それ以外のバリアントで実行時エラーを投げるような処理を書くことになる
返り値の型が多相的な関数が書けない

この二点を解決するのがGeneralized Algebraic Data Typeだ。

GADTに直接いく前に少し回り道をして考えてみたい。返り値の多相に関しては

type 'a t = 'a list
let f = function [] -> None | x :: _ -> Some x

のような形ならパラメトリック多相に書ける。

だったら

type 'a t =
  Int of int
| Char of char

ならどうか？というとこれは前回見た幽霊型で、例えばlet x : bool t = Int 5のような値が型システム上は作れてしまう。なので

let f : 'a t -> a = function
  Int x -> x
| Char x -> x

という関数が書けない（let x : bool t = Int 5 in f xがboolにならないといけないので型チェックが通らない）。

なんとかして

type 'a t =
  (Int of int) : int t
| (Char of char) : char t

のようにバリアントごとに型をきっちり決めることができれば・・・

というわけでそれができるのがGADTだ。

実際の構文はちょっと変わっている：

type 'a t =
  Int : int -> int t
| Char : char -> char t

コンストラクタを関数とみなして、その関数の型を書くような形で定義している。これで例えばInt 5は必ずint tf型になる。

このGADTに対してgetIntやgetCharを書くと：

let getInt = function Int x -> x
let getChar = function Char x -> x

関数一つにつき一つのバリアントについてだけパターンを書くだけで済む。

コンパイラはgetInt : int t -> int、getChar : char t -> charと推論して、その場合はパターンマッチが網羅的であると型チェックを通す。

そして多相的なget：

let get : type a. a t -> a = function
  Int x -> x
| Char x -> x

ここでは一工夫必要になっていて、get : type a. a t -> aでaというlocally abstract typeを使うことでちゃんと多相性を保つようにする。

これで

get @@ Int 5
get @@ Char 'c'

と入力されるバリアントが違えば返り値の型も変わる関数が書けた。

前述の問題点２つが

特定の型とバリアントを対象に考えてパターンマッチする時に、それ以外のバリアントは書かなくてもいい（書かなくても網羅的であると型システムが判断してくれる）
返り値の型が多相的な関数が書ける（locally abstract typeも使えば）

としっかり解決されている。

ASTをGADTで表現してみる

前回の記事の終わり、「隠蔽しないと幽霊型にならない」という話題で言及した記事：

camlspotter.hatenablog.com

とその元である：

tatta.hatenadiary.org

こちらの例は（記事が書かれた時点ではOCamlになかった）GADTで書くと非常に綺麗になる。

型はこんな定義：

type 'a term =
| Lit : int -> int term
| Inc : int term -> int term
| IsZ : int term -> bool term
| If : (bool term * 'a term * 'a term) -> 'a term
| Pair : ('a term * 'b term) -> ('a * 'b) term
| Fst : ('a * 'b) term -> 'a term
| Snd : ('a * 'b) term -> 'b term

幽霊型を使う場合はいちいちコンストラクタを関数でwrapしてその関数の返り値の型を幽霊型にして、という作業が必要だったがGADTの場合はバリアント定義そのものにしっかり型を指定するのでそういったwrapperはいらない。

上記の定義だけで：

utop # Lit 0;;
- : int term = Lit 0

utop # Inc (Lit 0);;
- : int term = Inc (Lit 0)

utop # IsZ (Lit 0);;
- : bool term = IsZ (Lit 0)

utop # IsZ (IsZ (Lit 0));;
Line 1, characters 4-17:
Error: This expression has type bool term
       but an expression was expected of type int term
       Type bool is not compatible with type int

とダメなケースは型システムで静的に弾いてくれる。

GADTが幽霊型に対してこのケースでさらに優れている点としては、この型を使った関数でも静的型検査がうまく作動してくれること。

例えばこのASTを評価するeval関数を作成する：

let rec eval : type a. a term -> a = function
| Lit n -> n
| Inc e -> eval e + 1
| IsZ e -> eval e = 0
| If(cond, t, f) -> if eval cond then eval t else eval f
| Pair(e1, e2) -> (eval e1), (eval e2)
| Fst e -> fst @@ eval e
| Snd e -> snd @@ eval e

まあ非常にスムーズに書ける。GADTではなくただのバリアントで書いた場合はIf(cond, t, f)などでeval condがちゃんとbool（をwrapしたバリアント）になっているかチェック、boolじゃなければエラー、といった処理を書く必要が出てくるものだが、ここでは必要ない。Ifバリアントの定義からcondはbool term型でしかありえず、ということはeval condはbool型でしかありえない、ということが型システムに保証される。

さらには、例えばうっかり

let rec eval : type a. a term -> a = function
| Lit n -> n
| Inc e -> eval e + 1
| IsZ e -> eval e = 0
| If(cond, t, f) -> if eval t then eval cond else eval f (* ここ *)
| Pair(e1, e2) -> (eval e1), (eval e2)
| Fst e -> fst @@ eval e
| Snd e -> snd @@ eval e

とIf(cond, t, f)の部分の順序を間違えてeval tを分岐条件に持ってこようとした場合：

Line 5, characters 23-29:
Error: This expression has type a but an expression was expected of type
       bool because it is in the condition of an if-statement

と「eval tはbool型であるとは限りません」と静的に弾かれる。

他の使い方

前回の記事でも出たYaron MinskyのJane Street BlogにもGADTの（言語実装以外の実用的な）活用事例が載っている：

blog.janestreet.com

多相的なarrayとchar型しか格納できないbyteを統一的なバリアント型とインターフェイスで扱うのにGADTが非常に便利、という話。Jane Street Blogはこういう実用的・少し泥臭い話題も出てくるから面白い。

Arantium Maestum

プログラミング、囲碁、読書の話題

Expression ProblemにおけるOOPと関数型の対比について

OCamlProのOCaml Cheat Sheets

Hanson & SussmanのSoftware Design for Flexibilityを読み始めた

第1章終わりまでの雑感

Robert Harperの型クラス批判

型クラスとモジュール

Modules Matter Most

Modular Type Classes

open recursionなモジュールに型定義を持たせる（下改）

let rec内での多相とlocally abstract typeとpolymorphic type constraints

locally abstract type

Polymorphic type constraints

追記１

追記２

OCamlでGADTを試してみた

GADTとは

ASTをGADTで表現してみる

他の使い方