2018-04-16

`foldr`で`dropWhile`と不動点コンビネータ補足

Haskell

以下の記事の補足：

zehnpaard.hatenablog.com

式をいじっていて気付いたのだが、不動点コンビネータを使った解は：

dropWhile p = foldr (const f) undefined (repeat undefined)
  where f _ [] = []
        f g ys@(x:xs) = if p x then g xs else ys

に落とし込める。（fixとdropWhileNR関数を別にしていたから当初気付かなかった）

これはそもそも別解の：

dropWhile p xs = foldr f id xs xs
  where f x g ys@(_:zs)
           | p x = g zs
           | otherwise = ys

から式変形で導出できる。xとysの最初の要素が同一なので：

dropWhile p xs = foldr f id xs xs
  where f _ g ys@(z:zs)
           | p z = g zs
           | otherwise = ys

あるいはifを使って：

dropWhile p xs = foldr f id xs xs
  where f _ g ys@(z:zs) = if p z then g zs else ys

に変形できる。そうすると、foldrの第三引数になっているxsの要素はDon't Careパターンに適用されるだけなのでrepeat undefinedで代用できそう：

dropWhile p xs = foldr f id (repeat undefined) xs
  where f _ g ys@(z:zs) = if p z then g zs else ys

上記のコードには問題があって、dropWhile (<5) [1..4]などでエラーを起こす。

実はfoldrの第三引数のxsは、要素は使われなかったがリスト自体の長さは「いつ第二引数のidを返すか」を判定するのに使われていた。具体的にはxsが空リストになった時にidを返す。

なので明示的にその停止条件を入れる：

dropWhile p xs = foldr f id (repeat undefined) xs
  where f _ _ [] = [] 
        f _ g ys@(z:zs) = if p z then g zs else ys

これはf _ _ [] = id []と同じ。foldrの第二引数も使わないのでundefinedで代用できる：

dropWhile p xs = foldr f undefined (repeat undefined) xs
  where f _ _ [] = [] 
        f _ g ys@(z:zs) = if p z then g zs else ys

そしてdropWhile p xs = foldr f undefined (repeat undefined) xsのxsも、それ以降の定義ではまったく使われないので両辺から消してしまう：

dropWhile p = foldr f undefined (repeat undefined)
  where f _ _ [] = [] 
        f _ g ys@(z:zs) = if p z then g zs else ys

f _は(const f)と同義だ。fの第一引数は必ずDon't Careパターンで無視されているので：

dropWhile p = foldr (const f) undefined (repeat undefined)
  where f _ [] = [] 
        f g ys@(z:zs) = if p z then g zs else ys

に落ち着く。あまり複雑なこともせずに式変形で不動点コンビネータが導き出されるのは面白い。

2018-04-15

Haskellで50行以内のミニマルコンパイラ

Haskell 言語処理系

こんなツイートが流れてきた：

A complete #compiler to x86 in one page for my lecture today. pic.twitter.com/PwFl4V5czy
— Joe Gibbs Politz (@joepolitz) 2018年4月11日

元となっているのは"An Incremental Approach to Compiler Construction"という論文とのこと。

面白そうだったので、許可をとってHaskellで再現してみた。(Thanks @joepolitz !)

Minimal Lisp Compiler inspired by Joe Gibbs Politz (@joepolitz) and http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf · GitHub

S式のパース

元のOCamlのコードでは文字列をS式にパースするライブラリを使っている。多分Haskellでも存在すると思うが、今回は簡単なパーサを自前で作る：

data Sexp = Atom String
          | List [Sexp]

parseSexp :: Parser Sexp
parseSexp = parseAtom <|> parseList

parseAtom :: Parser Sexp
parseAtom = ((many1 letter) <|> (many1 digit)) >>= return . Atom

parseList :: Parser Sexp
parseList = between (char '(') (char ')') $ sepBy parseSexp (many1 space) >>= return . List

stringToSexp :: String -> Sexp
stringToSexp s = case parse parseSexp "compiler" s of
  Left err   -> error $ "Parse failed at String->Sexp conversion: " ++ show err
  Right sexp -> sexp

今の所、アルファベット文字だけのAtomか数字だけのAtom、そしてそれらを()で囲ったListだけをパースする。

エラーハンドリングもめんどくさがってParsecがエラーを返してきたら直接errorを投げるようにしている。さらにこのエラーはどこでもキャッチされない。

言語仕様とS式からASTへ

今回コンパイルするソース言語は非常に簡単。

Expressionは整数か、一つのExpressionに対するIncかDecオペレータの適用、だけ。

以下の定義のとおり：

data Op = Inc
        | Dec

data Expr = ENum Integer
          | EOp Op Expr

さきほど定義したパーサが返すS式をこのExpr型を使ったASTに変換する：

sexpToExpr :: Sexp -> Expr
sexpToExpr (Atom s)                 = ENum (read s :: Integer)
sexpToExpr (List [Atom "inc", arg]) = EOp Inc (sexpToExpr arg)
sexpToExpr (List [Atom "dec", arg]) = EOp Dec (sexpToExpr arg)
sexpToExpr _                        = error "Parse failed at Sexp->Expr conversion"

これも簡単。パターンマッチ便利。

ただ、今読み直してみると(inc a)とかだとreadがエラーを投げていて統一感がないな・・・

ASTからアセンブラ

Expr型のデータに表されるASTをアセンブラ命令の文字列に変換する。

使う命令が三つしかない上、そもそもASTが枝分かれとかしないのでものすごく簡単：

exprToInstrs :: Expr -> [String]
exprToInstrs  (ENum n)    = ["move eax, " ++ show n]
exprToInstrs  (EOp Inc e) = exprToInstrs e ++ ["add eax, 1"]
exprToInstrs  (EOp Dec e) = exprToInstrs e ++ ["sub eax, 1"]

組み合わせたらコンパイラ

あとは組み合わせて、すべてのプログラム共通のヘッダ部分・フッタ部分を付け加えるだけ：

compile :: String -> String
compile s = body `seq` header ++ body ++ footer
 where header = "section . text\nglobal our_code_starts_here\nour_code_starts_here:\n"
       body   = concat $ map (printf "  %s\n") $ exprToInstrs $ sexpToExpr $ stringToSexp s
       footer = "  ret\n"

面白いところがあるとすれば、パースエラーが起きた場合にはヘッダを出力したくないのでseqをつかってパース部分を先行評価している部分か。

あとはmain関数のIOモナド内で、ファイル名を受けとり、ファイルを読んでコンパイルし、標準出力に送る：

main :: IO ()
main = do args   <- getArgs
          source <- readFile $ head args
          putStrLn $ compile source

全部で50行以内。非常に簡単なソース言語とはいえ、おおまかな枠組みが「ひとめ」で展望できるような形で実装できたのは面白かった。

これからhttp://scheme2006.cs.uchicago.edu/11-ghuloum.pdfもちょこちょこ実装していきたい。

2018-04-14

`foldr`で`dropWhile`と不動点コンビネータ

Haskell

前回のブログの問題元は今は亡き？Monad.Reader第6号の"Getting a Fix from the Right Fold"という記事。

私が考えた答えの「失敗例」は出てくるけど、パターンマッチを使わないで遅延評価が無限リストにも効く解は出てこない。

しかし別解は非常にエレガントなものが一つ、理解の範疇になかったものが一つ。

`foldr`で関数を作成する

エレガントな解は「foldrと入力リストでまず関数を作り、その関数を再度リストに適用する」というもの。

まずは簡単な定義から：

dropWhile p xs = foldr f id xs xs
  where f x g
          | p x = g . tail
          | otherwise = id

どういうふうに作動するかといえば

dropWhile (<3) [1..10]
= foldr f id [1..10] [1..10]
= f 1 (foldr f id [2..10]) [1..10]
= (foldr f id [2..10] . tail) [1..10]
= (f 2 (foldr f id [3..10]) . tail) [1..10]
= (foldr f id [3..10] . tail . tail) [1..10]
= (f 3 (foldr f id [4..10]) . tail . tail) [1..10]
= (id . tail . tail) [1..10]
= (id . tail) [2..10]
= id [3..10]
= [3..10]

正常に作動するし無限リストも扱えそう。ただし、遅延評価している分(id . tail .... tail)の部分が一旦スペースリークを起こしているのがわかる。

foldr f id xs xsのfの定義を同値性を保ったまま少しいじることで効率化が図れる：

dropWhile p xs = foldr f id xs xs
  where f x g ys@(_:zs)
          | p x = g . zs
          | otherwise = ys

where f x gだったところをf x g ysに変え、さらにys@(_:zs)とtail部分にパターンマッチしている。

これで挙動は以下のとおりとなる：

dropWhile (<3) [1..10]
= foldr f id [1..10] [1..10]
= f 1 (foldr f id [2..10]) [1..10]
= foldr f id [2..10] [2..10]
= f 2 (foldr f id [3..10]) [2..10]
= foldr f id [3..10] [3..10]
= f 3 (foldr f id [4..10]) [3..10]
= [3..10]

tailを貯めてから最終的に適用するのではなく、都度都度最終的な引数であるリストに対してtailを適用するのと同値なパターンマッチをすることで、効率よく「foldrとリストで関数を作って、同じリストに適用する」を実装している。

`foldr`で不動点 コンビネータを作成

さて、不動点コンビネータというものがある。

fix f = f (fix f)

という性質を持つ関数fixのことだ。

foldrでfixを定義できる：

fix :: (a -> b) -> a -> b
fix f = foldr (const f) undefined (repeat undefined)

使われている関数をいくつか説明する。

constは二つの引数をとり、第二引数を評価することなく必ず第一引数を返す：

const :: a -> b -> a
const f _ = f

repeat xはxが無限に続くリストを返す：

repeat :: a -> [a]
repeat x = x : repeat x

fix f = f (fix f)が成り立つ証明：

fix f
= foldr (const f) undefined (repeat undefined)
= foldr (const f) undefined (undefined : repeat undefined)
= (const f) undefined (foldr (const f) undefined (repeat undefined))
= f (foldr (const f) undefined (repeat undefined)
= f (fixed f) -- QED

undefinedが任意の型を取り得ることを利用している。

こうして定義した不動点コンビネータを使って任意の再帰を表現できる、らしい。

たしかにTypes and Programming Languagesでも

The omega combinator has a useful generalization called the fixed-point combinator, which can be used to help define recursive functions such as factorial

などと書いてあった。

`sum`で試してみる。

sumの再帰的定義：

sum :: Num a => [a] -> a
sum [] = 0
sum (x:xs) = x + sum xs

この定義から再帰部分を抜き出して引数rとする：

sumNoRec r [] = 0
sumNoRec r (x:xs) = x + r xs

このsumNoRecをfixに引数として渡すと、再帰的sumと同じ関数になる：

sumfp = fix sumNoRec

確かめてみる：

sumfp [1, 2, 3]
= (fix sumNoRec) [1, 2, 3]
= (foldr (const sumNoRec) undefined (repeat undefined)) [1, 2, 3]
= (foldr (const sumNoRec) undefined (undefined : repeat undefined)) [1, 2, 3]
= (const sumNoRec) undefined (foldr (const sumNoRec) undefined (repeat undefined)) [1, 2, 3]
= sumNoRec (foldr (const sumNoRec) undefined (repeat undefined)) [1, 2, 3]
= 1 + (foldr (const sumNoRec) undefined (repeat undefined)) [2, 3]
= 1 + (foldr (const sumNoRec) undefined (undefined : repeat undefined)) [2, 3]
= 1 + (const sumNoRec) undefined (foldr (const sumNoRec) undefined (repeat undefined)) [2, 3]
= 1 + sumNoRec (foldr (const sumNoRec) undefined (repeat undefined)) [2, 3]
= 1 + (2 + (foldr (const sumNoRec) undefined (repeat undefined)) [3])
= 1 + (2 + (foldr (const sumNoRec) undefined (undefined : repeat undefined)) [3])
= 1 + (2 + (const sumNoRec) undefined (foldr (const sumNoRec) undefined (repeat undefined)) [3])
= 1 + (2 + sumNoRec (foldr (const sumNoRec) undefined (repeat undefined)) [3])
= 1 + (2 + (3 + (foldr (const sumNoRec) undefined (repeat undefined)) []))
= 1 + (2 + (3 + (foldr (const sumNoRec) undefined (undefined : repeat undefined)) []))
= 1 + (2 + (3 + (const sumNoRec) undefined (foldr (const sumNoRec) undefined (repeat undefined)) []))
= 1 + (2 + (3 + sumNoRec (foldr (const sumNoRec) undefined (repeat undefined)) []))
= 1 + (2 + (3 + 0))
= 6

たしかにsumNoRecの停止条件が満たされるまで、何回でもfixによってsumNoRecが適用されるのがわかる。

`dropWhile`を`fix`で

dropWhileの再帰的定義：

dropWhile p [] = []
dropWhile p ys@(x:xs)
  | p x = dropWhile p xs
  | otherwise = ys

dropWhileNRの定義：

dropWhileNR r p [] = []
dropWhileNR r p ys@(x:xs)
  | p x = r p xs
  | otherwise = ys

dropWhileFPの定義：

dropWhileFP = fix dropWhileNR

挙動チェック：

dropWhileFP (<3) [1..10]
= (fix dropWhileNR) (<3) [1..10]
= (foldr (const dropWhileNR) undefined (repeat undefined)) (<3) [1..10]
= (foldr (const dropWhileNR) undefined (undefined : repeat undefined)) (<3) [1..10]
= ((const dropWhileNR) undefined (foldr (const dropWhileNR) undefined (repeat undefined))) (<3) [1..10]
= (dropWhileNR (foldr (const dropWhileNR) undefined (repeat undefined))) (<3) [1..10]
= (foldr (const dropWhileNR) undefined (repeat undefined)) (<3) [2..10]
= (foldr (const dropWhileNR) undefined (undefined : repeat undefined)) (<3) [2..10]
= ((const dropWhileNR) undefined (foldr (const dropWhileNR) undefined (repeat undefined))) (<3) [2..10]
= (dropWhileNR (foldr (const dropWhileNR) undefined (repeat undefined))) (<3) [2..10]
= (foldr (const dropWhileNR) undefined (repeat undefined)) (<3) [3..10]
= (foldr (const dropWhileNR) undefined (undefined : repeat undefined)) (<3) [3..10]
= ((const dropWhileNR) undefined (foldr (const dropWhileNR) undefined (repeat undefined))) (<3) [3..10]
= (dropWhileNR (foldr (const dropWhileNR) undefined (repeat undefined))) (<3) [3..10]
= [3..10]

信じがたいことに実際に上手くいくようだ。というか、書き下してみてようやく何故上手くいくのかが腑に落ちてきた・・・

実用性はわからないが、とりあえず

再帰関数は不動点コンビネータで書き直せる
不動点コンビネータはfoldrで書き表せる

の二点は忘れることはないと思う。

TaPLにはこういうコンビネータによる計算がいろいろと載っていそうなので、ぜひ近いうちに読んでいきたい。

2018-04-13

`foldr`を使って`dropWhile`を定義する

Haskell

Haskellの高階関数のfoldrでdropWhileを定義せよ、という問題を見たのでやってみる。

`foldr`

foldrの定義は以下のとおり：

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f e [] = e
foldr f e (x:xs) = f x (foldr f e xs)

「わかってるねんで？foldrがreduce、foldlがfold。ときどきごっちゃになんねん」「（PythonやClojureの）reduceはfoldlです」

エスカレーターとエレベーター　……ちょっと時々ごっちゃになるねん
— 春日歩(大阪) (@osakabot) 2017年4月5日

とにかくややこしい。

foldr (+) 0 [1, 2, 3] = 1 + (2 + (3 + 0)))
foldl (+) 0 [1, 2, 3] = ((0 + 1) + 2) + 3

でどっちにベースの要素eがつくかでleft/rightを判断するのがよさそう。

`dropWhile`

dropWhileを高階関数を使わずに定義するなら：

dropWhile :: (a -> Bool) -> [a] -> [a]
dropWhile p [] = []
dropWhile p ys@(x:xs) = if p x then dropWhile p xs else ys

になる。

`foldr`で`filter`

まず、より簡単なfilterをfoldrで定義してみる：

filter :: (a -> Bool) -> [a] -> [a]
filter p = foldr f []
  where f x xs = if p x then (x:xs) else xs

filterがなぜ簡単かというと、xを含むかどうかはp xの結果のみで決定できるからだ。

逆に言うとdropWhileが複雑なのはp xだけでなく、「以前の要素がpを満たしていたか」も判断基準になってくるから。

`foldr`と`reverse`で`dropWhile`

ということで、一番簡単な解決法はreverseを使ってリストを逆転させてしまうこと。

dropWhile p = reverse . foldr f [] . reverse
  where f x xs = if p x && (null xs) then xs else x:xs

これでfoldr f e (x:xs)の結果を返す時、foldr f e xsの結果を判断材料にできる。

しかし問題も大きい。そもそもO(n)の処理を三回やっているし、さらに重要なのは遅延評価の強みを消してしまっていること。dropWhileの第一要素を計算するためにリストすべてをいったん調べなければいけない。とくにundefinedが含まれるpartial listや無限リストなどは扱えなくなってしまう。

`foldr`で`dropWhile`

というわけで本命：

dropWhile p = fst . foldr f ([], [])
  where f x xst = case p x of
                    True -> (fst xst, ys)
                    False -> (ys, ys)
                  where ys = x : snd xst

foldrなのでリストの右側から考えて

今後左側の要素がすべてpを満たした場合
左側にpを満たさない要素があった場合

の結果リストをタプルで持っている。

これで

Prelude> take 5 $ dropWhile even ([2, 4, 6, 7, 8] ++ [1..])
[7,8,1,2,3]

のように無限リストにも対応出来る。

パターンマッチによる失敗例

ちなみにこれは：

dropWhile p = fst . foldr f ([], [])
  where f x (a, b) = case p x of
                      True -> (a, c)
                      False -> (c, c)
                    where c = x : b

としてしまうと無限リストなどの場合stack overflowを起こしてうまくいかない。なんとなく直観的にはわかるのだが説明はできない・・・

2018-04-12

HaskellとParsecでLisp REPL　その８（変数定義）

Haskell

今回の変更

github.com

「ユーザによる変数定義」機能を追加する。

これはかなり大きな変更で、変数の内容を保持・評価するために、変更可能な「状態」を導入する必要がある。

Haskellだと状態をStateやSTモナドで管理することも可能（らしい）のだが、基本的にそれらは「ローカルに状態が必要な関数」に役にたつようだ。今回のケースでは変数の状態はほぼグローバルに必要な上、REPL上でユーザからのIOの間で状態を保持する必要がある、ということでIOモナド内で状態を持つIORefを使う。

`IORef`モナド

IORefというのはData.IORefに定義されているモナド。今回は以下の関数で扱う：

newIORef :: a -> IO (IORef a)
readIORef :: IORef a -> IO a
writeIORef :: IORef a -> a -> IO ()

あまり「他言語の機能との比喩」は好きではないが、newIORefはコンストラクタ、readIORefはgetter、writeIORefはsetter、そしてIORef自体はポインタ的なもの、という理解だ。上記の関数は、型からもわかる通り、使うと必ずIOモナドの中に入る。

`Environment`モジュール

新しくEnvironmentというモジュールを作成した。定義された変数すべてを持ち歩く「環境」変数を定義するモジュールである。

type Env = IORef [(String, IORef LispVal)]

個別のユーザ定義変数は(変数名, IORef 値)というタプルで表現される。

それらを集めたリストをさらにIORefモナドで包んだのが環境変数を保持するための型であるEnv。

そして、「初期状態」を表すEnv型のnullEnvも定義：

nullEnv :: IO Env
nullEnv = newIORef []

REPLが始まったばかりでユーザがまだなにも定義していない状態を表している。

`Variables`モジュール

Variablesモジュールも新しく追加。変数定義・代入・値取得などの関数をここに書いていく。

以下の関数を定義：

isBound :: Env -> String -> IO Bool // 変数が定義されているか確認
getVar :: Env -> String -> IO LispVal // 変数の値を取得
setVar :: Env -> String -> LispVal -> IO LispVal // 値を既存の変数に代入
defineVar :: Env -> String -> LispVal -> IO LispVal // 変数を定義してから値を代入

まずは、ある変数がすでに定義されているかどうかを調べるヘルパー関数：

isBound :: Env -> String -> IO Bool
isBound envRef var = readIORef envRef >>=
                     return . lookup var >>=
                     return . maybe False (const True)

環境を表すenvRefの中身をlookupして成功か失敗かを返す。

maybe関数の第二引数はJust値に適用する関数なので、const Trueで引数がなんだろうと常に「真」を返す関数を作成して渡している。

次に値取得：

getVar :: Env -> String -> IO LispVal
getVar envRef var = readIORef envRef >>=
                    return . lookup var >>=
                    maybe (return $ Bool False) readIORef

値が存在しない場合はLispValのBool Falseを返し、存在する場合はreadIORefで変数に含まれる値を返す。環境も個々の変数もIORefを使っているのでreadIORefを二回使っている。

すでに存在する変数に対する代入：

setVar :: Env -> String -> LispVal -> IO LispVal
setVar envRef var val = do {
    env <- readIORef envRef;
    maybe (return ()) (flip writeIORef val) (lookup var env);
    return val;
}

代入するべき変数がlookupで見つかったらwriteIORefで代入。見つからなかったら何もしない。式全体の値は代入されるべき値。

変数が未定義の時に暗黙裡に失敗しているのがいや。エラー処理を実装する時に直す。

変数の定義：

defineVar :: Env -> String -> LispVal -> IO LispVal
defineVar envRef var val = do {
    alreadyDefined <- isBound envRef var;
    if alreadyDefined
        then setVar envRef var val
        else do {
            valRef <- newIORef val;
            env <- readIORef envRef;
            writeIORef envRef ((var, valRef):env);
            return val;
        }
}

すでに定義されている場合は値を代入するだけ。

変数が未定義の場合、その変数と値のタプルを追加した新しい環境変数を作成し、それをenvRefに代入している。

評価器

evalにVariablesで定義した関数を追加していく。

今まで評価は

eval :: LispVal -> LispVal

だったのが

eval :: Env -> LispVal -> IO LispVal

に変わった。この型の違いは

あるLispValを評価するには、評価対象だけではなく、世界の状態をもつEnvも必要
「評価」自体がIOを発生させ得る

という大きな変更を反映している。

それをふまえて既存のevalにも少し変更を加えないといけない。

一番簡単なのはこういうふうに：

eval env lispVal = return lispVal

引数を増やして、IOモナドに入れるためにreturnを追加するだけ。

関数適用やif構文ももちょこちょことモナド的な構文に変えている：

それでようやくVariablesで定義した関数を追加できる：

eval env (Atom var) = getVar env var

Atom一個をそのまま評価する場合、以前はAtomそのものを返していたのだが、変数として扱うようになった。

あとはset!とdefineを定義するだけ：

eval env (List [Atom "set!", Atom var, form]) = eval env form >>= setVar env var 
eval env (List [Atom "define", Atom var, form]) = eval env form >>= defineVar env var

REPL・Main

あまり大きな変更はないのだが、まずreadEvalPrintが環境envを引数として取るようにしたこと：

readEvalPrint :: Env -> String -> IO ()
readEvalPrint env str = (eval env $ readExpr str) >>= putStrLn . show

今までmain関数でREPLの構成関数を組み合わせていたのを、nullEnvを初期値としてreadEvalPrintに渡す必要があったので独立したREPL関数にまとめたこと：

readEvalPrintLoop :: IO ()
readEvalPrintLoop = nullEnv >>=
                    loopUntil (== "quit") (readPrompt ">> ") . readEvalPrint

これでmain関数がすっきりした：

main :: IO ()
main = readEvalPrintLoop

実行

>> (define x 5)
5 // xを定義
>> (+ x 1)
6 // xを変数として使った式の評価
> (set! x 6)
6 // xに代入
>> (+ x 1)
7 // xの値が変わっている
>> (set! y 2)
2 // 未定義の変数yに代入
>> (+ x y)
6 // yはまだ未定義なので値は0として扱われる
>> (define y 1)
1 // yの定義
>> (+ x y)
7 // yの値がつかわれている
>> (set! y 2)
2 // yに代入
>> (+ x y)
8 // 新しいyの値が使われている

次回

次はモナド・トランスフォーマを使ったエラー処理の枠組みを追加する。

2018-04-12

HaskellとParsecでLisp REPL　その７（if構文）

Haskell

今回の変更

github.com

条件分岐のための構文を追加。

(if a b c)という式は、まずaを評価し、もし真ならbを、偽ならcを評価してその結果が値となる。

bとcのどちらかひとつしか評価されないのが重要なので、関数ではなく構文として定義。

実装としてはevalにルールをひとつ追加しただけ。

評価器

evalへの変更：

eval (List [Atom "if", ifForm, thenForm, elseForm]) = 
  case eval ifForm of
    Bool True  -> eval thenForm
    Bool False -> eval elseForm

条件を評価してからパターンマッチで次に評価する式を決めている。

この評価ルールは関数適用のルールより先に書く必要がある。関数適用の評価ルールは「先頭が任意のAtom型の任意のリスト」に当てはまるので、もしifのルールがその後に来てしまうと絶対に使われなくなってしまう。

`LispVal`・パーサ・REPL・Main

全部変更なし。構文をevalをちょこっといじるだけで加えられるのはいい。

注意点

(if a b)や(if a b c d)のような、ifと一緒に含まれる要素の数が３以外の場合、上記の評価ルールに当てはまらない
- 次の評価ルールに当てはまってしまい、ifが関数として評価される

という問題がある。回避しようと思うなら、「要素数が３以外の時はエラー」と明示的に評価ルールを追加すればいい。

実行

>> (if #t 5 0)
5
>> (if #f 5 0)
0
>> (if (= 5 5) (+ 5 2) (* 5 2))
7 // ネスト挙動も正常
>> (if (= 5 2) (+ 5 2) (* 5 2))
10
>> (if (> 4 3) 1 2 3)
0 // 要素数が一致しない場合、`if`が未定義の関数として扱われて結果が0になる
>> (if 1 10 0)
repl: src/Evaluator.hs:(8,3)-(10,31): Non-exhaustive patterns in case

おっと、ふと試してみたらバグが見つかった。たしかにifFormがBool型じゃない場合を想定していなかった。

    case ifBool of
        Bool True  -> eval envRef thenForm
        _          -> eval envRef elseForm

に修正しておこう・・・

次回

次回は「ユーザによる変数定義」機能を実装する。

2018-04-11

HaskellとParsecでLisp REPL　その６（数値比較と論理演算）

Haskell

今回の変更

github.com

前回Bool値を定義できたので、=、/=、<、>などやand/orを実装する。

Functions

Functionsモジュールのprimitivesにどんどん追加していく：

primitives :: [(String, [LispVal] -> LispVal)]
primitives = [
  ...
  ("=", numBoolBinOp (==)),
  ("/=", numBoolBinOp (/=)),
  (">", numBoolBinOp (>)),
  (">=", numBoolBinOp (>=)),
  ("<", numBoolBinOp (<)),
  ("<=", numBoolBinOp (<=)),
  ("and", boolBoolBinOp (&&)),
  ("or", boolBoolBinOp (||))
  ]

やはりHaskellの関数をwrapする形で[LispVal] -> LispVal型の関数を作成していく。

まず、数値二つをとってブール値を返すnumBoolBinOp：

numBoolBinOp :: (Integer -> Integer -> Bool) -> [LispVal] -> LispVal
numBoolBinOp op [x, y] = Bool $ op (lvToInteger x) (lvToInteger y)
numBoolBinOp op _      = Bool False

パターンマッチを使って、引数の数が２ならopを数値化した引数に適用してBool化、２以外ならBool Falseを返す。以前numerifyという名前にしていた関数をlvToIntegerに変えてみた。

次に、ブール値を一つ以上とってブール値を返すboolBoolBinOp：

boolBoolBinOp op args = Bool $ foldl1 op $ map lvToBool args

おっとboolBoolBinOpの型定義し忘れてる。型推論があるので記事を書くまで気付かなかった・・・

正しくは：

boolBoolBinOp :: (Bool -> Bool -> Bool) -> [LispVal] -> LispVal
boolBoolBinOp op args = Bool $ foldl1 op $ map lvToBool args

これはnumBinOpと似たような定義。opの引数と戻り値の型がBoolで一致しているのでfoldl1が使える。やはりbinOpではない・・・

LispValをHaskellのIntegerやBoolに変換する関数：

lvToInteger :: LispVal -> Integer
lvToInteger (Number n) = n
lvToInteger _          = 0

lvToBool :: LispVal -> Bool
lvToBool (Bool b) = b
lvToBool _        = False

現在はまだLispValの型がNumber型やBool型でない場合は0やFalseにしている。

Functions以外のコード

変更なし

実行

>> (> 5 4)
#t
>> (= 6 (* 3 2))
#t // 四則演算をネストできる
>> (= 4 4 4)
#f // 引数が２以外の場合は必ず#f
>> (and (> 6 3) (< 6 5))
#f // 論理演算も可能
>> (or (> 6 3) (< 6 5) (<= 10 7))
#t // 論理演算は引数２以外でも可
>> quit

次回

if構文を追加する。

S式のパース

言語仕様とS式からASTへ

ASTからアセンブラ

組み合わせたらコンパイラ

foldrで関数を作成する

foldrで不動点コンビネータを作成

sumで試してみる。

dropWhileをfixで

foldr

dropWhile

foldrでfilter

foldrとreverseでdropWhile

foldrでdropWhile

パターンマッチによる失敗例

今回の変更

IORefモナド

Environmentモジュール

Variablesモジュール

評価器

REPL・Main

実行

次回

今回の変更

評価器

LispVal・パーサ・REPL・Main

注意点

実行

次回

今回の変更

Functions

Functions以外のコード

実行

次回

`foldr`で関数を作成する

`foldr`で不動点コンビネータを作成

`sum`で試してみる。

`dropWhile`を`fix`で

`foldr`

`dropWhile`

`foldr`で`filter`

`foldr`と`reverse`で`dropWhile`

`foldr`で`dropWhile`

`IORef`モナド

`Environment`モジュール

`Variables`モジュール

`LispVal`・パーサ・REPL・Main