Monday, December 28, 2009

There's this integration with an external site that I've had to implement for months, but it doesn't work. It turned out that for the first month or so, the contacts from the external site wouldn't help due to some legal issues. Supposedly, the legal issues have been resolved. I'm not sure that they have or that there aren't other issues, because the contacts from the external site seem to be stonewalling to me. Their response has been to say that since many other clients have been using their site without problems, the problem must be with our client (duh), and suggested trying more things to narrow down what the problem is (without any guidance on what to try). I provided them tcpdumps of what I've tried, and I've been using their staging servers (rather than their production servers), so they should know when and what I tried. One of the responses from their system includes an error id that looks like it has a date and IP address embedded in it, and I imagine that error id could be used to help track down what is going on. However, none of the responses from them indicate that they've even looked any logs on their own staging servers.

I can work on other things, so I'm fine with letting other people worry about why there is no progress on this.

Monday, December 21, 2009

The RESOL programming language

RESOL (REtro Statement Oriented Language) is a language inspired by fixed-format FORTRAN. (See also: resol)

RESOL uses fixed format lines. If the first column contains C, the line is a comment. Otherwise, columns 1-5 contain an optional label consisting of digits (0-9). Column 6, if not empty nor containing a space, indicates a continuation line. Columns 7-72 contain the statement. The first line cannot be a continuation line and a continuation line cannot directly follow a comment line.
Statements

Every labeled statement (except the first statement), has a call stack. Having multiple call stacks permits convoluted code where the returns (from different call sites) are not in the reverse order of the calls.

It is an error for more than one statement to have the same label.

Statements can take one or two arguments. An argument is a sequence of one or more digits (0-9). If there are two arguments, they are separated by a comma.

Every labeled DATA statement has a queue stack. If the first statement is a labeled DATA statement, it has a special queue where reading the queue reads the program's input, and writing to the queue writes the program's output. The statement's queue is the top of queue stack.

The DATA statement can take one or two arguments. If the statement is labeled, the first argument defines the size of the queue item in digits, and the second argument, if present, defines the initial contents of the statement's queue, which would otherwise be empty. When executing a DATA statement, if the first argument does not match any labeled DATA statement, it is a NO-OP, otherwise, if there is one argument, the top item in the matching DATA statement's queue is removed, and if there are two arguments, the value of the second argument, as defined later, will be added to the end of the matching DATA statement's queue.

The CALL statement can take one or two arguments. The first argument must match a label. The statement following the CALL statement is pushed onto the call stack of the matching statement. If there is no second argument, a new empty queue is pushed onto the matching statement's queue stack, otherwise a new queue containing the value of the second argument, as defined later, is pushed onto the matching statement's queue stack. Execution continues with the matching statement. It is an error if the matching statement is the first statement and the first statement is a labeled DATA statement.

The CONTINUE statement can take one or two arguments. All arguments must match a label. If the statement matching the first argument is not a DATA statement, there must not be a second argument, and execution continues with the statement popped from the matching statement's call stack. Otherwise, if the DATA statement's queue is not empty, execution continues with the statement matching the second argument, if present, otherwise, execution continues with the matching DATA statement. If the DATA statement's queue is empty, then the DATA statement's queue stack is popped, and execution continues with the statement popped from the DATA statement's call stack. Stack underflow is an error. As a special case, if the first argument matches the first statement and the first statement is a labeled DATA statement, then, if there is more input available, execution continues with the statement matching the second argument if available, otherwise execution continues with the first statement. If there is no more available input, execution continues with the statement following the CONTINUE statement.

The IF statement must take two arguments. If the values of both arguments, as defined later, are not equal, the next statement is skipped.

The STOP statement takes no arguments and ends the program.

It is an error if there are no more statements to execute. This error can be avoided with a STOP or CONTINUE statement as the last reachable statement.

Here is the definition of the value of the second argument of an executed DATA statement, the second argument of a CALL statement or either argument of an IF statement: if the argument matches a label of a DATA statement, then the value is top item in the matching statement's queue. Otherwise, the value is the literal value of the argument.
Input and output

If the first statement is a labeled DATA statement, then that statement's queue is used for input and output. The size of the queue item determines how the input and output are converted between bits and digits. If the size is 1, then each digit corresponds to 3 bits, and the output bits are equal to the digit's value modulo 8. If the size is 2, then each digit corresponds to 6 bits, and the output bits are equal to the item's value modulo 64. If the size is 3, then each item corresponds to 9 bits, and the output bits are equal to the item's value modulo 512. If the size is 4, then each item corresponds to 13 bits, and the output bits are equal to the item's value modulo 8192. Larger sizes work analogously.
Examples

HELLO WORLD

07734 DATA 4 0000001
DATA 07734,23125425157546133742526705450266 0000002
STOP 0000003

CAT

0 DATA 1 0000001
1 CONTINUE 0,2 0000002
STOP 0000003
2 DATA 0,0 0000004
DATA 0 0000005
CONTINUE 0,2 0000006
STOP 0000007

Interpreter

Here is a RESOL interpreter in Haskell. I wrote it in multiple modules, but crammed everything into a single module for this post.

module Main(main) where

import Data.Bits(bit,shiftR,testBit)
import Data.Char(chr,ord)
import Data.Map((!))
import qualified Data.Map as M
import System.Environment(getArgs)

data Statement = Statement {
stmtFileName :: String,
stmtLineNumber :: Int,
stmtLabel :: [Int],
stmtCommand :: Command
}

data Command =
Data1 [Int]
| Data2 [Int] [Int]
| Call1 [Int]
| Call2 [Int] [Int]
| Continue1 [Int]
| Continue2 [Int] [Int]
| If [Int] [Int]
| Stop

parse :: String -> String -> [Statement]
parse fileName str =
reverse $ finishPartial fileName
$ foldl (parseLine fileName) (Nothing,[]) (zip (lines str) [1..])

data PartialStatement = PartialStatement {
psLineNumber :: Int,
psLabel :: String,
psStatement :: String
}

startPartial = PartialStatement

continuePartial label stmt ps = ps {psLabel = psLabel ps ++ label,
psStatement = psStatement ps ++ stmt }

finishPartial fileName (ps,stmts) =
maybe stmts ((: stmts) . parseStatement fileName) ps

parseLine fileName state ('C':_,_) = (Nothing,finishPartial fileName state)
parseLine fileName state@(ps,stmts) (l1:l2:l3:l4:l5:c:stmt,lineNumber)
| c == ' ' = (Just (startPartial lineNumber label (take 66 stmt)),
finishPartial fileName state)
| otherwise = maybe (errMsg fileName lineNumber
"Invalid continuation line")
(flip (,) stmts . Just
. continuePartial label (take 66 stmt))
ps
where label = [l1,l2,l3,l4,l5]
parseLine fileName state (label,lineNumber) =
(Just (startPartial lineNumber label ""), finishPartial fileName state)

parseStatement fileName ps =
Statement fileName (psLineNumber ps)
(parseNumber fileName (psLineNumber ps) (psLabel ps))
(parseCommand fileName (psLineNumber ps) (psStatement ps))

parseNumber fileName lineNumber str
| all (flip elem (' ':['0'..'9'])) str = map toDigit (filter (/= ' ') str)
| otherwise = errMsg fileName lineNumber "Invalid number"

toDigit c = ord c - ord '0'

parseCommand fileName lineNumber str =
let (cmd,args) = span (flip elem ['A'..'Z']) (filter (/= ' ') str)
(arg1,args') = span (/= ',') args
arg2 = drop 1 args'
in makeCommand fileName lineNumber cmd
(parseNumber fileName lineNumber arg1)
(parseNumber fileName lineNumber arg2)

makeCommand _ _ "DATA" arg@(_:_) [] = Data1 arg
makeCommand _ _ "DATA" arg1@(_:_) arg2@(_:_) = Data2 arg1 arg2
makeCommand _ _ "CALL" arg@(_:_) [] = Call1 arg
makeCommand _ _ "CALL" arg1@(_:_) arg2@(_:_) = Call2 arg1 arg2
makeCommand _ _ "CONTINUE" arg@(_:_) [] = Continue1 arg
makeCommand _ _ "CONTINUE" arg1@(_:_) arg2@(_:_) = Continue2 arg1 arg2
makeCommand _ _ "IF" arg1@(_:_) arg2@(_:_) = If arg1 arg2
makeCommand _ _ "STOP" [] [] = Stop
makeCommand fileName lineNumber cmd _ _ =
errMsg fileName lineNumber ("Invalid command: " ++ cmd)

errMsg fileName lineNumber msg =
error (fileName ++ ':' : show lineNumber ++ ": " ++ msg)

encode :: Int -> [Int] -> String
encode n = bitsToStr . digitsToBits n

decode :: Int -> String -> [Int]
decode n = bitsToDigits n . strToBits

digitsToInteger :: [Int] -> Integer
digitsToInteger ints = digitsToInteger' ints 0 where
digitsToInteger' [] acc = acc
digitsToInteger' (d:ds) acc = digitsToInteger' ds (10*acc + fromIntegral d)

digitCountToBitCount :: Int -> Int
digitCountToBitCount n = highBit (digitsToInteger (take n (repeat 9))) where
highBit 0 = -1
highBit m = 1 + highBit (shiftR m 1)

digitsToBits :: Int -> [Int] -> [Bool]
digitsToBits _ [] = []
digitsToBits n digits =
let bitIndices = reverse [0..digitCountToBitCount n - 1]
(item,rest) = splitAt n digits
in map (testBit (digitsToInteger item)) bitIndices ++ digitsToBits n rest

bitsToDigits :: Int -> [Bool] -> [Int]
bitsToDigits _ [] = []
bitsToDigits n bits =
let nbits = digitCountToBitCount n
bitIndices = reverse [0..nbits - 1]
toDigits :: Int -> Integer -> [Integer] -> [Int]
toDigits 0 _ acc = map fromIntegral acc
toDigits n' m acc = toDigits (n' - 1) (div m 10) (mod m 10 : acc)
bitValue flag bitIndex = if flag then bit bitIndex else 0
(item,rest) = splitAt nbits bits
in toDigits n (sum (zipWith bitValue item bitIndices)) []
++ bitsToDigits n rest

strToBits :: String -> [Bool]
strToBits = concatMap (flip map [7,6..0] . testBit . ord)

bitsToStr :: [Bool] -> String
bitsToStr [] = []
bitsToStr bits =
let (octet,rest) = splitAt 8 bits
byte = sum (zipWith (\ f b -> if f then bit b else 0) octet [7,6..0])
in if length octet == 8
then chr byte : bitsToStr rest
else []

data Stmt = Stmt Statement (Maybe Stmt)

data State = State {
firstStmtLabel :: [Int],
firstStmtItemSize :: Int,
stmtByLabel :: [Int] -> Maybe Stmt,
queueItemSize :: [Int] -> Int,
queues :: M.Map [Int] [[Int]],
stacks :: M.Map [Int] [Stmt]
}

interpRaw :: [Statement] -> String -> [Int]
interpRaw = interp' (const id)

interp :: [Statement] -> String -> String
interp = interp' encode

interp' :: (Int -> [Int] -> a) -> [Statement] -> String -> a
interp' convert stmts input =
let (stmt,state) = initState stmts
itemSize = firstStmtItemSize state
in convert itemSize $ run stmt state $ decode itemSize input

initState :: [Statement] -> (Stmt,State)
initState statements =
let stmts = makeStmts statements
makeStmts [] = []
makeStmts (s:ss) =
let stmts' = makeStmts ss
in Stmt s (if length stmts' == 0
then Nothing
else Just (head stmts'))
: stmts'
byLabel = M.fromList $ filter ((/= []) . fst)
$ zip (map stmtLabel statements) stmts
toInt [] acc = acc
toInt (d:ds) acc = toInt ds (acc*10 + d)
itemSize (Just (Stmt (Statement {stmtCommand=Data1 arg}) _))
= toInt arg 0
itemSize (Just (Stmt (Statement {stmtCommand=Data2 arg _}) _)) =
toInt arg 0
itemSize _ = 0
initQueue (Stmt (Statement {stmtCommand=Data1 _}) _) = Just [[]]
initQueue (Stmt (Statement {stmtCommand=Data2 _ arg}) _) = Just [arg]
initQueue _ = Nothing
in (head stmts,
State {firstStmtLabel=stmtLabel (head statements),
firstStmtItemSize=itemSize (Just (head stmts)),
stmtByLabel=flip M.lookup byLabel,
queueItemSize=itemSize . flip M.lookup byLabel,
queues=M.mapMaybe initQueue byLabel,
stacks=fmap (const []) byLabel})

pushCall :: [Int] -> [Int] -> Stmt -> State -> State
pushCall label args returnStmt state =
let stacks' = M.adjust (returnStmt:) label (stacks state)
queues' = M.adjust (args:) label (queues state)
in state {stacks=stacks', queues=queues'}

retCall :: [Int] -> State -> (Stmt,State)
retCall label state =
let stmt = head ((stacks state) ! label)
stacks' = M.adjust tail label (stacks state)
queues' = M.adjust tail label (queues state)
in (stmt,state {stacks=stacks', queues=queues'})

modifyQueue :: ([Int] -> [Int]) -> [Int] -> State -> State
modifyQueue modify label state =
let modifyHead queues = (modify (head queues)) : tail queues
in state {queues = M.adjust modifyHead label (queues state)}

dequeue :: [Int] -> State -> State
dequeue label state =
modifyQueue (drop (queueItemSize state label)) label state

queueItem :: [Int] -> State -> [Int]
queueItem label state =
take (queueItemSize state label) (head (queues state ! label))

enqueue :: [Int] -> [Int] -> State -> State
enqueue label digits state = modifyQueue (++ digits) label state

isQueue :: [Int] -> State -> Bool
isQueue label state = M.member label (queues state)

isFirst :: [Int] -> State -> Bool
isFirst label state = label == firstStmtLabel state

value :: [Int] -> State -> [Int] -> [Int]
value arg state input =
if isQueue arg state
then if isFirst arg state
then take (firstStmtItemSize state) input
else queueItem arg state
else arg

run :: Stmt -> State -> [Int] -> [Int]

run (Stmt (Statement {stmtCommand=Data1 label}) (Just next)) state input =
if isFirst label state
then run next state (drop (firstStmtItemSize state) input)
else run next (dequeue label state) input

run (Stmt (Statement {stmtCommand=Data2 arg1 arg2}) (Just next)) state input =
if isFirst arg1 state
then (value arg2 state input) ++ run next state input
else run next (enqueue arg1 (value arg2 state input) state) input

run stmt@(Stmt (Statement {stmtCommand=Call1 label}) _) state input =
call [] label stmt state input

run stmt@(Stmt (Statement {stmtCommand=Call2 label args}) _) state input =
call (value args state input) label stmt state input

run stmt@(Stmt (Statement {stmtCommand=Continue1 label}) _) state input =
if isFirst label state
then continueFirst label label stmt state input
else continue label label stmt state input

run stmt@(Stmt (Statement {stmtCommand=Continue2 label label2}) _) state input=
if isFirst label state
then continueFirst label label2 stmt state input
else continue label label2 stmt state input

run (Stmt (Statement {stmtCommand=If arg1 arg2}) (Just next)) state input =
if value arg1 state input == value arg2 state input
then run next state input
else case next of
(Stmt _ (Just next')) -> run next' state input
_ -> errorMsg next "No following statement"

run (Stmt stmt@(Statement {stmtCommand=Stop}) _) state input = []

run stmt@(Stmt _ Nothing) _ _ = errorMsg stmt "No following statement"

call :: [Int] -> [Int] -> Stmt -> State -> [Int] -> [Int]
call args label stmt@(Stmt _ (Just next)) state input =
let doCall calledStmt =
run calledStmt (pushCall label args next state) input
in maybe (errorMsg stmt ("Undefined label: " ++ concatMap show label))
doCall
(stmtByLabel state label)

continueFirst :: [Int] -> [Int] -> Stmt -> State -> [Int] -> [Int]
continueFirst label label2 stmt@(Stmt _ next) state [] =
maybe (errorMsg stmt "No following statement")
(flip (flip run state) [])
next

continueFirst label label2 stmt state input =
maybe (errorMsg stmt ("Undefined label: " ++ concatMap show label2))
(flip (flip run state) input)
(stmtByLabel state label2)

continue :: [Int] -> [Int] -> Stmt -> State -> [Int] -> [Int]
continue label label2 stmt state input =
let ret = let (stmt',state') = retCall label state
in run stmt' state' input
cont ([]:_) = ret
cont _ = maybe (errorMsg stmt ("Undefined label: "
++ concatMap show label2))
(flip (flip run state) input)
(stmtByLabel state label2)
in maybe ret cont (M.lookup label (queues state))

errorMsg :: Stmt -> String -> a
errorMsg (Stmt statement _) msg =
error (stmtFileName statement ++ ':' : show (stmtLineNumber statement)
++ ": " ++ msg)

main :: IO ()
main = do
(srcs,run) <- fmap (parseArgs interp) getArgs
statements <- sequence (map readSrc srcs)
interact (run (concat statements))

readSrc :: String -> IO [Statement]
readSrc "" = fmap (uncurry parse . (,) "<stdin>") getContents
readSrc src = fmap (uncurry parse . (,) src) (readFile src)

parseArgs :: ([Statement] -> String -> String) -> [String]
-> ([String],[Statement] -> String -> String)
parseArgs run ("-r":args) =
parseArgs (\ stmts -> concatMap show . interpRaw stmts) args
parseArgs run [] = ([""],run)
parseArgs run srcs = (srcs,run)

Monday, December 14, 2009

The Parenthesis Hell programming language

The Parenthesis Hell programming language is an obfuscated, Lisp-like programming language. The only values are a cons pair or nil, which can be represented by using a subset of S-expressions. Since nil can be written as (), the dotted pair notation is superfluous, and everything can be written in list notation. Thus, in Parenthesis Hell, everything must be written in list notation, so everything Parenthesis Hell consists of matched pairs of open and close parentheses. All other characters are ignored.

I will use the dotted-pair notation for clarity. In particular, it highlights certain ways Parenthesis Hell differs from Lisp. However, it must be translated to list notation to be legal Parenthesis Hell.

A Parenthesis Hell program consists of a single expression. Its argument is the program's input. The program's output is the value of the expression.

An expression is either nil, written as (), or a cons pair.

nil, or (), evaluates to the current argument.

A cons pair, (head . tail), is evaluated by looking up head in the current scope, with tail as its argument.

The initial scope defines 7 functions.








()quote(() . expr)
(())letrec((()) (definition-list) . expr)
(()())cdr((()()) . expr)
(()()())if((()()()) cond expr1 . expr2)
((()))car(((())) . expr)
((())())cons((()()) expr1 . expr2)
(((())))eval((((()))) . expr)


quote, (() . expr), evaluates to the literal expr.

letrec, ((()) (definition-list) . expr), creates a new scope with given definitions, which may shadow definitions in the enclosing scopes, including those in the initial scope, and evalutes to the evaluation of expr in the new scope.

cdr, ((()()) . expr), evaluates to the tail of the evaluation of expr in the current scope. If expr evaluates to nil, the value is nil.

if, ((()()()) cond expr1 . expr2), evaluates to the evaluation of expr1 in the current scope if the evaluation of cond in the current scope is not nil, otherwise it evaluates to the evaluation of expr2 in the current scope. If (cond expr1 . expr2) is nil or if (expr1 . expr2) is nil, the value is nil.

car, (((())) . expr), evaluates to the head of the evaluation of expr in the current scope. If expr evaluates to nil, the value is nil.

cons, (((())()) expr1 . expr2), evaluates to the cons of the evaluation of expr1 and the evaluation of expr2 in the current scope.
If (expr1 . expr2) is nil, the value is nil.

eval, ((((()))) . expr), evaluates to the evaluation of the evaluation of expr in the current scope. Obligatory in languages where code is data.

Creating new scopes

In a new letrec scope, ((()) (definition-list) . expr), definition-list is a list of (name . body), where name is defined
within the scope, which includes the bodies in the definition-list. When name is invoked as (name . expr), the evaluation of expr in the current scope becomes the argument, referred to by (), in scope of name's body. The value of (name . expr) is value of body evaluated in the scope of name's body. nils in the definition-list are ignored.

Extensions


The initial scope can optionally define additional functions.

concat

concat, ((()(())) expr1 . expr2), takes the evaluation of expr1 and the evaluation expr2 in the current scope as output, and returns a value that would output the bits of expr1, followed by the bits of expr2. An approximate implementation in Parenthesis Hell, which does not evaluate expr1 and expr2 separately, is

((())(((())(()()())(((())))((()()())(((()))((())))(((())())(((
))((())())(((()))((())))(()()))())(()()())((()())((())))(((())
())(())(())((())())((()())((())))(()()))(()()))(()())))(()))

(ph-concat . (((())()) expr1 . expr2)), or (ph-concat . (cons expr1 . expr2)), is equivalent to (concat expr1 . expr2). The Parenthesis Hell implementation can use huge amounts of memory, having implementing concat in the interpreter enables some code that would otherwise cause out of memory errors.

Example code


cat:

()


Hello world:

(()()(()()(()()()()((()()(()(()((()((()()()((()((()()()((()((((()()(()(
)()()()()(((()(((()((()((((()(((()()(()()((()((()()()((()()(()()()()(()
()()()(()()()()(()(())))))))))))))))))))))))))))))))))))))))))))))))))


quine:

((())(((())()((())())(()())((())())(((())())(((())())(()())((())())(())(()))()
)(())))((())())(()())((())())(((())())(((())())(()())((())())(())(()))())(()))

An interpreter, written in Haskell


I wrote it in multiple modules, but crammed everything into a single module for this post.

module Main(main) where

import Data.Bits(bit,testBit)
import Data.Char(chr,ord)
import Prelude hiding (lookup,head,tail)
import System.Environment(getArgs)

data Value = Nil | Cons Value Value

instance Show Value where
show a = '(' : shows a ")" where
shows Nil s = s
shows (Cons a b) s = '(' : shows a (')' : shows b s)

isNil :: Value -> Bool
isNil Nil = True
isNil _ = False

car :: Value -> Value
car (Cons a _) = a
car Nil = Nil

cdr :: Value -> Value
cdr (Cons _ b) = b
cdr Nil = Nil

instance Read Value where
readsPrec p s@('(':_) = [readNext s]
readsPrec p (_:s) = readsPrec p s
readsPrec p [] = []

readNext :: String -> (Value,String)
readNext ('(':s) = readNext' s
where
readNext' ('(':s) =
let (car,s') = readNext' s
(cdr,s'') = readNext' s'
in (Cons car cdr,s'')
readNext' (')':s) = (Nil,s)
readNext' (_:s) = readNext' s
readNext' [] = error "unmatched ("

readNext (')':_) = error "unmatched )"
readNext (_:s) = readNext s
readNext [] = error "unmatched ("

strToValue :: String -> Value
strToValue [] = Cons Nil Nil
strToValue (c:cs) = bitsToValue [7,6..0] c
where
bitsToValue [] _ = strToValue cs
bitsToValue (b:bs) byte =
(if testBit (ord c) b then Cons else flip Cons)
(bitsToValue bs byte) Nil

valueToStr :: Value -> String
valueToStr value = valueToBits [7,6..0] 0 value
where
valueToBits [] byte rest = chr byte : valueToStr rest
valueToBits _ _ Nil = []
valueToBits (_:bs) byte (Cons Nil rest) = valueToBits bs byte rest
valueToBits (b:bs) byte (Cons rest _) = valueToBits bs (byte + bit b) rest

data ValueTrie a = ValueTrie (ValueTrie a) (ValueTrie a) (Maybe a) | Empty

head Empty = Empty
head (ValueTrie hd _ _) = hd

tail Empty = Empty
tail (ValueTrie _ tl _) = tl

value Empty = Nothing
value (ValueTrie _ _ val) = val

empty :: ValueTrie a
empty = Empty

insert :: ValueTrie a -> Value -> a -> ValueTrie a
insert trie key val =
let insertValue trie = ValueTrie (head trie) (tail trie) (Just val)
in insert' trie key insertValue

insert' :: ValueTrie a -> Value -> (ValueTrie a -> ValueTrie a) -> ValueTrie a
insert' trie Nil insertValue = insertValue trie
insert' trie (Cons hd tl) insertValue =
let insertTail trie' =
ValueTrie (head trie')
(insert' (tail trie') tl insertValue)
(value trie')
in ValueTrie (insert' (head trie) hd insertTail) (tail trie) (value trie)

lookup :: ValueTrie a -> Value -> Maybe a
lookup = lookup' value

lookup' :: (ValueTrie a -> b) -> ValueTrie a -> Value -> b
lookup' value Empty _ = value Empty
lookup' value trie Nil = value trie
lookup' value trie (Cons hd tl) =
case lookup' id (head trie) hd of
Empty -> value Empty
trie' -> lookup' value (tail trie') tl

data Scope a b = Root | Nested (Scope a b) (a -> Maybe b)

root :: (a -> Maybe b) -> Scope a b
root = Nested Root

nested :: Scope a b -> (a -> Maybe b) -> Scope a b
nested = Nested

get :: Show a => Scope a b -> a -> b
get scope@(Nested outer getValue) name =
case getValue name of
Just value -> value
Nothing -> get outer name
get Root name = error (show name)

data Op = Op (Value -> Scope Value Op -> Value -> Value)
op (Op op) = op

rootScope :: Scope Value Op
rootScope = root getBinding where
getBinding Nil = Just (Op quoteOp)
getBinding (Cons Nil Nil) = Just (Op letOp)
getBinding (Cons Nil (Cons Nil Nil)) = Just (Op cdrOp)
getBinding (Cons Nil (Cons Nil (Cons Nil Nil))) = Just (Op ifOp)
getBinding (Cons Nil (Cons (Cons Nil Nil) Nil)) = Just (Op concatOp)
getBinding (Cons (Cons Nil Nil) Nil) = Just (Op carOp)
getBinding (Cons (Cons Nil Nil) (Cons Nil Nil)) = Just (Op consOp)
getBinding (Cons (Cons (Cons Nil Nil) Nil) Nil) = Just (Op evalOp)
getBinding _ = Nothing

quoteOp input scope arg = arg

letOp input scope Nil = Nil
letOp input scope (Cons bindings letBody) =
let bindingList Nil = empty
bindingList (Cons Nil rest) = bindingList rest
bindingList (Cons (Cons name body) rest) =
let definedOp input scope arg =
eval (eval input scope arg) letScope body
in insert (bindingList rest) name (Op definedOp)
letScope = nested scope (lookup (bindingList bindings))
in eval input letScope letBody

cdrOp input scope arg = cdr (eval input scope arg)

ifOp input scope Nil = Nil
ifOp input scope (Cons _ Nil) = Nil
ifOp input scope (Cons cond body)
| isNil (eval input scope cond) = eval input scope (cdr body)
| otherwise = eval input scope (car body)

carOp input scope arg = car (eval input scope arg)

consOp input scope Nil = Nil
consOp input scope (Cons head tail) =
Cons (eval input scope head) (eval input scope tail)

evalOp input scope arg = eval input scope (eval input scope arg)

concatOp input scope Nil = Nil
concatOp input scope (Cons head tail) =
let concat Nil rest = rest
concat (Cons Nil Nil) rest = rest
concat (Cons Nil tl) rest = Cons Nil (concat tl rest)
concat (Cons hd tl) rest = Cons (concat hd rest) tl
in concat (eval input scope head) (eval input scope tail)

eval input scope Nil = input
eval input scope expr = (op (get scope (car expr))) input scope (cdr expr)

main :: IO ()
main = do
args <- getArgs
(toStr,code,arg) <- processArgs valueToStr args
putStr $ toStr $ eval arg rootScope code

processArgs :: (Value -> String) -> [String] -> IO (Value -> String,Value,Value)
processArgs toStr [] = do
src <- getContents
return (toStr, read src, Nil)
processArgs _ ("-v":args) = processArgs show args
processArgs toStr (file:_) = do
src <- readFile file
input <- getContents
return (toStr, read src, strToValue input)

Monday, December 7, 2009

It had been over 10 years since I had last written anything in Haskell, so I decided to try using it to see if I still could. I returned to the simple 01_ programming language that I invented earlier this year, and wrote an interpreter, and it took less than a day. I didn't worry about making it fast, but the interpreter compiled with ghc was about twice as fast as the one I wrote in C

I had written an interpreter in Java, and the source was about 3 times longer than the Haskell version. Since Haskell has laziness built-in, I didn't have to code in the laziness. The source for the interpreter in C that I wrote was about twice the size of the Java version and about 6 times the Haskell version. A lot of the C code was junk for debugging the memory management, since I couldn't rely on a garbage collector.
Haskell216 lines
Java782 lines
C1527 lines


I first wrote the Haskell 01_ interpreter in multiple modules, and it would be more maintainable that way. However, I'll present here all smashed into a single module, so that it can be stuck into a single file and run.

module Main(main) where

import Data.Bits(bit,testBit)
import Data.Char(chr,ord)
import Data.List(splitAt,elemIndex,(!!))
import Data.Map(Map,empty,findWithDefault,insert,member,(!))
import qualified Data.Map(map)
import System.Environment(getArgs)

data Token = Symbol String | Zero | One | Equal | Dot | Nil

tokenize :: String -> [Token]
tokenize ('=':'=':cs) = tokenize (dropComment cs)
tokenize ('0':cs) = Zero : tokenize cs
tokenize ('1':cs) = One : tokenize cs
tokenize ('=':cs) = Equal : tokenize cs
tokenize ('.':cs) = Dot : tokenize cs
tokenize ('_':cs) = Nil : tokenize cs
tokenize (' ':cs) = tokenize cs
tokenize ('\t':cs) = tokenize cs
tokenize ('\r':cs) = tokenize cs
tokenize ('\n':cs) = tokenize cs
tokenize [] = []
tokenize cs = tokenizeSymbol cs ""

tokenizeSymbol s@(c:cs) symbol =
case c of
'=' -> Symbol (reverse symbol) : tokenize s
'0' -> Symbol (reverse symbol) : tokenize s
'1' -> Symbol (reverse symbol) : tokenize s
'.' -> Symbol (reverse symbol) : tokenize s
'_' -> Symbol (reverse symbol) : tokenize s
' ' -> Symbol (reverse symbol) : tokenize cs
'\t' -> Symbol (reverse symbol) : tokenize cs
'\r' -> Symbol (reverse symbol) : tokenize cs
'\n' -> Symbol (reverse symbol) : tokenize cs
_ -> tokenizeSymbol cs (c:symbol)
tokenizeSymbol [] symbol = [Symbol (reverse symbol)]

dropComment ('\n':cs) = cs
dropComment (_:cs) = dropComment cs
dropComment [] = []

type Bits = [Bool]
data Pattern = Literal Bits | Binding Bits String | Wildcard Bits
data Expr = LiteralBits Bits | Concat Expr Expr | Bound Int | Call String [Expr]
data Definition = Def [Pattern] Expr

data Signature = Sig String [Pattern] [Token]

parseSigs :: [Token] -> [Signature]
parseSigs (Symbol s:tokens) = parseSig s [] tokens []
parseSigs [] = []
parseSigs _ = error "name expected to start definition"

parseSig name patterns (Equal:tokens) [] =
collectBody name (reverse patterns) tokens []
parseSig name patterns (Equal:tokens) bits =
collectBody name (reverse (Literal (reverse bits) : patterns)) tokens []
parseSig name patterns (Zero:tokens) bits =
parseSig name patterns tokens (False:bits)
parseSig name patterns (One:tokens) bits =
parseSig name patterns tokens (True:bits)
parseSig name patterns (Nil:tokens) bits =
parseSig name (Literal (reverse bits) : patterns) tokens []
parseSig name patterns (Symbol binding:tokens) bits =
parseSig name (Binding (reverse bits) binding : patterns) tokens []
parseSig name patterns (Dot:tokens) bits =
parseSig name (Wildcard (reverse bits) : patterns) tokens []

collectBody name patterns (Dot:tokens) body =
Sig name patterns (reverse body) : parseSigs tokens
collectBody name patterns (token:tokens) body =
collectBody name patterns tokens (token:body)
collectBody name patterns [] body = error ("unexpected end of body of "++name)

groupDefs [] defs = Data.Map.map reverse defs
groupDefs (sig@(Sig name patterns _):sigs) defs =
case findWithDefault [] name defs of
[] -> groupDefs sigs (insert name [sig] defs)
siglist@(Sig _ otherPatterns _:_) ->
if length patterns /= length otherPatterns then
error ("arity mismatch:"++name)
else
groupDefs sigs (insert name (sig:siglist) defs)

parseDefs defs = Data.Map.map (map (parseDef arities)) defs
where
arities = Data.Map.map (\ (Sig _ patterns _:_) -> length patterns) defs

parseDef arities (Sig _ patterns body) =
case body of
[] -> Def patterns (LiteralBits [])
_ -> Def patterns
(foldr1 (\a b -> Concat a b) (parseExprs arities bindings body))
where
bindings = map (\ (Binding _ name) -> name)
(filter (\x -> case x of Binding _ _ -> True; _ -> False) patterns)

parseExprs arities bindings [] = []
parseExprs arities bindings tokens@(Zero:_) =
parseLiteral arities bindings tokens []
parseExprs arities bindings tokens@(One:_) =
parseLiteral arities bindings tokens []
parseExprs arities bindings tokens@(Nil:_) =
parseLiteral arities bindings tokens []
parseExprs arities bindings (Symbol name:tokens) =
case elemIndex name bindings of
Just i -> Bound i : parseExprs arities bindings tokens
Nothing ->
let (args,exprs) = splitAt (arities ! name) (parseExprs arities bindings tokens)
in Call name args : exprs

parseLiteral arities bindings (Zero:tokens) literal =
parseLiteral arities bindings tokens (False:literal)
parseLiteral arities bindings (One:tokens) literal =
parseLiteral arities bindings tokens (True:literal)
parseLiteral arities bindings [] literal = [LiteralBits (reverse literal)]
parseLiteral arities bindings [Nil] literal = [LiteralBits (reverse literal)]
parseLiteral arities bindings (Nil:tokens) literal =
LiteralBits (reverse literal) : parseExprs arities bindings tokens
parseLiteral arities bindings tokens literal =
LiteralBits (reverse literal) : parseExprs arities bindings tokens

parseBodies :: [Signature] -> Map String [Definition]
parseBodies sigs = parseDefs (groupDefs sigs empty)

arity :: Map String [Definition] -> String -> Int
arity defs fn = case defs ! fn of (Def patterns _):_ -> length patterns

apply :: Map String [Definition] -> String -> [Bits] -> Bits
apply defs fn args =
let evalMatching Nothing (Def patterns expr) =
case bind patterns args [] of
Just bindings -> Just (eval defs bindings expr)
_ -> Nothing
evalMatching result _ = result
in case foldl evalMatching Nothing (defs ! fn) of
Just bits -> bits
_ -> error ("No matching pattern for "++fn)

bind :: [Pattern] -> [Bits] -> [Bits] -> Maybe [Bits]
bind (pattern:patterns) (arg:args) acc =
case pattern of
Literal bits ->
case matchArg bits arg of
Just [] -> bind patterns args acc
_ -> Nothing
Binding bits _ ->
case matchArg bits arg of
Just binding -> bind patterns args (binding:acc)
_ -> Nothing
Wildcard bits ->
case matchArg bits arg of
Just _ -> bind patterns args acc
_ -> Nothing
bind [] [] acc = Just (reverse acc)

matchArg (bit:bits) (argBit:argBits)
| bit == argBit = matchArg bits argBits
| otherwise = Nothing
matchArg [] arg = Just arg
matchArg _ [] = Nothing

eval :: Map String [Definition] -> [Bits] -> Expr -> Bits
eval _ _ (LiteralBits bits) = bits
eval defs bindings (Concat head tail) =
eval defs bindings head ++ eval defs bindings tail
eval _ bindings (Bound i) = bindings !! i
eval defs bindings (Call fn args) =
apply defs fn (map (eval defs bindings) args)

main :: IO ()
main = do
stdinText <- getContents
interpArgs <- getArgs
let (files,fn,argFiles) = parseArgs interpArgs []
stdinBits = textToBits stdinText
sigs <- readSources files []
args <- readArgs stdinBits argFiles []
let defs = parseBodies sigs
let nils = []:nils
let result = apply defs fn (take (arity defs fn) (args ++ stdinBits:nils))
putStr (bitsToStr result)

parseArgs ("-":fn:args) files = (reverse files,fn,args)
parseArgs ["-"] files = (reverse files,defaultFn files,[])
parseArgs (file:args) files = parseArgs args (file:files)
parseArgs [] files = (reverse files,defaultFn files,[])

defaultFn (fn:_) =
(fst . (break (=='.')) . reverse . fst . (break (=='/')) . reverse) fn

readSources [] sigs = return sigs
readSources (file:files) sigs = do
text <- readFile file
let s = sigs ++ ((parseSigs . tokenize) text)
readSources files s

readArgs _ [] args = return (reverse args)
readArgs stdinBits ("-":files) args =
readArgs stdinBits files (stdinBits:args)
readArgs stdinBits (file:files) args = do
text <- readFile file
readArgs stdinBits files (textToBits text:args)

textToBits text = foldl (++) [] (map charToBits text) where
charToBits char = map (testBit (ord char)) [7,6..0]

bitsToStr [] = []
bitsToStr bits =
let (byte,rest) = splitAt 8 bits
bitToInt flag index = if flag then bit index else 0
bitsToChar bs = chr (foldl (+) 0 (zipWith bitToInt bs [7,6..0]))
in
bitsToChar byte : bitsToStr rest