Structurally here refers to a systematic way of manipulating s-expressions in which form has and equivalent Haskell ADT.
Before getting into all the functions in op
, such as Pass.moduleTransform
, we first discuss the underlying tool that allows us to handle these passes in a principled way.
We'll use the let/defun
form as an example of this transformation.
This code can be found in the Structure module.
;; From Design/S-expression Syntax
(:defun foo (x)
(:cond
((:infix == x 2) (:infix + 3 (:infix + 4 x)))
(else (:infix + x 2))))
defun
is broken into the name foo
, the arguments x
(note this can pattern match), and the body cond
.
In Structure/Frontend.hs we find a Haskell encoding of this form:
-- | @Defun@ is the base defun structure
-- currently it does not have matching
data Defun = Defun
{ defunName :: Sexp.T,
defunArgs :: Sexp.T,
defunBody :: Sexp.T
}
deriving (Show)
Note: Notice how we say nothing about the head being the cadr
of the structure, and the arguments the caddr
, and the body of course the caddr
. Instead, we just lay out the logical structures in a record, divorced from any representation.
We provide an API to deal with the actual representation. It would need to change with a lisp-form redesign.
----------------------------------------
-- Defun
----------------------------------------
nameDefun :: NameSymbol.T
nameDefun = ":defun"
isDefun :: Sexp.T -> Bool
isDefun (Sexp.Cons form _) = Sexp.isAtomNamed form nameDefun
isDefun _ = False
toDefun :: Sexp.T -> Maybe Defun
toDefun form
| isDefun form =
case form of
_Defun Sexp.:> sexp1 Sexp.:> sexp2 Sexp.:> sexp3 Sexp.:> Sexp.Nil ->
Defun sexp1 sexp2 sexp3 |> Just
_ ->
Nothing
| otherwise =
Nothing
fromDefun :: Defun -> Sexp.T
fromDefun (Defun sexp1 sexp2 sexp3) =
Sexp.list [Sexp.atom nameDefun, sexp1, sexp2, sexp3]
All records in that file have a corresponding interface
of name<structure>
, is<structure>
, to<strcuture>
, and
from<structure>
. These deal with the small details of cars
and
cdrs
.
This can be tested by opening the Easy file in the Juvix
repl and running
λ> Right defun = Sexp.parse "(:defun foo (x) (:cond ((:infix == x 2) (:infix + 3 (:infix + 4 x))) (else (:infix + x 2))))"
λ> import qualified Juvix.Sexp.Structure as Structure
λ> Structure.toDefun defun
Just (Defun {defunName = "foo", defunArgs = ("x"), defunBody = (":cond" ((":infix" "==" "x" 2) (":infix" "+" 3 (":infix" "+" 4 "x"))) ("else" (":infix" "+" "x" 2)))})
We can see this transformation agrees with our documentation.
What this buys us, is that when it comes time to do the passes, we
don't have to worry about these details, while trying to make sure we
don't change the semantics of the passes themselves. This properly
decouples our concerns so we can worry about the abstraction meaning
of the syntax in the passes while worrying about the details here.
However all is not fine, notice, we had to write nineteen lines of
boilerplate in order to get rid of the details of the syntax! This is
rather tedious, and if we count the numbers of these there are at
least 22 structures in that file, and over 500 lines of this rather
straightforward API. This is especially annoying given that we are
essentially just translating over BNF translations.
It is due to this tedium early on that the lisp haskell generator was
created. The code itself isn't the best (due to using string
interpolation) and can with a bit of engineering be improved (with a
proper S-expression DSL in the lisp code), however it works fairly
well for our purposes.
Here are some choice snippets that cover every case of the generator
;; 1
(generate-haskell "Defun" '("sexp" "sexp" "sexp") ":defun")
;; 2
(generate-haskell "ArgBody" '("sexp" "sexp") nil)
;; 3
(generate-haskell "DeconBody" (repeat 2 "sexp") nil)
(generate-haskell "Case" '("sexp" "deconBody") "case" :list-star t)
;; 4
(generate-haskell "NotPunned" '("sexp" "sexp") nil)
(generate-haskell "RecordNoPunned" '("notPunnedGroup") ":record-no-pun"
:list-star t
:un-grouped t)
;; 5
(generate-haskell "ArgBody" '("sexp" "sexp") nil)
;; bodys here, as there are multiple!
(generate-haskell "LetMatch" '("sexp" "argBodys" "sexp") ":let-match")
The first one correlates to the structure we've already seen. Namely
the Defun
struct. the first argument is the type name we wish to
generate. This is of course Defun
. the second argument is the
list we wish to parse, in this case it says all three arguments are
just Sexp.T
's. and binds them in order to name
, args
, and
body
in the struct itself. The third argument is the name of the
structure itself if it has one. Our lets are translated to the
:defun structure, and so we note that here.
The second case correlates to a very similar structure, except for
that it lacks a name. so the structure we care about is (arg
body)
with no name attached. The nil
in the third argument
reflects this change.
the third case correlates to a scenario with two changes. The first
being is that we can define types for another to rely on. Here we
are saying that case has the type DecondBody
, we use lower case
to reflect the proper variable name this associates with. Further
the :list-start t
aspect of this tells us that the last argument,
in this case of DeconBody
, is the rest of the s-expression and
not just the cadr
.
data Case = Case
{ caseOn :: Sexp.T,
caseImplications :: [DeconBody]
}
deriving (Show)
The fourth is the last new concept of the bunch, namely we have
:un-grouped t
, which states the form in which this parses are not
grouped like
((name₁ type₁) (name₂ type₂) … (nameₙ typeₙ))
, but rather
(name₁ type₁ name₂ type₂ … nameₙ typeₙ)
.
This means that we have to tell it explicitly that it occurs over 2
positions with that :un-grouped
argument.
The fifth case is an interesting one. The key insight is that we
say argBodys
rather than argBody
. This is because our generator
is limited. Thus we manually write
-- these are ungrouned fromArgBodys, where we groupBy2 both ways
fromArgBodys :: [ArgBody] -> Sexp.T
fromArgBodys = Sexp.unGroupBy2 . toStarList fromArgBody
-- these are ungrouned fromArgBodys, where we groupBy2 both ways
toArgBodys :: Sexp.T -> Maybe [ArgBody]
toArgBodys = fromStarList toArgBody . Sexp.groupBy2
To be able to handle the scenario where the list-star like form
happens in not the last argument, but rather in a list in some
other position!
Improvements Upon the Generator
The generator can be improved a lot, as the previous section mentions
it was hacked together using string interpolation which is not a good
structural way to handle this sort of problem. The alternative would
however take a week of hacking, so it is not a priority to undergo.
However there are two worthwhile hacks that we should undergo.
Change the Maybe
of the to<Name>
to an Either
.
This change at the current moment does not matter! Namely because
Nothing
should never be triggered. This is due to the parser in
frontend excludes this possibility. However when syntax redesign
comes in, this will no longer be the case. It would be great if the
generated code could instead generate an Either
where the left
counts the number of arguments given and states precisely why it
can't translate the given S-expression form into the type we want.
list-star like behavior anywhere. Currently we bake it into the
last form, but it would be nice if we could have a way to denote
this for the middle slots, so we can avoid hacks, like the manual
argBodys
example given in the last section.