LLVM
Note that this description is merely a design, and does not reflect the current state of the LLVM backend yet. The design and implementation might change in the future.
Introduction to LLVM
LLVM is an intermediate language and collection of compiler tools. The intermediate language can be compiled to several backends including AMD64 and Web Assembly, giving us a lot of flexibility.
Structure of the backend
Primitives
The LLVM backend uses an intermediate representation for primitives that are
specific to LLVM: RawPrimVal
for values and functions, and PrimTy
for
types. Each of these have a rather direct relation to constructs available in
LLVM.
Application
It is possible that primitives are terms that can be reduced further, i.e.
addition of two constants can be replaced by a single constant. To support
this, the LLVM backend implements application of terms by instancing the
CanApply
class for both types and values.
Parameterisation
The parameterisation of the LLVM backend is really easy and follows
implementation of the Parameterisation
datatype:
hasType
is a function that checks the types for raw primitive values. Currently the typechecking is very basic, it just checks if the given types matches with the primitive value in arity. For operator primitives, we check if the types of the arguments are equal. This implementation will definitely have to be refined in the future.builtinTypes
andbuiltinValues
just provide a mapping from source code literal names to the actual literals.integerToRawPrimVal
creates a literal based on an integer. Currently it doesn't take the size of the literal into account, as integer literals always share the same type. This should be improved in the future.There are currently no implementations for
stringVal
andfloatVal
, as the backend does not support these types yet.
Substitution and weakening
For simplicity, the current HasWeak
, HasSubstValue
and HasSubstTerm
instances do not do anything at all. This will likely change in the future.
Connection to the pipeline
The LLVM instances for HasBackend
relies on the default for the parse
function. Additionally:
The
typecheck
function relies on the general typechecker implementation by calling thetypecheck'
function, together with the parameterisation and a the LLVM representation of a set type.The
compile
function, which gets a filename and an annotated term (AnnTermT
) as the program. The annotated term with typed primitives is translated to an annotated term with raw primitivesRawPrimVal
. These are then translated to LLVM code.
Compilation
With the input program being represented by the annotated Term
, we can now
easily translate these to LLVM code. We rely mostly on the
llvm-hs-pure package,
as it provides a nice monadic interface LLVM.IRBuilder
that keeps track of
variables for us. Therefore, the implementation is rather straightforward:
The whole program, is translated to an LLVM module by the
compileProgram
function. This module contains a hardcodedmain
function, with the type taken from the outermost annotated term.The term itself is compiled by the
compileTerm
function, using recursion for any sub terms.For any primitive functions, we check if enough arguments are available, and if so the primitive function is written, with unnamed references to the arguments all taken care of by the
IRBuilder
monad.
Each of these functions are called in their own part of the pipeline, the
output of parse
is the input of typecheck
and the output of typecheck
is the input of compile
.
References
LLVM website: https://llvm.org