LLVM
Note that this description is merely a design, and does not reflect the current state of the LLVM backend yet. The design and implementation might change in the future.
Introduction to LLVM
LLVM is an intermediate language and collection of compiler tools. The intermediate language can be compiled to several backends including AMD64 and Web Assembly, giving us a lot of flexibility.
Structure of the backend
Primitives
The LLVM backend uses an intermediate representation for primitives that are
specific to LLVM: RawPrimVal for values and functions, and PrimTy for
types. Each of these have a rather direct relation to constructs available in
LLVM.
Application
It is possible that primitives are terms that can be reduced further, i.e.
addition of two constants can be replaced by a single constant. To support
this, the LLVM backend implements application of terms by instancing the
CanApply class for both types and values.
Parameterisation
The parameterisation of the LLVM backend is really easy and follows
implementation of the Parameterisation datatype:
hasTypeis a function that checks the types for raw primitive values. Currently the typechecking is very basic, it just checks if the given types matches with the primitive value in arity. For operator primitives, we check if the types of the arguments are equal. This implementation will definitely have to be refined in the future.builtinTypesandbuiltinValuesjust provide a mapping from source code literal names to the actual literals.integerToRawPrimValcreates a literal based on an integer. Currently it doesn't take the size of the literal into account, as integer literals always share the same type. This should be improved in the future.There are currently no implementations for
stringValandfloatVal, as the backend does not support these types yet.
Substitution and weakening
For simplicity, the current HasWeak, HasSubstValue and HasSubstTerm
instances do not do anything at all. This will likely change in the future.
Connection to the pipeline
The LLVM instances for HasBackend relies on the default for the parse
function. Additionally:
The
typecheckfunction relies on the general typechecker implementation by calling thetypecheck'function, together with the parameterisation and a the LLVM representation of a set type.The
compilefunction, which gets a filename and an annotated term (AnnTermT) as the program. The annotated term with typed primitives is translated to an annotated term with raw primitivesRawPrimVal. These are then translated to LLVM code.
Compilation
With the input program being represented by the annotated Term, we can now
easily translate these to LLVM code. We rely mostly on the
llvm-hs-pure package,
as it provides a nice monadic interface LLVM.IRBuilder that keeps track of
variables for us. Therefore, the implementation is rather straightforward:
The whole program, is translated to an LLVM module by the
compileProgramfunction. This module contains a hardcodedmainfunction, with the type taken from the outermost annotated term.The term itself is compiled by the
compileTermfunction, using recursion for any sub terms.For any primitive functions, we check if enough arguments are available, and if so the primitive function is written, with unnamed references to the arguments all taken care of by the
IRBuildermonad.
Each of these functions are called in their own part of the pipeline, the
output of parse is the input of typecheck and the output of typecheck
is the input of compile.
References
LLVM website: https://llvm.org