This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Design

A document on various language design decisions and philosophies for Juvix. Also hosts a variety of sub documents on various language aspects

1: Extensible
2: Image System
3: Subtyipng
4: Trees-That-Grow
5: S-expression Syntax

5.1: Core S-Expression Syntax

6: Syntax

Design Philosophy

The philosophy of Juvix is one of generality and flexibility. It should not be the case that the Juvix compiler developers should have many extra tools over the users of the language. The compiler/language developers/designers should be able to chose what is canonical, however the forms of the language should be generic in a way that ordinary users could have made if they so wish.

For example, the form of def in the frontend language, is not a fixed piece of syntax, but rather a simple macro over a primitive that is most suitable for our core type theory. This means that if one is an application developer for some domain, like web-browsers, they can make their own version of def (def-command) that fits into their domain better (registering hooks and functions into the web-browser itself!).

In order to achieve this goal of flexibility, Juvix has made many design decisions that allow traditionally built in tools to be first class in Juvix.

Some of the major design decisions
1. Small core syntax
  - This means our syntax isn't special
2. Image based system
  - This means Juvix can talk about the environment in which it lives in.
Some Applications that are traditionally built in or hard to achieve without these systems
1. Build tools can be made in the language!
2. Live documentation tools can be made while developing!
3. Proof Assistant with more power to effect the environment for a better interactive experience.
4. Better proof generation and tools to introspect various aspects of their Code.

Another design goal of Juvix is to allow for safer programs that are mission critical. When one is writing smart contracts that have millions of dollars at stake, a mistake is unacceptable. Thus Juvix has dependent types. Dependent types are wonderful at removing entire classes of errors with proofs about behavior. Further they can make many idioms in a typed language simpler.

However dependent types on their own can be confusing, as the specific incantations one needs to write to be able to effectively prove code is often not intuitive. Juvix thus seeks to make tooling in this area better, offering abstractions that help the language work to help you achieve your goals.

Design Influences

Growing a language
- A wonderful talk by guy Steele on why a language should allow user's features to feel built in.

Decisions

1 - Extensible

Talking about the Extensible system.

Like Trees that Grow. When you give an extensible declaration. What it does is add extra field to every constructor. It does this via type fields for each constructor. For example on Either, we'd get Left a Annotation, Right b Annotation, and an extension field. The type families for each constructor also has the extension type. If the extension is a type annotation, you can add the primty and primval to have the primitive types.

These live in the Extend modules. We also generate a constructor


  extTerm =
    \_primTy _primVal ->
      IR.defaultExtTerm
        { IR.nameLam = "Lam0",
          IR.typeLam = Just [[t|NameSymbol.T|]],
          IR.namePi = "Pi0",
          IR.typePi = Just [[t|NameSymbol.T|]],
          IR.nameSig = "Sig0",
          IR.typeSig = Just [[t|NameSymbol.T|]],
          IR.nameLet = "Let0",
          IR.typeLet = Just [[t|NameSymbol.T|]]
        }

  extElim :: p1 -> p2 -> IR.ExtElim
  extElim =
    \_primTy _primVal ->
      IR.defaultExtElim
        { IR.typeBound = Nothing,
          IR.typeFree = Nothing,
          IR.typeElimX = [("Var", [[t|NameSymbol.T|]])]
        }

  IR.extendTerm "Term" [] [t|T|] extTerm

  -- TODO allow extendTerm to reorder fields?
  pattern Lam x t = Lam0 t x

The typeElimX can add multiple constructors.
The name<Constructor> is the new constructor name.
The type<Constructor> talks about an extra argument.
The extendTerm is how we instantiate the alias and extensions system.
The pattern are for synonyms to make the argument order better
- This is needed for more than 1 extra constructor, as it'll generate lefts and rights.

2 - Image System

In which we talk about the Image System

What Is the Juvix Image System

The definition for Image can be found in the terminology section. As for the image system itself, it is the way we are able to talk about Juvix in Juvix. It allows us primitives in Juvix to be able to reason over what is in scope, or to change and add to the system while it's live. It is a very powerful technique that allows Juvix to do a lot with little effort.

Image System Goals

Research Areas

Images are as old as computation, with the Lisp 1.5 compiler being an image which wrote to and read from a magnetic strip (source required). However there is very little literature on the subject. This was a list of resources we were able to collect about the subject. Further, since the technique has fallen out of fashion in the past few decades, there are many areas we can explore on this front

First Class Environments

First class environments are a way of making the environment code lives in first class. This means we could talk about discrete environments and more easily distinguish between startup, compilation time, evaluation time, etc environments. There are many bigger implications as well, like being able to talk about environments over the network, having sandboxed environments, etc.

While the benefits are quite promising, the technique seems to suffer from a lack of research. There is the SICL paper that talks about adding it to Common Lisp, and the SBCL compiler which at it's bootstrapping phase works on creating it's own environment. There is also MIT scheme which has first class environments

With Juvix, we can explore these ideas a lot further, as we have many systems which facilitate this kind of reasoning. Three particular constraints come together for this.

We have an algebraic effects system
- This is very significant for making progress, as we can see precisely how our effects touch the image. If we can determine that, we can likely create a system that can precisely denote the scope of the image we need to reason over.
- With that said, we need to be careful with our thoughts here, as free variables are an effect that are discharged by the default handler. Meaning that if the system fully throws this information away as it can lookup in the image what answers are bound to and throw away what it relies upon in the image to function, then this will be harder to manage.
We have multiple backends we must support in a single program.
- This means that we already have to tackle the issue of some parts of our language being out of scope.
- We are likely to include parts of Racket's #lang research, which should help talking about environments on a module level.
We have a first class module system.
- We might be able to formalize this idea as our module system, or view it as an extension of our module system.
- After all the image is one big module!

If we are able to synthesize 1., 2., and 3. properly along with our type system, we should have a very ergonomic way of creating very eloquent systems that can talk about library versioning, backend support, and evaluation time processing.

Possible Benefits

Talk formally about the various environments
- Boot Strapping environment
- Evaluation Environment
- Compilation Environment
Better Integration with Block chains
- We could ingest Anoma data into the image, and have multiple different conflicting nets around.
- We can fully explore them as if they were on our live image, and switch between the environments and see what effect our code would have on different sets of conflicting data!
Give us a formal way to talk about the environment where our type system gets injected
- The base image will have to not have an image, as that is rather hairy to do.
- What this gives us is an environment in which we can talk about everything having a nice type system.
  - We can even push this further for other kinds of type systems ala what Agda does, but as libraries, not baked in(?)
Leads into a Possible packge/versioning system.
- This could give us an LTS model like Haskell has, and maybe even better as we can have multiple versions on our image that doesn't conflict with each other?

3 - Subtyipng

This article covers the discussion on subtyping and considerations we’ve had

This document is bound to change once we have a proper writeup of row polymorphism

Subtyping

An argument about the extent of subtyping

Subtyping is an important part of any module or object system. The rules these systems usually care about revolve around the subtype of Records and Records in relation to functions.

About Subtyping Rules on Records

An example of a record subtyping is as follows, we will use a :> b to mean that a is a subtype of b

∀x, a … v : Type. {x : Type, a : Type … v : Type} :> {a … v : Type}
∀x, a … v : Type. ∀ f : {a : Type … v : Type} → Type
f :> f : {x : Type, a : Type … v : Type} → Type

What 1 tells us is that we can forget a field, and thus a record with more fields is a sub type of the same record with less fields

What 2 tells us is that a function which accepts a record with fields a through v can also accept a record with those fields and extra fields. The intuition is that in functions the sub typing relation is reversed, but we still need to forget information to get our relationship.

Implications of Rules and Arguments about Behavior

These implications can be quite worrying, consider the longer-form example below. We will introduce all the forms first then make an argument about what various object systems have done and what we are proposing to do.

  f0 : λX:>{x : int} -> x1 -> x2 -> x1
  f0 rec1 rec2 = rec1 { x = rec1.x + rec2.x}


  f1 : λX:>{x : int} -> X -> X -> X
  f1 rec1 rec2 = rec1 {x = rec1.x + rec2.x}


  f2 : λX:>{x : int} -> X -> X -> X
  f2 rec1 rec2 = {x = rec1.x + rec2.x}


  -- note that the t is opaque
  sig set : int -> {x : t,  type t, add : t -> t -> t}
  mod set x =
    type t = int
    add = (+)
    x
  end

  -- We include the set
  sig bag : int -> {x : t,  type t, add : t -> t -> t}
  mod bag x = include (set x)

  foo a b c =
    let ab = a in
    ...

   foo1 a b c =
     let ab = new bag in
     ...


  f3 : λX:>{x : int} -> X -> -> X
  f3 rec1 = rec1 {x = rec1.x + rec1.x}

NOTE: f₀ uses a short hand notation. x1 and x2 are sub types of {x : int}.

The main functions we wish to discuss are f₀, f₁, and f₂. In an object oriented system all three of these are permissible. This is due to in a sub typing relation these are all fine, we just end up with some subtype of {x : int}.

However this is a major issue, see Subtyping, Subclassing, and Trouble with OOP, written by Oleg for how insidious this can make errors.

Thus on <2020-11-19 Thu> we have decided that f₂ signature is invalid as we don't just have a subtype of {x : int}, but in fact our return type is {x : int}. Therefore, we don't have the issue of the function foo in the article above, which we will now quickly detail.

In our code above, foo and foo1 both return a derivative of the ab value. In an object oriented language the following type would be applicable to both

  λX:>bag X -> X -> X

The issue is that if we send in a set instead, then foo gives us a set, while foo1 would give us a bag. Thus in our type system foo1's signature would be

  λX:>bag X -> X -> bag

However what was more contentious was what to do with f₁ vs f₀. Should we allow

  f1 {x = 5; y = 3} {x = 1}

if f₁ were f₀ then obviously yes, as we know we get two seperate subtypes and return the first.

An argument for allowing this stems on the answer of This is how subtyping relations work. While a fair argument, we have concluded that either way is a valid way of working with sub typing, as we can merely check the two arguments are the same, then begin doing subtyping.

Another argument is around the flexibility of functors if we disallow the above example. However this argument was also flawed in that we still have a subtyping relation so f₃ in our code block can be called like

  f3 {x = 3; y = 4}

Thus the argument regarding functors with a single argument can be disregarded. However, for multiple arguments the legitimacy of the rest of the claim can be discredited by having a mechanism which allows the user to specify that the two arguments are really subtypes of each other. This explicit system still allows the mixup of bag and set in Oleg's article, however it must be done so explicitly and full intention, with the most serious error still being impossible!

Finally the last claim for f₁'s legitimacy rests on flexibility of a library user and the incorrect choices of the author. f₁ should really allow different subtypes, it just needs to specify it so! For this we can gain back our expressiveness by adding an coerce function which forgets fields to make the two records the same, and thus legitimate in f₁

  f1 {x = 5; y = 3} (coerce {x = 1})

  -- or
  f1 (coerce {x = 5; y = 3}) {x = 1}

Further Arguments/Unresolved Issues

As of <2020-11-19 Thu> we still don't have a full answer for the following questions

How do we handle usages
- We have already have some arguments about ω and 0 usage items and how subtyping works there, however we still need to flesh out the mechanism for other usages and how subtyping will work in relation to usages inside records.
How do we handle refinements
- What do we do with this information, forget it for the application and allow implicit subtyping of refinements?
  - Currently we plan on translating refinements to Σ's and thus this will not be the default behavior.

4 - Trees-That-Grow

This covers why we use the extensible abstraction over trees that grow

Trees that Grow

In the Juvix codebase we have a variety of similar but different abstract syntaxes for Core—for example, adding type annotations or names (plain IR is locally nameless). Rather than duplicating code that needs to work on multiple kinds of syntax, or keep a single abstract syntax and write partial functions, we instead use the idiom described in the Trees that Grow paper by Najd and Peyton Jones.

The basic idea is to have a type family for each branch of the constructor and place anything not present in plain IR in there. In addition, there is also an extra "extension" type family for each datatype in the syntax to add extra constructors (for example, names). The first group all default to (), providing new information, and the second group to Void, ensuring they don't appear unexpectedly.

Functions which don't care about the annotations can simply ignore them and thus be applied to any kind of syntax (as long as the extensions are still Void). Alternatively, if they need to work on the annotations as well, they can be encoded using type classes (see Juvix.Core.IR.Evaluator.subst for an example of this).

5 - S-expression Syntax

The concrete syntax of the Juvix syntax after turning into a LISP

This document is bound to change once a syntax redesign will happen. This will be greatly minimized

For this page we give examples of the syntax rather than a hard bnf unlike what is found in the syntax page

Generation

Top Level

Let

basic

input

  let foo x
    | x == 2 = 3 + 4 + x
    | else   = x + 2

output

  (:defun foo (x)
    (:cond
      ((:infix == x 2) (:infix + 3 (:infix + 4 x)))
      (else            (:infix + x 2))))

Sig

basic

input
```
  sig foo : int -> int -> int
```

output

  (:defsig foo (:infix -> int (:infix -> int int)))

usage

input
```
  sig foo 0 : int -> int -> int
```

output

  (:defsig foo (:usage 0 (:infix -> int (:infix -> int int))))

This form is a bit odd, ideally it'd mimic the Types version like
```
  (:defsig (foo :usage 0) (:infix -> int (:infix -> int int)))
```
- However for now I think it's fine… Though this is a place for improvement for the future

Types

record

input

  type foo x y z =
    { y-axis : y
    , x-axis : x
    , z-axis 1 : z }

output

  (type foo (x y z)
     (:record-d
       (y-axis (:primitve Builtin.Omega) y)
       (x-axis (:primitve Builtin.Omega) x)
       (z-axis 1 z)))

first sum

input

  type foo x y z =
    | Foo {y-axis : y, x-axis : x, z-axis : z}
    | Bar (var1 : x) (var2 : y)
    | Car : hd : x -> cdr : foo y -> foo x y z

output

  (type foo (x y z)
    (Foo (:record-d (y-axis (:primitve Builtin.Omega) y)
                    (x-axis (:primitve Builtin.Omega) x)
                    (z-axis (:primitve Builtin.Omega) z)))
    (Bar (:paren (:infix : var1 x))
         (:paren (:infix : var2 y)))
    (Car (:arrow
          (:infix : hd
                  (:infix -> x
                          (:infix : cdr
                                  (:infix -> (foo y) (foo x y z))))))))

Signature

input

  type foo x y z : typ -> typ -> typ = Foo a b c

output

  (type (foo :type (:infix -> typ (:infix -> typ typ))) (x y z)
     (Foo a b c))

Decalration

basic

input
```
  declare infixl foo 3
```
output
```
  (declare infixl foo 3)
```

Module

basic

input

  mod foo =
    let bar = 3
    type zar = Boo
  end

output

  (:defmodule foo ()
    (:defun bar () 3)
    (type zar () (Boo)))

guards

input

  mod foo x
    | x == 3 =
      let foo = 3
      let bar = 5
    | else =
      let foo = 5
  end

output

  (:defmodule foo (x)
    (:cond ((:infix == x 3)
            (:defun foo () 3)
            (:defun bar () 5))
           (else
            (:defun foo () 5))))

Expression

Let

basic

input
```
  let foo y =
    let fi = 3 in
    fi
```

output

  (:defun foo (y)
    (let fi () 3
      fi))

arguments

input

  let foo y =
    let fi x = 3 + x in
    fi y

output

  (:defun foo (y)
    (let fi (x) (:infix + 3 x)
      (fi y)))

Module

basic

input

  let foo =
    mod Bar =
      let bar = 3
      type zar = Boo
    end in
    Bar.bar

output

  (:defun foo ()
    (:let-mod Bar ()
       ((:defun bar () 3)
        (type zar () (Boo)))
      Bar.bar))

Tuples

basic

input
```
  let foo = (1,2,3)
```
output
```
  (:defun foo () (:tuple 1 2 3))
```

Lists

basic

input
```
  let foo = [1,2,3,4]
```
output
```
  (:defun foo () (:list 1 2 3 4))
```

Records

basic

input
```
  let foo = {a, b = 2}
```
output
```
  (:defun foo () (:record (a) (b 2)))
```

Do

basic

input

  let foo xs =
    a <- xs;
    more-comp;
    pure a

output

  (:defun foo (xs)
    (:do (%<- a xs)
         more-comp
         (pure a)))

Lambda

basic

input
```
  let foo y =
    \x -> x + y
```

output

  (:defun foo (y)
    (:lambda (x) (:infix + x y)))

Open

basic

input

  let foo y =
    open Prelude in
    x + y

output

  (:defun foo (y)
    (:open-in Prelude
       (:infix + x y)))

Parens

basic

input
```
  let foo y = (y + 3) * 9
```

output

  (:defun foo (y)
    (:infix * (:paren (:infix + y 3)) 9))

Block

basic

input
```
  let foo y = 3 * begin y + y end
```

output

  (:defun foo (y)
    (:infix * 3 (:progn (:infix + y y))))

Primitive

basic

input
```
  let add = %Michelson.add
```

output

  (:defun add ()
    (:primitive Michelson.add))

Declaration

basic

input
```
  let foo = declare infixl (+) 3 in 3
```

output

  (:defun foo ()
    (:declaim (infixl + 3) 3))

Cond

basic

input

  let add x =
    if | x == 3 = 1
       | x == 4 = 5
       | else   = 0

output

  (:defun add (x)
    (:cond ((:infix == x 3) 1)
           ((:infix == x 4) 5)
           (else            0)))

Case

basic

input

  let foo =
     case x of
     | (A (B {a, b})) -> a + b
     | A (B c) (C d)  -> c + d

output

  (:defun foo ()
     (case x
        ((A (B (:record (a) (b)))) (:infix + a b))
        ((A (B c) (C d))           (:infix + c d))))

Notes

Design Principles

Overall the goal of the S-expression incantation is twofold ¹.

An unambiguous syntax that is faithful to the original ML syntax
- This is a necessity to have limited strife between the ML syntax. and the work we need to do on the generic IR-syntax layer.
- Unambiguous in this case implies bijection, as we have to convert between the two forms.
A fully realized syntax that is readable and looks as if it were a standalone language.
- This is needed so inspection of the output is obvious. Further this will empower ease of reasoning over the syntax without too many surprises.

On Verbosity

Overall a few forms in this representation are quite verbose and thus harder to read than they should be, see


  (:arrow
   (:infix : hd
           (:infix -> x
                   (:infix : cdr
                           (:infix -> (foo y) (foo x y z))))))

  ;; should be
  [ hd : x -> cdr : (foo y) -> (foo x y z) ]

  (:record
   y-axis y
   x-axis x
   z-axis z)

  ;; should be
  {y-axis y
   x-axis x
   z-axis z}

This is due to lacking reader macros at this level. Overall this can be largely improved in the future for better inspection (we thus will offer another level of inspection that can do away with the infix view!)

On Forms

The syntax is heavily inspired from Common Lisp (CL). There are of course a plethora of changes, however some key concepts have survived through this transition.

Namely for adding extra property based information in declarations we got for an alist concept

  (type (foo :type (:infix -> typ (:infix -> typ typ))) (x y z)
     (Foo a b c))

which is taken directly from CL's defstruct construct.

Even naming conventions like defun are lifted directly from Common Lisp.

With that said, many forms (namely the lets) lack an extra set of parenthesis when in comparison to CL and the Schemer language.

  (let fi (x) (:infix + 3 x)
      (fi y))

  (let fi () 3
      fi)

This is due to us fully abusing the fact that we must represent non let blocks currently along with having a single expression each name is bound to.

Inconsistencies

There are likely inconsistencies between some of the forms. These should be removed as soon as possible. I have tried my best on this front, but likely some forms are inconsistent.

Known

Signatures

Currently we have a minor point of alists on the name itself to describe extra properties

  (type (foo :type (:infix -> typ (:infix -> typ typ))) (x y z)
     (Foo a b c))

This is great and we'll be adding it to Modules, however we currently do an odd transformation in regards to signatures

  (:defsig foo (:usage 0 (:infix -> int (:infix -> int int))))
  ;; should be
  (:defsig (foo :usage 0) (:infix -> int (:infix -> int int)))

This likely should be changed

Footnotes

The Whimsical spell like nature of sexpressions are heavily derived from SICP and the general imagery of other literature such as the dragon book.

5.1 - Core S-Expression Syntax

The concrete syntax of Juvix once we Hit Core, namely the Named version (HR)

Generation

Term

Star

Haskell ADT

  = -- | (sort i) i th ordering of (closed) universe.
    Star Universe

ML Syntax

  -- Universe 0
  ×₀
  -- Universe 1
  ×₁
  -- Universe n
  ×ₙ

S-expression Syntax

  ;; universe 0
  (:star 0)
  ;; universe 1
  (:star 1)
  ;; universe n
  (:star n)

Prim type

Haskell ADT

  | -- | PrimTy primitive type
    PrimTy primTy

ML Syntax
```
  primTy Michelson.Int
```
S-expression Syntax
```
  (:prim-ty Michelson.add)
```

Prim

Haskell ADT

  | -- | primitive constant
    Prim primVal

ML Syntax
```
  prim Michelson.Add
```
S-expression Syntax
```
  (:prim Michelson.add)
```

Π / Pi

Haskell ADT

  | -- | formation rule of the dependent function type PI.
    -- the Usage(π) tracks how many times x is used.
    Pi Usage NameSymbol.T (Term primTy primVal) (Term primTy primVal)

ML Syntax

  (Π (0 | typ : ×₀) (Π (1 | _ : typ) typ))

S-expression Syntax

  (:pi (typ 0 (:star 0)) (:pi (_ 1 typ) typ))

Lam

Haskell ADT

  | -- | LAM Introduction rule of PI.
    -- The abstracted variables usage is tracked with the Usage(π).
    Lam NameSymbol.T (Term primTy primVal)

ML Syntax

  -- Note we don't have (λ x : unit -> x) : unit -> unit, as that requires
  -- the type to be given in the Haskell ADT, and is thus handled in Π
  λ x ⟶ ∅

S-expression Syntax
```
  (:named-lambda x :unit)
```

Sig

Haskell ADT

  | -- | Dependent pair (Σ) type, with each half having its own usage
    Sig Usage NameSymbol.T (Term primTy primVal) (Term primTy primVal)

ML Syntax

  Π (ω | x : A) ⟶
    Σ (ω | y : B) ⟶
     C x y

S-expression Syntax

  (:pi (x ω A)
     (:sigma (y ω b)
        (C x y)))

Pair

Haskell ADT

  | -- | Pair value
    Pair (Term primTy primVal) (Term primTy primVal)

ML Syntax
```
  ‹f x y, g x y›
```
S-expression Syntax
```
  (:pair (f x y) (g x y))
```

Let

Haskell ADT

  | -- | Let binder.
    -- the local definition is bound to de Bruijn index 0.
    Let Usage NameSymbol.T (Elim primTy primVal) (Term primTy primVal)

ML Syntax
```
  let 2 | a = f x y in
    <a,a>
```

S-expression Syntax

  (:named-let (a 2 (f x y))
    (:pair a a))

Unit type

Note
Should this be treated specially?
Haskell ADT
```
  | -- | Unit type.
    UnitTy
```
ML Syntax
```
  Unit
```
S-expression Syntax
```
  :unit-type
```

Unit

Note
Should this be treated specially?
Haskell ADT
```
  | -- | Unit Value
    Unit
```
ML Syntax
```
  ∅
```
S-expression Syntax
```
  :unit
```

Elim

Should we even have Elims with this strategy, we know by exclusion, an explicit tome for it seems needless? Depends how the code turns out.

Record

Haskell ADT

  | -- | Constructor for a record type.
    Record [(Symbol, Term primTy primVal)]

ML Syntax

  {name₁ = term₁, name₂ = term₂, …, nameₙ = termₙ}

S-expression Syntax

  (:record name₁ term₁ name₂ term₂ … nameₙ termₙ)

RecordTy

Haskell ADT

  | -- | Record type. (It's a list not a map because the dependency order
    -- needs to be preserved.)
    RecordTy [(Symbol, Usage, Term primTy primVal)]

ML Syntax

  {name₁ usage₁ : term₁, name₂ usage₂ : term₂, …, nameₙ usageₙ : termₙ}

S-expression Syntax

  (:record-ty
    (name₁ usage₁ term₁)
    (name₂ usage₂ term₂)
    ...
    (nameₙ usageₙ termₙ))

Elim

Var

Haskell ADT

  = -- | Named Variables
    Var NameSymbol.T

ML Syntax
```
  λ x ⟶ x
```
S-expression Syntax
```
  (:named-lambda x x)
```
- We treat vars as simple any atom in the list that isn't a special form.

App

Haskell ADT

  | -- | elimination rule of PI (APP).
    App (Elim primTy primVal) (Term primTy primVal)

ML Syntax
```
  f a b
```
S-expression Syntax
```
  (f a b)
```
- We simply represent application as the car of the list if it's not a special form

Ann

Haskell ADT

  | -- | Annotation with usage.
    Ann Usage (Term primTy primVal) (Term primTy primVal) Universe

ML Syntax
```
  (ω | f x y : C x y : * 0)
```

S-expression Syntax

  (: (f x y) (:usage ω :universe 0)
     (C x y))

  ;; Should we adopt it without names?
  (: (f x y) (ω 0)
     (C x y))

Here we have a choice, I′m not sure which one I prefer, the first is more explicit, but may be a bit too verbose

Lookup

Haskell ADT

  | -- | Project an element from a record type.
    Lookup (Elim primTy primVal) Symbol

ML Syntax
```
  name.elim
```
S-expression Syntax
```
  (:lookup elim name)
```

Value

Instead of giving each variant a section, we will instead note that Values relate to Terms, except instead of an Elim they have a Neutral.

Thus, it's an extension of Term.

Neutral

Neutral is the same as Value, except we remove annotations from the possibilities

Pattern

Constructor

Haskell ADT

  = PCon GlobalName [Pattern primTy primVal]

ML Syntax
```
  (Cons a as)
```

S-expression Syntax

  ;; No other form has an unattended name with arguments, so this is
  ;; safe
  (Cons a as)

Pair

Haskell ADT

  | PPair (Pattern primTy primVal) (Pattern primTy primVal)

ML Syntax

  -- We have syntax for multi pairs
  ‹x, y, z›
  -- This is just an alias for
  ‹x, ‹y, z››

S-expression Syntax

  (:pair x y z)
  ;; This is just an alias for
  (:pair x (:pair y z))

Unit

Note
Should this be treated specially?
Haskell ADT
```
  | PUnit
```
ML Syntax
```
  ∅
```
S-expression Syntax
```
  :unit
```

Var

Haskell ADT
```
  | PVar PatternVar
```
ML Syntax
```
  (Constructor a b c)

  a

  z
```
S-expression Syntax
```
  (Constructor a b c)

  a

  z
```

Dot

Haskell ADT

  -- Deduction by the unifier, that the term should already be
  -- known
  | PDot (Term primTy primVal)

ML Syntax

  -- Taken From
  -- data Double (n : ℕ) : ℕ → Set where
  --   double : Double n (n + n)

  -- foo : ∀ m n → Double m n → {!!}
  -- foo m .(m + m) double = {!!}

  .(m + m)

S-expression Syntax
```
  (:dot (+ m m))
```

Primitive

Haskell ADT
```
  | PPrim primVal
```
ML Syntax
```
  (Prim +)
```
S-expression Syntax
```
  (:prim +)
```

Raw Globals

This is the target that the context type should hold. However we have to note a few facts about how things are stored.
We don't really have ML syntax counterparts for these, so instead we will give the raw structure as the syntax.

FunClause

In this the full type has many overlapping elements, thus we should just steal the RawFunClause field

Haskell ADT

  data RawFunction' ext primTy primVal = RawFunction
    { rawFunName :: GlobalName,
      rawFunUsage :: GlobalUsage,
      rawFunType :: Term' ext primTy primVal,
      -- The value we want
      rawFunClauses :: NonEmpty (RawFunClause' ext primTy primVal)
    }
    deriving (Generic)

  -- The Type that we will emulate
  data RawFunClause' ext primTy primVal = RawFunClause
    { rawClauseTel :: RawTelescope ext primTy primVal,
      rawClausePats :: [Pattern' ext primTy primVal], --TODO [SplitPattern]
      rawClauseBody :: Term' ext primTy primVal,
      rawClauseCatchall :: Bool
    }
    deriving (Generic)

Haskell Context ADT

  data Def term ty = D
      -- This mimics rawFunUsage
    { defUsage :: Maybe Usage.T,
      -- This mimics rawFunType
      defMTy :: Maybe ty,
      -- This field will mimic rawFunClause'
      defTerm :: term,
      defPrecedence :: Precedence
    }
    deriving (Show, Read, Generic, Eq, Data)

Raw Fun Clause Example

  RawFunClause
    { rawClauseTel = []
    , rawClausePats =
        [ PVar "x" ]
    , rawClauseBody = Elim'
                      ( Var "x" )
    , rawClauseCatchall = False } :| []

Currently I can't get a better example sadly.

S-expression Syntax

  (:raw-fun-clause () (x)
    x
    :false)

6 - Syntax

This is the BNF of Juvix’s syntax

Syntax

  header ::= mod <symbol> where <top-level>*
           | <top-level>*

  top-level ::= <type>
              | <module-open>
              | <type-class>
              | <type-class-instance>
              ; These four last, due to lack of words before
              | <module-signature>
              | <infix-declare>
              | <module>
              | <module-open>
              | <signature>
              | <function>

  ;;;; Declarations ======================================================

  declarations ::= declare <declare-adt>

  declare-adt ::= <infix-declare>

  infix-declare ::= infix <symbol> <number>
                  | infixl <symbol> <number>
                  | infixr <symbol> <number>

  declaration-expression ::= <declarations> in <expression>

  ;;;; Functions ======================================================

  ;; let until indent sensitivity is gone
  function ::= let  <function-body>

  function-body ::= <name> <args>+ = <expression>
                  | <name> <args>+ <cond-logic>*

  ;; Match-logic could just be a name, and so is thus fine
  ;; # is for implicit
  ;; Should we have implicit pattern matching like here?
  args ::= <match-logic>
         | #<match-logic>

  ;; sig only here while indent sensitivity is not here
  ;; if we only have usage in arrows, we should allow a way to have
  ;; usage information of variables as well!
  signature ::= sig <name>         :                  <expression>
              | sig <name>         : <constraint> =\> <expression>
              | sig <name> <usage> :                  <expression>
              | sig <name> <usage> : <constraint> =\> <expression>

  constraint ::= <type-name>
               | ( <constBody>* )

  constBody ::= <type-name> , <type-name>
              | <type-name>
  ;;;; Types ==========================================================

  type ::= <data-declaration>

  ;; Should we let there be a refinement in adt type declaration?
  ;; ? means it may or may not be there
  ;; it makes sense to have the number here, as it can't be confused with an
  ;; expression type
  data-declaration
    ::= type u#<usage>? <non-capital-name> <symbols>+ : <expression> = <adt>
      | type u#<usage>? <non-capital-name> <symbols>+                = <adt>

  adt ::= <sum1> <sum>*
        | <product1>


  sum1 ::= <name>
         | <name> <product>

  ;; | is special in bnf
  sum ::= \| <name>
        | \| <name> <product>

  product1 ::= { <name-type-comma>+ <name-type>* } -\> <type-name>
             | { <name-type-comma>* }              -\> <type-name>
             | { <name-type-comma>+ <name-type>* }
             | { <name-type-comma>* }


  product ::= { <name-type-comma>+ <name-type>* } -\> <type-name>
            | { <name-type-comma>* }              -\> <type-name>
            | { <name-type-comma>+ <name-type>* }
            | { <name-type-comma>* }
            | : <expression>
            | <expression''>*

  name-type      ::= <name> <usage>?  <type-signature>
                   | #<name> <usage>? <type-signature>
  name-type-comma ::= <name-type> <comma>


  ;; { } here are a refinement type!
  type-refine ::= <expression> { <expression> }
                | <expression>
                | <name> : <expression>
                | #<name> : <expression>
                | <name> : <expression> { <expression> }
                | #<name> : <expression> { <expression> }

  ;;;; Arrows =========================================================

  arrow ::= -<usage>-\>

  ;;; Modules ========================================================

  ;; For all intensive purposes, modules are the same as values, just with
  ;; top level expressions, and a minimal amount of sugar

  ;; This doubles as our import
  module-open ::= open <module-name>


  module-open-expression ::= open <module-name> in
                           | <module-name>.( <expression> )

  ;; We are going to make modules a bit more interesting, syntax wise
  ;; imagine modules were functions with capital name to delineate
  ;; thus module signatures have the same signature look as functions

  ;; Interestingly enough module "functors" can take more than just modules
  ;; they can take any value, however for examples, we will encourage the use
  ;; of taking (and thus parameterizing) modules

  ;; let and end are there until indent sensitivity is in
  module ::= mod <name> <args>+ = <top-level>* end
           | mod <name> <args>+ <cond-top>* end


  ;; we use mod here to disambiguate it for parsing speed
  moduleE ::= mod <name> <args>+ = <top-level>* end in <expression>
            | mod <name> <args>+ <cond-top>* end in <expression>

  ;; what should we allow in a module signature?
  ;; begin and end are only here while we aren't indent sensitive



  module-signature ::= <module-signature-helper> Module <sigs-and-types>+ end

  sigs-and-types ::= <sig>
                   | <module-signature>
                   | <type>

  module-signature-helper
    ::= sig <name>           :                 <expression>
      | sig <name>           : <type-name> =\> <expression>
      | sig <name> <usage-f> :                 <expression>
      | sig <name> <usage-f> : <type-name> =\> <expression>


  cond-top ::= \| <expression> = <top-level>*
  ;;;; Types Classes ==================================================

  ;; Need end if we are indent sensitive!
  type-class ::= class <type-name> where
               | class <type-name> =\> <type-name> where

  ;; Need end if we are indent sensitive!
  type-class-instance ::= instance <type-name> where

  ;;;; Expressions ====================================================

  ;; See comments about which to keep and which to maybe remove
  expression'' ::= <match>
                 | <if>
                 | <cond>
                 | <record-access>
                 | <module-lookup>
                 | <let>
                 | <moduleE>
                 | <let-type>
                 | <module-open-expression>
                 | <where>
                 | <string>
                 | <number>
                 | <lambda>
                 | <tuple>
                 | <list>
                 | <parens>
                 | <symbol>
                 ; This is useful for having nested do's or matchs
                 | <block>
                 | <do>
                 | <comments>
                 | <arrow>
                 | <infix>
                 | <record-creation>
                 | <type-refine>
                 ; TODO
                 | <record-update>
                 | <declaration-expression>

  expression ::= <application>
              | <expression''>

  usage ::= <expression>

  usage-f ::= <constant> | ( <expression> )

  record-access ::= <name>.<name>

  module-lookup ::= <module-name>.<name>

  application ::= <name> <expressions>*

  lambda ::= \\ <match-logic>* -\> <expression>

  symbol ::= <name>

  ;; useful for match, and nested do's!
  block ::= begin <expression> end

  do ::= <do-body>*

  do-body ::= <exprsesion> \; <expression>

  list ::= [ <command-list>* ]

  commad-list ::= <exprsesion> , <expression>


  tuple ::= ( <command-tuple>* )

  commad-tuple ::= <exprsesion> , <expression>

  parens ::= ( <expression> )

  comments ::= -- <any-text-not-new-line> \n
             | \n-- <any-text-not-new-line> \n
             | <comments-rec>

  comments-rec ::= <multi-comments>
                 | {- <comments-rec> -}

  multi-comments ::= {- <any-text-not-{-> -}

  infix ::= <expression> <inifx-name> <expression>

  record-creation ::= { <name-set-comma>* }


  name-set-comma ::= <name-set-creation> ,
                   | <name-set-creation>

  name-set-creation ::= <name> = <expression>
                      | <name>


  ;;; Matching ===================================

  match ::= case <expression> of <match-l>*

  match-l ::= \| <match-logic> -\> <expression>

  match-logic ::= <name>@<match-logic'>
                | <match-logic'>

  match-logic' ::= ( <match-logic''> )
                 | <match-logic''>

  match-logic'' ::= <record-match>
                  | <constructor-name> <match-args>+
                  | <constant>

  match-args ::= <name>
               | <match-logic>
               | <constant>

  record-match ::= { <name-set-comma-match>* }

  name-set-comma-match ::= <name-set> ,
                         | <name-set>


  name-set ::= <name> = <match-logic>
             | <name>

  ;; we should remove either if or cond!?
  if   ::= if   <cond-logic>*
  cond ::= cond <cond-logic>*

  constant ::= <string>
             | <number>
             | <char>

  ;;; Bindings ===================================

  ;; Due to trying to be less indent sensitive,
  ;; we only look for the in alternative,
  ;; is that we only have a single binding per let.
  let ::= let <function-body> in <expression>

  type-let ::= let <type> in <expression>

  ;; Does this even make sense to have?
  ;; Juvix is not lazy, how is order determined?
  ;; is it only for pure values???
  where ::= <expression> where <bindings>*

  binding ::= <match-logic> = <expression>


  ;; note it's fine to use |,
  ;; as matches have to be a pattern,
  ;; and thus not some expression

  ;; note in stdlib else and otherwise will both be true

  cond-logic ::= \| <expression> = <expression>

  ;;; Numbers ====================================

  number ::= <digits>*.<digits>*
           | <digits>*<exp>
           | <digits>*.<digits>*<exp>


  digits ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9


  exp ::= e <digits>*
  ;;; Strings ====================================

  ;; Give nicer string syntax?
  string ::= " <escaped-string>+ "

  escaped-string ::= <ascii-no-quotes-no-backslash> <escaped-string>+
                   | \" <escaped-string>+
                   | \ <escaped-string>+

  ;;; Universe ====================================

  ;; for now, set it up to what F* has, expand it later
  universe-expression ::= u#<name>
                       | u#<name> + u#<name>
                       | max u#<name>*

  ;;;; Misc ===========================================================
  ;; ; is comment in bnf
  comma            ::= ,
  semi             ::= \;
  name             ::= <utf8-non-reserved>
  non-capital-name ::= <utf8-no-capital>
  capital-name     ::= <utf8-capital>
  ;; may want m e or Map.t int string?
  type-name   ::= <name> <others-names>+
  infix-symbol ::= <utf8-infix>

  module-name ::= <name> ; enforce capital names?

  constructor-name ::= <capital-name-and-symbols>

  utf8-infix        ::= `<utf-non-reserved>`
                      | <UTF.Symbol>
  utf8-non-reserved ::= <UTF.Alpha>
                      | (<utf8-infix>)
  utf8-no-capital   ::=
  utf8-capital      ::=