Haskell-like core IR backend/optimizer (M2 project)

OCaml 86.3%
Forth 4.9%
Standard ML 3.6%
Shell 1.8%
Fortran 1.5%
Other 1.8%

Find a file

Hubert Gruniaux 317f18c0f9 Add references for inlining		2026-03-02 18:59:30 +01:00
src	Add comments for substitution_kind	2026-03-02 18:56:16 +01:00
test	Fully implement custom rewrite rules	2026-02-28 11:55:00 +01:00
.gitignore	Add global .gitignore	2026-02-10 18:58:47 +01:00
Makefile	Better build system and test system	2026-02-24 14:58:59 +01:00
README.md	Add references for inlining	2026-03-02 18:59:30 +01:00
sujet.pdf	initial import of the project files	2026-01-17 23:03:33 +01:00

README.md

The MiniCore project

Code written by Hubert Gruniaux.

Instructions:

# To build the project and run all tests
make

Supported arguments:

Usage: joujou.exe <options> <filename>

  --echo                   Prior to typechecking, echo intermediary steps
  --optimize               Typecheck, optimize, and display the optimized program
  --case-of-case           Enable case-of-case optimization
  --Omax-iters             Set the maximum number of iterations for the simplifier
  --Osmart-inline          Enable smart inlining
  --Orules                 Enable rewrite rules optimization
  --Oinline-threshold      Set the threshold for inlining (default: 100)
  --Oinline-arg-discount   Set the discount to apply for each argument in a function call for the inlining cost (default: 2)
  --Oinline-sat-discount   Set the additional discount to apply if the call is saturated for the inlining cost (default: 10)
  --Oexplain               Enable verbose logging for optimizations
  --Oexplain-rule-filter   Only log optimizations whose rule name matches the given regex
  --Oexplain-show-success  Show successful optimizations in the log
  --Oexplain-show-fail     Show failed optimizations in the log
  --Oexplain-show-info     Show informational messages about optimizations in the log
  -help                    Display this list of options
  --help                   Display this list of options

Extensions

In this project, I have implemented the following extensions:

smart/better inlining (--Osmart-inline)
optimizations feedbacks (--Oexplain)
user-defined rewrite rules (--Orules)

They are described more in details in the below sections.

Inlining

To enable this extension, should pass the --Osmart-inline option to the optimizer. This switch the inling decision procedure from the naive one (always inlining let bindings) to a smarter one based on multiple heuristics. The implementation is heavely inspired by the paper Secrets of the Glasgow Haskell Compiler inliner (Simon L. Peyton Jones and Simon Marlow; 2002) and a personal discussion with Manuel Serrano.

Mainly, this extension comes in two parts. First, the inlining of let-bindings can be done individually at each occurences, so the simplier can decide to just inline some occurences of a variable. Second, the implementation of an inlining decision procedure that decides for each occurences of a let-binding if it should be inlined or not.

The first part is directly implemented inside the src/simplifier.ml file by modifying a bit the Subst module and updating the code in simplify2 that does the variable substitution. We support two kinds of substitutions: Salways which is a variable substitution that should always be done (they come mainly from the match clauses or when smart inlining is disabled) and Sinline_candidate which is a variable that may or not be inlined depending on the decision from the inliner.

The second part, the inlining decision procedure, is implemented in the src/inliner.ml file. It contains a main function should_inline_here that takes a let-binding occurence and decide if yes or not we should inline it. It also takes other parameters used for optimization feedbacks logs and are irrelevant for this section.

The idea is to distinguish between five types of occurences of let-bindings in the code:

Odead: For let-bindings that do not have any occurences. These can always be safely removed. They are handled specially in the simplifier (this is the RULE(drop)).
Oonce: For let-bindings that have only one occurence which is neither inside a constructor nor a lambda abstraction. Inlining such let-binding is always beneficial because it doesn't increase code size nor duplicates work.
Oonce_in_lambda: For let-bindings that have only occurence inside a lambda abstraction. Inlining such let-binding does not increase code size must may duplicates work if the lambda function is called multiple times. Therefore, it is not always beneficial to inline it.
Omany_branch: For let-bindings that have only a single occurence in distinct match branches. Inlining such let-binding does increase code size but will not duplicate work. Therefore, it may be beneficial to inline it if it does not increase too much the code size (the bound term is small enough).
Omany: When all above cases do not apply. The let-binding occurs in many places and inlining it may both increase code size and duplicate work. However, it may still be beneficial to inline it in some cases. The types of occurences is computed once in the simplifier at the let-binding (find RULE(inline) comment in src/simplifier.ml) using the inliner function compute_occurences.

To avoid duplicating work when applicable, we only inline terms that are in weak head normal form (WHND), that is:

variables
data constructors, and
abstractions. Typed abstractions or applications are ignored as they erased at the time of code generation. Moreover, some specials nodes like type annotations or location nodes are also ignored.

To avoid too much code size increase, we compute an inlining cost which is the term size (excluding special nodes and typed abstractions/applications) minus some discounts. For example, if the inline candidate appears as a function in an application term then we add discount for each arguments and one more if the call is saturated. Indeed, if the call is saturated (more arguments that the count expected in the let-binding definition0; inlining it may avoid closure allocation. Moreover, more there are arguments, more there are optimizations opportunities. And, if inlined, we avoid the runtime cost for preparing arguments to the call (pusing them to stack, moving them to registers, etc.)

If the total inlining cost is less than some threshold (configurable with --Oinline-threshold), then the occurence is inlined.

Configuration parameters:

--Oinline-threshold max inlining cost threshold for inlining occurences that may increase code size,
--Oinline-arg-discount discount for each arguments in the inlining cost, and
--Oinline-sat-discount additional discount if the call is saturated in the inlining cost.

Optimization feedbacks

The implementation is quite simple and can be enabled with --Oexplain command-line argument. It aims to provide feedbacks about when or not optimizations rules are applied, and why. The most interesting showcase of the implementation is in conjunction with the previously mentionned smart inline extension as many parameters may impact the inlining procedure.

The feedbacks are grouped into two categories:

Successes, emitted when a rule was successfully applied. They are emitted by all rules.
Fails, emitted when a rule could not be applied (or was decided to not be applied). Not all rules emits a fail feedback. In particular, rule that depend on a specific context do not generally emit a fail feedback, such as case-of-case or beta-reduction.

In any case, the feedback may be followed by some info logs that may explain why the rule was applied or not.

Examples of some feedbacks for the rule drop:

# For a success in the rule_drop.f test case:
RULE(drop) succeed at "rule_drop.f":3:9-27: dropped let-binding of x__4
# And a failure in the inline/many_duplicate_too_big.f test case:
RULE(drop) failed at "inline/many_duplicate_too_big.f":7:9-442: failed to drop let-binding of x__7

As said earlier, the most interesting feedbacks comes from the inlining rule when --Osmart-inline is enabled. Consider the following code:

type bool
data constructor True : {} -> bool
data constructor False : {} -> bool

(* INLINE: f appears once in each distinct branches! *)
program
    fun [a] (c : bool) (v : a) (x : a) (y : a) (plus : a -> a -> a) : a -> a =
        let f (z : a) = plus v z in
        let g (z : a) = f in
        match c return a -> a with
        | False {} -> g x
        | True {} -> g x
        end

Then the simplifier emits the following feedbacks for the rule inline:

' The first feedback tells us that f was inlined because its appears only once inside a lambda,
' but as it is in WFNF it does not duplicate work!
RULE(inline) succeed at "inline/many_branch.f":9:25-26: inlined variable f__9 here
info: the let-binding appears only once inside a lambda
info: variable f__9 does not duplicate work when inlined because its binding term is in WHNF or is trivial
' The second feedback tells us that g was inlined because its only appears once two distinct branches.
' So inlining it cannot duplicate work. However, it can increase code size, 
' but its inlining cost was estimated to be 3.
RULE(inline) succeed at "inline/many_branch.f":11:23-24: inlined variable g__10 here
info: the let-binding appears appears only once in distinct branches (2 occurences)
info: computed discount for inlining this occurence as its appears in an application context:
  - number of arguments in the application: 1
  - is the call saturated: false (needed 2 arguments)
  - total discount: 2
info: variable g__10 is small enough to be inlined:
  - cost is 3 (term size = 5 and discount = 2)
  - threshold is 100
' Same here!
RULE(inline) succeed at "inline/many_branch.f":12:22-23: inlined variable g__10 here
info: the let-binding appears appears only once in distinct branches (2 occurences)
info: computed discount for inlining this occurence as its appears in an application context:
  - number of arguments in the application: 1
  - is the call saturated: false (needed 2 arguments)
  - total discount: 2
info: variable g__10 is small enough to be inlined:
  - cost is 3 (term size = 5 and discount = 2)
  - threshold is 100

All of this clearly explain why the inliner has decided, or not, to inline the different occurences. It is now the job of the developper to change the code to help the inliner or add some annotations (not implemented) if he is not happy with the inliner decision.

User-defined rewrite rules

Rules are global declarations and are done at the top level (like data constructors or types), just before the real program. They are composed of three elements:

The rule name, which is only used for debugging purposes and for optimization feedbacks,
The matching pattern,
The replacement expression.

There is an example of user-defined rule:

rule plus_zero_left: plus (Zero {}) x = x

Note: To better show and play with this feature, opaque globals were added to the language. A global is a variable that is defined at the file scope (like data constructors or types), and has no definition except a type.

A global can be declared with the following syntax:
global my_global_name : my_global_type
And then be referenced in the program code as any other variable.

Example:
global map : forall a b. (a -> b) -> list a -> list b
program
  let everything_to_false =
    map [bool] [bool] (fun (x : bool) = False {})
  in everything_to_false (Nil [bool] {})

Supported features

Not all types of expressions may be matched by the current implementation. In particular, only the following constructs are supported:

variables and globals
(types) applications,
(types) abstractions, and
data constructors.

Note that type annotations are ignored for the matching. So (x : bool) and x are considered the same by the pattern matching engine.

Warning about incorrect rules

The compiler has now way to verify that the rules verify the semantic of the problem. In fact, in most cases, rules are about opaque globals and types that the compiler has no information about. In other words, it is very possible to break the semantic of the program by writing incorrect rules such as:

rule true_is_false: True {} = False {}

which will replace all occurences of the True constructor by the False constructor. Therefore, itt is the responsability of the user to write correct rules.

An example (for others see test cases in test/rules)

For example, the following program:

global map : forall a b. (a -> b) -> list a -> list b

rule map_id: map [a] [a] (fun (x : a) = x) = (fun (l : list a) = l)

program
    fun [a] (l : list a) : list a =
        map [a] [a] (fun (x : a) = x) l

will be simplified to:

program
    fun [a] (l : list a) : list a = l (* IDENTITY *)

This optimization was impossible without specific rules as the compiler has no information about map. In this specific case, the compiler will not be able to optimize even if it had access to the function body.