mit-plv / koika Goto Github PK

View Code? Open in Web Editor NEW

136.0 24.0 9.0 4.78 MB

A core language for rule-based hardware design 🦑

License: GNU General Public License v3.0

Makefile 2.59% Coq 60.86% Emacs Lisp 0.72% Shell 0.54% Python 0.72% OCaml 28.99% C++ 5.49% Verilog 0.08%

coq programming-languages formal-methods semantics hardware-description-language compilation

koika's People

Contributors

Stargazers

Watchers

Forkers

mbty matteobusi ju-sh 00mjk namin barkhausen-institut jnbrq mhbtz1

koika's Issues

Context unknown during type checking

This is a minor issue but the error requests it is reported :) The following action correctly fails type checking due to the input to the external function having the incorrect type. However, Koika fails to discover/print the context of where the type checking failure is. Note removing the if in the _simple action generates the right context.

  Inductive reg_t := foo | bar.

  Definition R r :=
    match r with
    | foo => bits_t 32
    | bar => bits_t 4
    end.

  Definition r idx : R idx :=
    match idx with
    | foo => Bits.zero
    | bar => Bits.zero
    end.

  Inductive ext_fn_t := f1.

  Definition Sigma (fn: ext_fn_t) :=
    match fn with
    | f1 => {$ bits_t 1 ~> bits_t 32 $}
    end.

  Inductive rule_name_t := simple .

  Definition _simple : uaction reg_t ext_fn_t :=
    {{ if |4`d1| == |4`d0| then pass
       else (
        let y := extcall f1 (read0(foo)) in
        write0(foo, y)
        )
    }}.

  Definition simple_device : scheduler :=
    simple |> done.

  Definition rules :=
    tc_rules R Sigma
             (fun r => match r with
                    | simple => _simple
                    end).

Koika status:

## Type error:
(TypeMismatch (bits_t 32) (bits_t 1))

## Context unknown; please report this bug.

`unit_t` in ext_fn_t input generates invalid verliog

Perhaps not conventional usage but currently unit_t/bits_t 0 sized input from an external function generates invalid Verilog. e.g.

    Definition Sigma (fn: ext_fn_t) :=
      match fn with
      | f1 => {$ unit_t ~> bits_t 32 $}
      | f2 => {$ unit_t ~> bits_t 1 $}
      end.

Generated verilog with efr_internal := false:

  // -*- mode: verilog -*-
  // Generated by cuttlec from a Kôika module
  module simple_device(input wire[0:0] CLK, input wire[0:0] RST_N, output wire[-1:0] f1_arg, output wire[-1:0] f2_arg, input wire[31:0] f1_out, input wire[0:0] f2_out);
   reg[31:0] total /* verilator public */ = 32'b0;

   assign f2_arg = 0'b0;
   assign f1_arg = 0'b0;

   always @(posedge CLK) begin
    if (!RST_N) begin
     total <= 32'b0;
    end else begin
     total <= ~f2_out && ~f2_out ? f1_out + total : (f2_out ? 32'b0 : total);
    end
   end
  endmodule

Note the output wire[-1:0] f1_arg output wire[-1:0] f2_arg and also assign f1_arg = 0'b0; assign 21_arg = 0'b0; are invalid as constants cannot be zero length (IEEE 1800-2012 - 5.7 Numbers).

Generated verilog with efr_internal := true:

// -*- mode: verilog -*-
// Generated by cuttlec from a Kôika module
module simple_device(input wire[0:0] CLK, input wire[0:0] RST_N);
  reg[31:0] total /* verilator public */ = 32'b0;

  wire[0:0] mod_f20_out;
  f2 mod_f20(.CLK(CLK), .RST_N(RST_N), .arg(0'b0), .out(mod_f20_out));
  wire[31:0] mod_f10_out;
  f1 mod_f10(.CLK(CLK), .RST_N(RST_N), .arg(0'b0), .out(mod_f10_out));

  always @(posedge CLK) begin
    if (!RST_N) begin
      total <= 32'b0;
    end else begin
      total <= ~mod_f20_out && ~mod_f20_out ? mod_f10_out + total : (mod_f20_out ? 32'b0 : total);
    end
  end
endmodule

Note .arg(0'b0) again is a zero sized constant.

Vectors and name collisions

In Kôika projects including vectors, obtaining a name collision is trivial - simply add a register named <vector_name>_0. This is due to the way names are built through concatenation in coq/DerivingShow.v: the names generated are themselves valid Coq identifiers. For a demonstration, see commit 6e2010a on my fork. In this example, a register named rData_0 has been added to examples/vector.v and g++ doesn't appreciate that.

There are several possible workarounds, and the simplest I can think of is to use the $ character in the concatenation phase instead of _. The reason why this works is that the C++, Verilog and DOT backends all accept this character in register names whereas Coq doesn't allow it in identifiers. This follows from the definition of an identifier in the standard for C++ and Verilog, and from the fact that register names are printed between quotes for .dot files. See commit c11c463 for a demonstration of this possible fix (examples/vector.v has not changed since the previous commit yet it compiles now).

If support for $ in identifiers is to be added in some future Coq release, then this character should be considered reserved by Kôika. If another backend without support for $ in identifiers is to be added to Kôika in a future release, then there wouldn't be much of a problem. What matters is that Kôika outputs unique names, and these can always be adapted from cuttlec to suit the specifics of given backends.

Some other related but less problematic issues exist. For instance, there is no upper bound to the length of Coq identifiers yet the same can not be said of Verilog, where the lowest maximum length tolerated by the standard is 1024. On a similar note, Coq is in general much more tolerant than Verilog when it comes to the characters allowed in identifiers: using accents (let alone non-breaking spaces) in a Verilog identifier is not accepted. Writing Kôika code that typechecks in Coq yet results in an error further down the line is therefore possible and even easy. As Kôika outputs unique names, this kind of problem can always be managed from cuttlec. In practice, this problem is easy to deal with and in fact the solution is always some register renamings away.

The context of this question is that I am looking for a way to generate Verilog through Kôika in such a way as to get BRAM blocks to be used through inference when using vectors past a certain size. Assuming that registers stored in BRAM don't behave differently from registers stored using flip-flops, this shouldn't violate any of Kôika's expectations. It could for instance be interesting to model the memory of the RISC-V example directly in Kôika instead of relying on the current workarounds.

However, it seems to be possible to define a vector of registers of varying sizes through Vect as follows:

Definition R r :=
  match r with
  | rData n => bits_t (index_to_nat n)
  end.

This does not correspond to a vector in the traditional sense and it would in particular not be suitable for BRAM storage. Is this behavior desired or just a natural consequence of the way registers are managed? Why is reg_t defined directly through Inductive rather than through a custom construct?

There is a link between the previous question and the "related but less problematic issues" mentioned above: a way of solving them would be to limit the accepted names for register names by avoiding direct reliance on Inductive for their definition (and maybe likewise of Definition for R and such). Furthermore, not treating registers and vectors uniformly may reveal practical for my purpose. In fact, limiting the definition of reg_t to raw registers and vectors (in the traditional sense of the word) covers all the use cases of Kôika I can think of (I mean, registers alone would suffice but vectors make life way easier) - am I missing situations where a full blown inductive type is useful?

I think this digression matters as quite a lot of systems that would be interesting to model using Kôika have some notion of memory, which is best represented using vectors, which are in turn best stored in BRAM when targeting FPGAs.

Build issues on macOS

I encounter the following compiling issue.

== Building OCaml library and executables ==
dune build ocaml/cuttlec.exe @install
ocamlopt ocaml/cuttlec.exe
ld: warning: directory not found for option '-L/opt/local/lib'
make: *** No rule to make target examples/_objects/collatz.lv', needed by examples'. Stop.

Did I miss some configuration in my _Coqproject file?
Any advice is welcome. Thanks!

Potential optimization in the RISC-V example

I noticed that the getFields function of the RISC-V example often gets called just to get the value of a single field (with calls along the line of let funct3 := get(getFields(inst), funct3)). Since fetching the values of all the fields just to extract the value of one of them looked a bit overkill, I tried replacing getFields with individual getXxx functions (e.g. getFunct3). This provides a decent boost to simulation performance:

-- Running tests with Cuttlesim --
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/rvbench_qsort.rv32 -1
[before] real: 0m0.018s	user: 0m0.017s	sys: 0m0.001s
[after ] real: 0m0.013s	user: 0m0.013s	sys: 0m0.000s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/rvbench_median.rv32 -1
[before] real: 0m0.012s	user: 0m0.012s	sys: 0m0.000s
[after ] real: 0m0.008s	user: 0m0.007s	sys: 0m0.001s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/img.rv32 -1
[before] real: 0m0.216s	user: 0m0.195s	sys: 0m0.020s
[after ] real: 0m0.160s	user: 0m0.142s	sys: 0m0.018s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/primes.rv32 -1
[before] real: 0m4.421s	user: 0m4.417s	sys: 0m0.001s
[after ] real: 0m3.090s	user: 0m3.084s	sys: 0m0.001s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/morse.rv32 -1
[before] real: 0m1.584s	user: 0m1.583s	sys: 0m0.000s
[after ] real: 0m1.086s	user: 0m1.085s	sys: 0m0.000s
-- Running tests with Verilator --
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/rvbench_qsort.vmh -1
[before] real: 0m0.033s	user: 0m0.032s	sys: 0m0.001s
[after ] real: 0m0.030s	user: 0m0.030s	sys: 0m0.000s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/morse.vmh -1
[before] real: 0m2.689s	user: 0m2.687s	sys: 0m0.000s
[after ] real: 0m2.186s	user: 0m2.184s	sys: 0m0.000s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/rvbench_median.vmh -1
[before] real: 0m0.022s	user: 0m0.022s	sys: 0m0.000s
[after ] real: 0m0.018s	user: 0m0.017s	sys: 0m0.001s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/primes.vmh -1
[before] real: 0m8.358s	user: 0m8.349s	sys: 0m0.001s
[after ] real: 0m6.778s	user: 0m6.767s	sys: 0m0.001s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/img.vmh -1
[before] real: 0m0.399s	user: 0m0.385s	sys: 0m0.014s
[after ] real: 0m0.327s	user: 0m0.310s	sys: 0m0.017s

This results in a speedup of roughly 1.43 for Cuttlesim and of 1.23 for Verilator compared to master. I did not yet check the effects on synthesis but I might do it soon. I assume that this kind of optimization is applied automatically when synthesizing (isn't it?), so I don't expect changes on this side (alas). See commit 04cfdf8 for a demonstration.

Infinite loop

Dear @cpitclaudel ,

We are trying out Koika to run part of an operating system on an FPGA.
In a test case (a.k.a. Example) we are using tc_compute and interpret_action to evaluate an action.
When we do so, we enter an infinite loop.

So far I could only identify two source where this can happen:

In type class resolution (https://issuehint.com/issue/coq/coq/16216)
In tactics such as repeat (https://stackoverflow.com/questions/53750792/can-composing-try-and-repeat-lead-to-an-infinite-loop-in-coq)

I tried to get some debug output on following the instructions shared in the link above but to no avail.
There is quite some code in Koika that uses repeat but I was not sure how to debug this properly.

It is hard to boil this down to a small example that would fit in a bug post.
But if you could possibly find the time, I could share the according repository with you.

Any help would be very much appreciated.

External function incorrectly pruned if rule is otherwise uneffectful

In the following example, an external function is written to but the input value is unused/dropped. The generated verilog drops the call to total_o completely (even though it is effectful as we write a value to it). I would guess the logic optimisation pass does not correctly account for external functions.

Full source example. Relevant sections:

  Definition Sigma (fn: ext_fn_t) :=
    match fn with
    | f1 => {$ bits_t 1 ~> bits_t 32 $}
    | do_zero_i => {$ bits_t 1 ~> bits_t 1 $}
    | total_o => {$ bits_t 32 ~> bits_t 1 $} (* <-- This external function *)
    end.

  Definition _write_back : uaction reg_t ext_fn_t :=
    {{ let previous_total := read0(total) in
       let drop := extcall total_o ( previous_total ) in pass
    }}.

  Definition simple_device : scheduler :=
    write_back |> ...

  Definition ext_fn_specs (fn: ext_fn_t) :=
    match fn with
    | f1 => {| efr_name := "f1"; efr_internal := true |}
    | do_zero_i => {| efr_name := "f2"; efr_internal := true |}
    | total_o => {| efr_name := "f3"; efr_internal := true |} (* <--- *)
    end.

Note no mention of the external call in generated verilog (occurs also for efr_internal := false):

// -*- mode: verilog -*-
// Generated by cuttlec from a Kôika module
module simple_device(input wire[0:0] CLK, input wire[0:0] RST_N);
	reg[31:0] total /* verilator public */ = 32'b0;

	wire[0:0] mod_f20_out;
	f2 mod_f20(.CLK(CLK), .RST_N(RST_N), .arg(1'b1), .out(mod_f20_out));
	wire[31:0] mod_f10_out;
	f1 mod_f10(.CLK(CLK), .RST_N(RST_N), .arg(1'b1), .out(mod_f10_out));

	always @(posedge CLK) begin
		if (!RST_N) begin
			total <= 32'b0;
		end else begin
			total <= ~mod_f20_out && ~mod_f20_out ? mod_f10_out + total : (mod_f20_out ? 32'b0 : total);
		end
	end
endmodule

Adding a dummy register and writing to it with a nonconstant value generates correct verilog:

  Definition _write_back : uaction reg_t ext_fn_t :=
    {{ let previous_total := read0(total) in
       let still_dropped := extcall total_o ( previous_total ) in (* <- still dropping value *)
       write0 (dummy, previous_total)                             (* <- but preventing rule being optimised away *) 
    }}.

Building error (Ocaml 4.11.1)

Hello, I'm trying to build Koika with OCaml 4.11.1 and I get the following error log in my command line when running make:

== Building Coq library ==
dune build @@coq/all
File "./coq/FiniteType.v", line 44, characters 4-49:
Warning: omega is deprecated since 8.12; use “lia” instead.
[omega-is-deprecated,deprecated]
File "./coq/FiniteType.v", line 44, characters 4-49:
Warning: omega is deprecated since 8.12; use “lia” instead.
[omega-is-deprecated,deprecated]
File "./coq/FiniteType.v", line 45, characters 4-58:
Warning: omega is deprecated since 8.12; use “lia” instead.
[omega-is-deprecated,deprecated]
File "./coq/FiniteType.v", line 52, characters 4-10:
Warning: omega is deprecated since 8.12; use “lia” instead.
[omega-is-deprecated,deprecated]
File "./coq/FiniteType.v", line 158, characters 0-67:
Warning: The default value for hint locality is currently "local" in a
section and "global" otherwise, but is scheduled to change in a future
release. For the time being, adding hints outside of sections without
specifying an explicit locality is therefore deprecated. It is recommended to
use "export" whenever possible. [deprecated-hint-without-locality,deprecated]
File "./coq/DeriveShow.v", line 106, characters 2-73:
Warning: The default value for hint locality is currently "local" in a
section and "global" otherwise, but is scheduled to change in a future
release. For the time being, adding hints outside of sections without
specifying an explicit locality is therefore deprecated. It is recommended to
use "export" whenever possible. [deprecated-hint-without-locality,deprecated]
File "./coq/EqDec.v", line 24, characters 2-3:
Error: [Focus] Wrong bullet -: Current bullet - is not finished.

make: *** [Makefile:17: coq] Error 1

Readme says that OCaml version should lie between 4.07 and 4.09. Is it a strict requirement and I've received the result of it's disturbance?

Proof failures with (unsupported) Coq 8.12

Some changes in Coq 8.12 tactics break proofs in a few places.

firstorder is now weaker (related coq change coq/coq#11760), firstorder with bool works in failing cases here
lia seems to be weaker, failing at coq/PrimitiveProperties.v:354&357 the first being:

  353         * f_equal.
  354           rewrite N.mul_mod_distr_l by (destruct sz; cbn; lia).  
  355           reflexivity.

This can be fixed by an intermediate remember:

  353         * f_equal.               
  354           rewrite N.mul_mod_distr_l ; (destruct sz; cbn; try lia).  
  355           remember (2^(Pos.of_succ_nat sz))%positive; lia

This seems odd to me (I can't find a relevant changelog for lia), does the Koika team think this is a Coq issue?

Multiplier correctness example

Hi, I'm new to coq and I am facing a few issues with understanding the code MultiplierCorrectness.v. I could understand till the lemma mul_to_partial_mul. I don't understand what the rest of the code does. @math-fehr It would be really helpful if I could get what are we trying to prove in lemmas enq_preserves_invariant, deq_preserves_invariant, step_preserves_result_invariant, step_preserves_result_finished_invariant. It would be really helpful for beginners like me if a few comments could be added giving the description of the lemma.

I am currently working on proving correctness of different modules of a processor based on riscv architecture. I would also like to start a discussion (if interested) as to how can we prove the functionality and timing correctness of different modules. For instance, I currently wrote a koika code for alu module of the processor. I was thinking how could the functionality correctness of this module could be proved with the help of coq (ie, what lemmas would be good to prove since its not a mathematical problem).

Any sort of idea or suggestion would be really helpful. Thank you :)

Errors in examples/rv/MultiplierCorrectness.v

Most of the proofs of the lemmas defined in the file examples/rv/MultiplierCorrectness.v do not pass. I did not check them all individually, but most of them seem to be outdated. At least enq_preserves_invariant (no matching clauses for match), deq_preserves_invariant (attempt to save an incomplete proof), step_preserves_finished_invariant,step_preserves_step_invariant and step_preserves_result_finished_invariant are impacted.

This is easy to miss as no target in the Makefile checks these proofs. It might be useful to add a correctness target to give an easy way to test them, especially since this file contains the most complex examples of proofs on Kôika designs of the examples folder. Not including it by default makes sense considering that it takes some time to pass and is not always directly relevant.

For the record, my version of coq is 8.11.0.

Compilation error with the cosimulation example when verilator >= 5.004

I get the following error when I attempt to compile examples/cosimulation.v :

/usr/bin/ld: blackbox.obj_dir.opt/verilated.o: in function `VerilatedContext::threadPoolp()':
verilated.cpp:(.text+0x3212): undefined reference to `VlThreadPool::VlThreadPool(VerilatedContext*, unsigned int)'
collect2: error: ld returned 1 exit status

It is caused by the changes made to Verilator in commit d6126c4b3.
From this version on, the VlThreadPool constructor mentioned in the error message appears in examples/_objects/cosimulation.v/blackbox.obj_dir.opt/verilated.o. The VlThreadPool class is defined in verilated_threads.cpp, and linking this file manually fixes the issue.

There is no mention of verilated_threads.cpp in any of the dependency files. Maybe this is a Verilator issue ? I did not find signs of other people having similar issues.

Translating Koika's bits_t (Vect) to MIT's bbv

Hi,

Is there a particular reason why Koika bitvector's are designed with the non-inductive type vect ? Why not use MIT's bbv ?

Context: the Koika implementations I'm working on have to be proven correct against a spec that works with bbv, I've written a few functions to translate types (like the one below) but ideally I'd need to define some sort of rewritable equality, I suppose.

Fixpoint bbvword_to_bits_sized {sz} (wd : Word.word sz) : bits sz :=
    match wd with
    | Word.WO => vect_nil
    | @Word.WS hd n tl => if (Bool.eqb hd true)
        then {|vhd := true; vtl := bbvword_to_bits_sized tl|}
        else {|vhd := false; vtl := bbvword_to_bits_sized tl|}
    end.

Outdated information in examples/rv/README.rst

An "FPGA" section has been (relatively) recently added to the main README of the repository, including the information that the RV32I model does not fit on the TinyFPGA BX at the time. However, the README in examples/rv contains some older information that gives the impression that the TinyFPGA BX is supposed to work. Maybe replacing this duplication with a redirection to the "FPGA" section of the main README would be clearer.

I am in fact interested in demonstrating some properties of a slightly modified version of the RV32I example, and I recently got my hands on this FPGA model. I was looking for a way of testing it on this device to change from simulation and tried to synthesize Kôika's RV32I as a first step, and I stumbled upon an error unrelated to the size issue mentionned in the main README (of which I was not aware of at the time):

[...]
3.2.1. Analyzing design hierarchy..
ERROR: Module `\usb_uart_i40' referenced in module `\top_ice40_usb' in cell `\uart' is not part of the design.
make: *** [Makefile:269: top_ice40_usb.json] Error 1

I noticed that this file exists in the tinyfpga_bx_usbserial repository and spotted an issue seemingly related to Kôika there, which leads me to assume that there is some undocumented dependency on this project but I did not go further than this.

If the break did not happen too far back, I would be interested in trying to make my version run on the FPGA. Could you possibly recommend an older Kôika commit with which synthesis should work? Also, do you think that removing the multiplication module on my end would be enough to fix the size issue, or is this hard to say?

Warning message about Boost version triggers spuriously

In the file ocaml/backends/resources/cuttlesim.hpp (around line 85)

#define MULTIPRECISION_THRESHOLD 64

#ifdef NEEDS_BOOST_MULTIPRECISION

#if BOOST_VERSION < 106800
// https://github.com/boostorg/multiprecision/commit/bbe819f8034a3c854deffc6191410b91ac27b3d6
// Before 1.68, static_cast<uint16_t>(uint128_t{1 << 16}) gives 65535 instead of 0
#pragma message("Bignum truncation is broken in Boost < 1.68; if you run into issues, try upgrading.")
#endif

#include <boost/multiprecision/cpp_int.hpp>

should be:

#define MULTIPRECISION_THRESHOLD 64

#include <boost/multiprecision/cpp_int.hpp>

#ifdef NEEDS_BOOST_MULTIPRECISION
#if BOOST_VERSION < 106800
// https://github.com/boostorg/multiprecision/commit/bbe819f8034a3c854deffc6191410b91ac27b3d6
// Before 1.68, static_cast<uint16_t>(uint128_t{1 << 16}) gives 65535 instead of 0
#pragma message("Bignum truncation is broken in Boost < 1.68; if you run into issues, try upgrading.")
#endif

Otherwise, the BOOST_VERSION macro is not defined at the time of the check, which results in the message triggering regardless of Boost's real version.

Some circuits that include extcalls behave unexpectedly when simulated through Verilator

The underlying problem seems to be that cuttlec treats extcalls like normal Kôika code when targetting Verilator, ignoring side effects.

For instance, unconditional calls to external modules with side effects are optimized away when the result of the extcall is left unused. For a minimal example of this behavior, see this commit on my fork. I modified the RISC-V example, adding an external module printing a value to terminal for both Cuttlesim and Verilator and calling it from the tick function in RVCore.v. make cuttlesim-tests quickly fills the output with messages, while the output of the tests ran by make verilator-tests is left unchanged. The commit 9638936 stores the result to a debug register, and in this case the behavior of Cuttlesim and Verilator are equivalent.

Another instance of this problem can be seen by nesting an extcall inside an if block, in which case the call is triggered without regard to the condition in some situations. See commit 9f1853a for a minimal example of this behavior (once again, the core of the example is in the tick function). Commit 65621bd is a slightly more complex example showing that the effects of calls to write<n> present in the same blocks as the triggered extcalls are ignored. Commit 0e34888 shows that binding the result of the extcall to another debug register somehow fixes that behavior. Maybe in that case the optimizer manages to remove the whole block? Commit 65621bd is quite similar to 0e34888, but in this case the condition starts as true and the write inside the if block makes it wrong for future calls. Although the value of the extcall is still stored in a register unrelated to the condition, this time the situation seems to be too complex for the optimizer and Verilator ends up printing a message on each tick. It looks like whenever the extcall can't be entirely removed through optimization, it is just triggered. Is this why Maybes are used as arguments to extcalls, with some repetition between the valid field and the conditions already checked in the context of the calls? Couldn't values be passed directly to extcalls and Maybes be generated automatically for Verilator at compile time based on the context of the call? I guess this is a feature that was never fully integrated and that functions well enough for its current purpose but I am interested in getting it to work in what I feel would be a cleaner way and would happily contribute to that end.

"make verilator-tests" results in a failure (cannot find file containing module: 'ext_host_id')

On the other hand, make cuttlesim-tests still works fine.
Looks like examples/rv/etc/sv/ext_host_id.v was not pushed.
See https://gist.github.com/mbty/1a764c50afd8b169954612a65add433b for full details.

Avoidable stalling in the RISC-V example

valid_rs1 and valid_rs2 are never read in the RISC-V example. Making sure that the values really correspond to rs1 or rs2 instead of stalling whenever at least one of them is associated with something other than 0 in the scoreboard yields better performance:

-- Running tests with Cuttlesim --
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/rvbench_qsort.rv32 -1
  [before] real: 0m0.018s	user: 0m0.017s	sys: 0m0.001s
  [after ] real: 0m0.013s	user: 0m0.013s	sys: 0m0.000s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/rvbench_median.rv32 -1
  [before] real: 0m0.012s	user: 0m0.012s	sys: 0m0.000s
  [after ] real: 0m0.009s	user: 0m0.009s	sys: 0m0.000s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/img.rv32 -1
  [before] real: 0m0.213s	user: 0m0.197s	sys: 0m0.016s
  [after ] real: 0m0.157s	user: 0m0.137s	sys: 0m0.019s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/primes.rv32 -1
  [before] real: 0m4.413s	user: 0m4.407s	sys: 0m0.003s
  [after ] real: 0m3.084s	user: 0m3.080s	sys: 0m0.001s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/morse.rv32 -1
  [before] real: 0m1.583s	user: 0m1.581s	sys: 0m0.000s
  [after ] real: 0m1.142s	user: 0m1.141s	sys: 0m0.000s
-- Running tests with Verilator --
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/rvbench_qsort.vmh -1
  [before] real: 0m0.033s	user: 0m0.033s	sys: 0m0.000s
  [after ] real: 0m0.031s	user: 0m0.030s	sys: 0m0.001s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/morse.vmh -1
  [before] real: 0m2.692s	user: 0m2.689s	sys: 0m0.001s
  [after ] real: 0m2.547s	user: 0m2.544s	sys: 0m0.000s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/rvbench_median.vmh -1
  [before] real: 0m0.022s	user: 0m0.022s	sys: 0m0.000s
  [after ] real: 0m0.021s	user: 0m0.021s	sys: 0m0.000s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/primes.vmh -1
  [before] real: 0m8.383s	user: 0m8.371s	sys: 0m0.002s
  [after ] real: 0m7.892s	user: 0m7.885s	sys: 0m0.001s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/img.vmh -1
  [before] real: 0m0.401s	user: 0m0.388s	sys: 0m0.013s
  [after ] real: 0m0.378s	user: 0m0.361s	sys: 0m0.017s

I kept only the tests that run for more than 2ms to keep this short. This results in tests taking roughly 25% less time with Cuttlesim and 5% less time for Verilator. Of course, your mileage may vary. I did not check the effects on synthesis.

See related commit 66b59e9.

Different behavior in simulation and FPGA

When synthesizing the RV32E example with the unit/led program on my TinyFPGA BX, the LED stays off.
On the other hand, Cuttlesim and Verilator both behave as I expect them to following the definition of the test.

I tried the following things independently from each other:

replacing

if (led_wr_valid)
  led <= led_wr_data;

with

if (!led_wr_valid)
  led <= !led_wr_data;

in top_uart.v results in the LED turning (and staying) on (FPGA only, simulation not impacted although making the same change to top.v indeed inverts Verilator's behavior);

replacing both may_run and on by Ob~1 in the extcall to ext_led in RVCore.v results in the LED turning (and staying) on, as should be expected (FPGA and simulation);
replacing the contents of led.c with a simpler infinite loop containing only a call to putled(1) does not fix anything - the LED is still off, and simulation spams the output with the "☀" symbol.

Do you observe the same thing on your side? I don't have access to an ULX3S-85k as of now, does this test behave any differently on there? Are other tests featuring extcalls impacted?

I'm quite surprised by Verilator and synthesis disagreeing.

About the upcoming module system and Kôika's semantics

I'm interested in learning more about the upcoming module system. What are its main design objectives and how far along is it in its development? I also have one use case in mind that Kôika doesn't quite handle yet, and I wonder how the module system will impact it.

Here is a simplified, abstract example of it. Assume the following:

In rule r1, there is a call to function f1 guarded by an if statement. The result of f1 is written to register g.
In a separate rule, if g holds the value 1 at the beginning of the cycle, the external rule e is called.
The only call to f1 that can possibly occur is the one in r1.

We want to prove the following property: "given schedule s and for all possible external semantics, whenever a call to f1 returns 1, e is called during the next cycle". This is complicated as of today, as all available semantics are register-centric, yet registers do not make an appearance here. More precisely, external calls are not considered for the reasons you detailed in issue #2 and both "TypedSemantics" and "CompactSemantics" erase the causality information that could be used to prove "when x is true, f1 is called and returns 1 and 1 is stored in g". It is of course possible to extract x, that is, the conditions under which f1 is called (remember that it is guarded by an if) and returns 1 (depends on the definition of r1), however in proving that when these are verified, 1 is stored in g, the link to the call to f1 is lost.

A more concrete example is as follows.

The rv example defines an isLegalInstruction function that is used to check whether a given word corresponds to an instruction according to the RISC-V specification. Currently, when an instruction fetched from memory turns out to be invalid, the program counter is set to zero, and the execution continues. If we wanted the processor to shutdown on an invalid instruction instead, we would have to modify RVCore to some extent. As the notion of shutdown is external to Kôika, we would have to rely on an external call to handle it (in fact ext_finish is already available).

We may be interested in proving the following property: "On the first cycle following one on which an invalid instruction was decoded (that is, one on which isLegalInstruction returned false), only one external call is issued. This call should be a call to the shutdown external rule. No other external calls can happen from this point on.".

I understand that the module system will probably have an impact on how external rules (including both those from other Kôika modules and "real" external calls such as ext_finish) are handled, with consequences on the semantics. Will the notion of calls to external rules indeed appear in the semantics? That should solve part of my problem. I guess it might not be the case for calls to rules of Kôika modules as those could be inlined as far as the logs go.

However, the more general problem of the lack of traceability of the causes of the actions tracked in the logs will likely persist as it is not really related to the notion of module as well as somewhat niche. The most obvious solution to this problem would be to develop yet another semantics with more detailed trace information and a proof of equivalence to the other ones in the situations they both cover. I guess that this is the route I will have to follow for the property I want to prove. Does this sound like the way to go or is there something simpler (maybe simply waiting for the module system is)?