TODO
Amaranth is released under the very permissive two-clause BSD license. Under the terms of this license, you are authorized to use Amaranth for closed-source proprietary designs.
See LICENSE.txt file for full copyright and license info.
System on Chip toolkit for Amaranth HDL
License: BSD 2-Clause "Simplified" License
TODO
Amaranth is released under the very permissive two-clause BSD license. Under the terms of this license, you are authorized to use Amaranth for closed-source proprietary designs.
See LICENSE.txt file for full copyright and license info.
Many methods on MemoryMap
claim to iterate objects "in ascending order of their address" but this appears to not be the case in the current code. The code as it stands appears to iterate in order of object addition, as this is how Python dicts natively iterate.
This original documentation was written five years ago, so it's unclear something changed or this was never implemented. Iterating in address order is a useful property and will help determinism of downstream code, such as the SoC bus decoders, so the class should be modified to do this properly.
Apparently, resources in a dense memory window cannot be smaller than
the data width of the parent memory map. This can be demonstrated in
test_memory.py
by making the resource in the dense window smaller
than the ratio of the window (2 < 4):
# from this, which works
self.win3.add_resource(self.res6, name="name6", size=16)
# to this, which raises AssertionError in MemoryMap._translate(), via MemoryMap.all_resources() later on
self.win3.add_resource(self.res6, name="name6", size=2)
Here is another test case I created:
import pytest
from amaranth_soc.memory import MemoryMap
def test_dense():
regs = ("reg0", "reg1", "reg2", "reg3")
window = MemoryMap(addr_width=2, data_width=8)
for reg in regs:
window.add_resource(reg, name=reg, size=1)
memory_map = MemoryMap(addr_width=1, data_width=16)
(start, end, ratio) = memory_map.add_window(window, sparse=False)
assert start == 0
assert end == 2
assert ratio == 2
assert memory_map.decode_address(0) == regs[0] # unexpected! (would expect regs[0], regs[1])
assert memory_map.decode_address(1) == regs[0] # completely unexpected!
with pytest.raises(AssertionError): # unexpected!
list(memory_map.all_resources())
for reg in regs:
with pytest.raises(AssertionError): # unexpected!
res = memory_map.find_resource(reg)
I would expect that it is possible to address multiple units with a
single address in this case. Am I making a logical mistake this this assumption?
If not, is this just a problem with the implementation
(e.g. ResourceInfo not being able to represent "fractional"
addresses)?
In either case, I think this should be caught in add_window()
already (or at least with an explanatory exception text).
Issue by whitequark
Tuesday Sep 10, 2019 at 09:50 GMT
Originally opened as m-labs/nmigen-soc#1
Let's collect requirements for CSRs here.
Overview of oMigen design:
CSRStatus
(R/O) and CSRStorage
(R/W);AutoCSR
collects CSRs from submodules via Python introspection;CSRConstant
allows specifying limited supplementary data in generated code.According to @sbourdeauducq, the primary semantic problem with oMigen CSR design is that it lacks atomicity, and fixing that would involve ditching the CSR bus and instead using Wishbone directly.
According to @mithro and @xobs, a significant problem with oMigen CSR design is that it does not allow generating documentation.
According to @whitequark, a problem with oMigen CSR design is that the code generation is very tightly coupled to MiSoC internals, and is quite hard to maintain.
Peripherals are a currently missing building block from nmigen-soc.
They would provide wrappers to cores by means of a CSR interface (also interrupts, but handling these could be the subject of a separate issue).
For example, an AsyncSerialPeripheral
wrapper in nmigen-soc would provide access to an AsyncSerial
core in nmigen-stdio. Baudrate, RX/TX data, strobes etc. would be accessed through CSRs.
Integration would be straightforward for peripherals that provide nothing more than CSRs:
csr.Multiplexer
, whose bus interface is exposed by the peripheralcsr.Decoder
csr.Decoder
bus interface is bridged to the SoC interconnectBut what about peripherals that also provide a memory interface ? (e.g. DRAM controllers, flash controllers, etc.)
I see two possible approaches:
CSRs would be handled the same way as described above, but the peripheral would also provide a separate bus interface to access its memories (e.g. WB4). I think LiteX follows a similar approach.
This has the consequence of locating the CSRs and memories of a given peripheral in separate regions of the SoC address space.
pros:
csr.Decoder
, and the WB4 interface of a peripheral is directly connected to its logic.cons:
Instead of two separate interfaces, a memory-capable peripheral would expose a single bus interface like WB4 or AXI4. This has the consequence of locating all the resources of a peripheral in the same address space region.
wishbone.Decoder
, whose bus interface would be exposedcsr.Multiplexer
-> WishoneCSRBridge
-> wishbone.Decoder
)pros:
cons:
Any thoughts on this ?
cc @whitequark @awygle @enjoy-digital and others
I feel the need for a way to configure the i2c-controlled devices I have on my fpga board from a wishbone bus driven by a cpu core. That means a single master and no real surprises on the bus, everything documented (hopefully) and no collisions. So the question has been what the wishbone-level interface would be to keep the complexity as low as possible in both the cpu code and the module implementation. There's also an idea of "you pay for what you need and no more".
The ideas I currently have:
A possible starting point on the interface:
Advantages of the interface:
Disadvantages:
Any ideas to complement/replace that one?
Hey,
if I understand the intended use of Memorymap correctly, it should hold all information that is needed to write software for the SoC one is generating.
This works already quite well for me if I only have one "logical register" at one address. However, if I want to have registers which are smaller than 8 bits, and want to have them packed (e.g. when emulating the memory map of an existing peripheral to reuse driver code) I wasn't able to come up with a solution that expresses that using the current Memorymap class in a clean way.
Currently, csr.Builder
works around this by casting array indices to strings calling MemoryMap.add_resource()
.
Repro:
from amaranth import *
from amaranth_soc import csr
class FooRegister(csr.Register, access="r"):
a: csr.Field(csr.action.R, unsigned(8))
regs = csr.Builder(addr_width=1, data_width=8)
for n in range(2):
with regs.Index(n):
regs.add("foo", FooRegister())
for reg, reg_name, reg_range in regs.as_memory_map().resources():
print(reg_name)
Current output:
('0', 'foo')
('1', 'foo')
Expected output:
(0, 'foo')
(1, 'foo')
Currently, the Decoder does not catch illegal addresses, meaning that the initiator hangs itself waiting for ack (unless there is a timeout implemented).
When constructed with features = {"err"}
, the Decoder only propagates errors from the subordinate busses. Would it be appropriate to also signal an invalid address?
Issue by HarryHo90sHK
Friday Sep 27, 2019 at 07:10 GMT
Originally opened as m-labs/nmigen-soc#2
#1 Following the general ideas discussed, I have made a new abstraction for CSR objects in my fork (HarryMakes/nmigen-soc@ba5f354 fixed). An example script is given below, which will print all the fields and their properties in a CSR named "test". You might also test the slicing functionality using the format mycsr[beginbit:endbit]
, but please note that the upper boundary is exclusive.
from nmigen import *
from nmigen_soc.csr import *
if __name__ == "__main__":
mycsr = CSRGeneric("test", size=64, access=ACCESS_R_W, desc="A R/W test register")
mycsr.f += CSRField("enable", desc="Enable signal", enums=['OFF', 'ON'])
mycsr.f += CSRField("is_writing", size=2, access=ACCESS_R, desc="Status signal of writing or not",
enums=[
("YES", 1),
("NO", 0),
("UNDEFINED", 2)
])
mycsr.f += CSRField("is_reading", size=2, access=ACCESS_R, desc="Status signal of reading or not")
mycsr.f.is_reading.e += [
("UNDEFINED", 2),
("YES", 1),
("NO", 0)
]
mycsr.f += CSRField("is_busy", size=2, access=ACCESS_R_WONCE, desc="Busy signal",
enums=[
("YES", 1),
("NO", 0),
("UNKNOWN", -1)
])
mycsr.f += [
CSRField("misc_a", size=32),
CSRField("misc_b"),
CSRField("misc_c")
]
mycsr.f.misc_a.e += [
("HOT", 100000000),
("COLD", -100000000),
("NEUTRAL", 0)
]
#mycsr.f += CSRField("impossible", size=30, startbit=6)
print("{} (size={}) is {} : {}".format(
mycsr.name,
mycsr.size,
mycsr.access,
mycsr.desc))
for x in mycsr._fields:
print(" {} [{},{}] (size={}) is {}{}".format(
mycsr._fields[x].name,
mycsr._fields[x].startbit,
mycsr._fields[x].endbit,
mycsr._fields[x].size,
mycsr._fields[x].access,
(" : "+mycsr._fields[x].desc if mycsr._fields[x].desc is not None else "")))
After changes introduced in Amaranth in amaranth-lang/amaranth@422ba9e definition of Signature.create in e.g. wishbone module is incompatible - it lacks src_loc_at
keyword argument that is required by this call in Amaranth, hence it will throw an error:
File "(...)/venv/lib/python3.10/site-packages/amaranth_soc/wishbone/bus.py", line 467, in __init__
super().__init__()
File "(...)/venv/lib/python3.10/site-packages/amaranth/lib/wiring.py", line 873, in __init__
self.__dict__.update(self.signature.members.create(path=()))
File "(...)/venv/lib/python3.10/site-packages/amaranth/lib/wiring.py", line 242, in create
attrs[name] = create_dimensions(member.dimensions, path=(*path, name),
File "(...)/venv/lib/python3.10/site-packages/amaranth/lib/wiring.py", line 237, in create_dimensions
return create_value(path, src_loc_at=1 + src_loc_at)
File "(...)/venv/lib/python3.10/site-packages/amaranth/lib/wiring.py", line 233, in create_value
return member.signature.create(path=path, src_loc_at=1 + src_loc_at)
TypeError: Signature.create() got an unexpected keyword argument 'src_loc_at'
Issue by jfng
Thursday Jan 23, 2020 at 11:44 GMT
Originally opened as m-labs/nmigen-soc#4
Repro:
from nmigen import *
from nmigen.back import rtlil
from nmigen_soc import wishbone
class Top(Elaboratable):
def elaborate(self, platform):
m = Module()
m.submodules.dec = dec = wishbone.Decoder(addr_width=5, data_width=8, granularity=8)
bus = wishbone.Interface(addr_width=4, data_width=8, granularity=8)
dec.add(bus)
return m
if __name__ == "__main__":
print(rtlil.convert(Top()))
Output:
Traceback (most recent call last):
File "repro.py", line 18, in <module>
print(rtlil.convert(Top()))
File "/home/jf/src/nmigen/nmigen/back/rtlil.py", line 1007, in convert
fragment = ir.Fragment.get(elaboratable, platform).prepare(**kwargs)
File "/home/jf/src/nmigen/nmigen/hdl/ir.py", line 67, in get
obj = obj.elaborate(platform)
File "/home/jf/src/nmigen/nmigen/hdl/dsl.py", line 484, in elaborate
fragment.add_subfragment(Fragment.get(self._named_submodules[name], platform), name)
File "/home/jf/src/nmigen/nmigen/hdl/ir.py", line 67, in get
obj = obj.elaborate(platform)
File "/home/jf/src/nmigen-soc/nmigen_soc/wishbone/bus.py", line 247, in elaborate
with m.Case(sub_pat[:-log2_int(self.bus.data_width // self.bus.granularity)]):
File "/usr/lib/python3.7/contextlib.py", line 112, in __enter__
return next(self.gen)
File "/home/jf/src/nmigen/nmigen/hdl/dsl.py", line 283, in Case
.format(pattern, len(switch_data["test"])))
nmigen.hdl.dsl.SyntaxError: Case pattern '' must have the same width as switch value (which is 5)
(discussed during the 06/07 IRC meeting - log).
Examples of constants:
Constants could be associated both to individual CSRs (e.g. counter time limit) or to an entire peripheral (e.g clock frequency).
Different languages represent constants in different ways:
u
and l
suffixes. You can often get away with #define
'ing a constant.u8
, or an i16
, etc.Name bikeshedding: ConstantDict
or ConstantMap
Nesting:
Approach for the next iteration (first attempt was in #19):
In multiple places I end up needing an enable signal that generates ones at a given frequency in relation to the domain clock. This is a proposal for a generic generator for such a thing. Please someone find a nice name for it, EnableFrequencyGenerator is too weird.
I think the best method is to go for a bresenham variant. The algorithm is simple. Let's call the domain frequency fd and the target frequency ft (which ft < fd). Then:
The circuit is then "if carry at the previous clock, add delta to the counter and output one, else add ftr and output 0".
Some special cases can be simplified. If ftr is a power of two then delta and fdr are equal, and a mux is dropped. If ft is a multiple of fd, we end up with a simple divider. I don't know which is the most efficient between lut-wise between muxing on the adder input and using the carry or clearing to zero and an equlity comparison the counter value.
There probably should be two versions of the class. FixedEFG takes fd and ft and generates a fixed-frequency generator. ProgrammableEFG takes a number of bits for the counter and provides a wishbone endpoint with a couple of registers to write ftr and delta. The gcd aspect can be ignored, itis only there to reduce the number of bits of the counter.
The ProgrammableEFG could only have one register for fd and do the substract by itself, but it's a little sad to have a wide adder used just once for that instead of relying of the computational capabitilites of whatever cpu core is around. It should probably reset the counter on a write to the second register. Writing 0/0 stops the enable generation since no carry happens anymore.
It would be nice to have a peripheral that enables a CPU to measure time intervals and busy wait to delay. This peripheral would use the system clock as a reference, and has two registers:
The sizes of both registers/counters will be configurable as usual. The timer counter cannot be reset: this is a monotonic counter that only becomes lower than its previous value on natural (binary) overflow.
Proposed name: amaranth_soc.clock.MonotonicClock
? Later we may add amaranth_soc.clock.RealtimeClock
perhaps.
It would be nice to have a Wishbone-attached SRAM peripheral, to enable CPU cores to have scratchpad RAM.
Since there is an ambiguity about what to do for non-power-of-2 sizes of such RAM, I propose that they be prohibited. There is some usefulness but it's unclear what to do on out-of-bounds accesses (wrap? set err
? if you set err
when do you clear it?) so it's probably best to punt on this.
For the simple but very usual case of a memory range mapping a device that does not need internal decoding (ram, rom, that kind of stuff) the boilerplate is a little annoying. Specifically it is:
self.bus = wishbone.Interface(addr_width = self.ram_width-2, data_width = 32, granularity = 8, name="ram")
map = memory.MemoryMap(addr_width = self.ram_width, data_width=8, name="ram")
map.add_resource(self.ram, name="ram", size=self.ram_size)
self.bus.memory_map = map
The redundancy I suspect makes things error-prone, especially with the data_width/granularity issues between the Interface and the MemoryMap. It probably could be done in one helper function call, not sure what it should look like though.
Issue by jfng
Wednesday Jan 22, 2020 at 14:18 GMT
Originally opened as m-labs/nmigen-soc#3
This arbiter should be compatible with all the optional features of wishbone.Interface
.
The round-robin implementation could perhaps be moved to nmigen.lib
and imported back here, but that would make nmigen-soc depend on a later revision than nmigen v0.1.
jfng included the following code: https://github.com/m-labs/nmigen-soc/pull/3/commits
"TODO" isn't a very convincing pitch.
The data_width
attribute of a MemoryMap
reflects not the actual data width, but the granularity of the associated bus. A 32-bit bus with byte lanes hence has a memory map with data_width = 8
.
MemoryMap.add_window()
performs the following check:
if window.data_width > self.data_width:
raise ValueError(f"Window has data width {window.data_width}, and cannot be added to a "
f"memory map with data width {self.data_width}")
This restriction makes it impossible to add a window without byte lanes (e.g. with granularity 32) to a parent memory map with granularity 8, even if both of the associated buses have an actual data width of 32.
I would like to have a 32-bit wide CSR bus connected to a 32-bit CPU with byte lanes. While the byte lanes are trivial to shim, MemoryMap
doesn't support this arrangement, so instead of adding the target memory map as a window, I have to work around it by copying and transforming each individual resource entry to a new memory map.
Example implementation of a simple granularity converter that silently ignores writes smaller than the target granularity: https://paste.jvnv.net/view/g35PB
Issue by jfng
Thursday Jan 23, 2020 at 12:22 GMT
Originally opened as m-labs/nmigen-soc#5
Use case:
from nmigen_soc import csr
def wrapper(width, access):
return csr.Element(width, access, src_loc_at=1)
foo = wrapper(8, "r")
print(foo.name)
# before this commit: None
# after: foo
jfng included the following code: https://github.com/m-labs/nmigen-soc/pull/5/commits
Repro:
from nmigen.back import rtlil
from nmigen_soc import csr
csr_mux = csr.Multiplexer(addr_width=16, data_width=8, alignment=8)
csr_mux.add(csr.Element(1, "r"))
print(rtlil.convert(csr_mux))
Output:
<snip>
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in <lambda>
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in shape
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in <lambda>
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in shape
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in <lambda>
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in shape
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in <lambda>
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in shape
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 546, in <lambda>
op_shapes = list(map(lambda x: x.shape(), self.operands))
File "/home/jf/src/nmigen/nmigen/hdl/ast.py", line 643, in shape
return Shape(self.stop - self.start)
File "<string>", line 1, in __new__
RecursionError: maximum recursion depth exceeded while calling a Python object
Dumping CSR names from a MemoryMap
by iterating over .all_resources()
is subject to name collisions.
These can happen in two scenarios:
class X:
def __init__(self):
foo = csr.Element(1, "r")
mux = csr.Multiplexer(addr_width=1, data_width=8)
mux.add(foo)
self.bus = mux.bus
a = X()
b = X()
decoder = csr.Decoder(addr_width=2, data_width=8)
decoder.add(a.bus)
decoder.add(b.bus)
for elem, elem_range in decoder.bus.memory_map.all_resources():
print(elem.name, elem_range)
# output:
# foo (0, 1)
# foo (1, 2)
In this case, we could add a name
attribute to MemoryMap
. Each window would then have a namespace for its resources. Resolving the full resource name would be the responsibility of the BSP generator, by walking through the memory hierarchy.
The result could look like this :
class X:
def __init__(self, *, name=None, src_loc_at=0):
foo = csr.Element(1, "r")
mux = csr.Multiplexer(addr_width=1, data_width=8, name=name, src_loc_at=1 + src_loc_at)
mux.add(foo)
self.bus = mux.bus
a = X()
b = X()
class Y:
def __init__(self):
self.foo = csr.Element(1, "r")
c = Y()
d = Y()
mux = csr.Multiplexer(addr_width=1, data_width=8)
mux.add(c.foo)
mux.add(d.foo)
In this case, I don't see an easy way to disambiguate the two names. The BSP generator would have to detect this and throw an error.
I am working on my Retro_uC. In there I combine a 32-bit M68K and with two 8-bit CPUs (a MOS6502 and a Z80). I would like to have all three of them accessing the same memory map. I would like it that the M68K can fetch a 32-bit word in one bus cycle.
AFAICS, currently neither the Wishbone Arbiter or Decoder allows that the data_width of the initiator bus is smaller than the data_width of the subordinate bus(es) even if the granularity is the same.
I see different solutions to this problem:
This issue can be assigned to me after it is clear what the preferred implementation is.
Installing the PyPi package only install dist info and no python modules.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.