Coder Social home page Coder Social logo

tillitis / tillitis-key1 Goto Github PK

View Code? Open in Web Editor NEW
379.0 10.0 24.0 19.53 MB

Board designs, FPGA verilog, firmware for TKey, the flexible and open USB security key ๐Ÿ”‘

Home Page: https://www.tillitis.se

Makefile 5.00% Verilog 46.67% C 30.61% Assembly 0.26% C++ 1.00% Python 14.51% Shell 0.61% Dockerfile 0.53% CMake 0.81%
fpga open-hardware security-token

tillitis-key1's People

Contributors

bjoto avatar blaufish avatar cibomahto avatar dehanj avatar mchack-work avatar monrad-aas avatar quite avatar sallsim avatar secworks avatar sylv-io avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tillitis-key1's Issues

Changes to upstream icestorm break icebram replacement

After setting up a new build environment by either following the toolchain setup or building a fresh Docker image, the application gateware build fails to replace the bram correctly:

$make prog_flash
...
icebram -v bram_fw.hex firmware.hex < application_fpga.asc > application_fpga.asc.tmp
Loaded pattern for 32 bits wide and 1536 words deep memory.
Extracted 192 bit slices from from/to hexfile data.
Found 21 initialized bram cells in asc file.
No memory instances were replaced.
make: *** [Makefile:201: application_fpga.bin] Error 1

The issue appears to be somewhere between commit 45f5e5f3889afb07907bab439cf071478ee5a2a5 and d20a5e9001f46262bf0cef220f1a6943946e421d in incestorm: https://github.com/YosysHQ/icestorm/commits/master

Reverting icestorm back to commit 45f5e5f3889afb07907bab439cf071478ee5a2a5 seems to fix the issue, so let's pin icestorm to that commit.

It's possible that upstream yosys/nextpnr have evolved to be compatible with the changes to icebram, so this could be reconsidered when the other tools are also updated.

Hardening firmware

Make an attempt at hardening firmware:

  • Use memory and read functions that also pass size of destination buffer.
  • Introduce assert() that aborts firmware (eternal loop) when some expression doesn't hold true. Sprinkle asserts() at important places.
  • Avoid copying UDS, even to fw_ram, since it is now byte readable.
  • Investigate protecting firmware stack by using fw_ram.

Document NVCM programming procedure

This needs to be placed somewhere in the project documentation-

First, the script needs to be run in a python virtual environment. It can use the same one as the production test, so this would be sufficient to set it up:

./run (then ctrl+c to exit the production test once it's set up)
. venv/bin/activate

Next, there are three separate steps that need to be performed for NVCM programming: write, verify, then secure. Note that the device will boot from nvcm once the write stage is completed, but it will still allow read backs until the secure command is completed.

./pynvcm.py --write ../application_fpga/application_fpga.bin --my-design-is-good-enough
./pynvcm.py --verify ../application_fpga/application_fpga.bin
./pynvcm.py --secure --my-design-is-good-enough

Increase the size of FW-RAM to at least 2 kByte

We have some EBRs currently unallocated. Currently the FW-RAM is 1024 kByte (256 words). If FW-RAM could be increased, FW could move its stack into FW-RAM. We should investigate the cost of at least doubling the size of the FW-RAM.

Add support for ASLR and data protection of the main RAM

Currently the main RAM has no protection beyond the fact that it is internal to the FPGA. A warm-boot attack, or any other attack that would allow the attacker to read out the contents of the SPRAM blocks would gain access to the contents, including any secrets derived by or handled by an app.

We would like to improve this by adding some sort of address space layout randomization (ASLR). We would also like to scramble the actual data values being stored in RAM. And we want to change the randomization and data scrambling every time the device starts, and also between FW and App modes.

An important limitation is that we can't add any cycles, since this will drastically lower the execution performance. So, whatever solution we come up with must be possible to perform purely combinational, and as part of a normal memory operation. This also without dropping the clock speed to below 18 MHz.

CI

CI for main repo: Probably at least do a complete build and run linters.

Maybe start things under Verilator.

Add watchdog core

We want to be able to handle TKey FW and apps stuck. We do this by adding a watchdog core.

The core when activated will count cycles from a preset value down to zero. When it reaches zero it will trigger the reset process in the FPGA design. Note that this will NOT force the FPGA device to perform configuration by reading the configuration bitstream, just reset the FPGA design already configured.

FW and SW can avoid the reset by periodically write the 'start'-bit in the watchdog core API. FW and SW can also disable the watchdog by writing to the 'stop'-bit when the watchdog counter is running.

Increase size of RX-FIFO

The RX FIFO in the UART currently has a capacity of 256 bytes. The EBR allocated for the RX-FIFO however is able to store 512 bytes. Increasing the FIFO to 512 bytes would allow at least three complete frames to be stored, increasing the number of frames in flight and better throughput. Getting performance increase for "free" just by using the whole EBR is nice.

The following branch contain a change to the FIFO to handle 512 bytes. Test build shows that no additional EBR is allocated.
https://github.com/tillitis/tillitis-key1/tree/bigger_rx_fifo

Write FPGA bitstream to NVCM

Write FPGA bitstream to NVCM inside the ice40 instead of depending on external flash.

Verify that it can be locked down by our tools.

Warm boot attack protection: Random time for when UDS is present in FW_RAM

To protect against a warm boot attack where someone managed to trap execution at a precise clock cycle and capture all content of EBR we insert a random time delay in firmware for when the UDS is present in FW_RAM.

We can use the TRNG to select a random timer, then loop and wait for the timer to expire, then we do the CDI computation which temporarily holds UDS in FW_RAM.

CRC in FW protocol

Add a simple CRC to the FW protocol and add CRC checking to both ends. Maybe only on the FW protocol and not the framing protocol that is also used by some device apps.

This way we might be able to detect bit errors and it might help in figuring out why queueing doesn't seem to work.

Change timer API to have explicit start and stop bits to fix TOCTOU.

The current API for the timer contain a combined start/stop-bit. Based on the current state (idle or running), writing to the bit will either start or stop the timer. The problem is that when reading the status and then writing to this bit, the status may have changed (the timer finished). A nice example of a TOCTOU issue.

Move CH552 firmware directory?

The firmware for the CH552 is in a non-obvious location:
/hw/boards/mta1-usb-v1/ch552_fw

Is there a better place that this could go? It's also used in the tkey design ( /how/boards/tk1 ), so I don't think the current location makes sense any more. I put it there originally to copy how the application fpga firmware is stored ( /hw/application_fpga/fw ).

verilator broken by 90a57c4 (pll)

90a57c4 on the pll branch broke compilation make verilator. But perhaps it had issues before that..

%Error: /home/quite/t/tillitis-key1/hw/application_fpga/tb/application_fpga_vsim.v:172:3: Cannot find file containing module: 'reset_gen'
  172 |   reset_gen #(.RESET_CYCLES(200))
      |   ^~~~~~~~~
%Error: /home/quite/t/tillitis-key1/hw/application_fpga/tb/application_fpga_vsim.v:172:3: This may be because there's no search path specified with -I<dir>.
  172 |   reset_gen #(.RESET_CYCLES(200))
      |   ^~~~~~~~~
        ... Looked in:
             reset_gen
             reset_gen.v
             reset_gen.sv
             verilated/reset_gen
             verilated/reset_gen.v
             verilated/reset_gen.sv
%Error: Exiting due to 2 error(s)
make: *** [Makefile:155: verilator] Error 1

Add tags for PCB and Cases

For tracking hardware releases, we need to add tags that identify PCBA releases, and case plastic. Here's a proposal for how to name them:

release tag commit hash
mta1 PCB-v1.0.0 5c69549
mta1 CASE-v1.0.0 5c69549
tk1 PCB-v2.0.0 e71d700
tk1 CASE-v2.0.0 4995fdb

Cleanup CPU core instance configuration

In the instantiation of the PicoRV32 core we configure the core to what we want But some of the custom configurations are the same as the default. We should only have explicit configurations in the instance where we want something different compared to the default. We should also analyze if there are any other changes we want, or don't want.

[tillitis-key1/hw/boards/mta1-usb-v1/] USB-C connector might not be very durable?

Hi,

First of all, thanks for the great work.

If I may, I would like to critique the USB-C connector choice a bit. It seems you are limited (but then you did your own housing design) in PCBA space but it feels like the SMT USB-C connector uses is not very durable and would come off the board pretty quickly after some insertion cycles.
Any chance of changing it to a part that offers some THT mechanical support? This can even be a pin-in-paste type so that you do not need an additional THT solder step but can reflow the THT connector part.

Also, I noticed a few instances in the layout where the min gap is around 100um that could easily be extented to 125um minimum track-gap to improvbe manufacturability a bit.

Check blake2s performance

We need to check the performance of the blake2s implementation in the firmware.
How many cycles does it take to perform a blake2s_update()?

Byte-acess (to ROM, FW-RAM)

We'd like to hash/sign our firmware residing in ROM. The functions for doing that requires byte-adressable memory, which ROM currently is not.

document new hardware features more extensively

We have new important hw features. Where and how to document exe-monitor, illegal instruction trap, scrambling?

We think it should be more comprehensively explained than just the API addresses being documented in header file and software.md.

We will be using the response-status bit in the framing header. Needs more docs?

Change of firmware's use of white LED has been documented in readmes that already mentioned it (as has the less flashing by tkey-programs). There are no docs about red-flashing on error.

Also release notes for everything, not to forget.

Verify device

Provide some way of authenticating that the device is a true TK1.

In software.md we mention:

#### `FW_{CMD,RSP}_VERIFY_DEVICE`

Verification that the device is an authentic Mullvad
device. Implemented using challenge/response.

but nothing is said about how to actually go about this.

Programming board: implement faster USB protocol

I'd like to upgrade the USB protocol that the programmer boards speak, in order to have drastically better flash performance. Unfortunately the protocol change is incompatible with the firmware that we currently ship on the programmer boards, so they will need to be updated for this to work.

To upgrade, at the moment you need to:

  1. Update the firmware in your programmer to the 'raw_usb' version: https://github.com/blinkinlabs/ice40_flasher/tree/raw_usb (this is already done on the production test jig pico)
  2. Update iceprog to the 'raw_usb' version: https://github.com/tillitis/icestorm/tree/raw_usb/iceprog (this should work on Linux, but will need modification again to work on macOs)
  3. For anyone using the production test tools (either reset.py, or the actual production test), they will need to switch over to the raw_usb branch in tillits-key1 as well: https://github.com/tillitis/tillitis-key1/tree/raw_usb

Ideally I'd like to merge all of these changes into the main branch on each repo, but doing so will mean that any programmers with older firmware won't work correctly, and will need to be updated.

My proposal is to make this a clean break, and to merge these changes into each of the updated branches. I added a check to iceprog, that warns the user if their programmer firmware needs to be updated. Perhaps this is enough?
image

Flash the key from the container (a rootless podman container)

I have been trying to figure out how to flash the device from within the build container without needing root. This issue serves both as documentation for others coming here, and a request to update the documentation around it :)

  1. Add udev rules to allow the dialout group to access the programmer board. Your user must then of course be in the dialout group:

    $ cat /etc/udev/rules.d/55-tillitis.rules
    SUBSYSTEM=="usb", ATTR{idVendor}=="cafe", ATTR{idProduct}=="4004", MODE="0666", GROUP="dialout"
    KERNEL=="hidraw*", ATTRS{idVendor}=="cafe", ATTRS{idProduct}=="4004", MODE="0666", GROUP="dialout"
    
    $ sudo udevadm control --reload-rules && sudo udevadm trigger
    

    Make sure to unplug and plug your device again after this.

    If you have SELinux on your system (for example Fedora), you will also need to allow containers to access forwarded hardware devices. Run the following:

    setsebool container_use_devices=true
    
  2. Build the container:

    podman build -t tillitis-dev contrib/
    
  3. Open hw/application_fpga/Makefile and remove all usages of sudo (It's not installed in the container, and not needed)

  4. Build the firmware and flash the device, all from within the container:

    podman run --rm \
        --device /dev/bus/usb/$(lsusb | grep -m 1 cafe:4004 | awk '{ printf "%s/%s", $2, substr($4,1,3) }') \
        -v .:/build:Z -w /build/hw/application_fpga \
        -it tillitis-dev make prog_flash
    

Add support for USS

The current version of the FW and host side does not support including a USS. The protocol, the FW and the host SW needs to be updated to handle a USS.

Change name to TK1 everywhere

There are a lot of MTA1 and MTA1-MKDF everywhere in the code and filenames.

@quite started branch tk1 in:

  • qemu
  • tillitis-key1
  • tillitis-key1-apps

Add app access to BLAKE2s in FW

Add functionality to allow apps to call the BLAKE2s function in the FW for their own use.
The suggested solution is to expose the address to the function entry to the app using a readable 32-bit register.
The FW will have to get the correct address and write it to the register as part of the FW boot and app load process.

Allow FW, SW to read number of bytes in Rx FIFO

Right now, FW, SW can only know that there is at least one byte from the host in the Rx FIFO to consume. This basically means reading the status for every byte. The FIFO contains a byte counter. Exposing the counter to FW, SW in the API would allow FW, SW to find out how many bytes to extract and read them out without polling for each byte. This would be more efficient and save CPU cycles.

llvm missing from toolchain setup

Using a fresh install of Ubuntu 22.10 Desktop and following the toolchain instructions to set up the environment, the application fpga build fails with a missing llvm-size:

matt@tillitis-ubuntu:~/tillitis-key1/hw/application_fpga$ make
icebram -v -g 32 1536 > bram_fw.hex
(snip)
llvm-size firmware.elf
/bin/bash: line 1: llvm-size: command not found
make: *** [Makefile:140: firmware.bin] Error 127

Installing llvm works:

sudo apt install llvm

Not sure if this also applies to Ubuntu 22.04 LTS- will try that next

FW: Always start app when it's fully loaded?

After a discussion with @secworks :

Instead of leaving it up to the host when to start the program by sending FW_CMD_RUN_APP we might want to automatically start the device app when it's fully loaded.

To keep the check if the app data has been transmitted correctly we might want to return the app digest in response to the last chunk of app data instead of from FW_CMD_GET_APP_DIGEST.

We might also want to use a timeout between chunks for an even smaller time window for someone to control when the device app starts.

We probably don't want to allow someone to restart app loading by setting FW_CMD_LOAD_APP_SIZE again.

Pros:

  • Less turnaround until app start.
  • Might mean less possibility for timing attacks.

Cons:

  • Need to change host programs as well.

Wishlist: a way to reset device to firmware mode

If we had a way to reset the device to firmware mode, we could get much better UX. If would be possible to load a new app without unplugging and plugging the device in again! Can this be done without compromising security?

Handling of illegal instructions

We need to at least document how the CPU in the TKey1 handles illegal instructions. Possibly we should also add functionality to signal to the user that an illegal instruction event has occurred.

Checking if TKey is running an App

Currently, if you try to send app commands when the TKey is in firmware mode, the TKey locks up and cannot load apps.

It would be nice if there were a way to check that the TKey is currently running an app so you do not need to re-flash software each time.

I/O problems

I furnished a script to run signing in a loop. After some 1000-2000 iterations, the signing hangs. This on 38400 bps.

I seemed to have hung on getSig in the signerapp. So a byte was lost, and the ReadFull() hung because it never got the whole frame, the whole size that it expected? Possibly

When trying this at 500_000 bps, it would hang after some 10s of signing iterations.

Build fails on Ubuntu 22.04

Following the toolchain setup instructions on an Ubuntu 22.04 machine (as specified in the toolchain instructions), building the application fpga fails because Clang doesn't support the rv32iczmmul architecture. I think the fix is to update the toolchain instructions to specify Ubuntu 22.10?

Stack size?

The size of the stack controls how large apps we can load into memory.
The current stack is huge at 64 k and we have only 64 k to load the
app into.

I have done experiments with as low as 8 k stack and it still works,
including the signerapp. If we go this path we can load apps that are
120 k.

A memory map like this is a bit complicated, though, since a smaller
app then will have free memory both above and below itself. If an app
wants to move itself in memory to get more unfragmented free memory to
play with we provide APP_ADDR and APP_SIZE which should be enough
for the app to move itself, right?

I suggest we just decide on a stack size, say 8 or 16 k, and run with
that. Perhaps also write an example app that moves itself around in
memory.

@quite says long ago in another organization:

When the signer app has just returned the signature to its client,
some 6380 bytes of stack has been written to -- from 0x8000ffff and
down (as set in crt0.S). Tested this by dumping the stack mem at this
point, and also inserting some deadbeefs in mem to see exactly what
got overwritten.

Also noting that when app main is entered, 691 bytes (i think) of
stack has already been used. Because the first stack var in main named
stack is at 0x8000fd48 . Not sure who eats that, the C runtime?

Icebram errors when no BRAM instances are replaced

When running make prog_flash, I get this error during the build process:

python3 /Users/mpatil/Documents/Programming/Projects/tillitis-key1/hw/application_fpga/tools/makehex/makehex.py firmware.bin 1536 > firmware.hex
icebram -v bram_fw.hex firmware.hex < application_fpga.asc > application_fpga.asc.tmp
Loaded pattern for 32 bits wide and 1536 words deep memory.
Extracted 192 bit slices from from/to hexfile data.
Found 19 initialized bram cells in asc file.
Found and replaced 0 instances of the memory.
No memory instances were replaced.

How do I fix this? I'm building this on a Mac M1.

Remove UDA from documentation

The UDA has been deprecated (and has never been supported by the HW design).
Remove from software.md and elsewhere.

picorv32 built without DIVision, but extension "m" includes div/rem (rv32imc)

@bjoto wrote:

The picorv32 on the key is built with ENABLE_DIV=0, and we're compiling the source (fw/app) with the "m extension" enabled. The m-ext includes div/rem so we're just lucky that we're not getting those in the app/fw. Two options:

  • Enable division in the picorv32 (more LUTs)
  • Use the Zmmul extension instead of m (recent clangs has support): "RISC-V Zmmul Multiply Only enables low-cost implementations that require multiplication operations but not division, and is part of the RISC-V Unprivileged Specification."

Thank you. We currently use clang 14 and it does not have Zmmul, checked with llc -march=riscv32 -mcpu=help |& grep -i zmmul. For catching this early and right now we could do llvm-objdump | grep (div|rem) or "look into the exception handling of illegal instruction in picorv32... maybe it's possible to trap and blink the leds aggresively!"

Add EXE monitor

Currently all context of a TKey application is stored in the same memory - code (text section), heap and stack. A malicious (or inventive - dynamic programming is fun.) App (or a successful injection attack) would be able to change the code loaded. It would also be possible to have the CPU execute data stored on the stack as instructions.

We would like to improve this by being able to prevent the CPU from writing to the code in App mode. We would also like to prevent execution from the stack. This is similar in concept to The W^X in OpenBSD (and other OSes) - that is exclusively allow either writes or execution to specific memory areas.

This issue deals with how we can accomplish this. And what the proper response should be when an incorrect behavior is encountered.

Update release_notes.md

And should we have a new tagged release?

We tagged apps-repo last week, v0.0.2, but now we have changes to the timer_api. So perhaps we should tag it again?

How do we, and do we need to keep these in sync? At least as long as hardware API is in flux (which our goal is for it to not be, of course).

Move to the zmmul subset of the RISC-V M extensions

This will allow us to bring in the PicoRV32 with ENABLE_DIV=0, and use the freed resources in the FPGA for other things.

This will require clang-15. Is this sensible to do until Ubuntu 22.04 LTS gets clang-15 -- will it get it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.