Coder Social home page Coder Social logo

vbe0201 / faucon Goto Github PK

View Code? Open in Web Editor NEW
47.0 2.0 4.0 13.94 MB

NVIDIA Falcon Microprocessor Suite

Home Page: https://vbe0201.github.io/faucon

License: Apache License 2.0

Rust 100.00%
emulator emulation assembler disassembler rust rust-lang nvidia nvidia-gpu microprocessor falcon

faucon's People

Contributors

lockna avatar stupremee avatar vbe0201 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

faucon's Issues

Unknown instruction with 0xF8 opcode

There is a group of instructions which use 0xF8 as their opcode. They remarkably don't take any operands and there appears to be an unknown instruction in this group. The other instructions are mainly used for interrupt logic and for querying the DMA controller.

The instruction in question uses a 0xF8 opcode as well and 0x6 as its subopcode. It doesn't take any operands but details are unknown. Its behavior is unknown.

Proper debugger repl

Add a proper debugger command line using rustyline

This will allow for cool things like command completion, custom keybinds, or syntax highlighting.

Unknown interrupt flags

The Falcon has a special-purpose register for various flag bits that are used and modified by certain events and instructions. While most of the bits that are actually used and implemented are known, few still aren't.

The bits in question are the ones located at 0x1A-0x1F. It can be observed that 0x1A-0x1C are being modified on interrupt/trap delivery. The former values of these bits are copied to 0x1D-0x1F and later restored by the iret instruction.

Neither the purpose of these bits nor the actual meaning of their values is known, any help would be greatly appreciated before these can be implemented.

Support for more Falcon revisions

While this project explicitly targets Falcon v5 at the moment, we will want to support v0, v3, v4 and v6 at some point in the future. The most notable difference between these versions is that either some instructions did not exist at all or different types of encoding were used.

Thus, we will want faucon-asm-derive to accept a version attribute in its Instruction derive macro which allows for filtering out specific variants of the InstructionKind macro in correlance with the given version bitmask. Since faucon always depends on a configuration file no matter what command is invoked, the desired MCU revision can be extracted from there and then be used for instruction lookup.

Assembler implementation

faucon will want, for the sake of completeness, an implementation of an assembler that assembles the intermediate representation used by this project into valid machine code.

Formal specification

File Organization

  • Input to the assembler is a text file consisting of various statements

  • One statement per line, lines are separated by the first occurrence of either a newline character (\n) or a semicolon (;)

    • Technically, ; thus makes it possible to have multiple statements in a single text line
  • Comments are supported and may start with either two slashes (//); the span between these characters to the EOL is considered content of the comment

    • If a line starts with one of these sequences, it should be discarded
    • If a line contains one of these sequences, everything after it to the EOL should be stripped away

Statements

A statement is only considered valid if one of the following scenarios applies to it:

  • An empty statement; considered nothing but spaces, tabs or formfeed characters

    • Statements of this kind have no real meaning to the assembler and should be discarded
  • A directive statement; starts with a dot (.), followed by a directive symbol and ends with a :

    • Statements of this kind may not necessarily generate code but always invoke a specific assembler behavior
  • A machine operation statement; starts with a mnemonic of the machine opcode, optionally followed by operands

    • These intermediate representations must be translated to machine code in little-endian byteorder by the assembler
  • Furthermore, a statement remains a statement even if it is prefixed a label at the beginning of the statement

    • Labels consist of a symbol followed by a colon (:)

Symbols

The following is a non-exhaustive list of symbols that may occur in directive statements:

  • equ: Assigns a value to a symbol; It consists of a symbol prefixed by a hash (#), followed by a value

  • size: Inserts a fixed value into the resulting binary; It consists of a operand size mnemonic, followed by a value

  • align: Aligns a statement to a given boundary; It consists of the boundary in bits

  • section: Defines a relocatable section of the given name; It consists of a symbol prefixed by a hash (#)

Machine Instruction

As mentioned previously, instructions consist of their opcode and optional operands. If an instruction has two or more operands, the first operand always serves as the destination storage.

The opcode is the case-insensitive literal machine instruction mnemonic to be used, e.g. mov, ADD, IOwr. It is followed by
an operand size notation, if the instruction is sized, or by the operands, if the instruction is unsized. Sized instructions may operate with either 8-bit, 16-bit or 32-bit quantities of the supplied operands, or the full 32-bit when unsized. The operand size
must be explicitly denoted by an operand size symbol (b8, b16, b32) between the opcode mnemonic and the operands.

The notation of the operands varies depending on the instruction:

  • Registers always start with a dollar symbol ($), followed by the name

    • general-purpose registers: r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10, r11, r12, r13, r14, r15
    • special-purpose registers: iv0, iv1, iv2, tv, sp, pc, xcbase, xdbase, flags, cx, cauth, xtargets, tstatus
  • Immediates are denoted in standard two's complement binary notation and may either be 8, 16, 24 or 32 bits in size

  • Memory access always starts with the mnemonic for the memory space (I or D), with the expression that derives the address within brackets ([])

Examples:

// I am a pointless comment.
.equ #MAGIC_VALUE 0xDEADBEEF

add $sp -0x49  // Inline comments rock.
ld b32 $r1 D[$r10 + 0x80]
mov $r15 #MAGIC_VALUE
call #main

.align 4
main:
    ret

Implementation Notes

  • Due to the fairly simple and consistent intermediate representation of a Falcon assembly input source, nom could be used to
    easily work out a parser, since it's a project dependency anyway

  • faucon-asm-derive could be extended to return a Vec<InstructionMeta> with all possible variants of a specific InstructionKind mnemonic that was read by the parser

  • Some matching based on the found instruction operands in assembly could be done to extract the matching InstructionMeta

  • The internally used Arguments could be extended by a write method so they can serialize themselves into machine code

  • Given the InstructionMeta, the main opcode (first byte) can be constructed accordingly, followed by all the operands

  • Good error messages are important (that's where envyas is weak)

Pretty-print immediates correctly

For now, we're handling signed instruction immediate values by converting them into unsigned representations here. When working with immediate operands that are known to be signed, wrapping arithmetic operations are used to get correct results by discarding integer overflows.

As a consequence, however, it is not possible for us to pretty-print the real values of signed immediates for now. Say, an instruction takes -0xAB as an operand, it gets converted to 0xFF55 and this is the value that would be printed then. And needless to say, this is bad for everyone that relies on the accuracy of the disassembler.

At least for the purpose of printing, we need to introduce a indicator that can be used to distinguish signed/unsigned values (albeit both being converted to unsigned representations). Additional to the previously stated issue, Rust's default formatter doesn't print negative hex values (as you know it from Python), which is the desired solution. Some handwork needs to be done there as well to produce the correct output.

Unknown instruction in the iord group

The iord instruction is used to reads the contents of a MMIO register in the Falcon I/O space into a destination register. This specific variant has 0xFF encoded as its opcode and 0xF is the subopcode. In certain programs, it can be observed that an unknown instruction with the same opcode and the same operand scheme is used. The only difference is that this unknown instruction has the subopcode 0xE.

The behavior of the instruction is unknown.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.