Coder Social home page Coder Social logo

pmret / papermario Goto Github PK

View Code? Open in Web Editor NEW
1.2K 39.0 120.0 151.06 MB

Decompilation of Paper Mario

Home Page: https://papermar.io

Assembly 3.18% Python 2.77% Shell 0.04% C 94.00% C++ 0.01% Nix 0.01%
paper-mario decompilation n64 reverse-engineering nintendo

papermario's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

papermario's Issues

dead.h

ethteck HARD agreeing

We should create a dead.h header that is included at the top (i.e., before common.h is included) which aliases the symbols from dead_syms to their 'alive' counterparts using a bunch of #defines. This can be included in all 'dead' code units (C files referenced in splat.yaml, not .inc.cs) and removes the need for dead-specific implementations of duplicated functions like #349 adds. For non-matching adventurers in the future it will make porting the dead maps to use alive symbols as easy as removing the #include "dead.h" line.

e.g.

#define gItemTable dead_gItemTable
#define get_variable dead_get_variable
#define set_variable dead_set_variable

Preferably we would not refer to any dead symbols explicitly anywhere in src, prefering to use alive names.

Move persist-varTable function and data to a src/world/common/*.inc.c

As of creating this issue there are roughly 85 instances of this function - search for i = heap_malloc(16 - across the codebase, and we're likely to see more as we match more maps. At least one map includes this function and associated data twice.

I suggest naming this function something like N(PersistLocalVars), or similar. On first call it saves the script's varTable, and on a second call it restores the varTable and resets itself.

Here's it in arn_03:

/// Pushes/pops script local variables to D_80241C68_BE09F8
ApiStatus N(func_802412B0_BE0040)(ScriptInstance* script, s32 isInitialCall) {
    s32** ptr = &N(D_80241C68_BE09F8);
    s32 i;
    s32* test;

    if (*ptr == NULL) {
        i = heap_malloc(16 * sizeof(s32));
        *ptr = (s32*) i;
        for (i = 0, test = *ptr; i < 16; i++) {
            *test++ = script->varTable[i];
        }
    } else {
        for (i = 0, test = *ptr; i < 16; i++) {
            script->varTable[i] = *test++;
        }
        ptr = &N(D_80241C68_BE09F8);
        heap_free(*ptr);
        *ptr = NULL;
    }
    return ApiStatus_DONE2;
}
s32* N(D_80241C68_BE09F8) = NULL;

Notice that this include will need to be both data and a function. This means ordering is important- its not as simple as removing both the data and the function and adding a #include. (There are a bunch of other includes like this that should bring along data such as scripts but are hard to add, for example the texture-panning functions and scripts.)

It could also work to have the data be a static variable inside the function body, i.e. static s32** ptr = NULL.

Tracking issue for DSLs

Known DSLs we need macros and disassemblers for:

  • si scripts (evt)
  • sparkle scripts
  • hud element animations (AKA icon scripts)
  • entity scripts
  • entity model scripts
  • item entity scripts
  • model animations
  • sprite animations (supported in star rod xml, icky)

...this game is ridiculous

Convert DSL Scripts to Macros

See #562 for examples.

Basically use update_evts.py on the file you want to convert and do some manual cleanup afterwards. Search for Script({ in VSC to find the files that have to be converted.

Investigate Rodata Alignment Issues

There are some maps / c files where we have to create explicit rodata vars just to get alignment to work properly. Sometimes rodata seems 0x10 aligned (maps with manually-created rodata_alignment vars) and sometimes it seems 0x8 aligned (jump tables right next to each other). What's going on?

Use hex for enums

Enum values ( = xyz) and names, in the case of dummy stuff, should use hex and not decimal

Deduplicate item refund funcs

I think these are duplicated across every item except for a couple constants. With some macros, and now that we have BSS support, we can deduplicate them. I think there could be an inc.c for refund stuff with the bss symbol for the icon. it seems to always be at the beginning of every item.

Resolve asset handling for maps

Following maps have assets/models/etc. included in them that have to be handled by creating sub segments in Splat:

  • dgb_01
  • dro_02
  • iwa_01
  • kmr_02
  • kmr_22
  • kzn_19
  • pra_31

There are probably a bunch more of these that have to be handled, and the list should be updated accordingly, if possible.

Merge version-specific asset directories

Related to #383.

It's been suggested that we merge common assets in assets/us and assets/jp into an assets/core directory that, similar to src, only uses different asset files (in version-specific directories) for each version where necessary.

That is, the version stack for us will look like:

- core
- us

And jp:

- core
- jp

(we may want to have core hold all us things, im unsure)

Shiftability

From @ethteck on Discord:

Disassemble all data

basically anything in the splat.yaml that's "data" (no dot before it) or "bin" is asm data or just binary data, respectively
often, the asm data contains pointers that need to have actual symbol names (D_12341234 instead of 0x12341234)
so like, figuring those out is not trivial sometimes. they could be the start of a segment, they don't have to be an actual c variable necessarily
as for the bin stuff, we need to figure out what the binary data actually is and handle it appropriately. sometimes it's C data, sometimes it's like a custom format for something.

Correct mistaken pointers in c files and data (#277)

this one is tricker. sometimes two symbols have the same value, but one is technically more correct than the other
there are times when we DMA an overlay, and one of the things the function call wants is the RAM of the overlay. but instead, we're giving it the first function. which just happens to have the same ram address as the start ram address of the overlay, because it's currently the first thing there. but what happens if we write a function above it? suddenly we're not pointing at the beginning of the overlay anymore but just some random spot in the middle of it
these kinds of things are probably going to come up as we start being able to shift here and there, and we'll fix them as we go

Clear undefined_syms

okay so this one is probably the most important atm, and I saved it for last
undefined_syms includes all the symbols that are referenced in our asm files and c files but aren't actually declared anywhere
the idea is to declare them properly and then remove the entry from undefined_syms
a lot of these, if not all, are bss variables. bss vars are weird in that they take up no space in the rom and are at the end of each segment. they're supposed to be defined but not initialized, so like s32 someVar; would presumably end up in BSS
all of the BSS vars should be declared in their proper files so we can remove the undefined_syms entry for them
however, we can't just get rid of the undefined_syms entry ...for reasons I can explain later. so for now we're throwing all the known bss symbols into main_bss_syms.txt
so the work one could do is ...find BSS vars that aren't declared and declare them in the c files that they belong to
everything in main_bss_syms is from the first (main) segment, up until like 0x75000 or wheverver it stops. so you could start with those if you wanted

Write a tool to merge .inc.c files

It's likely that we have .inc.c files that are always included together and can therefore be merged into the same file, especially in cases of related NPC AI funcs.

This tool should look at all C files and determine includes that can be 'grouped', e.g. if foo.inc.c, bar.inc.c and baz.inc.c are always included together and in the same order in all C files, then they can be grouped.

#include "foo.inc.c"

#include "bar.inc.c"

#include "baz.inc.c"

Rename string funcs to msg

e.g. draw_string -> draw_msg

"string" is a leftover from Star Rod's naming conventions. "msg" is consistent with TTYD symbols (msgDrv.o) and with the pm_msg segment we already have.

itemEntity struct and 2 functions

While helping my friend out with his emulator, I found some info on the itemEntity struct and 2 functions. We used a dump of the RAM in his emulator to see what was going on.

The field currently named unk_34 seems to hold only 4 byte sized values. There is a function called update_item_entities, that should update item entities. In the ROM this is located at 900c85ec. This function is copied to RAM location 80131eec. In a for loop, this function checks if the itemID of some itemEntity structs is equal to 0x157, which from https://tcrf.net/Notes:Paper_Mario I gathered were coins. If they are coins, there is a 10% chance some values get updated in the struct, which I believe are values related to a (sparkling) animation sequence. The more interesting function call is that to FUN_90130acc, which is nothing, but after being copied to RAM this is a call to 80130acc corresponding to a function at 900c71cc. The decompilation for this function should be something like

void do_animation(itemEntity* item_entity) {
    if (--item_entity->frames_left < 1)  {  // Ghidra says < 1, presumably this is just == 0 and the field is an unsigned integer
        do { } while (next_step(item_entity));
    }
}

The frames_left field is the field at offset 0x3c in the itemEntity struct. Essentially, every animation sequence step (I am just guessing it is an animation sequence, it might be some other sequence) lasts for a certain amount of frames.
The next_step function in ROM is a call to 90130a04, which again points to nothing, but in RAM this is a call to 80130a04, corresponding to 900c7104 in the ROM. This function essentially does one step in the animation and returns whether a next step should be taken. It should look something like this:

int next_sequence_step(ItemEntity *item_entity)
{
  undefined4 uVar1;
  int *current_state_ptr;
  uint *next_ptr;
  
  current_state_ptr = item_entity->current_state_ptr;
  next_ptr = (uint *)(current_state_ptr + 1);  // this pointer has a different meaning depending on the state
  switch(*current_state_ptr) {
  case 0:  // this is an error state it seems, and will hang the above function
    return 1;
  case 1:
    item_entity->frames_left = *next_ptr;
    uVar1 = current_state_ptr[2];
    item_entity->current_state_ptr = current_state_ptr + 3;
    item_entity->field_0x44 = uVar1;
    break;
  case 2:
    item_entity->current_state_ptr = item_entity->sequence_start;
    return 1;
  case 3:
    item_entity->sequence_start = (int *)next_ptr;
    item_entity->current_state_ptr = (int *)next_ptr;
    return 1;
  case 4:
    item_entity->current_state_ptr = current_state_ptr + 2;
    return 1;
  case 5:
  case 6:
    break;
  case 7:
    item_entity->frames_left = *next_ptr;
    item_entity->field_0x4c = (int *)current_state_ptr[2];
    item_entity->field_0x50 = (int *)current_state_ptr[3];
    item_entity->field_0x54 = current_state_ptr[4];
    uVar1 = current_state_ptr[5];
    item_entity->current_state_ptr = current_state_ptr + 6;
    item_entity->field_0x58 = uVar1;
    break;
  default:
    return 0;
  }
  return 0;
}

essentially, the current_state_ptr (field offset 0x40) points to a struct, which varies on what state it is. In all cases it starts with an int showing what state it actually is. In the case of 0 it is just an int, and the do_animation function will hang. In case of 1, it looks like

struct sequence_state_1 {
    int state; // is always 1
    uint frames_left; // amount of frames it stays in this state
    int field_0x44; // struct field 0x44 is set to this value
}

case 2 resets the animation. The struct is also just an int then. case 3 switches to a new animation, and the struct looks like

struct {
    int state;  // is always 3
    int* next_animation; 
}

case 4 simply advances the state. The struct does hold an extra 32 bit piece of information, but this is not really used here it seems. case 5 or 6 do nothing, but will not hang the do_animation function. case 7 is the most interesting one, it holds a struct that looks like

struct {
    int state;  // always 7
    uint frames_left;  // amount of frames it stays in this state
    void* pointer_0x4c;  // pointer to 0x20 bytes of data
    void* pointer_0x50;  // pointer to at least 0x20 bytes of data
    int field_0x54;
    int field_0x58;
}

Of course, the integers showing the state are likely some enum. The struct sizes can be checked by the amount that item_entity->current_state_ptr is advanced (in strides of 4 bytes). An example of such a sequence can be found in ROM at 9009df70, copied to RAM at 80104ac0. This is a sequence that starts at state 4, then state 7 a few times and then state 0 (which presumably the N64 never reaches, have not figured this out yet). Looking at the values for the pointers of field 0x4c and 0x50, 4c seems to point at 0x20 bytes of data, and 0x50 at some multiple of 0x20.

Because of this, I also think that the itemEntity struct will look something like

typedef struct ItemEntity {
    /* 0x00 */ s32 flags;
    /* 0x04 */ s16 boundVar; /* see make_item_entity */
    /* 0x06 */ char unk_06[2];
    /* 0x08 */ Vec3f position;
    /* 0x14 */ struct ItemEntityPhysicsData* physicsData;
    /* 0x18 */ s16 itemID; /* into item table, also worldIconID */
    /* 0x1A */ u8 state;
    /* 0x1B */ u8 type;
    /* 0x1C */ u8 pickupDelay; /* num frames before item can be picked up */
    /* 0x1D */ char unk_1D;
    /* 0x1E */ s16 wsFaceAngle; /* < 0 means none */
    /* 0x20 */ s16 shadowIndex;
    /* 0x22 */ char unk_22[2];
    /* 0x24 */ u32* readPos;
    /* 0x28 */ u32* savedReadPos;
    /* 0x2C */ char unk_2C[2];
    /* 0x2E */ u8 unkCounter;
    /* 0x2F */ s8 unk_2F;
    /* 0x30 */ f32 scale;
    /* 0x34 */ u32 unk_34;
    /* 0x38 */ u32 unk_38;
    /* 0x3c */ u32 frames_left;
    /* 0x40 */ u32* current_state_ptr;
    /* 0x44 */ u32 unk_44;
    /* 0x48 */ u32* sequence_start_ptr;
    /* 0x4c */ u32* unk_4c;  // 32 bytes of data
    /* 0x50 */ u32* unk_50;  // 32 bytes or multiple of 32 bytes of data
    /* 0x54 */ u32 unk_54;
    /* 0x58 */ u32 unk_58;
} ItemEntity; // size = 0x5C

building on ARM host fails to match compiled US rom

build fails after verifying built US rom on aarch64 (raspberry pi) host. (does not affect x86 hosts)
I suggest that this may have cause of the arm64 built c compiler tools/arm/cc1 or some matched snippets of code, probably...

Update disasm_script.py

Update the script to better work with the new syntax, as a few Macros are currently receiving faulty data or are missing relevant data from enums.h

  • GotoMap: An EVT_CALL that calls GotoMap will have a faulty pointer as the first argument, that doesn't resembel a map-string. Most likely an issue with the disassembly that it points to some function instead of the rodata containing the string.
  • enums.h: Most of the new enums that have been created in any earlier PR have not been included into the disasm process and are thus not replaced correctly.
    • StoryProgress: Does not seem to be working when used in a Switch-Case currently. See arn_03/header.c and arn_04/header.c for an example
    • Missing elements: Keep the value in its original format (dec or hex) if the respective enum doesn't have a fitting element, instead of turning them all into decimals.
  • Symbols: Output the symbol name when using the script directly like EvtSource D_80242504 = {...

List is most definitely incomplete and will be updated as new issues arise.

Investigate linking overlays separately

Related to #438, since it should also fix that issue.

There is some evidence that overlays were linked by themselves, not together at the end. This evidence includes:

  • Existence of dead references. If everything was linked at once, this would not be possible
  • Requirement for the namespace macro N where static does not work. It can be presumed that the original developers did not do this

My proposal is that we Do As The Devs Did. I'm unsure if this is possible with GNU tooling (e.g. was it a Nintendo thing?) but we need to figure out how to:

  • Link the core game together
  • Link overlays (maps) against the core game as if it was a library, throwing away all but a select few 'public' symbols needed by core datastructures (e.g. MapConfig and the init function for maps)
  • Produce an ELF at the end

Doing this would make bad references like #277 not compile, which is good.

Fix warnings

Compiling the rom emits a ton of warnings that we should be able to fix - mostly implicit casts that should really be made explicit.
Additionally it would be nice to be able to enable -Wimplicit -Wredundant-decls so headers are used properly.

Once this is done we can enable -Werror on Jenkins so people fix their warnings as they match stuff :shipit:

Consume releases from gcc-papermario

Instead of storing build/linux, mac, etc dirs in the repo that contain the GCC 2.8.1 compiler, grab the correct architecture's compiler as part of the install setup from the releases of gcc-papermario based

Give (5.0f / 7.0f) a macro

I'm thinking SPRITE_PIXEL_SCALE or something - IIRC I read somewhere in the Star Rod source that this (roughly 0.7f) is how many in-game units large a single pixel on a sprite is by default, and it's used in a number of places e.g. #572

Merge jp and us source directories

At the moment jp's splat.yaml defines src/jp/ as its root source directory. Ideally it should be src - all the existing segment names need to be prefixed with jp/ to keep the existing structure, though - besides the first C file, since it's identical to is_debug from us.

Handle dead symbols with dark magic

See also: #432

A few functions, such as func_80242EC4_EA37C4, use a dead version of the libultra sqrtf, but codegen does not match with the typical macro magic we do with dead.h currently. I believe this is because sqrtf is an intrinsic function and gets optimised differently.

If this is the case, there is no way for us to tell the compiler to treat "dead_sqrtf" as if it were the real sqrtf. These functions cannot be matched right now.

My proposed solution is:

  • Use sqrtf in the source code
  • Remove dead.h, compile 'dead' maps normally
  • Perform objcopy dark magic to reroute function X to dead_X

Consider using LW, GW, GSW, etc. macros instead of Star-Rod style SI_VAR in scripts

Pros:

  • matches TTYD symbols
  • mnemonics are more succinct
  • there's a debug-print operation in scripts that prints out the mnemonic (most important imo)

Cons:

  • goes against Star Rod's convention of Var, AreaVar, Flag, etc.
  • this could confuse modders initially (but we could define both styles of macro?)

This is extra-relevant if we're moving to macro-based scripts.

@ethteck thoughts?

Migrate map and battle data to C

This is the first pass of map/battle data work. Future passes will involve disassembling scripts and other structs, moving any assets into actual files (and not including them in raw form in the repo!), documenting, etc.

This can probably be achieved with a simple script that takes in a .data.s or .rodata.s file and spits out an equivalent .c file, with each piece of data being an s32 array.

Move everything out of functions.h/variables.h

functions.h and variables.h have always been a temporary measure to make it quicker to match stuff. They define a ton of random functions and variables that should really be declared elsewhere.

The not-yet-enabled warnings -Wimplicit -Wredundant-decls described in #366 are relevant here too, as they will enforce good use of header files.

Additionally, we should move all declarations out of .c files and into .h files.


Steps to migrate functions from functions.h:

  1. Pick a C file with nonstatic functions you'd like to make a header for
  2. Create a header file (.h) next to the C file with an include guard
  3. Add an extern declaration for global variables defined in the C file (usually marked with g at the start of the variable name)
  4. Add a declaration for every non-static function in the C file
  5. Remove declarations from functions.h and variables.h
  6. Add a #include "myfoo.h" to every file that uses variables or functions declared in your header

Include guard example

#ifndef _MYFOO_H_
#define _MYFOO_H_

// ...

#endif

only Debian/Ubuntu is supported for building rom

Only some Linux distros with cpp-mips-linux-gnu package is supported right now, because of KMC gcc compiler executing mips-linux-gnu-cpp which it does not available for some distros. You can fix this by asking Pink Horned Man White Nose to use the host's cpp, or if he doesn't want to do it use mine instead. (I will tell more about it if you agree to use my version of his.)

Build running on Arch Linux:

2022-01-18_16-49

Figure out global loading nonsense

Example funcs:
gravity_use_fall_params
DisablePulseStone

We often have to create redundant temps for globals in functions.

PlayerStatus* playerStatus = &gPlayerStatus;
PlayerStatus* playerStatus2 = &gPlayerStatus;

tools/cc1 is an x86 binary

The executable tools/cc1 is an x86 binary, which means it can't run in WSL. Could an x64 binary be provided in the next commit, or the source code for building an x64 version?

Remove global-ref macros like BATTLE_STATUS

The layer of abstraction is pointless and arguably confusing for people new to the codebase.

- BattleStatus* battleStatus = BATTLE_STATUS;
+ BattleStatus* battleStatus = &gBattleStatus;

Type naming

Continuing from this note, it looks like a number of us would prefer that we use CamelCase for types (although, I assume, value types like s32 can stay as they are from ultra64.h).
Currently we mostly(!) use snake_case, following Star Rod's lead, even though SR is also inconsistent in places.

It might be worth considering how we currently name enums, too - the SCREAMING_CAPS for enum members - as they're constants - is fine, in my opinion, but the actual enum type is less awesome. Similarly, the current discrepancy between the type npc (struct npc) and the enum type NPC (which is a union between npc* and some special values) is a bit awkward.

Parameterise all flags as enums

We currently parameterise only some flags, like NPC flags, in enums.h. Ideally, we'd do this for all flag types, such as that of BattleStatus.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.