pmret / papermario Goto Github PK
View Code? Open in Web Editor NEWDecompilation of Paper Mario
Home Page: https://papermar.io
Decompilation of Paper Mario
Home Page: https://papermar.io
We should create a dead.h
header that is included at the top (i.e., before common.h
is included) which aliases the symbols from dead_syms to their 'alive' counterparts using a bunch of #define
s. This can be included in all 'dead' code units (C files referenced in splat.yaml, not .inc.c
s) and removes the need for dead-specific implementations of duplicated functions like #349 adds. For non-matching adventurers in the future it will make porting the dead maps to use alive symbols as easy as removing the #include "dead.h"
line.
e.g.
#define gItemTable dead_gItemTable
#define get_variable dead_get_variable
#define set_variable dead_set_variable
Preferably we would not refer to any dead symbols explicitly anywhere in src, prefering to use alive names.
As of creating this issue there are roughly 85 instances of this function - search for i = heap_malloc(16
- across the codebase, and we're likely to see more as we match more maps. At least one map includes this function and associated data twice.
I suggest naming this function something like N(PersistLocalVars)
, or similar. On first call it saves the script's varTable, and on a second call it restores the varTable and resets itself.
Here's it in arn_03
:
/// Pushes/pops script local variables to D_80241C68_BE09F8
ApiStatus N(func_802412B0_BE0040)(ScriptInstance* script, s32 isInitialCall) {
s32** ptr = &N(D_80241C68_BE09F8);
s32 i;
s32* test;
if (*ptr == NULL) {
i = heap_malloc(16 * sizeof(s32));
*ptr = (s32*) i;
for (i = 0, test = *ptr; i < 16; i++) {
*test++ = script->varTable[i];
}
} else {
for (i = 0, test = *ptr; i < 16; i++) {
script->varTable[i] = *test++;
}
ptr = &N(D_80241C68_BE09F8);
heap_free(*ptr);
*ptr = NULL;
}
return ApiStatus_DONE2;
}
s32* N(D_80241C68_BE09F8) = NULL;
Notice that this include will need to be both data and a function. This means ordering is important- its not as simple as removing both the data and the function and adding a #include
. (There are a bunch of other includes like this that should bring along data such as scripts but are hard to add, for example the texture-panning functions and scripts.)
It could also work to have the data be a static
variable inside the function body, i.e. static s32** ptr = NULL
.
Known DSLs we need macros and disassemblers for:
...this game is ridiculous
We have a function, gfx_init_state, that we just can't match. We think the compiler or settings might be wrong. Let's investigate and share progress here.
https://docs.google.com/spreadsheets/d/1-2mmsy7v3NJZorGnP6liRSaT-66C6mMc60NVz-lEQSQ/edit#gid=0
See #562 for examples.
Basically use update_evts.py
on the file you want to convert and do some manual cleanup afterwards. Search for Script({
in VSC to find the files that have to be converted.
There are some maps / c files where we have to create explicit rodata vars just to get alignment to work properly. Sometimes rodata seems 0x10 aligned (maps with manually-created rodata_alignment
vars) and sometimes it seems 0x8 aligned (jump tables right next to each other). What's going on?
soon
Fixing this issue will allow much quicker iteration when reorganizing the layout of source files, such as matching libultra functions.
See #224
Enum values ( = xyz) and names, in the case of dummy stuff, should use hex and not decimal
I think these are duplicated across every item except for a couple constants. With some macros, and now that we have BSS support, we can deduplicate them. I think there could be an inc.c for refund stuff with the bss symbol for the icon. it seems to always be at the beginning of every item.
and use these in functions when temps for these are needed and used as ints / pointers so we don't have to cast
Following maps have assets/models/etc. included in them that have to be handled by creating sub segments in Splat:
There are probably a bunch more of these that have to be handled, and the list should be updated accordingly, if possible.
Related to #383.
It's been suggested that we merge common assets in assets/us and assets/jp into an assets/core directory that, similar to src, only uses different asset files (in version-specific directories) for each version where necessary.
That is, the version stack for us will look like:
- core
- us
And jp:
- core
- jp
(we may want to have core hold all us things, im unsure)
This can give us macros that use the ASCII names for models/colliders rather than their IDs in scripts and modelLists
From @ethteck on Discord:
basically anything in the splat.yaml that's "data" (no dot before it) or "bin" is asm data or just binary data, respectively
often, the asm data contains pointers that need to have actual symbol names (D_12341234 instead of 0x12341234)
so like, figuring those out is not trivial sometimes. they could be the start of a segment, they don't have to be an actual c variable necessarily
as for the bin stuff, we need to figure out what the binary data actually is and handle it appropriately. sometimes it's C data, sometimes it's like a custom format for something.
this one is tricker. sometimes two symbols have the same value, but one is technically more correct than the other
there are times when we DMA an overlay, and one of the things the function call wants is the RAM of the overlay. but instead, we're giving it the first function. which just happens to have the same ram address as the start ram address of the overlay, because it's currently the first thing there. but what happens if we write a function above it? suddenly we're not pointing at the beginning of the overlay anymore but just some random spot in the middle of it
these kinds of things are probably going to come up as we start being able to shift here and there, and we'll fix them as we go
okay so this one is probably the most important atm, and I saved it for last
undefined_syms includes all the symbols that are referenced in our asm files and c files but aren't actually declared anywhere
the idea is to declare them properly and then remove the entry from undefined_syms
a lot of these, if not all, are bss variables. bss vars are weird in that they take up no space in the rom and are at the end of each segment. they're supposed to be defined but not initialized, so like s32 someVar; would presumably end up in BSS
all of the BSS vars should be declared in their proper files so we can remove the undefined_syms entry for them
however, we can't just get rid of the undefined_syms entry ...for reasons I can explain later. so for now we're throwing all the known bss symbols into main_bss_syms.txt
so the work one could do is ...find BSS vars that aren't declared and declare them in the c files that they belong to
everything in main_bss_syms is from the first (main) segment, up until like 0x75000 or wheverver it stops. so you could start with those if you wanted
It's likely that we have .inc.c files that are always included together and can therefore be merged into the same file, especially in cases of related NPC AI funcs.
This tool should look at all C files and determine includes that can be 'grouped', e.g. if foo.inc.c
, bar.inc.c
and baz.inc.c
are always included together and in the same order in all C files, then they can be grouped.
#include "foo.inc.c"
#include "bar.inc.c"
#include "baz.inc.c"
e.g. draw_string
-> draw_msg
"string" is a leftover from Star Rod's naming conventions. "msg" is consistent with TTYD symbols (msgDrv.o) and with the pm_msg
segment we already have.
While helping my friend out with his emulator, I found some info on the itemEntity struct and 2 functions. We used a dump of the RAM in his emulator to see what was going on.
The field currently named unk_34
seems to hold only 4 byte sized values. There is a function called update_item_entities
, that should update item entities. In the ROM this is located at 900c85ec
. This function is copied to RAM location 80131eec
. In a for loop, this function checks if the itemID of some itemEntity structs is equal to 0x157
, which from https://tcrf.net/Notes:Paper_Mario I gathered were coins. If they are coins, there is a 10% chance some values get updated in the struct, which I believe are values related to a (sparkling) animation sequence. The more interesting function call is that to FUN_90130acc
, which is nothing, but after being copied to RAM this is a call to 80130acc
corresponding to a function at 900c71cc
. The decompilation for this function should be something like
void do_animation(itemEntity* item_entity) {
if (--item_entity->frames_left < 1) { // Ghidra says < 1, presumably this is just == 0 and the field is an unsigned integer
do { } while (next_step(item_entity));
}
}
The frames_left
field is the field at offset 0x3c
in the itemEntity
struct. Essentially, every animation sequence step (I am just guessing it is an animation sequence, it might be some other sequence) lasts for a certain amount of frames.
The next_step
function in ROM is a call to 90130a04
, which again points to nothing, but in RAM this is a call to 80130a04
, corresponding to 900c7104
in the ROM. This function essentially does one step in the animation and returns whether a next step should be taken. It should look something like this:
int next_sequence_step(ItemEntity *item_entity)
{
undefined4 uVar1;
int *current_state_ptr;
uint *next_ptr;
current_state_ptr = item_entity->current_state_ptr;
next_ptr = (uint *)(current_state_ptr + 1); // this pointer has a different meaning depending on the state
switch(*current_state_ptr) {
case 0: // this is an error state it seems, and will hang the above function
return 1;
case 1:
item_entity->frames_left = *next_ptr;
uVar1 = current_state_ptr[2];
item_entity->current_state_ptr = current_state_ptr + 3;
item_entity->field_0x44 = uVar1;
break;
case 2:
item_entity->current_state_ptr = item_entity->sequence_start;
return 1;
case 3:
item_entity->sequence_start = (int *)next_ptr;
item_entity->current_state_ptr = (int *)next_ptr;
return 1;
case 4:
item_entity->current_state_ptr = current_state_ptr + 2;
return 1;
case 5:
case 6:
break;
case 7:
item_entity->frames_left = *next_ptr;
item_entity->field_0x4c = (int *)current_state_ptr[2];
item_entity->field_0x50 = (int *)current_state_ptr[3];
item_entity->field_0x54 = current_state_ptr[4];
uVar1 = current_state_ptr[5];
item_entity->current_state_ptr = current_state_ptr + 6;
item_entity->field_0x58 = uVar1;
break;
default:
return 0;
}
return 0;
}
essentially, the current_state_ptr
(field offset 0x40) points to a struct, which varies on what state it is. In all cases it starts with an int showing what state it actually is. In the case of 0 it is just an int, and the do_animation
function will hang. In case of 1, it looks like
struct sequence_state_1 {
int state; // is always 1
uint frames_left; // amount of frames it stays in this state
int field_0x44; // struct field 0x44 is set to this value
}
case 2 resets the animation. The struct is also just an int then. case 3 switches to a new animation, and the struct looks like
struct {
int state; // is always 3
int* next_animation;
}
case 4 simply advances the state. The struct does hold an extra 32 bit piece of information, but this is not really used here it seems. case 5 or 6 do nothing, but will not hang the do_animation
function. case 7 is the most interesting one, it holds a struct that looks like
struct {
int state; // always 7
uint frames_left; // amount of frames it stays in this state
void* pointer_0x4c; // pointer to 0x20 bytes of data
void* pointer_0x50; // pointer to at least 0x20 bytes of data
int field_0x54;
int field_0x58;
}
Of course, the integers showing the state are likely some enum. The struct sizes can be checked by the amount that item_entity->current_state_ptr
is advanced (in strides of 4 bytes). An example of such a sequence can be found in ROM at 9009df70
, copied to RAM at 80104ac0
. This is a sequence that starts at state 4, then state 7 a few times and then state 0 (which presumably the N64 never reaches, have not figured this out yet). Looking at the values for the pointers of field 0x4c and 0x50, 4c seems to point at 0x20 bytes of data, and 0x50 at some multiple of 0x20.
Because of this, I also think that the itemEntity
struct will look something like
typedef struct ItemEntity {
/* 0x00 */ s32 flags;
/* 0x04 */ s16 boundVar; /* see make_item_entity */
/* 0x06 */ char unk_06[2];
/* 0x08 */ Vec3f position;
/* 0x14 */ struct ItemEntityPhysicsData* physicsData;
/* 0x18 */ s16 itemID; /* into item table, also worldIconID */
/* 0x1A */ u8 state;
/* 0x1B */ u8 type;
/* 0x1C */ u8 pickupDelay; /* num frames before item can be picked up */
/* 0x1D */ char unk_1D;
/* 0x1E */ s16 wsFaceAngle; /* < 0 means none */
/* 0x20 */ s16 shadowIndex;
/* 0x22 */ char unk_22[2];
/* 0x24 */ u32* readPos;
/* 0x28 */ u32* savedReadPos;
/* 0x2C */ char unk_2C[2];
/* 0x2E */ u8 unkCounter;
/* 0x2F */ s8 unk_2F;
/* 0x30 */ f32 scale;
/* 0x34 */ u32 unk_34;
/* 0x38 */ u32 unk_38;
/* 0x3c */ u32 frames_left;
/* 0x40 */ u32* current_state_ptr;
/* 0x44 */ u32 unk_44;
/* 0x48 */ u32* sequence_start_ptr;
/* 0x4c */ u32* unk_4c; // 32 bytes of data
/* 0x50 */ u32* unk_50; // 32 bytes or multiple of 32 bytes of data
/* 0x54 */ u32 unk_54;
/* 0x58 */ u32 unk_58;
} ItemEntity; // size = 0x5C
build fails after verifying built US rom on aarch64 (raspberry pi) host. (does not affect x86 hosts)
I suggest that this may have cause of the arm64 built c compiler tools/arm/cc1
or some matched snippets of code, probably...
Update the script to better work with the new syntax, as a few Macros are currently receiving faulty data or are missing relevant data from enums.h
EVT_CALL
that calls GotoMap
will have a faulty pointer as the first argument, that doesn't resembel a map-string. Most likely an issue with the disassembly that it points to some function instead of the rodata containing the string.arn_03/header.c
and arn_04/header.c
for an exampleEvtSource D_80242504 = {...
List is most definitely incomplete and will be updated as new issues arise.
distros that provide the ability to install multiple package managers (Fedora) or mix various distros together (Bedrock) will have some trouble with the install script
Related to #438, since it should also fix that issue.
There is some evidence that overlays were linked by themselves, not together at the end. This evidence includes:
N
where static
does not work. It can be presumed that the original developers did not do thisMy proposal is that we Do As The Devs Did. I'm unsure if this is possible with GNU tooling (e.g. was it a Nintendo thing?) but we need to figure out how to:
Doing this would make bad references like #277 not compile, which is good.
e.g. pos, position
Compiling the rom emits a ton of warnings that we should be able to fix - mostly implicit casts that should really be made explicit.
Additionally it would be nice to be able to enable -Wimplicit -Wredundant-decls
so headers are used properly.
Once this is done we can enable -Werror
on Jenkins so people fix their warnings as they match stuff
Also make available at https://docs.papermar.io
Instead of storing build/linux, mac, etc dirs in the repo that contain the GCC 2.8.1 compiler, grab the correct architecture's compiler as part of the install setup from the releases of gcc-papermario based
I'm thinking SPRITE_PIXEL_SCALE or something - IIRC I read somewhere in the Star Rod source that this (roughly 0.7f) is how many in-game units large a single pixel on a sprite is by default, and it's used in a number of places e.g. #572
e.g. bActorTattles
At the moment jp's splat.yaml defines src/jp/
as its root source directory. Ideally it should be src
- all the existing segment names need to be prefixed with jp/
to keep the existing structure, though - besides the first C file, since it's identical to is_debug
from us.
https://discord.com/channels/688807550715560050/688849682373410819/826819524669603851
I'll add more info here later
See also: #432
A few functions, such as func_80242EC4_EA37C4
, use a dead version of the libultra sqrtf
, but codegen does not match with the typical macro magic we do with dead.h
currently. I believe this is because sqrtf is an intrinsic function and gets optimised differently.
If this is the case, there is no way for us to tell the compiler to treat "dead_sqrtf" as if it were the real sqrtf. These functions cannot be matched right now.
My proposed solution is:
dead.h
, compile 'dead' maps normallyPros:
Cons:
This is extra-relevant if we're moving to macro-based scripts.
@ethteck thoughts?
Resolves #230
This is the first pass of map/battle data work. Future passes will involve disassembling scripts and other structs, moving any assets into actual files (and not including them in raw form in the repo!), documenting, etc.
This can probably be achieved with a simple script that takes in a .data.s
or .rodata.s
file and spits out an equivalent .c
file, with each piece of data being an s32 array.
functions.h
and variables.h
have always been a temporary measure to make it quicker to match stuff. They define a ton of random functions and variables that should really be declared elsewhere.
The not-yet-enabled warnings -Wimplicit -Wredundant-decls
described in #366 are relevant here too, as they will enforce good use of header files.
Additionally, we should move all declarations out of .c files and into .h files.
Steps to migrate functions from functions.h
:
.h
) next to the C file with an include guardextern
declaration for global variables defined in the C file (usually marked with g
at the start of the variable name)functions.h
and variables.h
#include "myfoo.h"
to every file that uses variables or functions declared in your header#ifndef _MYFOO_H_
#define _MYFOO_H_
// ...
#endif
Only some Linux distros with cpp-mips-linux-gnu package is supported right now, because of KMC gcc compiler executing mips-linux-gnu-cpp which it does not available for some distros. You can fix this by asking Pink Horned Man White Nose to use the host's cpp, or if he doesn't want to do it use mine instead. (I will tell more about it if you agree to use my version of his.)
Example funcs:
gravity_use_fall_params
DisablePulseStone
We often have to create redundant temps for globals in functions.
PlayerStatus* playerStatus = &gPlayerStatus;
PlayerStatus* playerStatus2 = &gPlayerStatus;
The executable tools/cc1 is an x86 binary, which means it can't run in WSL. Could an x64 binary be provided in the next commit, or the source code for building an x64 version?
The layer of abstraction is pointless and arguably confusing for people new to the codebase.
- BattleStatus* battleStatus = BATTLE_STATUS;
+ BattleStatus* battleStatus = &gBattleStatus;
Continuing from this note, it looks like a number of us would prefer that we use CamelCase
for types (although, I assume, value types like s32
can stay as they are from ultra64.h).
Currently we mostly(!) use snake_case
, following Star Rod's lead, even though SR is also inconsistent in places.
It might be worth considering how we currently name enums, too - the SCREAMING_CAPS
for enum members - as they're constants - is fine, in my opinion, but the actual enum type is less awesome. Similarly, the current discrepancy between the type npc
(struct npc
) and the enum type NPC
(which is a union between npc* and some special values) is a bit awkward.
We currently parameterise only some flags, like NPC flags, in enums.h
. Ideally, we'd do this for all flag types, such as that of BattleStatus.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.