Coder Social home page Coder Social logo

c-how-to's Issues

能添加更多实例吗?

我是C语言的新手,之前学的是python,想更多的了解一些高级语言的底层实现,你的教程对我来说很有用,但是我还想了解比如像垃圾回收这样的功能和面向对象,我可以到哪学习呢?

Malloc_tutorial.pdf has many issues

I'll ignore grammar and spelling (I get that English isn't your first language, that's okay) and matters of opinion I frankly disagree with (like saying that using typedef on a struct or especially a pointer is good practice when both of these things are widely considered code smells if not outright antipatterns) and point out two MASSIVE, program-crashing flaws I noticed in the code.

The first one is on page 12, in this implementation of calloc

image

For one, it is neither malloc's nor calloc's responsibility to ensure that memory is page-aligned per request. That should be handled by whatever code shrinks or grows a node. calloc need not know that the size it requested from malloc got modified, nor does it need to zero-fill that region. It only needs to zero-fill the region the caller asked for. On top of that, calloc is supposed to check for overflow and return NULL and set errno to ENOMEM or EOVERFLOW in the event of an overflow. Your code does not check for this. Your code also clears more than it should be clearing. So much more, in fact, that it's a bug, as it's clearing four times more than it should be.

for simplicity, let's pretend a bookkeeping node is 4 bytes and we have a pool of 36 bytes

[amount: 32]XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
a = malloc(4)
[amount: 4]XXXX[amount: 24]XXXXXXXX XXXXXXXX XXXXXXXX
b = malloc(4)
[amount: 4]XXXX[amount: 4]XXXX[amount: 16]XXXXXXXX XXXXXXXX
free(a)
[amount: 4 (free)]XXXX[amount: 4]XXXX[amount: 16]XXXXXXXX XXXXXXXX
c = calloc(2, 1) 
// aligns to 4, malloc is called with 2, malloc is unable to shrink since a node won't fit
// so it secretly returns the whole 4-byte block
[amount: 4]XXXX[amount: 4]XXXX[amount: 16]XXXXXXXX XXXXXXXX
loop: 2 is up-aligned to 4 and them <<= 2 resulting in 16: for 16 bytes, set to 0
[amount: 4]XXXX 0000 0000 0000 XXXXXXXX XXXXXXXX
                ^^^^ ^^^^ ^^^^ this has all been zeroed out!

Now let's also assume that before calloc was called, b had a pointer stored in it

b = (struct table_row *)big_static_table;

what do you suppose happens, after calloc is called and clears both the region that contained b's bookkeeping node and the memory it pointed to, when the code tries to dereference that pointer?

Well, the pointer is 0 now, so the caller just dereferenced a null pointer through no fault of its own. The program receives a SIGSEGV.

The second major flaw is on page 13-15.

The flaw is that free attempts to determine if the pointer is a valid pointer. This is not only impossible to check, but it's not free's job to check. The standard clearly states that if the pointer passed to free is not NULL or a pointer generated by malloc, the behavior is undefined. Your "solution" is to stuff an unnecessary pointer into the bookkeeping struct that points back to the memory, presumably to check that the pointer in the block matches the pointer passed in. But how is this going to work if you aren't even sure that the block prior to the passed in address is a valid block?

So let's assume the pointer passed in is (void *)8. What happens? The code subtracts the size of a block from the pointer, and OOPS, you've just overflowed below zero. The code then tries to dereference this invalid pointer to check if the pointer inside points to the initial pointer. An action which is, of course, undefined behavior. Double oops.

Your code does an extra check to see if the initial pointer is within the range between the first node and the program break, but this is still easy to break. For example, if you set the address to sbrk(0) + 1 before making any calls to malloc and then use that value later, then the address will be valid per the check, but it will then go into out of bounds memory and dereference it, which is UB. Another way to break it is as such. Let's assume the caller is malicious and has read your implementation.

struct foo {
    size_t a;
    void * b;
    void * c;
    int    d;
    void * e;
    char   f[1];
};

size_t chunksize = sizeof (struct foo);
char *foo = malloc(chunksize * 2);
struct foo *bar = foo;

memset(foo, 0, chunksize);
bar->e = foo + chunksize;
bar->b = (void *)1;
bar->c = (void *)1;
free(bar->e);

What do you suppose this code will do given the implementation of free on page 13-15?

image

Firstly, free will see the address and call valid_addr on it.

image

This function will check if base is null. It isn't, so then it checks if the address is in range. It is. We got it from malloc, and the offset we jumped to is still within the chunk we allocated. So then it calls get_block on it.

image

This function does what we expect it to do and subtracts the block size from the address to get what must be the block, where the previous function will dereference the pointer to get the pointer. However, the caller is malicious and used the struct definition they read from the implementation to create a bogus block with a pointer matching the pointer that gets passed into free, and free will use this imposter block as if it were real. This is NOT a valid pointer, but your implementation of free will attempt to free it and, per the standard, the behavior is undefined. What's worse, the malicious caller also set the prev and next pointers to bogus values, so at some point, free may try to dereference those and at that point the program will receive a SIGSEGV.

Again, it is not free's responsibility to check for this and it's not possible to reliably check. The onus is on the caller to make sure it only passes pointers obtained from malloc into free. If the pointer is not valid, the whole program is invalid (and will likely crash) either way. Why bother with the unnecessary and non-deterministic bounds check?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.