Coder Social home page Coder Social logo

Comments (15)

bet4it avatar bet4it commented on May 23, 2024 1

We can store the Range of PacketBuf in ParseCommand rather than &[u8]?

from gdbstub.

daniel5151 avatar daniel5151 commented on May 23, 2024

Some relevant discussion regarding this issue: #72 (comment)

from gdbstub.

bet4it avatar bet4it commented on May 23, 2024

Summary of discussion in #72 (comment):

Because of the existence of vFile:readlink, we must use PacketBuf to hold the input data, so it's impossible to use 100% of PacketBuf as the response buffer.
We have three ways to solve it:

  1. Accept an optional PacketBufOut parameter, the handler code would check for it's existence when deciding which PacketBuf to pass to the target implementation.
    Drawback: if PacketBufOut is not provided, we can only use part of PacketBuf as response buffer, which is smaller than PacketSize, then in some situations we will get an unreasonable behavior, such as it's not enough to store the result of readlink.
  2. Split the PacketBuf user provided into two, and set PacketSize to half of the length of PacketBuf.
    Drawback: Because most packets don't need a huge buffer of PacketIn and PacketOut at the same time, it will result in a bunch of memory which is not used all that often.
  3. Make the gdbstub constructor take a const-generic value for the PacketBuf len, then we can allocate on the stack with size of this value to hold the input data when needed.
    Drawback: We may can't allocate such a big stack space in some limited environment.

I prefer the third way. The environment that can't allocate a big stack space is really rare, and it's very likely the Host I/O extensions or other extensions that need the stack space are not implemented in such environment, or we can set the length of PacketBuf to a smaller value that can be allocated on the stack.

from gdbstub.

daniel5151 avatar daniel5151 commented on May 23, 2024

This is a great summary of our discussion - thanks you!

Personally, I prefer the first way. It gives the end user the ultimate flexibility wrt. where they'd like to allocate this additional response buffer, and exactly how large it should be. Much of the "complexity" of this approach will be hidden within gdbstub itself, whereby the user-facing API will be a single additional GdbStubBuilder method.

That said, as I've stated earlier: I believe the current readlink implementation - even with it's edge-case limitations - should work just fine in almost all use-cases. As such, I don't think this is a task we should prioritize at the moment, and leave things as-is until someone expresses a need for a more robust solution.

from gdbstub.

bet4it avatar bet4it commented on May 23, 2024

I think the first way is quite meaningless. We must know why we want to use the whole PacketBuf as the response buffer, because the spec tells us that the size of response won't exceed PacketSize, which is the size of PacketBuf. So if we can use the whole PacketBuf, we can make sure that our implementation is perfect, we can handle with all the packets that should can be handled.

If we use the first way, our implementation is only perfect when the user provide PacketBufOut with a buffer which is not less than PacketSize. It's really weird, how could we tell the user when to use it? "The current implementation may not works in some situations, unless you provide the PacketBufOut big enough"? Or "If your code use vFile:readlink, you need to provide PacketBufOut with a buffer of PacketSize"?

But if we choose the second or the third way, we can assure our implementation is always perfect.

That said, as I've stated earlier: I believe the current readlink implementation - even with it's edge-case limitations - should work just fine in almost all use-cases. As such, I don't think this is a task we should prioritize at the moment, and leave things as-is until someone expresses a need for a more robust solution.

I really agree with it! The situation that readlink is used should be really rare, and in most situations the data read shouldn't be very long. If you can bear the imperfection, we can leave it as this.

from gdbstub.

bet4it avatar bet4it commented on May 23, 2024

We may need to do something what bytes crate has done:

use bytes::Bytes;

let mut mem = Bytes::from("Hello world");
let a = mem.slice(0..5);

assert_eq!(a, "Hello");

let b = mem.split_to(6);

assert_eq!(mem, "world");
assert_eq!(b, "Hello ");

(But bytes require alloc)

from gdbstub.

daniel5151 avatar daniel5151 commented on May 23, 2024

I'm not entirely sure what you're suggesting with that bytes example, but as you've pointed out, bytes requires alloc, which is a non-starter for core gdbstub code. Any solution will have to rely on built-in slice methods, and/or custom no_std utility code

from gdbstub.

bet4it avatar bet4it commented on May 23, 2024

Bytes keeps both a pointer to the shared state containing the full memory slice and a pointer to the start of the region visible by the handle. Bytes also tracks the length of its view into the memory.

Isn't this the way you used by ReadLinkPacket in #72 (comment) ?

And Bytes can track different part of a buffer at the same time, which is we need when parse packet (For example, addr and kind when parse breakpoint packet).


I have a new idea: we can only assure the code is perfect (as described above) when alloc is enabled? In most situations, alloc should be enabled, then we can allocate a temporary buffer to store filename. If some limited environments don't enable alloc, they most probabily don't need the Host I/O extension.

from gdbstub.

daniel5151 avatar daniel5151 commented on May 23, 2024

Ahh, yes, sure, that's certainly a valid way of doing it (i.e: instead of storing a slice reference directly, store the offsets into buf corresponding to the field data).

FWIW, that approach would enable using 100% of the PacketBuf for any fields of target-dependent size (e.g: addr, offset, kind, etc...). Those can be parsed into the correct type within gdbstub prior to calling the handler API, thereby allowing the entire buffer to be used in the handler as well.

The real complexity is how to handle packets like ReadLinkPacket, where the handler API needs access to variable-length incoming packet data (e.g: the filename field). In that case, you have to either do those aforementioned ownership-juggling I sketched out in the linked comment above, or use an entirely separate buffer.

from gdbstub.

bet4it avatar bet4it commented on May 23, 2024

We may need to figure out that if there is only readlink command use variable-length incoming packet data and also need a buffer to prepare the response. If only this command has this situation, we can use an easy way ( Use alloc when it's enabled as I mentioned above, or just ignore it ) to handle it, and I still hope we can use entire PacketBuf for all the other commands.
( Then we don't need this loop: )

while n != 0 {
let chunk_size = n.min(buf.len());
use num_traits::NumCast;
let addr = addr + NumCast::from(i).ok_or(Error::TargetMismatch)?;
let data = &mut buf[..chunk_size];
match target.base_ops() {
BaseOps::SingleThread(ops) => ops.read_addrs(addr, data),
BaseOps::MultiThread(ops) => {
ops.read_addrs(addr, data, self.current_mem_tid)
}
}
.handle_error()?;
n -= chunk_size;
i += chunk_size;
res.write_hex_buf(data)?;
}


If what bytes does is what we need, we may suggest the upstream not to use alloc when the buffer is provided by the user?

from gdbstub.

daniel5151 avatar daniel5151 commented on May 23, 2024

So, to reiterate - it is entirely possible to rewrite the current m packet implementation (and many others) to use 100% of the packet buffer, using the "store a start..end of the addr/len instead of a &'a [u8]" approach I described above, without requiring any surface-level API changes.

I agree - now that we have the extra protocol constraint of "responses must fill fit in the packet buffer", we can absolutely remove that gnarly NumCast loop, and swap it out with a significantly simpler implementation. If you'd like to take a crack at that, feel free. Otherwise, it's something I'll get around to... eventually.


Could you rephrase your comment regarding bytes?


Also, I should probably clarify something important about alloc: even with the alloc feature enabled, I would not want to introduce any "hidden" allocations inside gdbstub. i.e: any buffers gdbstub uses should be opt-in by the user, either by them providing the buffer directly, or by setting some kind of option to enable some kind of one-time internal heap allocation.

The rationale for this is to support the kind of use cases like gz's kernel debugger - a use-case where alloc is available + enabled, but gdbstub is still (sometimes) driven via interrupt handlers. Heap allocation from an interrupt handler can lead to Bad Things™️ happening, and should usually be avoided.

Right now, the only time gdbstub does any kind of internal hidden heap allocation is when the trace-pkt feature is enabled, which is strictly for debugging purposes.

from gdbstub.

bet4it avatar bet4it commented on May 23, 2024

I know we can solve it with ReadLinkPacket, I just think it's a bit ugly so I want to find a way more elegant to do it. It's better that there are already crates do the same thing so we can use it directly.

I think bytes crate do the thing what we want to do: tracks the whole buffer and different parts of the buffer at the same time. But we can't use it because it allocates memory by itself, we can't provide a buffer to it.

from gdbstub.

daniel5151 avatar daniel5151 commented on May 23, 2024

Again - and I really cannot stress this enough: using bytes or any similar alloc-dependent crates is not an option.

Period.

gdbstub is a no_std + no_alloc library. Every single packet must support no_std + no_alloc.

No exceptions.

With ReadLinkPacket, it really seems that there are only two approaches:

  1. Rewrite the read_link API as something like #72 (comment), giving the implementation a chance to read+copy the filename into its own local buffer, before proceeding to write the result into the (now free to use) PacketBuf
  2. Add an optional output_buffer argument to the GdbStubBuilder, which gives users an option to provide a separate output buffer, enabling "transparent" use of 100% of the packet buffer without requiring any API contortions.

FWIW, I actually lean towards the first option, as while it results in a slightly less "elegant" API, it doesn't require the end user to allocate a buffer that will be left unused almost all of the time.


Also, some more thoughts on gdbstub's approach to alloc:

If gdbstub was an alloc library, the entire API and internal structure of the library would look completely different.

It would probably look a lot closer to https://github.com/luser/rust-gdb-remote-protocol, with plenty of Vecs and Strings being used as part of the API, and the internal implementation of the library would look a lot "cleaner" to the typical Rust programmer, as it would be devoid of any lifetimes and complex ownership juggling.

But gdbstub is not an alloc library. The fact that it is no_std + no_alloc means that implementing code in gdbstub is harder and more annoying - that's just the name of the game when working with no_std code!

The constraints of the platform that no_std code is expected to run on results in library code that doesn't look like "typical" Rust code you'd see on hosted platforms. Instead of heap-allocation + .clone(), you'll see delicate lifetime annotations + zero copy references to buffers. Instead of allocating heap memory when a buffer is needed, you'll see that the buffer has to be provided by the user, as not to "force" the implementation to have some kind of inherent memory requirement.

I honestly feel like I should put this explanation somewhere more visible, like a top-level CONTRIBUTING.md, since it really cuts to the core of gdbstub's implementation philosophy.

To really summarize the point: gdbstub is no_std first, and will only ever use a light sprinkle of alloc as an entirely optional way to make life easier on hosted platforms.

from gdbstub.

bet4it avatar bet4it commented on May 23, 2024

using bytes or any similar alloc-dependent crates is not an option

I know this, so I said if what bytes does is what we need, we may suggest the upstream not to use alloc when the buffer is provided by the user, then bytes could be no_std + no_alloc.


giving the implementation a chance to read+copy the filename into its own local buffer

I actually lean towards the first option, as while it results in a slightly less "elegant" API, it doesn't require the end user to allocate a buffer that will be left unused almost all of the time.

Don't you said it's impossible for us to allocate a local buffer with unknown size?

from gdbstub.

daniel5151 avatar daniel5151 commented on May 23, 2024

Ah, apologies - I didn't catch the part about wanting to upstream these changes...

I don't believe that bytes would want to upstream that kind of change, as it doesn't really fit with what that crate is trying to do. Feel free to open an issue upstream to ask, but I would be very surprised if that was something that's a change they'd be willing to make / implement.

For gdbstub, we'd want to roll our own abstraction.


Yes, us as in "us who write the code in gdbstub" cannot allocate any buffers on the user's behalf. There is no reason why the user cannot allocate however they like from within the readlink handler itself - that code is out of our control.

If they want to allocate some buffer on the heap, sure. If they want to stack allocate it, also cool. If they want to write data into some long-lived global buffer, that's also fine. The important thing is we don't make the decision for them.

from gdbstub.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.