Coder Social home page Coder Social logo

Comments (20)

jkryl avatar jkryl commented on July 17, 2024 1

Hi @xxks-kkk! Most likely this is the same problem as I had. The upgrade of spdk would most likely help. The PR#8 you mentioned cannot be used in its current form because it has a problem with linking SPDK libraries. We are considering multiple options of how to fix it but none of them is easy. I don't dare to estimate how and when this could be fixed.

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

Hey @jkryl,

Nice work! I haven't touched the code in a while due to other committments, but hoping to get back to it in the new year.

Good investigative work! I have run into the same issue before with librte_mempool_ring missing, but I think at that point using the version of spdk that produced a static library solved it for me. My suggestion (if you want to get yourself ublocked) is to just use in-memory backend for all this (the way it's all setup in the tests here), but I might be wrong? The spdk_nvmf_subsystem_state_change kinda suggests to me you're trying to run on actual hardware, so maybe using the in-memory backend would be ok?

Anyway, I don't have a good suggestion off the top of my head right now, in the past I was using cargo config to pass custom linker flags, but it was really hacky (example is here: https://github.com/jkozlowski/starfish/blob/083afb35e5ec870af3d490a0a9b53760947223bf/spdk-sys/.cargo/config) so I'd rather not go back to that.

Out of curiosity what's your motivation? The library is really unfinished, so I wouldn't get your hopes up, but I'm hacking on it whenever time permits. If you're interested in contributing, would be great to collaborate. One thing I haven't bothered figuring out is what happens when futures are cancelled (there was an issue on https://github.com/rust-lang-nursery/wg-net about this but I can't find it): basically I'm currently using references to pass buffers to the library but those might dissapear, because futures are polling and spdk is callback/completion based.

Anyway, lemme know what you think and please push a PR if you manage to solve this before me!

from starfish.

gila avatar gila commented on July 17, 2024

Just using the memory backend won't solve the problem I think. These subsystem modules are loaded before main (i.e using the attribute flags) The only way to disable the nvmf target pieces then is simply to not link them in.

As for the rustc flags, I've actually tried that and works. However, when running a release build, they still get lost :(

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

Hmm, then I'm not sure what's happening. As you can see tests compile and run on CircleCI and I used the same exact setup on my Ubuntu box to run them and I'm not getting those failures :(

As for the rustc flags, I've actually tried that and works. However, when running a release build, they still get lost :(

Yeah, that's why I'd rather not go back to them, they were so hit and miss :( Another approach I tried in the past was writing a custom Makefile that pulls in the spdk Makefile, requests all the libs and builds the libraries. Getting a single statically linked lib was so sweet, why did they pull the plug on that?

In any case, lemme know how it goes, it might just force me to sacrifice my other commitments and play with this myself...

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

Hmm, I'm wondering if this is actually worth reporting to spdk folks:

Just running the example for the shared libraries doesn't seem to work for me.

./configure --with-shared
make
ldconfig -v -n ./build/lib
LD_LIBRARY_PATH=./build/lib/ ./app/spdk_tgt/spdk_tgt

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

As a data point, I have followed your instructions and managed to run the tests. I would actually be willing to take a PR with your changes anyway: it's already pretty hard to run these apps that I have a script to run tests locally anyway. So send a PR and add this file https://github.com/jkozlowski/starfish/pull/7/files#diff-40703290474a4270c464349966a9dc2c with suitable LD_* stuff and I'll happily merge.

Keeping up with spdk is more important for me now than a bit of a hacky setup; and in the meantime I might raise an issue on the spdk repo (or feel free to do it yourself, I'm going offline in a sec).

Also, hi @gila, I originally didn't realise there was yet another person on this issue!

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

Just running the example for the shared libraries doesn't seem to work for me.

Nevermind, just forgot sudo...

from starfish.

jkryl avatar jkryl commented on July 17, 2024

@jkozlowski thanks for your prompt response! I submitted a PR with the changes. In a long term, using static libraries would be probably even more useful. But yea, good to be able to run latest spdk version. To answer your question regarding motivation, I just saw the library and thought it was interesting so I decided to give it a try and learn from it. I understand it is not complete and that's fine. I see a great value in combining SPDK with rust async/await feature. Thanks for pioneering this area! :-)

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

I would also prefer a statically linked spdk! Any idea how we can get that?

I really don't understand (I mean I understand , but I don't buy that it's a problem in general) why you wouldn't want that!

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

I see a great value in combining SPDK with rust async/await feature.

I know right! I would love to get a bit more time to work on this, the potential for something of the seastar architecture with the expressivity of rust is rather great! Throw in some quic networking and some nice RPC library on top of that and things could get pretty interesting still.

from starfish.

jkryl avatar jkryl commented on July 17, 2024

absolutely! šŸ‘

from starfish.

xxks-kkk avatar xxks-kkk commented on July 17, 2024

I run cargo test and hit the following error on SPDK 18.07. I'm not sure if it is the same error message that is reported:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fcfeb492535 in __GI_abort () at abort.c:79
#2  0x00007fcfeb4f9516 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fcfeb61dc00 "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007fcfeb5003aa in malloc_printerr (str=str@entry=0x7fcfeb61fc88 "malloc(): smallbin double linked list corrupted") at malloc.c:5336
#4  0x00007fcfeb504144 in _int_malloc (av=av@entry=0x7fcfe4000020, bytes=bytes@entry=32) at malloc.c:3632
#5  0x00007fcfeb506bc1 in __libc_calloc (n=n@entry=1, elem_size=elem_size@entry=32) at malloc.c:3420
#6  0x00007fcfeb749b27 in spdk_nvmf_poll_group_remove_subsystem (group=0x7fcfe4414600, subsystem=0x7fcfe4414430, cb_fn=0x7fcfeb7472b0 <subsystem_state_change_continue>, cb_arg=0x7fcfe441c690) at nvmf.c:993
#7  0x00007fcfeb73e7d2 in _spdk_event_queue_run_batch (reactor=0x7fcfe400d100, reactor=0x7fcfe400d100) at reactor.c:207
#8  _spdk_reactor_run (arg=0x7fcfe400d100) at reactor.c:506
#9  0x00007fcfeb73ee29 in spdk_reactors_start () at reactor.c:692
#10 0x00007fcfeb73daa5 in spdk_app_start (opts=0x7fcfeabfa3a0, start_fn=0x5586eca1a140, arg1=0x7fcfeabfa198, arg2=0x0) at app.c:576

I guess SPDK 18.10 is the way to go? The print out from the program:

running 1 test
Starting SPDK v18.07 / DPDK 18.05.0 initialization...
[ DPDK EAL parameters: hello_blob -c 0x1 --legacy-mem --file-prefix=spdk_pid3004 ]
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/spdk_pid3004/mp_socket
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
app.c: 530:spdk_app_start: *NOTICE*: Total cores available: 1
reactor.c: 718:spdk_reactors_init: *NOTICE*: Occupied cpu socket mask is 0x1
reactor.c: 492:_spdk_reactor_run: *NOTICE*: Reactor started on core 0 on socket 0
error: process didn't exit successfully: `/home/ubuntu/starfish/target/debug/deps/spdk_sys-3165ff8a68d39986` (signal: 11, SIGSEGV: invalid memory reference)

Since the repo is in the middle of change to support SPDK 18.10, I'm also curious with the pull request #8 Will the test case pass @jkryl ?

Thanks!

from starfish.

jkryl avatar jkryl commented on July 17, 2024

I have some new interesting ideas around this. @jkozlowski if you are interested to review them please do. thanks!

spdk-sys crate

Problem statement

Creating a crate with bindings for spdk lib is challenging:

  1. The public interface of spdk lib contains countless number of function, constant and structure definitions. That makes creating the bindings by hand unfeasible. rustā€™s bindgen must be used. The result is a huge file with all public member definitions in rust with tens thousands of lines, which makes the compilation slower. In fact we donā€™t generate bindings for all spdk header files, but just some and even then the file is big enough.
  2. SPDK is modular framework and contains a lot of optional modules which can but donā€™t have to be included. Depending on which modules are used, the right header files must be used for generating rust bindings and the right libraries must be used for linking. The question is how to master this complexity without having hard coded list of headers and libs in a build script.
  3. SPDK is very much optimised for the cpu it was compiled on. This goes much further than just different cpu architectures. Even various x86 64-bit processors are not compatible between each other. It is non-trivial to produce a binary which works on letā€™s say all x86 64-bit cpus manufactured in last 10 years.
  4. Linking spdk libraries together with rust program is troublesome because some spdk libs are not included in the resulting binary. Thatā€™s because there is no explicit dependency on some of the libs. The libs are rather plugable modules which if loaded by the dynamic linker (shlib) or present in the binary (static) register themselves with spdk using ā€œconstructorā€ functions which are executed before the main starts. Linker cannot detect such dependencies and therefore ignores the libraries as it thinks they are not needed. However spdk cannot even start without some of such modules. So this is even more fundamental problem than the previous ones.

The purpose of this doc is to suggest a solution for the last problem mentioned above. As for the first three problems we assume that:

  1. Compilation time does not concern us.
  2. Size of the produced library/binary does not concern us. We expose all modules and functionality which spdk provides even if not used by the application in the end.
  3. The only supported cpu architecture is x86_64 and the cpu must not be too old.

Though there are ways how to address first two problems in rust using ā€œfeaturesā€, it would require more work for the initial implementation and there is also maintenance burden as spdk libraries and headers get removed or added between SPDK releases. That said it would be a good extension of spdk sys crate later when it gets more mature.

Goals

We wish to have a spdk-sys crate which:

  1. follows the best practises for sys crates in rust
  2. is a general rust interface to spdk usable by any project making use of spdk
  3. allow using modified SPDK which differs from the upstream

Best practices

Best practices follow from articles listed in the links section and exemplar rust sys crates (libgit2-sys).

  • static vs dynamic: sys crates for libraries come in two basic flavours depending on how they link to the library: static or dynamic. In most cases static libraries are preferred because they minimise number of build and runtime dependencies. The program just works after being installed without having to do other prerequisite steps like using package manager to install the dependencies. This makes the program more robust and less error prone user experience.
  • building the C library: A neat optional feature of many sys crates is that they can build the library without relying on it being installed on the system. It avoids problems and unexpected errors during the build stage. The sys crate is used as a dependency in another project and usually it will be a dependency of dependency of dependency ā€¦ Errors about missing libs from deep of the build chain donā€™t contribute to a good user experience. sys crates bundle the source code of the library which they wrap, in form of a git submodule. And many of them donā€™t even use the build system of the library and have their own rules how to build it in the build.rs script. The important fact to note is that if the sys crate builds the library, then it is used exclusively for static libs. The built product which is an object archive is stored in a temporary build directory and apps depending on it take it from there during a link phase. This cannot be done for dynamic libraries as the build script would have to install the shlib to a system location requiring root privileges which is inappropriate action for a build script. Storing the shlib in a build directory is possible, but it would have to be distributed along with the application binary and the path of a temporary build directory would have to be hardcoded (using -rpath) in the app binary which is fragile and complicated.

Switching between different options is done by using rust ā€œfeaturesā€ or environment variables. The default should be set to the most popular way of using the library which depends on the platform and the library itself. Detecting if the lib is installed on a system is done almost exclusively using pkg-config.

Reality of SPDK

SPDK can be built static or dynamic. It is a set of following libraries:

SPDK:

app_rpc bdev bdevio bdev_delay bdev_error bdev_gpt bdev_iscsi bdev_lvol bdev_malloc bdev_null bdev_nvme bdev_passthru bdev_id bdev_rpc bdev_split bdev_virtio blob blob_bdev blobfs conf copy copy_it env_dpdk event event_bdev event_copy event_iscsi event_nbd event_net event_nvmf event_scsi event_vhost event_vmd ftl ioat iscsi json jsonrpc log log_rpc lvol nbd net notify nvme nvmf rpc rte_vhost scsi sock sock_posix thread tce tce_rpc util vhost virtio vmd

DPDK:

bus_pci bus_vdev cmdline compressdev cryptodev eal ethdev hash krgs mbuf mempool mempool_bucket mempool_ring meter net pci ring vhost

ISA-L:

isal

There is no monolithic libspdk.so and libdpdk.so containing all libraries above. When building SPDK with --with-shared configure option, DPDK libs are built as static. This is a bug in SPDK build system and can be easily fixed.

Applying the best practices to SPDK

The most preferred solution would be to build SPDK object archives and link them statically to app. We bang a head against a wall when doing so. Object archives are linked with as-needed or no-whole-archive linker flag in rust and all libs which are not explicitly used by the app are omitted. Since there is no way how to change the default linker behaviour in rust, the only viable
workaround is to pretend that we use all of the libraries by referencing a symbol from each of them. The problem is that some of the libs have only private symbols and we canā€™t reference them anyhow without patching them. There is now way how to make that when using vanilla SPDK (from upstream). Unless the rust build system is enhanced to support -whole-archive linker option (ticket rust-lang/rust#56306), we have to use shared libraries.

Shared libraries suffer from the same problem as static libs as rust build system uses as-needed linker flag. But there is an elegant workaround for that. If we create one big shared library out of all smaller ones then it will be surely referenced because there will be surely at least one call to the library - if there was none then it would not make sense to require the lib by the app in the first place - and the dynamic lib loader must load it all to the memory in one piece . The only problem is that we must create this fat library (as I will be calling it) ourselves. Good news is that it is super simple. We just need to take all object archives and combine them into a single shlib:

cc -shared -o libspdk_fat.so -Wl,--whole-archive *.a -Wl,--no-whole-archive

The result is a suboptimal solution because the shared library must be built and installed to the system prior to building the app and deployed to a target system along with the app. Detecting if libspdk fat lib is installed on the system is cumbersome as we canā€™t use pkg-config. SPDK in general is missing support for pkg-config ( https://trello.com/c/uBM2PR4c/19-generate-pkg-config-pc-file-during-make-install ). In spite of all drawbacks it is kinda standard rust solution for sys crates and works with upstream spdk with no changes required.

Using spdk sys crate

High level steps of how to use spdk sys crate:

Phase 1:

  1. Check out spdk-sys git repository including spdk sources as submodule.
  2. Run a build script which automates steps needed to build SPDK and creates the fat lib.
  3. Install the fat lib to a system location (as root).

Phase 2:

  1. Use spdk-sys crate in dependencies in Cargo.toml.
  2. When spdk-sys is built by rust, the build script merely checks that the fat shlib is installed on the system and generates bindings using bindgen and spdk header files for it (it does not build the library).

The way how the fat shlib is deployed along with the app to a target system is out of scope. In case of a docker image it can be copied from the system where the image is built to a docker image. On other systems it may be delivered in form of a package as it is usually done for other system libraries.

SPDK with patches

Some projects using SPDK in rust will need to use their own version of SPDK which differs from the upstream. There is probably no better way than to clone spdk-sys repository on github and override spdk git submodule in the repository so that it points to their own version of SPDK. In dependencies section of Cargo.toml file must be used a github URL of the cloned spdk-sys repo.

Tests

The sys crate should come with tests. It is out of scope to test each function provided by SPDK. At minimum calling a single function from spdk would be sufficient. Later when support for opt-in modules is added, there should be a similar test for each module.

Links

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

I like the fat shared library approach. I upgraded the startfish-executor project to latest nightly, I will try to do the same for the spdk-sys package. Then split out the async wrappers into a separate project, and try your suggestion.

Can we not build a fat static library? (my knowledge in this domain is not the greatest)

from starfish.

jkryl avatar jkryl commented on July 17, 2024

Hi Jakub, it would be great if we could give it a try. Here is the repository with spdk-sys: https://github.com/jkryl/spdk-sys . Note that it is still very fresh so expect problems when using it šŸ˜› Possible way how to use it in another project is to include it as a submodule, because the fat library needs to be built and installed to the system before running the cargo build so it makes sense for it to be there and use it twice (first time to build and install fat lib and second time to build bindings from headers and serve as a proper sys crate). I'm doing the same work for our private project just now, I suppose it should not be that hard to do the same for starfish for me. I will try to submit a PR ..

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

from starfish.

jkryl avatar jkryl commented on July 17, 2024

@jkozlowski : it would be cool if we could chat on spdk's slack (spdk-team.slack.com) and discuss spdk-sys crate in greater detail there. cheers

from starfish.

jkozlowski avatar jkozlowski commented on July 17, 2024

Happy to, but you'd need to invite me, says it's invite only.

from starfish.

xxks-kkk avatar xxks-kkk commented on July 17, 2024

Is it possible for me to join the conversation as well?

from starfish.

jkryl avatar jkryl commented on July 17, 2024

I have created a "rust" channel on spdk's slack. Anyone is welcomed to join. Link to spdk's slack is at https://spdk.io/community/.

@jkozlowski : I have sent an invite to your email address that I got from linked in. I hope that will work.

from starfish.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.