Coder Social home page Coder Social logo

ofiwg / libfabric Goto Github PK

View Code? Open in Web Editor NEW
524.0 524.0 368.0 51.67 MB

Open Fabric Interfaces

Home Page: http://libfabric.org/

License: Other

Shell 0.91% Perl 0.15% C 95.46% C++ 0.67% Ruby 0.05% Makefile 0.26% M4 1.21% PowerShell 0.01% Roff 0.15% Python 0.98% Batchfile 0.11% Assembly 0.02% Groovy 0.03%

libfabric's People

Contributors

a-szegel avatar acgoldma avatar aingerson avatar belynam avatar darrylabbate avatar dmitrygx avatar github-actions[bot] avatar goodell avatar hppritcha avatar iziemba avatar j-xiong avatar jithinjose avatar jsquyres avatar jswaro avatar ofiwg-bot avatar ooststep avatar pkcoff avatar rajachan avatar rwespetal avatar shefty avatar shijin-aws avatar sungeunchoi avatar sunkuamzn avatar swelch avatar sydidelot avatar vkrishna avatar wenduwan avatar wzamazon avatar zachdworkin avatar zhngaj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libfabric's Issues

Add fork support for new connections

From the OFIWG F2F, there was a request to allow an application to indicate that a new connection request would be handed off to another process (forked or otherwise). The idea is that the provider could arrange its data structures accordingly, so that the new connection could successfully be migrated to another process.

Are src_addr and dest_addr in fi_info needed

struct fi_info contains a source and destination address, which correspond to an endpoint address. The fi_getinfo call takes a node and service parameter, which represent either the source or destination address. Determine if the src_addr and dest_addr fields are needed. The actual addresses can be retrieved from the endpoint getname/getpeer calls, once an endpoint has been created. With the librdmacm, the addresses were used to identify a local device, but that can be determined through the fi_info::domain_name field.

Allow user to register for miscellaneous events

Expand the EQ API to allow an application to register for specific types of events, including fabric and provider specific events -- e.g. remote node available, port up/down, topology change, congestion notification, receive buffer consumed, etc.

Split EQs into control and data event domains

Identify EQs as either belonging to a control or data domain. Data EQs are equivalent to CQs -- used to report data transfer completions -- and are optimized for performance, expected to be implemented in HW. Control EQs will be used to report all other events, and will trade off performance for ease of use by the app.

Support memory window like registration method

Register memory exposes a memory region for access by remote processes immediately after the registration completes. The region is open to access by all endpoints associated with a domain. Define a mechanism by which the region is 'closed' for access until it is attached to a specific endpoint.

Define use case for fi_info auth_key or remove

The auth_key and auth_keylen fields of struct fi_info are intended to be used for job authorization. Determine if there's a use case for these fields as defined (since they come from the applications). If not, remove them and determine what, if any mechanism is needed by libfabric to support job isolation.

Investigate triggered EQ operations

Triggered requests are somewhat defined for generating a data transfer when an event occurs. Investigate whether it makes sense for a triggered request to take other actions, such as inserting an item into an EQ. Review application requirements to see if this makes sense, versus using an existing mechanism, such as selective event generation.

Remove verbs/rdmacm from libfabric

From the OFIWG F2F - a decision was made to delay any extension support to the existing verbs interfaces. Remove the verbs and rdma cm code bases from libfabric, and instead use whatever version may be installed on the current system. This will also help prevent conflicts between distro, OFED, and/or vendor versions of the libraries and that used by libfabric.

Document priority of provider endpoint types

Applications need some sort of hint regarding the optimal way to use a provider, in the absence of application usage hints. Document a method by which a provider can indicate the best method for using their hardware. This may be as simple as returning fi_info structures in priority order.

Define and document flow control

There needs to be a mechanism for applications to enable/disable flow control, along with events defined when flow control or buffer overruns occur.

Add fi_addr_t data type

We can provide stronger type checking in the data transfer APIs by using a typedef for remote addresses. E.g. typedef uint64_t fi_addr_t. The output of AV insert would be type fi_addr_t. The data transfer calls, sendto, writeto, etc. would accept this type. This would force the use of an AV for all unconnected endpoint types. It also guarantees that 64-bits of addressing data is available to the provider to return from AV insert, making it simpler to encode raw address data.

Support EP getname aligning with AV insert

Verify and document that the output from EP getname may be used with AV insert. This allows for an application to do an all to all exchange of addresses and insert the results into an AV table.

Control EQ needs to report event type

An EQ used to report control related events (e.g. CM requests, memory registration, AV insertions, etc.) must indicate the type of event that was read. Either we need a generic event structure for this purpose, or the EQ read must return the event through a separate parameter. Control EQs may need to return a single event per read.

Fix conflict binding EP to EQ

An EQ may be associated with a domain or a fabric object. (The fabric EQ may be modified to be unassociated.) When binding an EP to an EQ, there's no way to know if the EQ was associated with the domain or fabric object. This can result in a provider attempting to dereference a fabric EQ as a domain EQ, resulting in a crash.

Support writing user events to an EQ

Update EQ API to allow an application to insert user defined events onto an EQ. This will be an optional feature for data EQs, but supported on control EQs.

Define raw/packet endpoints

Document the use of raw and packet endpoint types. Define flow steering mechanisms for packet endpoints. The flow steering defined in libibverbs is a reasonable starting point

Add AV lookup call

Add a call to retrieve one or more addresses store in an AV. This may be useful for apps for debugging purposes, or for extracting addresses from an AV in order to share them with another process.

Expand iovec support to include other data formats

Other data formats may be more concise than iovec for referencing multiple buffers. For example, strided operations may be able to point to a buffer, a size, an offset between buffers, and the number of buffers using a single structure, rather than chaining together a large set of SGEs. Incorporate 'expanded iovec' support into the APIs.

Add queue sizes to endpoint attribute

The endpoint attribute structure should be expanded to expose the size of the underlying queue. Now that the EP attribute exist, we can simplify things for the user and avoid needing to use control interfaces to override the default values. But default values should still be available to the user, with the actual values returned when an endpoint is created.

Allow re-use of fi_info returned from fi_getinfo

The struct fi_info returned from fi_getinto may only be used once. Redefine the API to allow the fi_info to be used multiple times. This will require changes from the verbs provider to handle the 'data' field differently. Note that the data field is also used when establishing a connection request.

Modify use of fi_sync

fi_sync was intended to allow applications to block until all data transfers of a specific type have completed. It's actually exceptionally hard to implement over all existing hardware. Remove it. Applications can use the primitive EQ events or counters to wait until all necessary operations have completed.

Data structure versioning

From the OFIWG F2F, data structure versions will be indicated using a version parameter to fi_getinfo. The version parameter will indicate the version of the set of data structures known to the application. libfabric will adjust its behavior accordingly, based on the data structures and fields known to the app. This mechanism will replace the field/mask concept in the current data structure scheme.

Interface structure versioning

From OFIWG F2F, interface structure versioning will be done using a size field within the struct. A query method (or static inline or define) will indicate if a specific interface is available.

Make EQ readfrom support optional

Allow a provider to optionally support EQ readfrom. It may not be possible for a provider to implement readfrom efficiently, compared to the app carrying the source address in the message. Also figure out how to write the first sentence without using a split infinitive.

Define or remove data_flow

The data flow endpoint attribute is basically defining sessions. Either fully define it or remove it from the API until it can be defined. It may be possible to remove data flow in favor of fully defined sessions.

Determine if EP attribute fields should be ssize_t

There are several fi_ep_attr fields which are size_t. Determine which, if any, should be ssize_t, so that a negative value can be used to indicate that the provider can select the maximum or best default value.

Decide whether to rename FI_RDM to FI_RUM

RDM = reliable datagram message. This is based on the socket type of a similar name. Decide whether this is acceptable or if we should rename this to RUM - reliable unconnected message.

Expose maximum remote EQ data size

Define a mechanism to request and report the maximum size of remote EQ data (i.e. immediate data). Need to decide if the size is reported in bits or bytes.

Define an endpoint as a session address

Endpoints are vaguely defined. Refine the endpoint definition to indicate that an endpoint represents a session level address (using the OSI model). As such, multiple endpoints may share the same transport and network address, if multiple sessions are defined. Expand the API and data structures to handle this.

Define only fi_close

Eliminate all reference in the man pages to close calls other than fi_close. Do not define object specific close calls, such as fi_ep_close.

Replace AV FI_RANGE with FI_SYMMETRIC

In practice, FI_RANGE is difficult to use and harder to implement. Define an alternative method, FI_SYMMETRIC, instead, which allows an app to indicate that addresses on remote systems use the same transport addresses (port numbers), with an equal number of processes placed per node. This will simplify the implementation and the application usage of the interface, plus enable optimal storage of the data.

Provide mechanism to define tagged fields

The tagged message API uses a tag and a mask to identify messages. However, applications are really using the tag space as a collection of independent fields. For example, 32-bits may represent the message, 16-bits the source address, and 16-bits as a group address. By exposing the use of the fields to a provider, it enables additional optimizations within the provider. For example, a provider could maintain separate queues, to greatly improve search times. Define a mechanism by which the app and provider can communicate the number of fields and their size.

Define link between fabric object and providers

There's not a clear link between what's needed to implement the fabric object and related interfaces and the providers. Maybe we need a fabric provider? Or providers need interfaces that allow the framework to implement fabric interface support in a generic fashion.

Separate FI_REMOTE_ACK from FI_REMOTE_COMPELTE

There are 3 identified cases where an operation can complete on the initiator side of a data transfer. The first is when the data buffer is reusable. The second is when the transfer has been ack'ed by the remote side (FI_REMOTE_ACK). The third is when the remote side has placed the data into a fault domain outside of the fabric hardware (such as memory, NVM, or hard disk) -- FI_REMOTE_COMPLETE. Applications may have use for any of these notification types. See if remote ack and remote complete both need to exist, and if so, define them.

Redefine msg_tag_value

The endpoint attribute msg_tag_value is defined as a maximum value. Redefine as a number of bits. This is needed to align with defining the tagged bits as fields, rather than generic bits.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.