Coder Social home page Coder Social logo

python-rdma's People

Contributors

jgunthorpe avatar roidayan avatar scotttaggart avatar slavashw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

python-rdma's Issues

Discovery fails on broken links

Hi, Jason!

If a bad link is encountered, the subnet discovery fails with an exception. Here is a simple patch to ignore bad links.

Best regards,
Alexander Daryin

No module named ibverbs

I build the python-rdma. But when i run
./ibtools help
I get error
image

I get into
python-rdma/build/lib.linux-x86_64-2.7/rdma

I do not find ibverbs.py but find ibverbs.so.

Can you help me to find what's wrong.

union ibv_gid::raw must be defined as "unsigned char"

diff --git a/rdma/libibverbs.pxd b/rdma/libibverbs.pxd
index 055a6d1..3ff90b1 100644
--- a/rdma/libibverbs.pxd
+++ b/rdma/libibverbs.pxd
@@ -6,7 +6,7 @@ include 'libibverbs_enums.pxd'
 cdef extern from 'infiniband/verbs.h':

     union ibv_gid:
-        char raw[16]
+        unsigned char raw[16]

This is how it is defined in verbs.h (OFED 4.4):

union ibv_gid {
	uint8_t			raw[16];
	struct {
		uint64_t	subnet_prefix;
		uint64_t	interface_id;
	} global;
};

ibverbs library is missing

I cloned the repository and ran sample code but it seems that it does not have ibverbs.py file and only other file extension are available which doesn't seems to work as I get Import Error

import rdma.ibverbs as ibv;
ImportError: No module named ibverbs

Port to Python 3.x

Hello,

is there a plan to add python3.x compatibility ?

Best regards,

Andreas

IB Support Through UCX

Hi @jgunthorpe,

I came across this library only recently and wanted to make you aware of a separate effort to provide RDMA support for Python users. A number of us have worked to add Python wrappers for UCX (https://ucx-py.readthedocs.io/) which provides RDMA and GPURDMA transport. My guess is that these projects serve different users -- still, wanted to make you generally aware

Creating a gold master cache-file to set the optimal fabric status?

Hi there,

we maintain a fairly large setup with a broad range of nodes in which there are nodes down all the time.
I like the subnet_diff function a lot but I do not have a 'clean' cache file to check again.
Is it possible to manipulate a existing file. Or am I able to create a fictive one?

Cheers
Christian

Slower than TCP sockets?

I know this library isn't performance-focused but I was expecting it to still be faster than normal sockets. Instead, doing some basic benchmarks I found that it (with two-sided verbs) is about two orders of magnitude slower than normal sockets. I wonder if I am doing anything wrong or if this is the normal performance of the library.

Thanks a lot.

ibsim functionality

Hey there,

I am revisiting my IB past and currently checking out your wonderful library.
On a host with a real card this works without a problem:

python -c "import rdma;print rdma.get_devices()"
{'mlx4_0': <rdma.devices.RDMADevice object for mlx4_0 at 0x2968e10>}

But if I run it on a node using ibsim, it fails:

# echo $LD_PRELOAD
/usr/lib64/umad2sim/libumad2sim.so
# python -c "import rdma;print rdma.get_devices()"
{}

I used your ibsim repo (0.5) and the upstream of fedora20 (0.6).

What do I miss? The branch 'sim' seems to be pretty old.

Cheers
Christian

sbn is emtpy when not use global view of subnet

For debugging reasons I changed discovery.py a bit:

diff --git a/libibtool/discovery.py b/libibtool/discovery.py
index b88f54a..3708a62 100644
--- a/libibtool/discovery.py
+++ b/libibtool/discovery.py
@@ -378,12 +378,19 @@ def cmd_iblinkinfo(argv,o):
"all_NodeDescription",
"all_PortInfo",
"all_topology"]);

  •        print "#### sbn.nodes"
    
  •        for k,v in sbn.nodes.items():
    
  •            print k,v
    
  •        print "#### \sbn.nodes"
         root = sbn.ports[umad.parent.port_guid];
         for I in sbn.iterbfs(root):
             if isinstance(I.parent,rdma.subnet.Switch):
                 print_switch(sbn,args,I.parent);
     else:
         sbn = lib.get_subnet(sched,());
    
  •        print "#### sbn.nodes"
    
  •        print sbn.nodes
    
  •        print "#### \sbn.nodes"
         sched.run(queue=rdma.discovery.subnet_get_port(sched,sbn,lib.path));
         port = sbn.path_to_port(lib.path);
         if not isinstance(port.parent,rdma.subnet.Switch):
    

If I fire up iblinkinfo without arguments I got all the nodes within the subnet:

root@emv111 python-rdma # ibtool iblinkinfo --up

sbn.nodes

0002:c902:0044:e890 <rdma.subnet.Switch object at 0x190a5d0>
0008:f104:0399:09ec <rdma.subnet.CA object at 0x190a7d0>
0008:f104:0399:0a64 <rdma.subnet.CA object at 0x190a350>
0008:f104:0399:0980 <rdma.subnet.CA object at 0x190a850>
0008:f104:0399:0944 <rdma.subnet.CA object at 0x190f290>
0008:f104:0399:0ab0 <rdma.subnet.CA object at 0x190aa90>
0008:f104:0399:01d4 <rdma.subnet.CA object at 0x190f450>
0008:f104:0399:0a98 <rdma.subnet.CA object at 0x190ae10>
0008:f104:0041:27bc <rdma.subnet.Switch object at 0x190a6d0>
0008:f104:0399:0a7c <rdma.subnet.CA object at 0x190ad10>

\sbn.nodes

Switch 0008:f104:0041:27bc 'ISR9024S-M Voltaire':
1 1[ ] ==( 4x SDR Active/Link UP) ==> 4 1[ ] 'butragueno HCA-1'
1 2[ ] ==( 4x SDR Active/Link UP) ==> 6 1[ ] 'puskas HCA-1'
1 8[ ] ==( 4x SDR Active/Link UP) ==> 9 1[ ] 'emv107 HCA-1'
1 9[ ] ==( 4x SDR Active/Link UP) ==> 10 1[ ] 'emv109 HCA-1'
1 10[ ] ==( 4x SDR Active/Link UP) ==> 7 1[ ] 'emv110 HCA-1'
1 11[ ] ==( 1x SDR Active/Link UP) ==> 3 1[ ] 'emv111 HCA-1'
1 23[ ] ==( 4x SDR Active/Link UP) ==> 2 7[ ] 'Infiniscale-IV Mellanox Technologies'
1 24[ ] ==( 4x SDR Active/Link UP) ==> 2 8[ ] 'Infiniscale-IV Mellanox Technologies'
Switch 0002:c902:0044:e890 'Infiniscale-IV Mellanox Technologies':
2 1[ ] ==( 4x SDR Active/Link UP) ==> 8 1[ ] 'emv108 HCA-1'
2 2[ ] ==( 4x SDR Active/Link UP) ==> 5 1[ ] 'emv104 HCA-1'
2 7[ ] ==( 4x SDR Active/Link UP) ==> 1 23[ ] 'ISR9024S-M Voltaire'
2 8[ ] ==( 4x SDR Active/Link UP) ==> 1 24[ ] 'ISR9024S-M Voltaire'
root@emv111 python-rdma #

With options like LID (or GUID or DirectPath) I got no nodes at all within the subnet.

root@emv111 python-rdma # ibtool iblinkinfo --up -L 1

sbn.nodes

{}

\sbn.nodes

E: RPC MAD_METHOD_GET(1) SMPFormat(1.1) SMPPortInfo(21) got error status 0x1c - Invalid attr or modifier

root@emv111 python-rdma # ibtool iblinkinfo --up -L 2

sbn.nodes

{}

\sbn.nodes

Switch 0002:c902:0044:e890 'Infiniscale-IV Mellanox Technologies':
2 1[ ] ==( 4x SDR Active/Link UP) ==> 8 1[ ] 'emv108 HCA-1'
2 2[ ] ==( 4x SDR Active/Link UP) ==> 5 1[ ] 'emv104 HCA-1'
Traceback (most recent call last):
File "/usr/local/bin/ibtool", line 93, in
if not func(argv,o):
File "/usr/lib64/python2.6/site-packages/libibtool/discovery.py", line 402, in cmd_iblinkinfo
print_switch(sbn,args,port.parent);
File "/usr/lib64/python2.6/site-packages/libibtool/discovery.py", line 333, in print_switch
if better_possible(pinf.linkWidthSupported,peer_port.pinf.linkWidthSupported,
AttributeError: 'NoneType' object has no attribute 'linkWidthSupported'

I have not found the code fragment that fills the list and cause that error.

Cheers
Christian

IOError: [Errno 25] Inappropriate ioctl for device

I am trying out this

# ibtool sminfo
Traceback (most recent call last):
  File "/root/p27/bin/ibtool", line 95, in <module>
    if not func(argv,o):
  File "/root/p27/lib/python2.7/site-packages/libibtool/inquiry.py", line 258, in cmd_sminfo
    with lib.get_umad_for_target(values[0]) as umad:
  File "/root/p27/lib/python2.7/site-packages/libibtool/libibopts.py", line 276, in get_umad_for_target
    umad = rdma.get_umad(self.end_port);
  File "/root/p27/lib/python2.7/site-packages/rdma/__init__.py", line 265, in get_umad
    return rdma.umad.UMAD(port,**kwargs);
  File "/root/p27/lib/python2.7/site-packages/rdma/umad.py", line 102, in __init__
    if not self._ioctl_enable_pkey():
  File "/root/p27/lib/python2.7/site-packages/rdma/umad.py", line 123, in _ioctl_enable_pkey
    return fcntl.ioctl(self.dev.fileno(),self.IB_USER_MAD_ENABLE_PKEY) == 0;
IOError: [Errno 25] Inappropriate ioctl for device

if I just ran

#sminfo
sminfo: sm lid 1 sm guid 0x248a0703008f9e88, activity count 438477803 priority 15 state 3 SMINFO_MASTER

what am I missing? thanks

FDR10 is wrongly displayed as QDR

Hey,
unfortunately FDR10 connections are displayed as QDR in ibtool.

$ ibtool ibnetdiscover
[...]
[1]	"S-<GUID>"[5]	# "desc" lid 2 4xQDR
[...]
$ ibnetdiscover
[...]
[1]	"S-<GUID>"[5]	# "desc" lid 2 4xFDR10
[...]

Performance benchmarks

Hello,

I wonder if you have done any performance benchmarks to measure, for example, latency of different RDMA operations, and to get an idea about the possible speedup to achieve as compared to traditional TCP/IP.

Memory Windows Support

I couldn't find anything about memory windows in this library. Is there any way to use them with the current implementation?

Thanks.

Bug in ibverbs.pyx::to_ah_attr

diff --git a/rdma/ibverbs.pyx b/rdma/ibverbs.pyx
index 16891a6..59bddf0 100644
--- a/rdma/ibverbs.pyx
+++ b/rdma/ibverbs.pyx
@@ -108,7 +108,7 @@ cdef to_ah_attr(c.ibv_ah_attr *cattr, object attr):
                 raise TypeError("attr.grh must be a global_route")
             if not isinstance(attr.grh.dgid, IBA.GID):
                 raise TypeError("attr.grh.dgid must be an IBA.GID")
-            tmp = <uint8_t *>PyString_AsString(attr.DGID);
+            tmp = <uint8_t *>PyString_AsString(attr.grh.dgid);
             for 0 <= i < 16:
                 cattr.grh.dgid.raw[i] = tmp[i];
             cattr.grh.flow_label = attr.grh.flow_label
@@ -466,6 +466,22 @@ cdef class Context:
                          active_speed = cattr.active_speed,
                          phys_state = cattr.phys_state)

How to create WRs?

I am not very familiar with Cython, so I am having some basic questions like this one. How can I create work requests for post_send or post_recv? I tried rdma.ibverbs.send_wr() and rdma.ibverbs.recv_wr() but they do not accept any arguments. How can I specify memory locations and other parameters?

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.