jgunthorpe / python-rdma Goto Github PK
View Code? Open in Web Editor NEWPython interface to the Linux RDMA stack
Home Page: https://jgunthorpe.github.io/python-rdma/
License: Other
Python interface to the Linux RDMA stack
Home Page: https://jgunthorpe.github.io/python-rdma/
License: Other
Hi, Jason!
If a bad link is encountered, the subnet discovery fails with an exception. Here is a simple patch to ignore bad links.
Best regards,
Alexander Daryin
diff --git a/rdma/libibverbs.pxd b/rdma/libibverbs.pxd
index 055a6d1..3ff90b1 100644
--- a/rdma/libibverbs.pxd
+++ b/rdma/libibverbs.pxd
@@ -6,7 +6,7 @@ include 'libibverbs_enums.pxd'
cdef extern from 'infiniband/verbs.h':
union ibv_gid:
- char raw[16]
+ unsigned char raw[16]
This is how it is defined in verbs.h (OFED 4.4):
union ibv_gid {
uint8_t raw[16];
struct {
uint64_t subnet_prefix;
uint64_t interface_id;
} global;
};
I cloned the repository and ran sample code but it seems that it does not have ibverbs.py file and only other file extension are available which doesn't seems to work as I get Import Error
import rdma.ibverbs as ibv;
ImportError: No module named ibverbs
can you point me to the right documentation if this available on linux? I have two emulex 10gbe cards that I have weird behavior with, so I'd like to try something else? I don't really understand what converged ethernet is, I should probably look that up first
Hello,
is there a plan to add python3.x compatibility ?
Best regards,
Andreas
Hi @jgunthorpe,
I came across this library only recently and wanted to make you aware of a separate effort to provide RDMA support for Python users. A number of us have worked to add Python wrappers for UCX (https://ucx-py.readthedocs.io/) which provides RDMA and GPURDMA transport. My guess is that these projects serve different users -- still, wanted to make you generally aware
Hi there,
we maintain a fairly large setup with a broad range of nodes in which there are nodes down all the time.
I like the subnet_diff function a lot but I do not have a 'clean' cache file to check again.
Is it possible to manipulate a existing file. Or am I able to create a fictive one?
Cheers
Christian
I know this library isn't performance-focused but I was expecting it to still be faster than normal sockets. Instead, doing some basic benchmarks I found that it (with two-sided verbs) is about two orders of magnitude slower than normal sockets. I wonder if I am doing anything wrong or if this is the normal performance of the library.
Thanks a lot.
Hey there,
I am revisiting my IB past and currently checking out your wonderful library.
On a host with a real card this works without a problem:
python -c "import rdma;print rdma.get_devices()"
{'mlx4_0': <rdma.devices.RDMADevice object for mlx4_0 at 0x2968e10>}
But if I run it on a node using ibsim, it fails:
# echo $LD_PRELOAD
/usr/lib64/umad2sim/libumad2sim.so
# python -c "import rdma;print rdma.get_devices()"
{}
I used your ibsim repo (0.5) and the upstream of fedora20 (0.6).
What do I miss? The branch 'sim' seems to be pretty old.
Cheers
Christian
For debugging reasons I changed discovery.py a bit:
diff --git a/libibtool/discovery.py b/libibtool/discovery.py
index b88f54a..3708a62 100644
--- a/libibtool/discovery.py
+++ b/libibtool/discovery.py
@@ -378,12 +378,19 @@ def cmd_iblinkinfo(argv,o):
"all_NodeDescription",
"all_PortInfo",
"all_topology"]);
print "#### sbn.nodes"
for k,v in sbn.nodes.items():
print k,v
print "#### \sbn.nodes"
root = sbn.ports[umad.parent.port_guid];
for I in sbn.iterbfs(root):
if isinstance(I.parent,rdma.subnet.Switch):
print_switch(sbn,args,I.parent);
else:
sbn = lib.get_subnet(sched,());
print "#### sbn.nodes"
print sbn.nodes
print "#### \sbn.nodes"
sched.run(queue=rdma.discovery.subnet_get_port(sched,sbn,lib.path));
port = sbn.path_to_port(lib.path);
if not isinstance(port.parent,rdma.subnet.Switch):
If I fire up iblinkinfo without arguments I got all the nodes within the subnet:
root@emv111 python-rdma # ibtool iblinkinfo --up
0002:c902:0044:e890 <rdma.subnet.Switch object at 0x190a5d0>
0008:f104:0399:09ec <rdma.subnet.CA object at 0x190a7d0>
0008:f104:0399:0a64 <rdma.subnet.CA object at 0x190a350>
0008:f104:0399:0980 <rdma.subnet.CA object at 0x190a850>
0008:f104:0399:0944 <rdma.subnet.CA object at 0x190f290>
0008:f104:0399:0ab0 <rdma.subnet.CA object at 0x190aa90>
0008:f104:0399:01d4 <rdma.subnet.CA object at 0x190f450>
0008:f104:0399:0a98 <rdma.subnet.CA object at 0x190ae10>
0008:f104:0041:27bc <rdma.subnet.Switch object at 0x190a6d0>
0008:f104:0399:0a7c <rdma.subnet.CA object at 0x190ad10>
Switch 0008:f104:0041:27bc 'ISR9024S-M Voltaire':
1 1[ ] ==( 4x SDR Active/Link UP) ==> 4 1[ ] 'butragueno HCA-1'
1 2[ ] ==( 4x SDR Active/Link UP) ==> 6 1[ ] 'puskas HCA-1'
1 8[ ] ==( 4x SDR Active/Link UP) ==> 9 1[ ] 'emv107 HCA-1'
1 9[ ] ==( 4x SDR Active/Link UP) ==> 10 1[ ] 'emv109 HCA-1'
1 10[ ] ==( 4x SDR Active/Link UP) ==> 7 1[ ] 'emv110 HCA-1'
1 11[ ] ==( 1x SDR Active/Link UP) ==> 3 1[ ] 'emv111 HCA-1'
1 23[ ] ==( 4x SDR Active/Link UP) ==> 2 7[ ] 'Infiniscale-IV Mellanox Technologies'
1 24[ ] ==( 4x SDR Active/Link UP) ==> 2 8[ ] 'Infiniscale-IV Mellanox Technologies'
Switch 0002:c902:0044:e890 'Infiniscale-IV Mellanox Technologies':
2 1[ ] ==( 4x SDR Active/Link UP) ==> 8 1[ ] 'emv108 HCA-1'
2 2[ ] ==( 4x SDR Active/Link UP) ==> 5 1[ ] 'emv104 HCA-1'
2 7[ ] ==( 4x SDR Active/Link UP) ==> 1 23[ ] 'ISR9024S-M Voltaire'
2 8[ ] ==( 4x SDR Active/Link UP) ==> 1 24[ ] 'ISR9024S-M Voltaire'
root@emv111 python-rdma #
With options like LID (or GUID or DirectPath) I got no nodes at all within the subnet.
root@emv111 python-rdma # ibtool iblinkinfo --up -L 1
{}
E: RPC MAD_METHOD_GET(1) SMPFormat(1.1) SMPPortInfo(21) got error status 0x1c - Invalid attr or modifier
root@emv111 python-rdma # ibtool iblinkinfo --up -L 2
{}
Switch 0002:c902:0044:e890 'Infiniscale-IV Mellanox Technologies':
2 1[ ] ==( 4x SDR Active/Link UP) ==> 8 1[ ] 'emv108 HCA-1'
2 2[ ] ==( 4x SDR Active/Link UP) ==> 5 1[ ] 'emv104 HCA-1'
Traceback (most recent call last):
File "/usr/local/bin/ibtool", line 93, in
if not func(argv,o):
File "/usr/lib64/python2.6/site-packages/libibtool/discovery.py", line 402, in cmd_iblinkinfo
print_switch(sbn,args,port.parent);
File "/usr/lib64/python2.6/site-packages/libibtool/discovery.py", line 333, in print_switch
if better_possible(pinf.linkWidthSupported,peer_port.pinf.linkWidthSupported,
AttributeError: 'NoneType' object has no attribute 'linkWidthSupported'
I have not found the code fragment that fills the list and cause that error.
Cheers
Christian
I am trying out this
# ibtool sminfo
Traceback (most recent call last):
File "/root/p27/bin/ibtool", line 95, in <module>
if not func(argv,o):
File "/root/p27/lib/python2.7/site-packages/libibtool/inquiry.py", line 258, in cmd_sminfo
with lib.get_umad_for_target(values[0]) as umad:
File "/root/p27/lib/python2.7/site-packages/libibtool/libibopts.py", line 276, in get_umad_for_target
umad = rdma.get_umad(self.end_port);
File "/root/p27/lib/python2.7/site-packages/rdma/__init__.py", line 265, in get_umad
return rdma.umad.UMAD(port,**kwargs);
File "/root/p27/lib/python2.7/site-packages/rdma/umad.py", line 102, in __init__
if not self._ioctl_enable_pkey():
File "/root/p27/lib/python2.7/site-packages/rdma/umad.py", line 123, in _ioctl_enable_pkey
return fcntl.ioctl(self.dev.fileno(),self.IB_USER_MAD_ENABLE_PKEY) == 0;
IOError: [Errno 25] Inappropriate ioctl for device
if I just ran
#sminfo
sminfo: sm lid 1 sm guid 0x248a0703008f9e88, activity count 438477803 priority 15 state 3 SMINFO_MASTER
what am I missing? thanks
Hey,
unfortunately FDR10 connections are displayed as QDR in ibtool.
$ ibtool ibnetdiscover
[...]
[1] "S-<GUID>"[5] # "desc" lid 2 4xQDR
[...]
$ ibnetdiscover
[...]
[1] "S-<GUID>"[5] # "desc" lid 2 4xFDR10
[...]
Hello,
I wonder if you have done any performance benchmarks to measure, for example, latency of different RDMA operations, and to get an idea about the possible speedup to achieve as compared to traditional TCP/IP.
I couldn't find anything about memory windows in this library. Is there any way to use them with the current implementation?
Thanks.
diff --git a/rdma/ibverbs.pyx b/rdma/ibverbs.pyx
index 16891a6..59bddf0 100644
--- a/rdma/ibverbs.pyx
+++ b/rdma/ibverbs.pyx
@@ -108,7 +108,7 @@ cdef to_ah_attr(c.ibv_ah_attr *cattr, object attr):
raise TypeError("attr.grh must be a global_route")
if not isinstance(attr.grh.dgid, IBA.GID):
raise TypeError("attr.grh.dgid must be an IBA.GID")
- tmp = <uint8_t *>PyString_AsString(attr.DGID);
+ tmp = <uint8_t *>PyString_AsString(attr.grh.dgid);
for 0 <= i < 16:
cattr.grh.dgid.raw[i] = tmp[i];
cattr.grh.flow_label = attr.grh.flow_label
@@ -466,6 +466,22 @@ cdef class Context:
active_speed = cattr.active_speed,
phys_state = cattr.phys_state)
I am not very familiar with Cython, so I am having some basic questions like this one. How can I create work requests for post_send or post_recv? I tried rdma.ibverbs.send_wr() and rdma.ibverbs.recv_wr() but they do not accept any arguments. How can I specify memory locations and other parameters?
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.