I am within a team that creates an Appliance VM that needs a Yubikey HSM for Security.
During the setup, we had some struggles with stabilizing the operation of the yubihsm-connector within this VM.
It might just be an issue with the underlying libusb version or the Linux Kernel itself, but nevertheless I would like to pointout the issue and mentioning the workaround, we did implement to have this issue fixed for the moment
Environment
Some Details about the environment:
The Host System is VMWare ESXI 6.7.0 Update 1 (Build 11675023)
.
The HW is a NUC NUC7i5BNH
, we will switch to real server gear later this month, so it might also be a misbehaviour of the underlying HW :-).
The Guest VM is a CentOS Linux release 7.6.1810 (Core)
- Kernel Version 3.10.0-957.12.1.el7.x86_64
.
The Yubikey HSM is setup as a passtrough device to the Linux VM:
$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 004: ID 1050:0030 Yubico.com
Bus 002 Device 003: ID 0e0f:0002 VMware, Inc. Virtual USB Hub
Bus 002 Device 002: ID 0e0f:0003 VMware, Inc. Virtual Mouse
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
What is the error?
In 50% of our tests, the setup just works as expected.
The rest of the tests fail with the Yubikey Connetor stating:
$ curl localhost:12345/connector/status
status=NO_DEVICE
serial=*
version=2.0.0
pid=6677
address=localhost
port=12345
The Debug Log of the Connector shows then:
DEBU[0000] preflight complete cert= config= key= pid=6658 seccomp=false serial= syslog=false version=2.0.0
DEBU[0000] takeoff TLS=false listen="localhost:12345" pid=6658
DEBU[0003] reopening usb context Correlation-ID=d3a27afe-67b7-69ad-4b43-852b71978459 why="status request"
DEBU[0003] usb context not yet open Correlation-ID=d3a27afe-67b7-69ad-4b43-852b71978459
WARN[0005] status failed to open usb device X-Request-ID=d3a27afe-67b7-69ad-4b43-852b71978459 error="device not found"
INFO[0005] handled request Content-Length=0 Content-Type= Method=GET RemoteAddr="127.0.0.1:45628" StatusCode=200 URI=/connector/status User-Agent=curl/7.29.0 X-Real-IP=127.0.0.1 X-Request-ID=d3a27afe-67b7-69ad-4b43-852b71978459 latency=1.629251502s
DEBU[0007] reopening usb context Correlation-ID=acaa699f-bd22-25ae-4cfe-6d1030559047 why="status request"
Workaround
Our first attempt was to disable the USB autosuspend via Kernel Parameters that did not work as expected, although the error description was similar to ours.
A working solution was finally to have a cron script running, that checks the connector status and resets all USB Devices if the status is not equals "OK"
#!/bin/bash
STATUS=$(curl -s localhost:12345/connector/status | grep status=)
if [ $STATUS != "status=OK" ]; then
for i in /sys/bus/pci/drivers/[uoex]hci_hcd/*:*; do
[ -e "$i" ] || continue
echo "${i##*/}" > "${i%/*}/unbind"
echo "${i##*/}" > "${i%/*}/bind"
done
fi
Maybe an update of go-libusb
(as mentioned in #3) will solve the issue, but otherwise it might be nice to have the connector itself deal with these kind of issues (although resetting USB Device need root access rights :-/)