Comments (4)
I'm debugging with this:
diff --git a/src/zdaemon/tests/tests.py b/src/zdaemon/tests/tests.py
index dd09ae7..d8e85ce 100644
--- a/src/zdaemon/tests/tests.py
+++ b/src/zdaemon/tests/tests.py
@@ -319,6 +319,11 @@ def test_start_timeout():
... start-test-program cat x
... start-timeout 1
... </runner>
+ ... <eventlog>
+ ... <logfile>
+ ... path /tmp/WAT-HAPPEN.log
+ ... </logfile>
+ ... </eventlog>
... ''' % sys.executable)
>>> import time
diff --git a/src/zdaemon/zdrun.py b/src/zdaemon/zdrun.py
index 32b4bfe..018664e 100755
--- a/src/zdaemon/zdrun.py
+++ b/src/zdaemon/zdrun.py
@@ -149,6 +149,8 @@ class Subprocess:
starttestprogram = self.options.starttestprogram
try:
while self.pid == pid:
+ self.options.logger.info("spawned test process %s",
+ starttestprogram)
if not subprocess.call(starttestprogram):
break
time.sleep(1)
@@ -175,7 +177,8 @@ class Subprocess:
thread.setDaemon(True)
thread.start()
- self.options.logger.info("spawned process pid=%d" % pid)
+ self.options.logger.info("spawned process pid=%d: %s %s",
+ pid, self.filename, self.args)
return pid
else: # pragma: nocover
# Child
@@ -318,10 +321,18 @@ class Daemonizer:
def sigchild(self, sig, frame):
try:
pid, sts = os.waitpid(-1, os.WNOHANG)
- except os.error:
+ except os.error as e:
+ self.logger.info("Ignoring SIGCHLD: waitpid returned %s", e)
return
if pid:
+ self.logger.info("Got SIGCHLD for pid %s", pid)
+ if self.waitstatus:
+ es, msg = decode_wait_status(self.waitstatus[1])
+ msg = "lost wait status for pid %d: " % self.waitstatus[0] + msg
+ self.logger.warn(msg)
self.waitstatus = pid, sts
+ else:
+ self.logger.info("Ignoring SIGCHLD: waitpid returned pid %s", pid)
transcript = None
Reminder of what's happening in the test:
- zdaemon spawns a child process (t.py) that sleeps for 10 seconds
- zdaemon also spawns a test process ('cat x') that returns 1 (because 'x' doesn't exist), and it does so every second in a loop in a background thread
- there's a start-timeout of 1 second which means zdaemon is supposed to give up and kill t.py because the test program didn't return 0 in that time.
AFAICT there's a race condition between subprocess.call(starttestprogram)
inside Subprocess.test
and the os.waitpid()
call inside Daemonizer.sigchild
. On Python 3.4, waitpid() in sigchild returns 0. On PyPy with set_wakeup_fd, waitpid() in sigchild returns the PID of the test process, which makes zdaemon terminate after emitting a
2016-04-19T09:28:47 WARNING root unknown pid 7367: exit status 1
I cannot explain why using signal.set_wakeup_fd
changes PyPy's timings so sigchild wins the race.
It's a problem: if our SIGCHLD handler uses waitpid(-1) to get the exit status of the test process, then subprocess.call() gets no chance to see it. This is handled in subprocess.py by blithely returning the exit status of 0 when its waitpid(actual_test_program_pid) call fails with OSError(errno=ECHILD).
I think this can be fixed by changing sigchild() to invoke os.waitpid(self.pid, os.WNOHANG), but I've had no luck with that approach. What it does is make the test fail in zdaemon stop
saying "daemon manager not running", with my event log truncated in an inexplicable way (anything up to the first call to select.select() is logged, anything after isn't, including in the except: part).
from zdaemon.
I've had no luck with that approach.
Oh, duh, self.pid is not None initially; it's an AttributeError.
If I make it None, and change the waitpid call in sigchild to os.waitpid(self.pid or -1)
then the code loops forever on Python 3.4 (even though my logs show that self.pid is always None inside sigchild, which makes no sense to me).
I give up today.
from zdaemon.
PyPy works fine today. But then I've seen a different failure on Python 2.7 on git master.
I suspect the tests might be simply prone to race conditions.
from zdaemon.
Currently the tests run well on PyPy in GHA, so I decided to close this issue.
Yes, I saw some problems locally when running tests of multiple Python versions in parallel but they went away when running them sequentially.
from zdaemon.
Related Issues (18)
- zdaemon transcript thread dies if disk is full HOT 4
- Resurrect logreopen HOT 1
- Kill should accept signal names, not just numbers HOT 6
- Tests are slow HOT 5
- Daemon manager ignores 'stop' action while it's busy respawning
- How to specify relative paths into configuration file? HOT 2
- zdaemon doesn't realise it is managing a different program if nargs < running process
- support for running multiple processes as daemons HOT 1
- Drop support for `python setup.py test`
- use unittest.mock instead of mock HOT 1
- Using zdaemon API instead of providing a config file and document it HOT 5
- PyPy3 tests break HOT 19
- zdaemon -v/--version HOT 2
- zdaemon "reopen_transcript" command is not shown in "zdaemon -h"
- Tests leave running processes HOT 8
- zdaemon never reopens <eventlog>
- Idea: make `logreopen` imply `reopen_transcript` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zdaemon.