Coder Social home page Coder Social logo

Test breakage on pypy about zdaemon HOT 4 CLOSED

zopefoundation avatar zopefoundation commented on June 9, 2024
Test breakage on pypy

from zdaemon.

Comments (4)

mgedmin avatar mgedmin commented on June 9, 2024

I'm debugging with this:

diff --git a/src/zdaemon/tests/tests.py b/src/zdaemon/tests/tests.py
index dd09ae7..d8e85ce 100644
--- a/src/zdaemon/tests/tests.py
+++ b/src/zdaemon/tests/tests.py
@@ -319,6 +319,11 @@ def test_start_timeout():
     ...   start-test-program cat x
     ...   start-timeout 1
     ... </runner>
+    ... <eventlog>
+    ...   <logfile>
+    ...     path /tmp/WAT-HAPPEN.log
+    ...   </logfile>
+    ... </eventlog>
     ... ''' % sys.executable)

     >>> import time
diff --git a/src/zdaemon/zdrun.py b/src/zdaemon/zdrun.py
index 32b4bfe..018664e 100755
--- a/src/zdaemon/zdrun.py
+++ b/src/zdaemon/zdrun.py
@@ -149,6 +149,8 @@ class Subprocess:
         starttestprogram = self.options.starttestprogram
         try:
             while self.pid == pid:
+                self.options.logger.info("spawned test process %s",
+                                         starttestprogram)
                 if not subprocess.call(starttestprogram):
                     break
                 time.sleep(1)
@@ -175,7 +177,8 @@ class Subprocess:
                 thread.setDaemon(True)
                 thread.start()

-            self.options.logger.info("spawned process pid=%d" % pid)
+            self.options.logger.info("spawned process pid=%d: %s %s",
+                                     pid, self.filename, self.args)
             return pid
         else:  # pragma: nocover
             # Child
@@ -318,10 +321,18 @@ class Daemonizer:
     def sigchild(self, sig, frame):
         try:
             pid, sts = os.waitpid(-1, os.WNOHANG)
-        except os.error:
+        except os.error as e:
+            self.logger.info("Ignoring SIGCHLD: waitpid returned %s", e)
             return
         if pid:
+            self.logger.info("Got SIGCHLD for pid %s", pid)
+            if self.waitstatus:
+                es, msg = decode_wait_status(self.waitstatus[1])
+                msg = "lost wait status for pid %d: " % self.waitstatus[0] + msg
+                self.logger.warn(msg)
             self.waitstatus = pid, sts
+        else:
+            self.logger.info("Ignoring SIGCHLD: waitpid returned pid %s", pid)

     transcript = None

Reminder of what's happening in the test:

  • zdaemon spawns a child process (t.py) that sleeps for 10 seconds
  • zdaemon also spawns a test process ('cat x') that returns 1 (because 'x' doesn't exist), and it does so every second in a loop in a background thread
  • there's a start-timeout of 1 second which means zdaemon is supposed to give up and kill t.py because the test program didn't return 0 in that time.

AFAICT there's a race condition between subprocess.call(starttestprogram) inside Subprocess.test and the os.waitpid() call inside Daemonizer.sigchild. On Python 3.4, waitpid() in sigchild returns 0. On PyPy with set_wakeup_fd, waitpid() in sigchild returns the PID of the test process, which makes zdaemon terminate after emitting a

2016-04-19T09:28:47 WARNING root unknown pid 7367: exit status 1

I cannot explain why using signal.set_wakeup_fd changes PyPy's timings so sigchild wins the race.

It's a problem: if our SIGCHLD handler uses waitpid(-1) to get the exit status of the test process, then subprocess.call() gets no chance to see it. This is handled in subprocess.py by blithely returning the exit status of 0 when its waitpid(actual_test_program_pid) call fails with OSError(errno=ECHILD).

I think this can be fixed by changing sigchild() to invoke os.waitpid(self.pid, os.WNOHANG), but I've had no luck with that approach. What it does is make the test fail in zdaemon stop saying "daemon manager not running", with my event log truncated in an inexplicable way (anything up to the first call to select.select() is logged, anything after isn't, including in the except: part).

from zdaemon.

mgedmin avatar mgedmin commented on June 9, 2024

I've had no luck with that approach.

Oh, duh, self.pid is not None initially; it's an AttributeError.

If I make it None, and change the waitpid call in sigchild to os.waitpid(self.pid or -1) then the code loops forever on Python 3.4 (even though my logs show that self.pid is always None inside sigchild, which makes no sense to me).

I give up today.

from zdaemon.

mgedmin avatar mgedmin commented on June 9, 2024

PyPy works fine today. But then I've seen a different failure on Python 2.7 on git master.

I suspect the tests might be simply prone to race conditions.

from zdaemon.

icemac avatar icemac commented on June 9, 2024

Currently the tests run well on PyPy in GHA, so I decided to close this issue.

Yes, I saw some problems locally when running tests of multiple Python versions in parallel but they went away when running them sequentially.

from zdaemon.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.