Coder Social home page Coder Social logo

Comments (8)

cosmos72 avatar cosmos72 commented on June 14, 2024

Hello Hans,

you found a very subtle bug, either in STMX or in SBCL. I managed to reproduce your first error a few times on SBCL 1.1.15, then it disappeared and I could not reproduce it anymore, not even by cleaning ~/.cache/common-lisp/ and restarting SBCL.

Can you please provide the following information, to help me investigate it further:

  • CPU model and version
  • Operating system name and version (and 32 or 64 bit?)
  • SBCL version (and 32 or 64 bit?)
  • STMX version (I guess the one from quicklisp, but please confirm)

Thanks,

Massimiliano

from stmx.

 avatar commented on June 14, 2024

Hi Massimiliano,

here all the relevant information:

I.) STMX version seems to be the latest:

CL-USER> (ql:system-apropos "stmx")
<SYSTEM stmx / stmx-stable-ef96af7a-git / quicklisp 2014-05-25>
<SYSTEM stmx.test / stmx-stable-ef96af7a-git / quicklisp 2014-05-25>

II.) CPU is NOT Haswell architecture:

vendor_id : GenuineIntel
cpu family : 6
model : 42
model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz
stepping : 7
microcode : 0x25
cpu MHz : 1627.750
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 6787.70
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

III.) SBCL version is latest:
SBCL 1.1.18

IV.) OS is latest Archlinux 64 Bit:
Linux 3.14.5-1-ARCH #1 SMP PREEMPT Sun Jun 1 07:36:23 CEST 2014 x86_64 GNU/Linux

Hope this helps and good luck! :-)

Regards,
Hans-J.

from stmx.

cosmos72 avatar cosmos72 commented on June 14, 2024

Hello Hans-J,

commit 707d3dc should fix this bug.
The relevant change is:

diff --git a/stmx/main/tlog.lisp b/stmx/main/tlog.lisp
index b215f4d..57c891c 100644
--- a/stmx/main/tlog.lisp
+++ b/stmx/main/tlog.lisp
@@ -54,7 +54,9 @@ return LOG itself."
 (declaim (inline new-tlog))
 (defun new-tlog ()
   "Get a TLOG from pool or create one, and return it."
-  (let1 log (the tlog (nth-value 0 (fast-vector-pop-macro *tlog-pool* (make-tlog))))
+  (let1 log (or (fast-vector-pop *tlog-pool*)
+                (make-tlog))
+    (declare (type tlog log))
     (setf (tlog-read-version log) +invalid-version+)
     log))

Can you check if it fixes the problem also for you?

Regards,

Massimiliano

from stmx.

 avatar commented on June 14, 2024

Hi Massimiliano,

it got a bit better. ;-)

I tried several times and the experience is mixed. A few times this error message showed up when I started take-thread and put-thread:

BUG! Transaction 794 called (retry), but no TVARs to wait for changes.
This is a bug either in the STMX library or in the application code.
Possible reason: some code analogous to (atomic (retry)) was executed.
Such code is not allowed, because at least one TVAR or one TOBJ slot
must be read before retrying an ATOMIC block.
[Condition of type SIMPLE-ERROR]

Restarts:
0: [ABORT] Abort thread (#<THREAD "BFS" RUNNING {1005D51BF3}>)

Backtrace:
0: (STMX::LISTEN-TVARS-OF #<STMX::TLOG v1460 {1005D2F3B3}>)
1: (STMX::WAIT-TLOG #<STMX::TLOG v1510 {1005D2F3B3}>)
2: (STMX::%RUN-SW-ATOMIC #<CLOSURE (LAMBDA NIL :IN STMX.UTIL:TAKE) {100600FE2B}>)
3: (TM-TEST::LOOP1)
4: ((FLET #:WITHOUT-INTERRUPTS-BODY-1226 :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
5: ((FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
6: ((FLET #:WITHOUT-INTERRUPTS-BODY-660 :IN SB-THREAD::CALL-WITH-MUTEX))
7: (SB-THREAD::CALL-WITH-MUTEX #<CLOSURE (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE) {7FFFF50AEC9B}> #<SB-THREAD:MUTEX "thread result lock" owner: #<SB-THREAD:THR..
8: (SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE #<SB-THREAD:THREAD "BFS" RUNNING {1005D51BF3}> #S(SB-THREAD:SEMAPHORE :NAME "Thread setup semaphore" :%COUNT 0 :WAITCOUNT 0 :MUTEX #<SB-THREAD:MUTEX (fre..
9: ("foreign function: call_into_lisp")

In other situations it just locked up.

When I only start put-thread it will shutdown after a short while as I can see in htop. When I want to list all the running threads in the REPL with "(sb-thread:list-all-threads)" the command does not return and hangs up the REPL. Peeking or taking test-fifo while put-thread is running doesn't work. This seems to be consistent and could be used as a test case for troubleshooting.

Regards,
Hans-J.

from stmx.

cosmos72 avatar cosmos72 commented on June 14, 2024

Hello Hans-J,

then it means the first error you reported "The value NIL is not of type STMX::TLOG" has disappeared. good. This is what the fix was supposed to do.

About the two remaining issues:

  1. When you only start put-thread, you're basically executing something like
(let ((fifo nil))
  (loop for n from 0
    (push n fifo)))

No wonder it shutdowns/hangs after a while, as it keeps allocating until SBCL runs out of memory. Remember that TFIFO is unbounded... If you look, you will probably see SBCL low-level debugger (LDB) reporting an out-of-memory error and waiting for user input - if you use emacs & slime, check the inferior-lisp buffer

  1. I could not yet reproduce the error "BUG! Transaction 794 called (retry), but no TVARs to wait for changes." I will leave SBCL running on your example and see if it happens to me as well - I will open a separate bug report for this, as it is a different problem.

from stmx.

 avatar commented on June 14, 2024

Hi Massimiliano,

you are right. I was just looking at htop which updates in certain intervals and it didn't seem there would be a memory issue. But actually heap got exhausted very, very quickly. And I thought STM is slow! ;-)

Thanks a lot for your support,
Hans-J.

from stmx.

cosmos72 avatar cosmos72 commented on June 14, 2024

I just opened issue #8 for the remaining error

from stmx.

cosmos72 avatar cosmos72 commented on June 14, 2024

Solved the remaining error as well - see issue #8 for details.
The short answer is: you must call (bt:make-thread) instead of (sb-thread:make-thread)

from stmx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.