Coder Social home page Coder Social logo

Comments (17)

picnixz avatar picnixz commented on July 17, 2024

I'm not really sure about this but would it actually make sense to only check for protocol-based queues and not the actual types of queues? strictly speaking, we only need a queue-like interface right? If so, I can work on that one.

from cpython.

shmilee avatar shmilee commented on July 17, 2024

After reading the codes above and this pull request, I tend to not check the queue or queue-like classes, as the main purpose of the code from lines 789 to 804 is to handle the special cases of isinstance(qspec, str) and isinstance(qspec, dict) and then update config['queue'] to be a valid queue instance.
Other cases are considered that config['queue'] is already a queue, like an instance of queue.Queue or any other Queue type, which is the common use case in the previous version of Python.

Perhaps we can ask @vsajip for some advice. Is there anything else we should consider?

from cpython.

vsajip avatar vsajip commented on July 17, 2024

Can you explain the use cases which require multiprocessing.Manager().Queue() (the AutoProxy)? Also, since logging isn't async, why would you want to pass an asyncio.queues.Queue()?

from cpython.

shmilee avatar shmilee commented on July 17, 2024

I have only used multiprocessing.Manager().Queue(). The asyncio.queues.Queue() mentioned is just a possible use case I imagined after searching the standard library for queues. So I can only talk about the former.

Using Python 3.12 on Linux, I change manager.Queue() (manager is a Manager instance ) to multiprocessing.Queue(), in my codes logging also works. So, currently, it seems that using manager.Queue() is not mandatory.
The reason for choosing it before is that in previous versions of Python, multiprocessing.Queue() may cause some errors related to multiprocessing Pool, spawn, and fork things, on Linux or Windows.

Sorry for leaking an example code with logging, because tests on Windows, and installing previous versions of Python, may take a lot of time. However, some information about using use a queue created with a manager to avoid some potential issues can be provided below after I searched:

  1. python-multiprocessing-queue-vs-multiprocessing-manager-queue
  2. https://docs.python.org/3.12/library/multiprocessing.html#pipes-and-queues

    If they really bother you then you can instead use a queue created with a manager.

  3. google keywords: RuntimeError: Queue objects should only be shared between processes through inheritance.

from cpython.

vsajip avatar vsajip commented on July 17, 2024

Do you have an actual use case where you're running into problems? Or is it just a theoretical exercise? If an actual use case, please give details.

from cpython.

shmilee avatar shmilee commented on July 17, 2024

Here is an actual use case,and quickly show the problem:

cd /tmp/
git clone --depth 1 --branch v0.8.3 https://github.com/shmilee/gdpy3.git
cd gdpy3/
python3.12 -m unittest src/processors/tests/test_multiprocessor.py

Then I will get errors from more and more ForkPoolWorker-XXX, XXX is an increasing number.

Process ForkPoolWorker-210:
Traceback (most recent call last):
  File "/usr/lib/python3.12/logging/config.py", line 581, in configure
    handler = self.configure_handler(handlers[name])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/logging/config.py", line 801, in configure_handler
    raise TypeError('Invalid queue specifier %r' % qspec)
TypeError: Invalid queue specifier <AutoProxy[Queue] object, typeid 'Queue' at 0x7f5250423d70>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 109, in worker
    initializer(*initargs)
  File "/tmp/gdpy3/src/glogger.py", line 167, in __call__
    logging.config.dictConfig(self.glogger_config)
  File "/usr/lib/python3.12/logging/config.py", line 914, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/lib/python3.12/logging/config.py", line 588, in configure
    raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'queue'

from cpython.

shmilee avatar shmilee commented on July 17, 2024
  1. Using multiprocessing.Queue() can fix it. see here

  2. I also test this patch on logging/config.py. The error also disappears.
    (cannot find class AutoProxy[Queue] to use in managers module, so just test with BaseProxy)

--- config.py	2024-05-31 10:16:24.905893630 +0800
+++ config.py-fix3.12	2024-05-31 13:18:07.223555882 +0800
@@ -786,8 +786,11 @@
                     # raise ValueError('No handlers specified for a QueueHandler')
                 if 'queue' in config:
                     from multiprocessing.queues import Queue as MPQueue
+                    from multiprocessing.managers import BaseProxy as MPBaseProxy                    
                     qspec = config['queue']
-                    if not isinstance(qspec, (queue.Queue, MPQueue)):
+                    print('============', isinstance(qspec, MPBaseProxy))
+                    print(qspec, type(qspec))
+                    if not isinstance(qspec, (queue.Queue, MPQueue, MPBaseProxy)):

from cpython.

vsajip avatar vsajip commented on July 17, 2024

No, that's not the use case I meant. What I meant is, do you have a real problem where you have to use the proxy and can't use multiprocessing.Queue?

from cpython.

shmilee avatar shmilee commented on July 17, 2024

No. For me, using logging with multiprocessing.Queue also works fine.

from cpython.

shmilee avatar shmilee commented on July 17, 2024

And I think using the proxy "manager.Queue()" should not be excluded / droped, it was working fine before 3.12.
Even in version 3.12, there is still a way to bypass the existing restrictions (Line-792), using the much more powerful dict qspec configuration. About the listener part, I haven't figured out how to set up a "manager.Queue()" together with handler yet.

import logging.config
import multiprocessing as mp
import asyncio

def main(i, qd):
    print('%d. qspec dict: ' % i, qd)
    config = {
        'version': 1,
        'handlers': {
            'sink': {
                'class': 'logging.handlers.QueueHandler',
                'queue': qd,
            },
        },
        'root': {
            'handlers': ['sink'],
        },
    }
    logging.config.dictConfig(config)
    l = logging.getLogger()
    s = l.handlers[0]
    print('USE: ', id(s.queue), s.queue, type(s.queue))
    print('------')

def get_mp_queue(manager=None, maxsize=None):
    q = manager.Queue(maxsize)
    print('GET: ', id(q), q)
    return q

def get_asy_queue(maxsize=None):
    q = asyncio.Queue(maxsize)
    print('GET: ', id(q), q)
    return q

if __name__ == '__main__':
    m = mp.Manager()
    main(1, {'()': get_mp_queue, 'manager': m, 'maxsize': 10})
    main(2, {'()': m.Queue, 'maxsize': 20})
    main(3, {'()': get_asy_queue, 'maxsize': 10})  # not async, so the user is responsible for his choice.

Output:

1. qspec dict:  {'()': <function get_mp_queue at 0x7fd645687100>, 'manager': <multiprocessing.managers.SyncManager object at 0x7fd644d2c440>, 'maxsize': 10}
GET:  140558252776848 <queue.Queue object at 0x7fd644bccbf0>
USE:  140558252776848 <queue.Queue object at 0x7fd644bccbf0> <class 'multiprocessing.managers.AutoProxy[Queue]'>
------
2. qspec dict:  {'()': <bound method BaseManager.register.<locals>.temp of <multiprocessing.managers.SyncManager object at 0x7fd644d2c440>>, 'maxsize': 20}
USE:  140558252622480 <queue.Queue object at 0x7fd644bccf50> <class 'multiprocessing.managers.AutoProxy[Queue]'>
------
3. qspec dict:  {'()': <function get_asy_queue at 0x7fd645532200>, 'maxsize': 10}
GET:  140558252955872 <Queue maxsize=10>
USE:  140558252955872 <Queue maxsize=10> <class 'asyncio.queues.Queue'>
------

from cpython.

vsajip avatar vsajip commented on July 17, 2024

And I think using the proxy "manager.Queue()" should not be excluded / droped

Why should it not be excluded, since multiprocessing.Queue () is available to be used? You can't check the class easily as AutoProxy could be used to proxy other things, not just queues.

from cpython.

picnixz avatar picnixz commented on July 17, 2024

My 2 cents: one concrete use case I can imagine is when you want to pass that queue to a pool without using a global variable and a custom initializer. AFAIK, you need a manager-based queue since multiprocessing.Queue cannot be pickled. There are other (possibly rare) issues mentioned at the end of the https://docs.python.org/3.10/library/multiprocessing.html#pipes-and-queues paragraph, so maybe some people might be affected?

When I suggested the protocol-based approach, it was mostly to deal with the issue arising from the proxy interface because checking the runtime type would not help in that case.

from cpython.

shmilee avatar shmilee commented on July 17, 2024

Why should it not be excluded, since multiprocessing.Queue () is available to be used?

Except for the mentioned reasons, 1) some people may get errors when updating to py3.12 due to their previous code using proxy queue, 2) py3.12 excludes but doesn't completely exclude proxy queue (dict qspec), I don't have any other new reasons.

You can't check the class easily as AutoProxy could be used to proxy other things, not just queues.

Yes, that's true and makes checking the type of proxy indeed tricky.

  1. If it really needs to be checked, I think picnixz's suggestion is quite good.
    (PS: In my personal code, I might directly use its __class__ to check the class, because, for personal projects, I treat the class names in the standard library basically unchanged. Of course, in a standard library this will be weird.)

  2. If the final decision is not to check proxy, perhaps the documentation should explicitly mention it.
    Because even though its class name is <class 'multiprocessing.managers.AutoProxy[Queue]'>, it still is a <queue.Queue object at xxx>. Perhaps, besides me, someone also believes they are the same AutoProxy[Queue] can be treated as queue.Queue.

SyncManager.register('Queue', queue.Queue)
SyncManager.register('JoinableQueue', queue.Queue)

from cpython.

vsajip avatar vsajip commented on July 17, 2024

Please check if the changes in the linked PR #120030 fix the problem reported..

from cpython.

shmilee avatar shmilee commented on July 17, 2024

I just checked. The PR fixes the problem. Thanks!

#!/bin/bash
TEST_DIR='/tmp/py3.12-logging-queue-fdjsa'
mkdir -v $TEST_DIR
cd $TEST_DIR/
wget -c https://github.com/python/cpython/pull/120030.patch
filterdiff --include=a/Lib/logging/config.py ./120030.patch > ./120030-part.patch
cat > ./test-mp.py <<EOF
import logging.config
import multiprocessing as mp
def main(q):
    config = {
        'version': 1,
        'handlers': {
            'sink': {
                'class': 'logging.handlers.QueueHandler',
                'queue': q,
            },
        },
        'root': {
            'handlers': ['sink'],
        },
    }
    logging.config.dictConfig(config)
if __name__ == '__main__':
    main(mp.Manager().Queue())
    print('===> PASS. Manager().Queue()')
    #print('-'*20)
    #main(mp.Manager().JoinableQueue())
    #print('===> PASS. Manager().JoinableQueue()')
EOF

echo -e '\n--------before and test--------\n'
python3.12 $TEST_DIR/test-mp.py

echo -e '\n--------patch and test--------\n'
md5sum /usr/lib/python3.12/logging/*.py
sudo cp -iv /usr/lib/python3.12/logging/config.py /usr/lib/python3.12/logging/config.py-backup
sudo patch -p3 -i  $TEST_DIR/120030-part.patch -d /usr/lib/python3.12/logging
diff -Nur /usr/lib/python3.12/logging/config.py-backup /usr/lib/python3.12/logging/config.py
python3.12 $TEST_DIR/test-mp.py

echo -e '\n--------back--------\n'
sudo mv -iv /usr/lib/python3.12/logging/config.py-backup /usr/lib/python3.12/logging/config.py
sudo rm -iv /usr/lib/python3.12/logging/config.py.orig
md5sum /usr/lib/python3.12/logging/*.py

Output:

--------patch and test--------

5bbe3f40092754f7c0048141d8e18d46  /usr/lib/python3.12/logging/config.py
9e8979dac3100e20627ffb5ec3eb0f88  /usr/lib/python3.12/logging/handlers.py
1b36e4639409218821586449d57ed36b  /usr/lib/python3.12/logging/__init__.py
'/usr/lib/python3.12/logging/config.py' -> '/usr/lib/python3.12/logging/config.py-backup'
patching file config.py
Hunk #1 succeeded at 786 (offset 5 lines).
--- /usr/lib/python3.12/logging/config.py-backup	2024-06-05 10:54:15.791962361 +0800
+++ /usr/lib/python3.12/logging/config.py	2024-06-05 10:54:15.813962361 +0800
@@ -786,8 +786,10 @@
                     # raise ValueError('No handlers specified for a QueueHandler')
                 if 'queue' in config:
                     from multiprocessing.queues import Queue as MPQueue
+                    from multiprocessing import Manager as MM
+                    proxy_queue = MM().Queue()
                     qspec = config['queue']
-                    if not isinstance(qspec, (queue.Queue, MPQueue)):
+                    if not isinstance(qspec, (queue.Queue, MPQueue, type(proxy_queue))):
                         if isinstance(qspec, str):
                             q = self.resolve(qspec)
                             if not callable(q):
===> PASS. Manager().Queue()

from cpython.

shmilee avatar shmilee commented on July 17, 2024

Before closing the issue, I will paste a potential error that may never happen. Not sure if anyone has used JoinableQueue and logging together, so only paste it here and let someone who is searching JoinableQueue issue confirm their problem.

if __name__ == '__main__':
    main(mp.Manager().Queue())
    print('===> PASS. Manager().Queue()')
    print('-'*20)
    main(mp.Manager().JoinableQueue())
    print('===> PASS. Manager().JoinableQueue()')
--------------------
Traceback (most recent call last):
  File "/usr/lib/python3.12/logging/config.py", line 581, in configure
    handler = self.configure_handler(handlers[name])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/logging/config.py", line 803, in configure_handler
    raise TypeError('Invalid queue specifier %r' % qspec)
TypeError: Invalid queue specifier <AutoProxy[JoinableQueue] object, typeid 'JoinableQueue' at 0x7fc795981e80>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/tmp/py3.12-logging-queue-fdjsa/test-mp.py", line 23, in <module>
    main(mp.Manager().JoinableQueue())
  File "/tmp/py3.12-logging-queue-fdjsa/test-mp.py", line 17, in main
    logging.config.dictConfig(config)
  File "/usr/lib/python3.12/logging/config.py", line 916, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/lib/python3.12/logging/config.py", line 588, in configure
    raise ValueError('Unable to configure handler '
ValueError: Unable to configure handler 'sink'

from cpython.

shmilee avatar shmilee commented on July 17, 2024

Thank you again! @vsajip @picnixz

from cpython.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.