Coder Social home page Coder Social logo

Comments (9)

Ch41r05 avatar Ch41r05 commented on June 8, 2024 1

Hi @AmenRa , thanks for your help! I ran the sample code you provided me, sadly (?) everything worked fine, the output was the expected one in both the console and the logfile. Seems tqdm isn't the culprit here. I agree that the problem may lie in the fact that I'm using Windows. As I was saying this kind of behavior seems to be linked on how files are opened under the hood. I'll try to dig deeper and keep you updated if I find anything interesting, since this behavior is strange and not as intended I guess!

from retriv.

Ch41r05 avatar Ch41r05 commented on June 8, 2024 1

Hi @AmenRa, thanks for your support, I finally solved the issue.
The problem doesn't lie in retriv per se, but in the fact that's using multiprocessing.
The logger was instantiated outside the if name == "main" section of the code, so the multiprocessing executed it more than once, leading to unexpected output. Once the instantiation of the code has been moved under the if, everything works as expected. Leaving this here so other windows users won't be caught in OS this difference.

from retriv.

AmenRa avatar AmenRa commented on June 8, 2024

Could you provide a reproducible snippet?

from retriv.

Ch41r05 avatar Ch41r05 commented on June 8, 2024

Hi @AmenRa ,

Here is a script that reproduces the error. The logger class only has this problem when SearchEngine().index() is run, if commented the logs are correctly printed.

I may have done something wrong, but I can't seem to find the cause for this since every other test I've run didn't reproduce the issue.

binary_log.zip

from retriv.

AmenRa avatar AmenRa commented on June 8, 2024

Hi, i removed from commons.logger import Logger as I do not know what commons is.

Everything seems to work fine on my end.
I get this in both my terminal and the log file:

[2023-09-02 09:16:28,031] {binary_log.py:55} INFO - Building index...
[2023-09-02 09:16:28,303] {binary_log.py:57} INFO - Index built.

from retriv.

Ch41r05 avatar Ch41r05 commented on June 8, 2024

@AmenRa I checked and confirmed that even without the import the console log works fine but the log file still has the problem, I tested this script on two different PCs which both run windows. Maybe this could be something related to the operating system?

Also, can I ask you your configuration? Thing like OS, Python version, and the such.
Thanks a lot

from retriv.

Ch41r05 avatar Ch41r05 commented on June 8, 2024

Hi @AmenRa I did some further testing and I found something interesting. If I change the order of the lines

 logger.info("Building index...")
 SearchEngine("new-index").index(collection, show_progress=False)
 logger.info("Index built.")

to

 logger.info("Building index...")
 logger.info("Index built.")
 SearchEngine("new-index").index(collection, show_progress=False)

the log file becomes empty. So it seems from the tests I conducted that somehow rertiv is still sending something to stdout even when with show_progress=False. If I comment the indexing line, the log correctly is shown in the file. I'm running on windows 11 with python 3.10.0. I don't know if I can give you any more information, but please tell me if you need something else to try to reproduce the error in your environment.

from retriv.

Ch41r05 avatar Ch41r05 commented on June 8, 2024

Hi @AmenRa , quick update, I tried changing the logging library with loguru, and the issue still occurs when opening the file in W mode. From further digging, I found that the log file has NUL (yes, with just one "L") values before the last line, this seems to be an issue related to the underlying open file command and multithreading/multiprocessing which, given the high speed of retriv, I guess it's being done under the hood. I'll try to see if I can dig some other information.

from retriv.

AmenRa avatar AmenRa commented on June 8, 2024

Hi, show_progress is passed to a tqdm progress bar in this fashion:

tqdm(
  ...
  disable=not show_progress,
)

Try running a loop with tqdm like this to verify whether tqdm is (one of) the cause:

from tqdm import tqdm

logger.info("Building index...")
for _ in tqdm(range(10_000), disable=True):
    continue
logger.info("Index built.")

Other tqdm options that I use are:

desc="something"
dynamic_ncols=True
mininterval=0.5

Honestly, I am not aware of other things that may interfere with a logger.
Also, I do not have a Windows machine, so maybe the issue is related to Windows / Windows + tqdm.

from retriv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.