Coder Social home page Coder Social logo

kts-o7 / better_bing_image_downloader Goto Github PK

View Code? Open in Web Editor NEW
21.0 1.0 2.0 56 KB

Python library to download bulk of images from Bing

Home Page: https://pypi.org/project/better-bing-image-downloader/

License: MIT License

Python 100.00%
bing-image-downloader bing-image-scrapping image-dataset-maker image-datasets image-downloader-python image-scraper image-scrapping python-image-download python-image-downloader python-image-webcrawler

better_bing_image_downloader's Introduction

Better Bing Image Downloader

Table of Contents

Disclaimer

This program lets you download tons of images from Bing. Please do not download or use any image that violates its copyright terms.

GitHub top language GitHub Hits

Installation

git clone https://github.com/KTS-o7better_bing_image_downloader
python -m venv ./env
source env/bin/activate
cd better_bing_image_downloader
pip install .

or

pip install better-bing-image-downloader

PyPi

Package Link

Usage

Using as a Package:

from better_bing_image_downloader import downloader

downloader(query_string, limit=100, output_dir='dataset', adult_filter_off=True,
force_replace=False, timeout=60, filter="", verbose=True, badsites= [], name='Image')

query_string : String to be searched.
limit : (optional, default is 100) Number of images to download.
output_dir : (optional, default is 'dataset') Name of output dir.
adult_filter_off : (optional, default is True) Enable of disable adult filteration.
force_replace : (optional, default is False) Delete folder if present and start a fresh download.
timeout : (optional, default is 60) timeout for connection in seconds.
filter : (optional, default is "") filter, choose from [line, photo, clipart, gif, transparent]
verbose : (optional, default is True) Enable downloaded message.
bad-sites : (optional, defualt is empty list) Can limit the query to not access the bad sites.
name : (optional, default is 'Image') Can add a custom name for the images that are downloaded.

Using as a Command Line Tool:

    git clone https://github.com/KTS-o7/better_bing_image_downloader.git
    cd better_bing_image_downloader
    python -m venv ./env
    source env/bin/activate
    pip install -r requirements.txt
    cd better_bing_image_downloader
    # This is an example query
    python multidownloader.py "cool doggos" --engine "Bing"  --max-number 50 --num-threads 5 --driver "firefox_headless"

Command Line Arguments:

multidownloader.py "keywords" [-h] [--engine {Google,Bing}] [--driver {chrome_headless,chrome,api,firefox,firefox_headless}] [--max-number MAX_NUMBER] [--num-threads NUM_THREADS] [--timeout TIMEOUT] [--output OUTPUT] [--safe-mode] [--face-only] [--proxy_http PROXY_HTTP] [--proxy_socks5 PROXY_SOCKS5] [--type {clipart,linedrawing,photograph}] [--color COLOR]
  • "keywords": Keywords to search. ("in quotes")
  • -h, --help: Show the help message and exit
  • --engine, -e: Image search engine. Choices are "Google" and "Bing". Default is "Bing".
  • --driver, -d: Image search engine. Choices are "chrome_headless", "chrome", "api", "firefox", "firefox_headless". Default is "firefox_headless".
  • --max-number, -n: Max number of images download for the keywords. Default is 100.
  • --num-threads, -j: Number of threads to concurrently download images. Default is 50.
  • --timeout, -t: Seconds to timeout when download an image. Default is 10.
  • --output, -o: Output directory to save downloaded images. Default is "./download_images".
  • --safe-mode, -S: Turn on safe search mode. (Only effective in Google)
  • --face-only, -F: Only search for faces.
  • --proxy_http, -ph: Set http proxy (e.g. 192.168.0.2:8080)
  • --proxy_socks5, -ps: Set socks5 proxy (e.g. 192.168.0.2:1080)
  • --type, -ty: What kinds of images to download. Choices are "clipart", "linedrawing", "photograph".
  • --color, -cl: Specify the color of desired images.
# Example usage
python multidownloader.py "Cool Doggos" --engine "Google" --driver "chrome_headless" --max-number 50 --num-threads 10 --timeout 60 --output "./doggo_images" --safe-mode --proxy_http "192.168.0.2:8080" --type "photograph" --color "blue"

Star History

Star History Chart

License

This project is licensed under the terms of the MIT license.

Contact

If you have any questions or feedback, please contact us at email.

better_bing_image_downloader's People

Contributors

hyoungsooo avatar kts-o7 avatar theharshithh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

better_bing_image_downloader's Issues

Dependency Installation Error with memoria Leading to Failed Execution in better-bing-image-downloader

Title: Dependency Installation Error with memoria Leading to Failed Execution in better-bing-image-downloader

Environment:

  • better-bing-image-downloader Version: 1.1.0
  • Python Version: Tried on 3.9.5 and 3.11.8
  • Operating System: Windows 11

Description:
The better-bing-image-downloader installs without issues, but upon execution, it fails with an error related to missing dependencies, specifically memoria. Following the error chain to install memoria leads to a deprecated dependency issue involving sklearn, which is now replaced by scikit-learn. Attempts to rectify this through environment variables and direct installation of scikit-learn were unsuccessful.

Steps to Reproduce:

  1. Install better-bing-image-downloader version 1.1.0.
  2. Run a script utilizing better-bing-image-downloader.
  3. The execution prompts an installation of memoria due to a missing dependency.
  4. Installing memoria leads to an error involving a deprecated package sklearn, suggesting the use of scikit-learn instead.

Expected Behavior:
better-bing-image-downloader should run without requiring the manual resolution of deprecated dependencies, particularly between sklearn and scikit-learn.

Actual Behavior:
The tool fails to run due to a cascade of dependency-related errors, starting with memoria and leading to issues with deprecated sklearn package installations.

Impact:
This issue blocks the usage of better-bing-image-downloader, as it can't execute its core functionality without resolving these dependency issues, impacting project progress where image downloading is required.

Logs/Error Messages:
For a detailed traceback and error messages, please refer to the initial queries in this ticket submission.

Suggested Fix or Workaround:
A review and update of the dependencies within better-bing-image-downloader and its related packages (memoria, and further dependencies therein) could potentially resolve this issue. Ensuring compatibility with scikit-learn instead of the deprecated sklearn package could be a critical part of the solution.

multidownloader example in readme is not working

I ran the following example:

python multidownloader.py "Cool Doggos" --engine "Google" --driver "chrome_headless" --max-number 50 --num-threads 10 --timeout 60 --output "./doggo_images" --safe-mode --proxy_http "192.168.0.2:8080" --type "photograph" --color "blue"

image

And it finds 0 images.

I am on ubuntu 22.04, bing works. I think its something to do with Chrome.

Call back for progress

First of all thank you for your library. I am using it in a PyQT5 python application. Is there any way to get the progress update to my calling function so that I can update the progress bar on my GUI

Add method to rename the downloaded images

Add a functionality to rename the images in a custom format when downloading.
It must be a command line argument. Please fork the repository and PR the changes. It will be reviewed and appended if found useful.

Document

Hello,

Is there a brief explanation on why it is better?

Improve the README.md

  • Add a comprehensive user manual as in readme.
  • Add code snippets to use as examples.
  • Add useful statistics

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.