Coder Social home page Coder Social logo

valentinitnelav / img-with-box-from-excel Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 8.38 MB

boxcel: Integrate Excel with Python for visualizing images with their corresponding bounding boxes for object detection annotation workflows

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
annotation-tool excel image-annotation image-annotation-macro image-annotation-tool via pyinstaller tkinter xlwings image-labeling

img-with-box-from-excel's Introduction

DOI

Overview - What is this about?

How to use the free functionality of the xlwings library to integrate Python with Excel for visualizing annotated images with their associated bounding boxes for object annotation workflows in your object detection project.

excel_main_view image_with_box

The Syrphid image was downloaded from wikipedia

In our AI object detection project, we used VGG Image Annotator (VIA) to manually annotate insects in images, that is, manually place a bounding box and general taxa information. However, it is difficult to filter and edit metadata fields with VIA, while Excel is more user friendly for such tasks. Therefore, it was necessary to visualize the annotated images directly from Excel.

This repository provides the tools to view images directly from within Excel, together with the associated bounding box of an annotated object.

From Excel, one can click on any row, and a Python script will read the image path together with the coordinates of the bounding box and display the image in a window together with the bounding box.

Installation - How to make it work?

Installation of xlwings addin (for Windows)

  • The xlwings addin needs conda to be installed and with it, Python will also be installed. Follow this tutorial from Anaconda's documentation: Installing on Windows. To test if it is already installed, from the Start menu, open the Anaconda Prompt (or Anaconda Powershell Prompt), then type the command conda list. A list of installed packages should appear.
  • You also need git installed and you can download the executable file from here
  • I assume that Python is installed; if not, you can check this tutorial;
  • To check if Python is installed, in Anaconda Prompt type python and you should see something like this:
    C:\Users\your_user_name> python
    
    Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 
  • Install the Excel add-in with the commands:
    pip install xlwings
    xlwings addin install
    # You should see something like:
    # xlwings version: 0.28.5
    # Successfully installed the xlwings add-in!
  • In a/any Excel file, you need to enable the macro options: menu File > Options > Trust Center > Trust Center Settings > Macro Settings > “Enable all macros...". For safety reasons, you can disable this after you are done with your work.
  • In a/any Excel file enable the xlwings add-in: menu File > Options > Add-ins > button "Go..." (usually at the bottom, to the right of "Manage: Excel Add-ins"); Click “Browse” and search for a path similar to this one C:\Users\you_user_name\AppData\Roaming\Microsoft\Excel\XLSTART; Select the file xlwings.xlam; OK; YES (if asked to replace the existing file); OK again;
  • At his point, you should see a new menu/tab named "xlwings" in the Excel file (after the Help menu/tab);

Optionally, if you prefer to run the boxel tool from command line, then clone this repository at your favorite location, for example, to C:\Users\your_user_name\Documents and then install the dependencies:

cd C:\Users\%USERNAME%\Documents
git clone https://github.com/valentinitnelav/img-with-box-from-excel
cd img-with-box-from-excel
pip install -r requirements.txt
# You'll get a series of messages and finally should see something like:
# Successfully installed Pillow-9.3.0 et-xmlfile-1.1.0 numpy-1.23.5 openpyxl-3.0.10 etc.
# If they are already installed, then you will see messages like:
# "Requirement already satisfied: ..."

Excel data structure:

  • In our case, the annotation data can be stored in an Excel file (we'll call it further data_file.xlsx) where each row represents information about a single bounding box.
  • The first row of the Excel file must act as the header of the data and must not have empty cells within cells with data (each column must have a name);
  • Each row must have at least the following columns (exactly these names) so that the tool works without any other adjustments:
    • windows_img_path: string type, the full/absolute path to the image, e.g. I:\data\field-images\2021-07-06\Centaurea-scabiosa-01\IMG_0377.JPG;
    • id_box: integer, the id of each box as recorded by the VGG Image Annotator (VIA);
    • x, y, width & height integer type columns as given by VIA; these are the bounding box coordinates, where x & y represent the upper left corner (the origin).

excel_data_structure

Run the tool

Via the Graphical User Interface (GUI)

Download the file boxcel.exe from this repository and save it anywhere on your computer. Or if you already cloned this repository (see above), then navigate with Windows Explorer to the folder where you cloned this repository and there you find boxcel.exe. This file can be moved anywhere on your computer and it will still execute the Python tool with the associated GUI without the need of installing Python. However, the 3rd party software, xlwings addin, requires conda to be installed (see above).

Open boxcel.exe. Click the button "Browse & execute". Choose the desired Excel file (e.g. data_file.xlsx). The file must respect the mentioned data structure - see above. Click "Open", Then the tool will create the needed Python file (e.g. data_file.py) in the same folder with the Excel file. If successful, you will be notified with a message like "All good! Python code generated. Choose another file or close the application."

Open the Excel file, click on any cell, go to the xlwings menu, and press the green play button named "Run main". The tool will read the current row information with the image path from the column windows_img_path, the id_box and the box coordinates from x, y, width & height columns, and will display the image with its bounding box and a label with the box id. It will work on any sheet in your data_file.xlsx file as long as it can find the required columns mentioned above and they contain valid values.

1) Open boxcel.exe 2) Click "Browse & execute"
boxcel_exe browse_and_execute
3) Open/Choose Excel file 4) Message when successful
open_file all_good_msg
5) Can close the application or run it for another file
final_window

The documentation for how PyInstaller was used to produce the boxcel.exe is here.

Alternatively, if all Python dependencies are in place you can also run the tool like this:

Navigate with Windows Explorer to the folder where you cloned this repository, then to the img-with-box-from-excel\src\boxcel and right-click on the gui.py file and choose "Open with..." then Python. This will start the GUI.

If the above fails, then you can also start the GUI script from command line:

# In a terminal/command line navigate to the cloned repository and then to the src/boxcel folder
cd C:\Users\%USERNAME%\Documents\img-with-box-from-excel\src\boxcel
python gui.py

Via command line

  • Assuming you have a file called data_file.xlsx (with the requirements from above), to make it ready to run with this xlwings tool, in the Anaconda Prompt (or Anaconda Powershell Prompt) do this:
# In a terminal/command line navigate to the cloned repository and then to the src/boxcel folder
cd C:\Users\%USERNAME%\Documents\img-with-box-from-excel\src\boxcel
# Execute the start_project.py which takes as argument the path to your Excel file:
python start_project.py path\to\your\data_file.xlsx # or python3 ...
# Example:
# python start_project.py C:\Users\%USERNAME%\Downloads\data_file.xlsx

# You should see something like:
# xlwings version: 0.28.5
# Copied the Python code from C:\Users\%USERNAME%\Documents\img-with-box-from-excel\src\boxcel\display_images.py to C:\Users\%USERNAME%\Downloads\data_file.py
# All good!

This just created the data_file.py in the same folder with data_file.xlsx.

Additional resources for xlwings and the xlwings add-in:

How to cite this repository?

If this work helped you in any way and would like to cite it, you can do so with a DOI from Zenodo, like:

Valentin Ștefan. (2022). boxcel (v2.0.0) - Integrate Excel with Python for visualizing images with their corresponding bounding boxes for object detection annotation workflows. Zenodo. https://doi.org/10.5281/zenodo.7487550

img-with-box-from-excel's People

Contributors

valentinitnelav avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

img-with-box-from-excel's Issues

Error message - "ValueError: incorrect coordinate type"

Not sure at the moment why this is happening.

File: dt_annotations_with_cor_plants_2023_04_20.xlsx
Note that this doesn't happen at the moment with hymenoptera_Demetra.xlsm. Check why - is it the file type?

image

---------------------------
Error
---------------------------
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "i:\artificial-intelligence\field-img-taxa-annotation\archive\dt_annotations_with_cor_plants_2023_04_20.py", line 82, in main
    display_img()
  File "i:\artificial-intelligence\field-img-taxa-annotation\archive\dt_annotations_with_cor_plants_2023_04_20.py", line 63, in display_img
    draw.rectangle(xy=coord, outline='Red', width=3)
  File "C:\ProgramData\Anaconda3\lib\site-packages\PIL\ImageDraw.py", line 281, in rectangle
    self.draw.draw_rectangle(xy, ink, 0, width)
ValueError: incorrect coordinate type

Press Ctrl+C to copy this message to the clipboard.
---------------------------
OK   
---------------------------

ModuleNotFoundError: No module named your_excel_file_name

---------------------------
Error
---------------------------
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'df_iou_false_spl_box_1'

Press Ctrl+C to copy this message to the clipboard.
---------------------------
OK   
---------------------------

Such an error message indicates that the df_iou_false_spl_box_1.py file is missing. Follow this: https://github.com/valentinitnelav/img-with-box-from-excel#run-the-tool

Implement informative error messages if the needed columns are missing

If any of these columns is missing then one gets a cryptic error message at the moment: windows_img_path, id_box, x, y, width & height.

Example of a current not-user-friendly error message when the column windows_img_path is missing (e.g. renamed by accident, or simply missing):

Selection_144

---------------------------
Error
---------------------------
Traceback (most recent call last):
  File "c:\python38\lib\site-packages\pandas\core\indexes\base.py", line 2895, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'windows_img_path'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\...\field-img-taxa-annotation\vba_python_img_with_box\xlwings_test\xlwings_test_project.py", line 68, in main
    display_img()
  File "C:\...\field-img-taxa-annotation\vba_python_img_with_box\xlwings_test\xlwings_test_project.py", line 44, in display_img
    img_path = df['windows_img_path'][0]
  File "c:\python38\lib\site-packages\pandas\core\frame.py", line 2902, in __getitem__
    indexer = self.columns.get_loc(key)
  File "c:\python38\lib\site-packages\pandas\core\indexes\base.py", line 2897, in get_loc
    raise KeyError(key) from err
KeyError: 'windows_img_path'

Press Ctrl+C to copy this message to the clipboard.
---------------------------
OK   
---------------------------

Windows - cannot open the xlsm created file; works on Linux though

On Windows, when trying to open the newly created xlsm file with Excel I get this error message Excel cannot open the file '...xlsm' because the file format or file extension is not valid.
There is no problem on Linux & LibreOffice.

A xlsm test file was created with:

C:\Users\vs66tavy\Documents\img-with-box-from-excel\src\boxcel>python start_project.py C:\Users\vs66tavy\Downloads\hymenoptera_sample.xlsx
xlwings version: 0.28.3
Writing sheet Andrenidae to xlsm file C:\Users\vs66tavy\Downloads\hymenoptera_sample\hymenoptera_sample.xlsm
Writing sheet No family to xlsm file C:\Users\vs66tavy\Downloads\hymenoptera_sample\hymenoptera_sample.xlsm
Writing sheet _xlwings.conf to xlsm file C:\Users\vs66tavy\Downloads\hymenoptera_sample\hymenoptera_sample.xlsm
Path to the current executed python file is: C:\Users\vs66tavy\Documents\img-with-box-from-excel\src\boxcel\display_images.py
Copied the Python code from C:\Users\vs66tavy\Documents\img-with-box-from-excel\src\boxcel\display_images.py to C:\Users\vs66tavy\Downloads\hymenoptera_sample\hymenoptera_sample.py
All good!

Allow multiple boxes per image

There might be the need to visualise the "ground truth" + the predicted bounding boxes on the same image.
For example if there are extra box coordinates (predictions) in the Excel file for a given row, then display those as well, but use a different color.

Implement informative error messages if the values in needed columns are missing or are unexpected

If any of these columns (windows_img_path, id_box, x, y, width & height) contain unexpected values, then one gets a cryptic error message at the moment.

Example of a current not-user-friendly error message when in the column windows_img_path one has a number instead of the expected absolute file path:

Selection_145

---------------------------
Error
---------------------------
Traceback (most recent call last):
  File "c:\python38\lib\site-packages\PIL\Image.py", line 3096, in open
    fp.seek(0)
AttributeError: 'numpy.float64' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\...\field-img-taxa-annotation\vba_python_img_with_box\xlwings_test\xlwings_test_project.py", line 68, in main
    display_img()
  File "C:\...\field-img-taxa-annotation\vba_python_img_with_box\xlwings_test\xlwings_test_project.py", line 46, in display_img
    with Image.open(img_path) as im:
  File "c:\python38\lib\site-packages\PIL\Image.py", line 3098, in open
    fp = io.BytesIO(fp.read())
AttributeError: 'numpy.float64' object has no attribute 'read'

Press Ctrl+C to copy this message to the clipboard.
---------------------------
OK   
---------------------------

When the absolute path has a typo, the error message is a bit more clear and can leave it as it is for now:

---------------------------
Error
---------------------------
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\...\field-img-taxa-annotation\vba_python_img_with_box\xlwings_test\xlwings_test_project.py", line 68, in main
    display_img()
  File "C:\...\field-img-taxa-annotation\vba_python_img_with_box\xlwings_test\xlwings_test_project.py", line 46, in display_img
    with Image.open(img_path) as im:
  File "c:\python38\lib\site-packages\PIL\Image.py", line 3092, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\...\\field-img\\2021-07-06\\Centaurea-scabiosa-01\\IMG_0377.JPG'

Press Ctrl+C to copy this message to the clipboard.
---------------------------
OK   
---------------------------

Linux vs Windows differences for __file__ (path to current/calling script file)

It looks like __file__ & os.path.realpath(__file__) returns different things depending on the OS (Windows vs Linux).

On Windows:

cd  C:\Users\vs66tavy\Documents\img-with-box-from-excel\src\boxcel
python start_project.py C:\Users\vs66tavy\Downloads\hymenoptera_sample.xlsx 
# Gives an error like
Traceback (most recent call last):
  File "start_project.py", line 94, in <module>
    with open(display_images_py_file,'r') as firstfile, open(target_py_file,'w') as secondfile:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\vs66tavy\\Downloads\\display_images.py'

print('path to current file using __file__ is: ' + __file__)
# path to current file using __file__ is: 
start_project.py

print('path to current file using os.path.realpath(__file__) is: ' + os.path.realpath(__file__))
# path to current file using os.path.realpath(__file__) is: 
C:\Users\vs66tavy\Downloads\start_project.py

On Linux:

cd '/home/vs66tavy/iDiv Dropbox/Valentin Stefan/GitHub/img-with-box-from-excel/src/boxcel'
python3 start_project.py '/home/vs66tavy/iDiv Dropbox/Valentin Stefan/GitHub/img-with-box-from-excel/sandbox/hymenoptera_sample.xlsx'

print('path to current file using __file__ is: ' + __file__)
# path to current file using __file__ is: 
/home/vs66tavy/iDiv Dropbox/Valentin Stefan/GitHub/img-with-box-from-excel/src/boxcel/start_project.py

print('path to current file using os.path.realpath(__file__) is: ' + os.path.realpath(__file__))
# path to current file using os.path.realpath(__file__) is: 
/home/vs66tavy/iDiv Dropbox/Valentin Stefan/GitHub/img-with-box-from-excel/src/boxcel/start_project.py

On Windows, os.path.realpath(__file__) returns the path to the Excel file that is used as the argument for start_project.py and not the path to the executed script.
On Linux, this behaves as expected - it returns the path to the executed script and not to the Excel file. No error is therefore given under Linux and the tool works as expected.

Python versions:

python # On Windows it returns this:
Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

python # On Ubuntu it returns this:
Command 'python' not found, did you mean:
  command 'python3' from deb python3
  command 'python' from deb python-is-python3

python3
Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

Need to fix this.

Simplify usage by not creating a xlsm file - xlwings should read directly from a xlsx file

If the xlwings add-in is installed, then the python script (display_images.py) should work directly from a xlsx file: https://docs.xlwings.org/en/stable/addin.html#xlwings-addin

Then, in start_project.py no need to run:

os.chdir(path_to_xlsx_file)
os.system("xlwings quickstart " + xlsx_file_name_without_extension)

And also, no need to copy the data from xlsx to xlsm. This will also simplify the code in start_project.py

Define function to create the xlsm file and its corresponding python xlwings script

the function should operate like this:

  • is executed from command line;
  • takes as input a xlsx file;
  • creates the python script with the xlwings functionality connected with the xlsx file;
  • creates the needed xlsm file and its structure that allow communication with the python script from above;
  • has also an optional argument that allows the opening of the xlsm file;

Use tkinter to make a GUI for the tool

Build a simple GUI that does:

  • the user chooses the path to the Excel file
  • click on a run button that will create the needed python script which will enable the xlwings functionality

Convert the tool into an executable file for Windows

Convert the python scripts into a standalone executable to run the tool with the implemented GUI.

For inspiration:

I'll have to execute this on the Windows virtual machine, because if the operation runs under Linux, then it produces an executable for Linux.

Window name doesn't match the name of the image file in Windows OS

This is because the PIL | Image.show() method opens a temporary PNG file for the current image file (under Windows).

In im.show(title=img_path) title doesn't work as expected, see:

Not sure if I can solve this for now or if it will prove to be problematic for our workflows. Possibly check image visualization with opencv-python.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.