Coder Social home page Coder Social logo

Work Flow about asdc_mwe HOT 36 CLOSED

littlerob84 avatar littlerob84 commented on August 18, 2024
Work Flow

from asdc_mwe.

Comments (36)

littlerob84 avatar littlerob84 commented on August 18, 2024 2

ahhhh, sorry, I copied and pasted the command from a google doc and it formated the " differently. If I enter " in the command prompt window, it runs fine, sorry.

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob

this one is really hard for me to figure out and sort out. Could you provide some form of sorting / indexing that allows me to find out what files are created where and when?
That way I can break it down into the steps that need to happen at what stage and where we could place checkpoints, or split scripts to also distribute to different machines / software packages that can run in parallel.

Either like this example that I provided as an image;
WorkflowV1
Or in terms of GitHub Markdown formatting
p.ex.

Script to visualise bboxes

visualise_bboxes.py

inputs

  • input root folder with 2 subfolders. Names same across folders:
    • images. Only .jpg files
    • labels. Only .txt files.

expected output

New Folder with output:

  • images with bounding boxes drawn on them. Naming convention same as input

description

takes an image and a label file, draws the bounding boxes from the label file on the image and stores the image again in a new location. Works on entire folders, not only single images.

Breaking it down

From what I can tell so far, it should be something like this:

  1. Process site
    • input: folder of site to process
    • make folder for each flight in top site folder
    • rewrite stats file on the fly
    • output: image files with bounding boxes on them
  2. geotag images
    • not quite sure how this one works - pretty sure you have that sorted, won't need my input for this.

Is that about right?

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Sorry, I'm not sure I understand exactly what you are after, but I 100% agree that what I wrote is very hard to follow, sorry, there is just a lot in this job!

In simple terms, we just want to point a script at the root folder (Site folder) and then have it process all images and spit out the detection images somewhere. We don't need, or care about the label files or anything, we just need images with boxes on them we can look through. So just one command line entry with the input of , that it.

We will need a way to process an individual folder of images though and not a whole site, so a command line entry with the input of <flight folder path> will need to be an option.

The geotagging can be sorted out later as this is a much easier issue to work on and I have a script that pretty much already does it, I can adjust this to suit the output.

This is an example folder structure of a site. It shows the final state of a folder thats been processed.
image

So it starts with just the Flight folders and tracklog folder, then makes the detection root folder and flight folders within that when the script starts.

One option to make it a bit easier with this algorithm could be to put all detections, from all flights into just one folder, called "Detections". You would also need your "Labels" folder as well I suppose, but then you can just run the bounding box script over one folder and not have to make it batch through a root folder. Issue with this there will be multiple images with the same file name, so that will need to be handled and each duplicate will need to be labeled in a way that the georeferencing will know which flight each duplicate image came from. I think this issue will make this approach harder in the long run.

Happy to chat about this on the phone if that will be easier? If I share my screen I can demo an actual workflow. I'm over covid now so can do my late nights/your mornings if that works for you?

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob

this is good insight. What I can do is just provide a script that has a command line argument taking an input folder and an extra flag that would then be for single flights.
It'd look something like this
python inference_folder.py -i "2023.12.22 - Toolong South" -o "Toolong South Detections"
or, for single flights:
python inference_folder.py -i "2023.12.22 - Toolong South/Flight 1" -o "Toolong South Detections/Flight 1" -f the -f flag would then be used not to process subfolders, just the one.

  • How do you want me to handle already existing Folders? Append a new number and write anyways?
  • How do you want me to handle non-existent output folders? E.g. should the program create it or can we assume that the output folder already exists?
    I can provide naming checks, I've done that before here. so if something exists, it'll change the name and append it. however it'll be some form of bloaty code.

Let me get a baseline version ready for you first and then we can provider user-based wrappers and exception handling later.

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Those ideas sound perfect.

Existing Folders I have handled existing folders and files by just overwriting, with no warning needed. This is fine as we are only overwriting detection outputs and not original images. I did add the version number to the detection folder (you can see in that screenshot the V1.34). I did that so if I ran different models over the same images, we can compare the outputs, so this would be good to keep if possible, but it not critical.

Non-existent output folders I handled this the way you suggested, with bloaty code, and seeing if its there and making the folder if not. The ideal is to have as minimal setup on each site folder as possible, so it making the folder(s) itself is definitely the preference. As per the top point, it overwriting the folder or files inside it is fine, so no need to check if there is contents in the folder and warning the user of that.

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Just as a note here for myself. when running visualise_bbox.py myself, everything looks fine in matplotlib.plot. However, running the function inside visualise_bbox from the script full_process does not plot for some reason. And the stored images look different. Will have a look tomorrow

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Important caveat on the folder naming are the different filenaming conventions, see here: https://stackoverflow.com/questions/8384737/extract-file-name-from-path-no-matter-what-the-os-path-format/8384788#8384788

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Looks like it's all coming along well. Let me know when you want me to test it on a fresh dataset from this years data capture.

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob
yes I am making reasonable progress, and I would call this 2/3 finished. It would work on a site folder now, but not when only running single flights etc. The detection and overlay are now working in the same script full_process.py and all of them are using the same functions defined in utils.py, to make it also usable separately.
Can you make a reduced test set available for me? e.g. a folder for a site with 2-3 flight folders inside, each of them with ballpark 5-10 images and all kinds of auxiliary files that would be in those folders in a deployment setting. That way I can test processing errors when site runs restart and those kinds of things?
It's hard for me to catch potential errors when I do not have the same setup available, so I just want to make sure that things work as desired.

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Hi Nico,

Folder with a test set is here - https://drive.google.com/drive/folders/1V8RqDiLxlTCYMjF5iROt81eHi5PLpBmf?usp=sharing

Sorry its so large, there are only 2 ARW's in each flight folder, but they are just so big it adds up. If you want more images in each folder, probably best to just copy and paste the same images in there a few times.

I've thrown al the various files types, incorrect naming conventions and human errors I can think of (or that I have had to deal with myself) amongst the files and folders, so if it can handle this dataset, it should be able to handle anything for this project.

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob
this is what I currently have. I just ran the command python full_process.py -i <2023...blablabla> -o and this is the output I get. Note that the top part doesn't expand the subfolder path, so I've run that again below. From the folder structure in the second part that I got from you, the first part are the outputs I generate.
image
I have also decided to split the processing of a site and the processing of a single flight into two separate scripts. that makes the scripts less convoluted and allows us to handle more specific cases. I'll ensure shared functions are in utils.py so that they all use the same conventions and filenames.
I'll need guidance how you want to handle edge cases when files / folders already exist. One example I can instantly think of is the following: someone has run a full detection on a site, but it's crashed somewhere in flight 2, let's say at file 3 of 100.
Where should the checks happen and the program restart with its processing?
Checking whether a folder already exists is MUCH faster than checking for every file AFTER inference. So it is much more time consuming for the model to check at every file level, but there are chances that files will be missed if only checking at folder level whether things exist.

another idea would be to add a hidden logfile (csv or some custom format) in the detections folder, that can be read for the last folder / file that has been written and skip if it already exists. This is cumbersome to implement and will take a fair while, because it is not clear how to structure that file (also at file / folder level) and we need read as well as write functionalities.

Thoughts on this?

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

I attacked the "where to restart" problem by writing a log file that was just written at the end of each flight folder, see attached example of a "temp" one which is made during the process after each folder and a "final" one which is made on successful completion of the folder. It then deletes the temp one. That is probably a nice to have function and not worth your time for now. I think just overwriting a flight folder and files is an acceptable way to handle a mid-site crash. Yes, thats time inefficient, but it's not the end of the world running half a flight again to A) make your life easier and B) ensure we don't miss anything. I had an option in my GUI specifically for this where you could tell it to process an entire site, but starting from flight 3 (for example). This obviously adds a lot of work for you, so for now maybe we just run a full site, and if there is a problem and the last 4 flights didnt get processed, we just run those 4 flights individually. As this will be mainly used by me for now, it's easy enough to keep track of all this.

[AA - Temp Detection log for 2023.12.30 - MacGregors South East - (V 4.41).txt]
AA - Final Detection log for Flights - (V 4.41).txt
(https://github.com/NicoMandel/asdc_mwe/files/13823672/AA.-.Temp.Detection.log.for.2023.12.30.-.MacGregors.South.East.-.V.4.41.txt)

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob
thanks for the comments. I'll reply to both in the same mesasge.

  1. I deleted the comment, that's why you couldn't find it. The screenshot in the previous comment gave me enough clarification, but now with the new info I'll put a detections folder in the site folder. I'll also put a labels folder next to it, toggable by a flag in the command line arguments
  2. What I'd suggest then is a check whether the "flight" subfolder exists, and just skipping it if it does. So if one site was unsuccessful, you would have to manually delete it and then just rerun.
    There are obviously more sophisticated ways, p.ex. with log files, however this should serve for now. Agreed?

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Sounds good, but how would we know if a flight was unsuccessful? Will there be a txt file to tell us it got completed, or will the script just exit once it encounters a problem and we just have to see where it got up to?

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

could do both. I could provide a simple log file in each flight subfolder in the detections folder with a counter how many images it already processed.

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

That would be great, cheers.

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob

please download at the current stage and let it run, and let me know if everything works as desired.
Running a site or a single flight now make use of the same function, run_single_flight, which is tucked away in the process_flight script. This way it ensures that both do the exact same thing and changes to one actually do not break the other one, or behave differently in some form. It also makes for cleaner, simpler programming as all files are under 200 lines and things that are used multiple times are neatly in other files and are imported across where necessary.

I've now split utilities into three files. One for log_utils, just for logging things, one for file_utils for just file handling and one for model things like visualisation in model_utils.
I've included two types of logs now. One per site, where each image that has been processed is written into the logfile (a .log.txt) inside the flight folder. Upon next execution, that file is read, and if that file has already been processed, it's skipped.
The second log does the same, but flight-wide. So the process_site script just launches the run_single_flight with all of its subfolders, and also writes it's own log file once it has finished a flight, so this will skip flights if they are already considered processed. Note that the log file is written AFTER the site is run, so that it will not be considered finished if the program exits due to an error or is killed.
That should significantly help

Caveats:

  • One thing I noticed is that when you sent me the folder, some of them have subfolders with .JPG files or such as subfolders inside the flights. Those are ignored.
  • all folders that should be processed are required to have some form of the word "flight" in them (see here ) typos and such will be ignored, but can be added afterwards by running the process_flight on that specific folder.

Please run and see if it does what you want it to do and let me know if I can close this issue and move to accelerating inference by batching things up (issue #2 )

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Hi Nico,

The model is now missing in this repo so I had to find it from the old one and place it in the config folder, I assume that is the correct place for it. Also found a few modules missing so had to manually install them, It was rawpy and tdqm from memory, so be good to add these to the requirements file.

I also had to place the \inference\png\labels into the data folder as it was looing for that directory, Is that correct?

It did then start running and I tried to run it on a single folder "Flight 1" by entering

python inference_single_image.py -i "D:/2023.12.28 - Farm Ridge NW Redo/Flight 1"

Is that correct? To run a whole site do I just use the site folder name i.e. python inference_single_image.py -i "D:/2023.12.28 - Farm Ridge NW Redo"?

It started, ran through 16 images and then gave this error. I assume I am missing the "exiq" file?

`(OHWAlgorithm) C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main>python inference_single_image.py -i "D:/2023.12.28 - Farm Ridge NW Redo/Flight 1"
YOLOv5 2024-1-7 Python-3.10.13 torch-1.13.1 CUDA:0 (NVIDIA GeForce RTX 3080 Ti Laptop GPU, 16384MiB)

Fusing layers...
Model summary: 291 layers, 20863236 parameters, 0 gradients, 48.2 GFLOPs
Adding AutoShape...
There are 780 listed files in folder: Flight 1/
Running on device:_CudaDeviceProperties(name='NVIDIA GeForce RTX 3080 Ti Laptop GPU', major=8, minor=6, total_memory=16383MB, multi_processor_count=58)
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing prediction on 126 number of slices.
Performing inference on D:/2023.12.28 - Farm Ridge NW Redo/Flight 1: 2%|▏ | 15/780 [02:51<2:25:39, 11.42s/it]
Traceback (most recent call last):
File "C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main\inference_single_image.py", line 107, in
convert_pred_to_txt(result, target_dir, imgf)
File "C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main\model_utils.py", line 104, in convert_pred_to_txt
for x, y, w, h, score, category_id in yolo_bboxes:
TypeError: 'NoneType' object is not iterable

(OHWAlgorithm) C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main>
(OHWAlgorithm) C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main>exiq
'exiq' is not recognized as an internal or external command,
operable program or batch file.`

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob

sorry for the confusion. The scripts to run now are:

to run either, just put in python <script name> -i <input_folder_location, either flight or site> -o

And yes, concerning the model file, you are correct, it's not part of the repository. Just copy it into the config folder, and it will be found automagically. Otherwise, you can specifiy where it is by using the -m flag at the end of the command to run the script. e.g. python <script name> -i <input folder location> -o -m <path to model>

Let me know if that works
Cheers
Nico

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Also, could you make the file where the error occured available to me? I want to run some tests with it.

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Hi Nico,

Just running a flight now and its working through the images. It is waaaaay too sensitive though and picking up detections in every single image that are 100% not OHW. Where do I tweak the confidence level now?

Output PNG files are ~120mb, can we make these a lower quality/file size?

Also, while writing this just got this error:

Performing prediction on 77 number of slices.
Performing inference on E:\Upper\Flights\1:   2%|▌                                  | 15/900 [03:22<3:18:41, 13.47s/it]
Traceback (most recent call last):
  File "C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main\Process_flight.py", line 163, in
    run_single_flight(
  File "C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main\Process_flight.py", line 101, in run_single_flight
    image, _ = visualize_object_predictions(
  File "C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main\visualise_bbox.py", line 71, in visualize_object_predictions
    someret = xywhn2xyxy(object_prediction_list[:,1:], w, h, padh=padding_px, padw=padding_px)
IndexError: too many indices for array: array is 0-dimensional, but 2 were indexed

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Also just noticed we can't have spaces in the path, this will be a problem as ALL sites ever have got spaces, hyphens and full stops in the paths. i.e. "E:\2023.12.23 - Kellys Hut\Flight 1"

Got another error on this image, same error message - https://drive.google.com/open?id=1V8RqDiLxlTCYMjF5iROt81eHi5PLpBmf&usp=drive_fs

It the image called "DSC00001.ARW"

Error message was

`Running on device:_CudaDeviceProperties(name='NVIDIA GeForce RTX 3080 Ti Laptop GPU', major=8, minor=6, total_memory=16383MB, multi_processor_count=58)
YOLOv5 2024-1-7 Python-3.10.13 torch-1.13.1 CUDA:0 (NVIDIA GeForce RTX 3080 Ti Laptop GPU, 16384MiB)

Fusing layers...
Model summary: 291 layers, 20863236 parameters, 0 gradients, 48.2 GFLOPs
Adding AutoShape...
Writing visuals to E:\Upper\Flights\Detections\2
There are 1152 listed files in folder: 2/
Performing prediction on 77 number of slices.
Performing inference on E:\Upper\Flights\2: 0%| | 0/1152 [00:06<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main\Process_flight.py", line 163, in
run_single_flight(
File "C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main\Process_flight.py", line 101, in run_single_flight
image, _ = visualize_object_predictions(
File "C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main\visualise_bbox.py", line 71, in visualize_object_predictions
someret = xywhn2xyxy(object_prediction_list[:,1:], w, h, padh=padding_px, padw=padding_px)
IndexError: too many indices for array: array is 0-dimensional, but 2 were indexed`

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob

The confidence level is set in the configurations file in the config folder. Just tweak as desired. The default I used to use during development was 0.4

thanks for the hint on the error. Please make both images available, I have a fleeting suspicion and it shouldnt be a difficult fix.

Not sure what you mean with spaces. If you input the flight paths in the command line code, just wrap it in hyphens, e.g.
python process_site.py -i E:\2023.12.23 - Kellys Hut\Flight 1 should be python process_site.py -i "E:\2023.12.23 - Kellys Hut\Flight 1" is that what you were referring to?

Cheers
Nico

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob

The confidence level is set in the configurations file in the config folder. Just tweak as desired. The default I used to use during development was 0.4

thanks for the hint on the error. Please make both images available, I have a fleeting suspicion and it shouldnt be a difficult fix.

Not sure what you mean with spaces. If you input the flight paths in the command line code, just wrap it in hyphens, e.g.
python process_site.py -i E:\2023.12.23 - Kellys Hut\Flight 1 should be python process_site.py -i "E:\2023.12.23 - Kellys Hut\Flight 1" is that what you were referring to?

Cheers
Nico

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Hi Nico, found that config file just before you replied, sorry, I should have spent 2 mins looking first.

I've just added another images called "0.4 DSC00001.ARW" that is fine on 0.1 confidence level and produces an output detection image but gives the same error on 0.4?

Also uploaded the first image that it errored on and that was with confidence level 0.1 (the one that it was pre-set to)

Hopefully that helps.

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

I changed the output to jpg in the config file and now have 20mb jpegs being created.

Still no joy with spaces in paths, this failed

(OHWAlgorithm) C:\Users\littl\Desktop\OHW CSU Stuff\asdc_mwe-main>Python Process_flight.py -i “E:\2023.12.22 - Upper Teddys Creek\Flights\1” -o usage: Process_flight.py [-h] [-i INPUT] [-o] [-m MODEL] [--labels] [-c CONFIG] Process_flight.py: error: unrecognized arguments: - Upper Teddys Creek\Flights\1”

but this passed

Python Process_flight.py -i E:\Upper\Flights\3 -o

I changed the folder name to "Upper" to test it

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi Rob

thanks for the comment. I think that's what happens when there are no detections in the image, so that the array is empty. I'll get on top of it.

Just for my own info on image compression, how to put it into code, see the docs on the imwrite() function here and the parameter list here. Scaling from 0 to 9 should be the simple way.

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

That makes perfect sense from some confidence threshold testing I just did, there were less and less boxes drawn as I increased te threshold, and then the error got thrown.

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hi rob

a little stumped on the filename with double quotes, because that's the convention see here
Can you try wrapping the test folder you renamed with "upper" into double quotes and see if that changes anything? process_flight.py should also be lowercase, but I don't know if that makes a difference.

I'll consider the compression parameters fixed then if you use jpg now.

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Could you try with -i="E:\2023.12.22 - Upper Teddys Creek\Flights\1"

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

alright so problem remaining is the error it throws on empty files? I will fix that the rest of today, no problem

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Yep, correct

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Hey Rob

Please download again (you can also use GitHub Desktop ) and have a run
it should work fine now. Let me know if any errors come up!

Cheers
Nico

from asdc_mwe.

littlerob84 avatar littlerob84 commented on August 18, 2024

Working fine so far. I'm getting around 6.7-7.5s/its on my 3080 laptop.

from asdc_mwe.

NicoMandel avatar NicoMandel commented on August 18, 2024

Sounds good. I'll close this for now as "resolved". If issues come up again, please use the "open again with comment" button on the bottom of this page. Maybe keep the email that comes through with this comment as a reference so you can find the link to the issue again.

from asdc_mwe.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.