👋 Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Performance difference in model formats,about ultralytics/yolov5

Comments (10)

github-actions commented on September 9, 2024

👋 Hello @Avaneesh-S, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

from yolov5.

glenn-jocher commented on September 9, 2024

@Avaneesh-S hello,

Thank you for reaching out and for providing detailed information about your issue. It's great to see that you've already followed the model export tutorial and tested the models on a Colab T4 GPU.

Performance Differences Across Model Formats

Performance differences between model formats can occur due to several factors, including differences in how each framework handles operations, optimizations, and precision. Here are a few points to consider:

Precision and Optimizations: Ensure that the precision (FP16 vs. FP32) and any optimizations applied during export are consistent across formats. For instance, using the --half flag during export can reduce model size and potentially affect performance.
Export Parameters: Verify that the export parameters are consistent. For example:
```
python export.py --weights yolov5s.pt --include torchscript onnx engine --half
```
Framework-Specific Optimizations: TensorRT and ONNX may apply different optimizations that can affect performance. These optimizations are generally aimed at improving inference speed but can sometimes lead to slight variations in accuracy.
Validation Script: Ensure that the validation script (val.py) is consistent and that no additional preprocessing or postprocessing steps are inadvertently altering the results.

Ensuring Consistent Performance

To preserve performance across exports, you can try the following:

Consistency in Precision: Ensure that you are using the same precision (FP16 or FP32) across all exports.
Validation: Use the same validation script and dataset for all formats to ensure consistency.
Export Options: Experiment with different export options to see if they affect performance. For example, using --dynamic for dynamic input shapes or --simplify for simplified ONNX models.

Example Export Command

Here’s an example command to export a model to multiple formats with consistent precision:

python export.py --weights yolov5s.pt --include torchscript onnx engine --half

Documentation Note

Thank you for pointing out the issue with the numpy version for TensorRT export. We will look into updating the documentation to reflect this requirement.

For further details, you can refer to the model export tutorial.

If the issue persists, please ensure you are using the latest versions of torch and the YOLOv5 repository. If you have any further questions or need additional assistance, feel free to ask!

from yolov5.

Avaneesh-S commented on September 9, 2024

Hey @glenn-jocher, I have verified that the .pt model's weights are float16 by default using 'netron' website. So if i dont use --half while exporting I see that my .onnx model's are float32, so using --half shouldn't affect as much right?.
( Can't verify for .engine model on 'netron')
exporting using --half means even while using val.py I have to use the --half. I noticed that using --half flag with val.py changes the performance:

for .pt model:

for .onnx model (exported using --half and --device 0):

can notice the difference by comparing these results to ones I had originally shared.
I have also verified that using --dynamic and --simplify did not affect the performance.
Do let me know if you know of any method to convert without losing performance (I am looking for a way to get the exact performance results of the .pt model on the .onnx and .engine models)

PS: I would have shared the screenshots of all the above runs including for tensorrt, but there is an issue with the yolov5 repo right now, which was not there previously(until yesterday for me). commands to fetch data from - https://ultralytics.com/assests/ is not running. it returns 301 moved permanently response.
It does this only when getting data through linux terminals (I tried with wget), including the default running of
val.py is not working on colab as well with the same error when it tries to fetch data from there (like the coco128 dataset).
It works on windows though, kindly look into this too. If resolved I can test the other methods.

from yolov5.

glenn-jocher commented on September 9, 2024

Hello @Avaneesh-S,

Thank you for your detailed follow-up and for providing additional insights into your issue. Let's address your concerns step-by-step.

Precision and Performance

You are correct that using the --half flag during export and validation can affect performance due to the change in precision from FP32 to FP16. This is expected behavior as FP16 models are generally faster but may exhibit slight differences in accuracy compared to their FP32 counterparts.

Ensuring Consistent Precision

To maintain consistent precision across all formats, you should ensure that both the export and validation processes use the same precision. Here’s how you can do it:

Export with FP16 Precision:

python export.py --weights yolov5s.pt --include torchscript onnx engine --half

Validate with FP16 Precision:

python val.py --weights yolov5s.onnx --half

Issue with Data Fetching

Regarding the issue with fetching data from https://ultralytics.com/assets/, it seems there might be a temporary redirection issue. We recommend checking if this persists and trying the following workaround:

Manual Download: Download the dataset manually from the browser and upload it to your Colab environment.
Alternative Data Source: Use an alternative data source or mirror the dataset to a different location.

Next Steps

Reproducible Example: If the issue persists, please provide a minimum reproducible code example. This will help us investigate further. You can find more details on how to create one here.
Latest Versions: Ensure you are using the latest versions of torch and the YOLOv5 repository. This can resolve many issues related to compatibility and bugs.

Example Code for Consistent Precision

Here’s an example of how you can export and validate your model with consistent precision:

# Export with FP16 precision
python export.py --weights yolov5s.pt --include torchscript onnx engine --half

# Validate with FP16 precision
python val.py --weights yolov5s.onnx --half

If you have any further questions or need additional assistance, feel free to ask. We’re here to help!

from yolov5.

Avaneesh-S commented on September 9, 2024

Hey @glenn-jocher, I looked into using the different flags during exporting the .pt model to .onnx and as well with the val.py script.
These are my results:

--dynamic and --simplify do not affect the performance
--half drops the performance (very small drop)
no possibility of conversion from .pt to .onnx gave the exact performance results for both
.engine gives same results as .onnx (since .pt to .engine works like : .pt ->.onnx -> .engine)

for a detailed result, you can view the following colab file which I created : testing_yolov5.

Also I have seen that while converting .pt to .engine i get the following line :
onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
but then it builds the FP32 engine. why is it generating ONNX model with INT64 weights?

Also to conclude, do check and let me know if there does any way to convert without any drop in performance or if always a drop is expected.

from yolov5.

glenn-jocher commented on September 9, 2024

Hello @Avaneesh-S,

Thank you for your detailed investigation and for sharing your results. It's great to see your thorough approach to testing the different export options and their impact on performance.

Addressing Your Findings

Dynamic and Simplify Flags: It's good to know that --dynamic and --simplify do not affect performance. These flags are primarily for optimizing the model structure and handling dynamic input shapes.
Half Precision: The slight drop in performance when using --half (FP16) is expected due to the reduced precision. FP16 models are generally faster but may exhibit minor accuracy differences compared to FP32 models.
Performance Consistency: Achieving exact performance parity between .pt and .onnx models can be challenging due to differences in how each framework handles operations and optimizations. The same applies to .engine models, as the conversion process involves multiple steps.

ONNX to TensorRT Conversion

The warning message you encountered during the conversion from ONNX to TensorRT:

onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.

This indicates that the ONNX model includes INT64 weights, which TensorRT does not support directly. TensorRT attempts to cast these weights to INT32, which can introduce minor discrepancies. This is a known limitation and is part of the conversion process.

Recommendations

Precision Consistency: Ensure that you are using consistent precision across all formats. If you prefer FP32 precision, avoid using the --half flag during export and validation.
Latest Versions: Verify that you are using the latest versions of torch and the YOLOv5 repository. This can help mitigate any compatibility issues and ensure you have the latest optimizations and bug fixes.
Reproducible Example: If you encounter any specific issues or bugs, please provide a minimum reproducible code example. This will help us investigate further. You can find more details on how to create one here.

Conclusion

While achieving exact performance parity across different model formats can be challenging due to inherent differences in how each framework handles operations, following the recommendations above can help minimize discrepancies. If you have any further questions or need additional assistance, feel free to ask. We're here to help!

Thank you for your contributions and for being an active member of the YOLO community! 🚀

from yolov5.

Avaneesh-S commented on September 9, 2024

Hey @glenn-jocher, I have followed your recommendations but they didn't help. For the reproducible example you can refer to testing_yolov5. Do let me know your thoughts on it.

from yolov5.

glenn-jocher commented on September 9, 2024

Hello @Avaneesh-S,

Thank you for sharing the detailed Colab notebook and for following the recommendations provided earlier. I appreciate your thorough approach and the effort you've put into this.

Reviewing Your Colab Notebook

I've reviewed your Colab notebook, and it provides a clear and comprehensive example of the issue you're encountering. This is very helpful for us to understand and investigate the problem further.

Next Steps

Latest Versions: Please ensure that you are using the latest versions of torch and the YOLOv5 repository. This can help mitigate any compatibility issues and ensure you have the latest optimizations and bug fixes.
Precision Consistency: As discussed earlier, maintaining consistent precision across all formats is crucial. Ensure that both the export and validation processes use the same precision (FP32 or FP16).
ONNX to TensorRT Conversion: The warning about INT64 weights being cast to INT32 during the ONNX to TensorRT conversion is a known limitation. This can introduce minor discrepancies, but it's part of the conversion process.

Example Code for Consistent Precision

Here’s an example of how you can export and validate your model with consistent precision:

# Export with FP32 precision
python export.py --weights yolov5s.pt --include torchscript onnx engine

# Validate with FP32 precision
python val.py --weights yolov5s.onnx

Further Investigation

If the issue persists, we can investigate further based on the reproducible example you've provided. Your detailed notebook will be instrumental in helping us identify any potential issues or optimizations.

Thank you for your patience and for being an active member of the YOLO community! If you have any further questions or need additional assistance, feel free to ask. We're here to help! 🚀

from yolov5.

Avaneesh-S commented on September 9, 2024

Hey @glenn-jocher, Thanks for the help. I have got what I needed. Closing this issue.

from yolov5.

glenn-jocher commented on September 9, 2024

Hello @Avaneesh-S,

Thank you for the update! I'm glad to hear that you found the information helpful and that your issue has been resolved. 🎉

If you have any more questions or need further assistance in the future, feel free to open a new issue or join the discussions. The YOLO community and the Ultralytics team are always here to help!

Best of luck with your projects, and happy coding! 🚀

from yolov5.

Comments (10)

Requirements

Environments

Status

Introducing YOLOv8 🚀

Performance Differences Across Model Formats

Ensuring Consistent Performance

Example Export Command

Documentation Note

Precision and Performance

Ensuring Consistent Precision

Issue with Data Fetching

Next Steps

Example Code for Consistent Precision

Addressing Your Findings

ONNX to TensorRT Conversion

Recommendations

Conclusion

Reviewing Your Colab Notebook

Next Steps

Example Code for Consistent Precision

Further Investigation

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org