Coder Social home page Coder Social logo

Comments (7)

Kas-code avatar Kas-code commented on September 25, 2024

I'm using the following method to try to detect the timeout.
It runs a Task.Delay as well as cam.ProcessAsync at the same time, and then uses Task.WhenAny to wait for either of them to finish.
If the Task.Delay returns first then it knows that cam.ProcessAsync is hanging.

public static async Task<bool> CancelAfterAsync(Task startTask, TimeSpan timeout)
        {
            using (var timeoutCancellation = new CancellationTokenSource())
            {
                var delayTask = Task.Delay(timeout, timeoutCancellation.Token);
                Serilog.Log.Logger.Debug("await Task.WhenAny");
                var completedTask = await Task.WhenAny(startTask, delayTask);
                Serilog.Log.Logger.Debug("Finished await Task.WhenAny");
                // Cancel timeout to stop either task:
                // - Either the original task completed, so we need to cancel the delay task.
                // - Or the timeout expired, so we need to cancel the original task.
                // Canceling will not affect a task, that is already completed.
                timeoutCancellation.Cancel();
                if (completedTask == startTask)
                {
                    // original task completed
                    Serilog.Log.Logger.Debug(" await startTask;");
                    await startTask;
                    return true;
                }
                else
                {
                    Serilog.Log.Logger.Debug("Timed out");
                    // timeout
                    return false;
                }
            }
        }

I then use this in the following way:

public static async Task<byte[]> CaptureRawData()
        {
            MMALCamera cam = MMALCamera.Instance;
            MMALCameraConfig.StillEncoding = MMALEncoding.BGR24;
            MMALCameraConfig.StillSubFormat = MMALEncoding.BGR24;
            using (var imgCaptureHandler = new MemoryStreamCaptureHandler())
            using (var renderer = new MMALNullSinkComponent())
            {
                cam.ConfigureCameraSettings(imgCaptureHandler);
                cam.Camera.PreviewPort.ConnectTo(renderer);

                // Camera warm up time
                await Task.Delay(2000);
                Serilog.Log.Logger.Debug("cam.ProcessAsync(cam.Camera.StillPort)");
                if (await TaskUtils.CancelAfterAsync(cam.ProcessAsync(cam.Camera.StillPort), TimeSpan.FromSeconds(100)))
                {
                    Serilog.Log.Logger.Debug("Got response, returning data");
                    return imgCaptureHandler.CurrentStream.ToArray();
                }
                else
                {
                    Serilog.Log.Logger.Warning("Camera timed out, return null");
                    return null;
                }
            }
        }

        public static async Task<byte[]> CaptureImage()
        {
            byte[] data = null;

            for (var i = 0; i < 10; i++)
            {
                data = await CaptureRawData();
                if (data != null)
                {
                    break;
                }
                else
                {
                    Serilog.Log.Logger.Warning("Camera timed out, retrying");
                }
            }

            if (data == null || data.Length == 0)
            {
                //Todo: Better exception message
                throw new Exception("Image capture failed");
            }

            return data;
        }

If the cam.ProcessAsync method hangs, then the CancelAfterAsync method should detect this and return false.
When this returns false, the CaptureRawData method should return null.
Then the CaptureImage method should detect a null result and retry up to 10 times.
But instead, the logging output I get is the following:

[13:54:54 DBG] Running CaptureRawData
[13:54:54 DBG] cam.ProcessAsync(cam.Camera.StillPort)
[13:54:54 DBG] await Task.WhenAny
[13:56:34 DBG] Finished await Task.WhenAny
[13:56:34 DBG] Timed out
[13:56:34 WRN] Camera timed out, return null

It then hangs indefinitely on the return null line inside CaptureRawData.
This makes no sense.
It's clear that CancelAfterAsync has detected the hang and returned false, but then it hangs when returning null from CaptureRawData.
It never gets to the line where it logs "Camera timed out, retrying", which is confusing because all it's doing is returning null?

If anyone can just help me with how to detect the hang and retry safely then that would be great.

from mmalsharp.

techyian avatar techyian commented on September 25, 2024

I'm struggling for time to invest in MMALSharp at the moment due to work commitments. My advice would be to enable verbose logging using MMALCameraConfig.Debug = true and see what is happening before the hanging happens which may help identify where the problem is happening. I suspect it is happening because a buffer header that you've sent to MMAL is not being processed and returned to the callback handler, therefore causing an indefinite "hang". I would observe how much memory is available to the system at the point this problem occurs by running free in a terminal window and see if there has been a memory leak of any kind.

You don't say in your message whether you're using .NET Core or Mono. If you're using .NET Core, please can you try the same application under the latest version of Mono and see if you can reproduce?

I have encountered issues like this in the past, although it is very difficult to understand the root cause and which component is actually at fault; whether it be the MMALSharp library itself, the MMAL library we're consuming, hardware related, or an issue with the platform (.NET Core/Mono).

from mmalsharp.

Kas-code avatar Kas-code commented on September 25, 2024

I've enabled MMALCameraConfig.Debug = true but I'm not seeing any extra output, where does the verbose logging go to?

I'm logging memory usage every minute in a python script completely independently to this and it stays a solid 25-30% memory usage over 4-5 hours so there's almost definitely not a memory leak.

I'm using .NET Core. I may try Mono as well.

I've also tried removing all async and changing to use the following:

        private static volatile bool _camProcessing = false;

        private static bool WaitForCam(MMALCamera cam)
        {
            _camProcessing = true;
            Task.Run(() =>
            {
                cam.ProcessAsync(cam.Camera.StillPort).ConfigureAwait(false).GetAwaiter().GetResult();
                _camProcessing = false;
            });
            for (var i = 0; i < 1000; i++)
            {
                Thread.Sleep(100);
                if (!_camProcessing)
                {
                    Serilog.Log.Logger.Debug($"cam processing finished");
                    return true;
                }
            }
            Serilog.Log.Logger.Warning($"Reached timeout, camera might have locked up");
            return false;
    }

It reaches the "Camera might have locked up" line and then hangs outside the Task.Run, which is just bizarre.

from mmalsharp.

Kas-code avatar Kas-code commented on September 25, 2024

It's possible that it's locking while disposing either one of these using statements:

using (var imgCaptureHandler = new MemoryStreamCaptureHandler())
using (var renderer = new MMALNullSinkComponent())

I'm going to run some further tests and report back.

from mmalsharp.

techyian avatar techyian commented on September 25, 2024

I think the issue will be a background thread that is failing to complete due to the buffer sending issue I spoke about previously. The native MMAL library sends buffer headers to the callback handlers (function pointers) which invokes a background thread. If an operation on those background threads fails to complete then the application will hang indefinitely because it's unable to terminate cleanly until all threads have completed. As you're processing raw data, the callback will be sent to here and in turn here. The issue will most probably be happening here.

Regarding the logging, I think the output you see will depend on the Log Level you've set against Serilog and you will want to set a minimum level of "Debug". The output will then go to your desired target i.e. console/file.

It will be interesting to find out whether it's the sending or the releasing of the buffer that's causing the lock but I would encourage you to try Mono to see if the issue is .NET Core specific.

from mmalsharp.

Kas-code avatar Kas-code commented on September 25, 2024

Ok, so the locking up was also happening during disposing one of those two usings in my previous comment.
After cam.ProcessAsync(cam.Camera.StillPort) hangs, if I then dispose one of those two usings, that will also hang.
I can detect that cam.ProcessAsync(cam.Camera.StillPort) is hanging by wrapping it in a Task.Run, so I tried the following:
Only dispose when there hasn't been any hang detected.
If there has been a hang detected, then don't dispose but just try again regardless.
I can get to the point of calling var renderer = new MMALNullSinkComponent() again, but that gets the following error:

[08:01:14 DBG] var renderer = new MMALNullSinkComponent();
mmal: mmal_vc_component_create: failed to create component 'vc.null_sink' (1:ENOMEM)
mmal: mmal_component_create_core: could not create component 'vc.null_sink' (1)
[08:01:14 FTL] Fatal error
MMALSharp.MMALNoMemoryException: Out of memory. Unable to create component
at MMALSharp.MMALNativeExceptionHelper.MMALCheck(MMAL_STATUS_T status, String message)
at MMALSharp.MMALComponentBase.CreateComponent(String name)
at MMALSharp.MMALComponentBase..ctor(String name)
at MMALSharp.Components.MMALDownstreamComponent..ctor(String name)
at MMALSharp.Components.MMALRendererBase..ctor(String name)
at MMALSharp.Components.MMALNullSinkComponent..ctor()

Although it's definitely not out of memory, the memory usage is only 30% when this happens.
If I can detect the hang, can I just reset everything and start again by using MMALCamera.Instance.Cleanup or ForceStop or StopCapture?
If I Ctrl+C the process and then run it again everything is fine for another few hours, so if there's a way of just resetting everything and starting fresh from within the code, then that would be great.

from mmalsharp.

Kas-code avatar Kas-code commented on September 25, 2024

I've found an acceptable workaround that is a good enough solution for me for now. I can detect the first hang and just exit the program. I've configured it as a linux service that will automatically re-run if it exits. Since the program is running a continual loop anyway, it's not really detrimental if it exits and starts running again, and it only does so a couple of times per day.
If I get some more time in the future I can try it in mono and do some further investigation, but for now, the solution means it does what I need it to do.

from mmalsharp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.