Hi, this looks like a neat project, and a few quick tests with some videos showed promising results. Unfortunately, it's much slower than other scene detection tools I've experimented with, so unless it gets faster or the detection capability is much better, I'm not sure it will meet my needs.
The other tools I've used for content scene detection are ffprobe
:
ffprobe -show_frames -of compact=p=0 -f lavfi "movie=${FILE},select=gt(scene\,${THRESHOLD})
and x264 (transcoding the video with ffmpeg
first to make it faster:
ffmpeg -i ${FILE} -vf scale=320:-1 -sws_flags neighbor -an -pix_fmt yuv420p -f yuv4mpegpipe - 2>/dev/null | x264 - --demuxer y4m --bframes 0 --min-keyint 10 --scenecut ${THRESHOLD} --preset superfast --crf 30 --threads 1 -v --output /dev/null 2>&1 | grep scene | cut -d ' ' -f 6
(ffmpeg
also has a blackframe
filter, but I haven't used it much and am more interested in detection when there aren't black frames to make things easy.)
I'm sure you're familiar with these approaches, and I'm wondering if you've spent any time comparing them to your tool for quality / speed. I see you have an open issue to reduce the file size to make things faster, but even with a ~2m long 320x180 file, I'm seeing 17 seconds for your tool vs. 8 seconds for x264 vs. less than a second for ffprobe. I've subjectively found x264
to do a better job than ffprobe
, so I think parity with it would be compelling enough reason to switch.
So, my actual questions:
- Do you think that's an achievable performance target for your tool?
- If not, do you think the slowness is coming from OpenCV or Python?
- Would you consider adding some text to the README documenting what's better about your tool over these others?
I know you have plans to add some other detection methods,, which sounds great, but right now, I don't see a compelling reason to switch given the speed difference.
Nonetheless, this is a really cool project, and I'll be interested to see how it develops in the future.