Coder Social home page Coder Social logo

Comments (5)

breunigs avatar breunigs commented on August 22, 2024 3

issue 1: close key frame insertion and reservoir-frame-delay

Playing around a bit, I suspect this is a rate control issue that occurs when rav1e inserts a key frame early (before max keyframe interval) and reservoir-frame-delay <= keyint. Since by default keyint=240 and reservoir-frame-delay=min(240, 1.5*keyint)=240 latter is true unless rav1e's defaults are changed.

Let's look at some graphs. Here's with rav1e's default settings (i.e. -i 12 -I 240 --reservoir-frame-delay 240):
image

These metrics have been grabbed from

rav1e/src/rate.rs

Lines 890 to 897 in c7c72b5

// If we've been missing our target, add a penalty term.
let rate_bias = (self.rate_bias / (self.nencoded_frames + 100))
* (reservoir_frames as i64);
// rate_total is the total bits available over the next
// reservoir_tus TUs.
let rate_total = self.reservoir_fullness - self.reservoir_target
+ rate_bias
+ (reservoir_tus as i64) * self.bits_per_tu;
and

rav1e/src/rate.rs

Lines 1227 to 1228 in c7c72b5

// Adjust the bias for the real bits we've used.
self.rate_bias += estimated_bits - bits;
.
The x-axis are whenever the code-path in question got called, so roughly matches video progress. y-axis are either bits for the top two graphs, or trans form units for the bottom one.

We can see that rate_total ("rate_total is the total bits available over the next reservoir_tus TUs") becomes negative at some point. This matches the period of distorted video. We can also see that shortly before the real_bits (just bits in rav1e's code) are much larger than the estimate – this is where the "calm period" in my video ends, and the "action" starts. The reservoir empties accordingly. However, as far as I can tell that is not the (only?/main?) reason for the rate_total to become negative, since fiddling with the rate_bias (not graphed) calculation keeps the reservoir full enough, but the rate_total will still be negative.

The interesting bit seems to be the reservoir_tus which changes abruptly, which makes the rate_total negative. Basically, as far as rate control is concerned there are no more bits to spend, and none are coming until the next key frame. So it starves the last frames before the next key frame.

The dip in reservoir_tus is triggered by insertion of a new key frame shorter than the max key frame interval. In my case that's some 30 frames or so ahead. But that doesn't matter much, since guess_frame_subtypes will only consider up to the last keyframe within reservoir-frame-delay. Since a new key frame was just inserted at 30 frames away, the n+1th key frame will be at 30+240 -- which is outside the reservoir-frame-delay range. It therefore returns only the TUs it sees until the newly inserted key frame, so ~30 give or take, which is very little and causes a negative rate_total.

Increasing reservoir-frame-delay to 360 (i.e. 1.5x keyint) avoids the negative rate_total:
image
The result is still "low quality", but the horrible distortion is gone.

Trying out different variants where the "reservoir isn't startled by surprise key frame" yield similar results - low quality, but not completely distorted. E.g.

-i 240 -I 240 -reservoir-frame-delay 240:
image

-i 360 -I 360 -reservoir-frame-delay 360:
image

Looking at the comment https://github.com/xiph/rav1e/blob/master/src/rate.rs#L589-L596 specifically: "but long enough to allow looking into the next GOP (avoiding the case where the last frames before an I-frame get starved)". It seems that with keyint = reservoir-frame-delay = 240 this is exactly what happens – it only checks the current GOP, which is coincidentally very short.

So we need reservoir-frame-delay > max_keyint by some suitable margin, but not too large to avoid making rate control too slow to react. Maybe reservoir-frame-delay = min(max_keyint * 1.5, max_keyint + min_keyint * 4)? For the default values of min_keyint=12 max_keyint=240 that'd be reservoir-frame-delay=288, i.e. +48 frames.

issue 2: all bits spent early

Adjusting reservoir-frame-delay is not a magic fix, unfortunately. For example, with keyint=120 reservoir-frame-delay=180 I still run into distortions, i.e. when rav1e's log_hard_limit condition triggers. Essentially the "surprise key frame insertion" has been moved back, so the rate control has more room to adapt to the change. But it's still possible for the RC to spend most/all of its bits, and being unable to react to the KF insertion. This is likely helped by sudden complexity changes.

For single pass, the reservoir_tus that affects rate_total exhibits a saw tooth pattern, i.e. spend bits early on the key frame, and conserve them towards the "last KF in reservoir". The rate then spikes when a new KF is observed in the reservoir. The simple idea here would be to be more conservative if the next (guaranteed) KF is far away, to allow more room to react to "surprising" KF insertions.

The general idea is along the lines of reservoir_tus.powf(0.9) but this can be expressed linearly instead by looking at the reservoir_frames guess. Both variants need to account for the influence of reservoir-frame-delay (defines "max") and max_keyint (defines "range" of saw tooth) have on reservoir_tus. The code for this is surprisingly verbose, but I have been unable to find a more concise variant. Something like this:

// magic value! higher = conserve more bits early on
let penalize_early_strength = 0.25;
let max = self.reservoir_frame_delay as f32;
let min = if ctx.config.max_key_frame_interval == 0 {
  0.0
} else {
  max - ctx.config.max_key_frame_interval as f32
};
// 0.0 = next KF is far; 1.0 = next KF imminent
let next_keyframe_ratio = (reservoir_frames as f32 - min) / (max - min);
let conservative_rtu = reservoir_tus as f32
  * (1.0 - penalize_early_strength * next_keyframe_ratio);
// and then use conservative_rtu instead of reservoir_tus for total_rate

this graph shows the difference of old vs proposed approach with reservoir_frame_delay=180 max_keyint=120:
image

tuning

Obviously these changes affect both quality and size. I tested them with some of my videos, and that's where the magic values come from. But my corpus isn't very diverse, I have checked only a few conditions/scenarios, nor would I rate my eyeballs as "trustworthy and objective measure".

Additionally, I'm definitely lacking understanding of the surrounding code and video encoding in general, so it's hard for me to come up with a sensible test strategy. Put differently, if my changes are good or not, I have no clue. Point out flaws in my line of reasoning, please.

Should I polish these proposed changes into PRs?

Recap

  • reservoir-frame-delay should be suitably larger than max_keyint, to give room to adapt to keyframe insertions small than max_keyint.
  • flatten the reservoir_tus saw tooth pattern for single pass mode to give room in case of complexity spikes.
  • please advise on the next steps

Related

  • Increasing the reservoir-frame-delay to 360 also fixes #2857 . It also works for keyint=120 rfd=180. It doesn't need any other of the proposed changes to be fine. Presumably these two issues are related.
  • There's also cap_underflow that's off by default:
    https://github.com/xiph/rav1e/blob/master/src/rate.rs#L1222-L1226
    Enabling it also helps somewhat, since reservoir_fullness doesn't become negative anymore, and thus more quickly makes at least some bits available. It slightly increases the file size, though, but it's hard to tell from my reproducer because it's so short.

from rav1e.

breunigs avatar breunigs commented on August 22, 2024

From IRC:

# is the issue present with higher bitrate?
NO    rav1e raw9.yuv --speed 10 --bitrate 5000  -o 73.ivf --skip 50

# is the issue present when using multi-pass?
NO    rav1e raw9.yuv --speed 10 --bitrate 500  -o 74_first.ivf --skip 50 --first-pass 74.stats
NO    rav1e raw9.yuv --speed 10 --bitrate 500  -o 74_second.ivf --skip 50 --second-pass 74.stats

NO    rav1e raw9.yuv --speed 10 --bitrate 500  -o 75_first.ivf --first-pass 75.stats

I have also failed to find a commit when this bug was introduced. Either it's been there "mostly from the beginning", or it's in some dependency that I fail to recompile/pull the old version for. I gave up trying to make versions older than 8b19a94 build, but as far as I can tell the bug is already present there. Note that I had to

cargo update -p [email protected] --precise 1.3.9
cargo update -p [email protected] --precise 0.3.49

at some point to keep the build working in a debian:stable docker container due to the Cargo.lock not being present for some period, I think. So low confidence that I didn't make a bisecting mistake.

from rav1e.

BlueSwordM avatar BlueSwordM commented on August 22, 2024

Can you reproduce the bug with the quantizer mode instead of bitrate?

from rav1e.

breunigs avatar breunigs commented on August 22, 2024

Can you reproduce the bug with the quantizer mode instead of bitrate?

rav1e 0.7.1 from Debian

X = chosen quantizer level

rav1e raw9.yuv --speed 10 --bitrate 500 --skip 50 --quantizer X -o qX.ivf
↓             rav1e raw9.yuv --speed 10 --skip 50 --quantizer X -o pX.ivf
              ↓

NO q1         NO p1
NO q11        NO p11
NO q21        NO p21
NO q31        NO p31
NO q41        NO p41
NO q51        NO p51
NO q61        NO p61
NO q71        NO p71
NO q81        NO p81 
NO q91        NO p91 
NO q101       NO p101
NO q111       NO p111
NO q121       NO p121
NO q131       NO p131
NO q141       NO p141
NO q151       NO p151
NO q161       NO p161
NO q171       NO p171
NO q181       NO p181
NO q191       NO p191
NO q201       NO p201
NO q211       NO p211
NO q221       NO p221
NO q231       NO p231
NO q241       NO p241
NO q251       NO p251
NO q252       NO p252
NO q253       NO p253
NO q254       NO p254

YES q255      NO p255

from rav1e.

lu-zero avatar lu-zero commented on August 22, 2024

Your analysis seems correct for all I could see, but I'd defer to @barrbrain for informed opinions :)

from rav1e.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.