aomediacodec / av1-spec Goto Github PK

View Code? Open in Web Editor NEW

334.0 334.0 68.0 51.62 MB

AV1 Bitstream & Decoding Process Specification

Home Page: https://aomedia.org/

License: Other

Ruby 3.80% CoffeeScript 12.44% HTML 15.08% CSS 36.55% JavaScript 2.32% Sass 29.81%

av1-spec's People

Contributors

Stargazers

Watchers

Forkers

louquillio agrange phokgoan luctrudeau barrbrain jxlssdut mewbak lukw00heck vigneshvg jackhaughton-argon tdaede kevleyski cconcolato andrey-norkin robux4 ltrudeau-twoorioles zerosign timkingh dujianqiang1981 kqyang renanoliver ewouth getbits mhoro wantehchang eleft magwyz jbkempf seo-young truthiswill whitemike889 jpradocueva acidburn0zzz libra811 shiyoi kenkuang nidolla guest271314 sakilaj jzern jackzhouvcd hades210 ljx0305 todd-n johnsim lcksk lpye nossa481 chengxueli zonalds lu-zero ca-borja ytakio schweinepriester kpchoi cabelo suryatmodulus 1314wu chomolungma fwadnjar fhvwy furongzhang y-guyon podborski mattrwoz raztafary shansha3

av1-spec's Issues

compound type definitions.

In the spec, compound_type is defined as wedge (0), diffwtd (1), average (2), intra (3), distance (4).
But in the code (enums.h), it is defined as:

typedef enum {
COMPOUND_AVERAGE,
COMPOUND_WEDGE,
COMPOUND_DIFFWTD,
COMPOUND_TYPES,
} COMPOUND_TYPE;

So the code doesn't have COMPOUND_INTRA or COMPOUND_DISTANCE mode. And constants for average/wedge/diffwtd are also different. Is the spec going to follow the code and remove intra/distance modes?

Thanks.

"initial_display_delay_present_flag" text is not bold in sequence_header_obu

It breaks the text style, please fix it, thanks.

sequence_header_obu( ) {
    seq_profile
    still_picture
    reduced_still_picture_header
    if ( reduced_still_picture_header ) {
        ...
    }
    else
    {
         ...
         initial_display_delay_present_flag f(1)
         operating_points_cnt_minus_1 f(5)
         ...
     }
     ...
}

TM_PRED name should be replaced by PAETH_PRED

This includes "FILTER_TM_PRED" changed to "FILTER_PAETH_PRED" as well.

This will ensure that this new PAETH_PRED mode is not confused with a different TM_PRED mode in VP9, and will also match the name used in the reference encoder/decoder.

typo in 7.13.5.1

May need to change "else if()" to only "if()".

mask = 0
if ( filterLen >= 4 ) {
mask |= (Abs( p1 - p0 ) > limitBd)
mask |= (Abs( q1 - q0 ) > limitBd)
mask |= (Abs( p0 - q0 ) * 2 + Abs( p1 - q1 ) / 2 > blimitBd)
} else if ( filterLen >= 6 ) {
mask |= (Abs( p2 - p1 ) > limitBd)
mask |= (Abs( q2 - q1 ) > limitBd)
} else if ( filterLen >= 8 ) {
mask |= (Abs( p3 - p2 ) > limitBd)
mask |= (Abs( q3 - q2 ) > limitBd)
}

Bitstream format terminology clarification

The Abstract/Scope say

This document defines the bitstream format

"bitstream" is only defined as:

The sequence of bits generated by encoding a sequence of frames.

"AV1 bitstream" is not defined
"bitstream format" is not defined
The current title for Annex B is:

Annex B: Nominal bitstream interchange format

The titles of B.1 and B.2 are:

Bitstream syntax
Bitstream semantics

Overall, this is misleading people into thinking that THE recommended format for AV1 is Annex B.

I would suggest:

Adding a top-level section to the spec or as a high-level subsection of one of the existing sections, entitled "low overhead bitstream format", with the following content:

This specification defines a low-overhead bitstream format as a sequence of the OBU syntactical elements defined in Section 5. When using this format, the obu_has_size_field may be used to enable parsers to skip OBUs without having to parse them. For applications requiring a format where it is easier to skip through frames or temporal units, a length-delimited bitstream format is defined in Annex B. Derived specifications, such as container formats enabling storage of AV1 videos together with audio or subtitles should indicate which of these formats they rely on.

renaming Annex B to "Annex B: length-delimited bitstream format"
Replace the overview as follows:

Section 5 defines the syntax for OBUs. Section XXX defines the low-overhead bitstream format. This annex defines a length-delimited format for packing OBUs into a format that enables skipping through temporal units and frames more easily.

Rename B1 and B2 to

Length-delimited bitstream syntax
Length-delimited bitstream semantics

Change the Abstract and Scope to use "bitstream formats" (plural)

same variable "i" is used twice

In the description for operating_point_idc:

operating_point_idc[ i ] contains a bitmask that indicates which spatial and temporal layers should be decoded for operating point i. Bit i is equal to 1 if temporal layer i should be decoded (for i between 0 and 7). Bit j+8 is equal to 1 if spatial layer j should be decoded (for j between 0 and 3).

The variable "i" is used twice, once to represent which operating point, and then it's used again to indicate the temporal layer. A different variable should be used for one of them. Otherwise, it is confusing.

dynamic range after column transform (iwht case)

I have a problem between AV1 spec. and reference code about iwht4x4 case.

AV1 spec is described as below.

"Between the row and column transforms, Residual[ i ][ j ] is set equal to Clip3( - ( 1 << (colClampRange - 1 ) ), ( 1 << (colClampRange - 1 ) ) - 1, Residual[ i ][ j ] ) for i = 0..(h-1), for j = 0..(w-1)."

AV1 Reference Code

static INLINE tran_high_t check_range(tran_high_t input, int bd) {
// AV1 TX case
// - 8 bit: signed 16 bit integer
// - 10 bit: signed 18 bit integer
// - 12 bit: signed 20 bit integer
// - max quantization error = 1828 << (bd - 8)
const int32_t int_max = (1 << (7 + bd)) - 1 + (914 << (bd - 7));
const int32_t int_min = -int_max - 1;
#if CONFIG_COEFFICIENT_RANGE_CHECKING
assert(int_min <= input);
assert(input <= int_max);
#endif // CONFIG_COEFFICIENT_RANGE_CHECKING
return (tran_high_t)clamp64(input, int_min, int_max);
}

What is the maximum bit width for the coeffs between row and column transform in the case of iwht4x4 ?

Link to specification goes via Outlook

On your website - https://aomedia.org/av1-bitstream-and-decoding-process-specification/

The link to the specification is https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faomediacodec.github.io%2Fav1-spec%2Fav1-spec.pdf&data=04%7C01%7Cgfrost%40microsoft.com%7Cc01143f4353e426231d508d590e3a9c1%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636574229902920663%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=lLQibtMygoLH30UNXZcUZGAA1i%2FqNE%2Ff6fgotaX3uhI%3D&reserved=0

That link seems to be a copy & paste from Gabe Frost's email account.

I think it should point directly to https://aomediacodec.github.io/av1-spec/av1-spec.pdf

terms for Transform block

Transform block is defined as below. I am not sure but I think transform block might be a square or rectangular like as Block. Would you please check this ?

Terms and definitions
Transform block
A square transform coefficient matrix, used as input to the inverse transform process.

I have a simple question related to Quant.
In the case of HEVC, we can easily know the maximum bit width for the quantized coefficients as below.

It is a requirement of bitstream conformance that the value of coeff_abs_level_remaining[ n ] shall be constrained such that the corresponding value of TransCoeffLevel[ x0 ][ y0 ][ cIdx ][ xC ][ yC ] is in the range of CoeffMinY to CoeffMaxY, inclusive, for cIdx equal to 0 and in the range of CoeffMinC to CoeffMaxC, inclusive, for cIdx not equal to 0.

However, We can't find it easily in the case of AV1 spec. Of course, AV1 spec. uses the bit and operator with 0xFFFFF. (We can know that maximum bit width is 20. but this will be various according to the input bit depth)
Quant[ pos ] = Quant[ pos ] & 0xFFFFF

From the AV1 reference code, I can see comments as below. Can I know the max and min value of the quantized coefficients according to the input bit depth? (quantized coefficients means that output of the entropy decoder and input of the inverse quantization block.)

      // Bitmasking to clamp level to valid range:
      //   The valid range for 8/10/12 bit vdieo is at most 14/16/18 bit
      level &= 0xfffff;

typo in 7.13

Should the following "plane>0" be replace with "plane==0"?

rowStep = ( plane > 0 ) ? 1 : ( 1 << subsampling_y )
colStep = ( plane > 0 ) ? 1 : ( 1 << subsampling_x )

Description of OperatingPointIdc

In 6.4, the following statement is found:

It is a requirement of bitstream conformance that if OperatingPointIdc is equal to 0, then obu_extension_flag is equal to 0.

Perhaps it would be clearer to add "for all OBUs that follow the sequence header OBU from which the OperatingPointIdc was derived from, until the next sequence header" (to emphasize that the requirement is valid following the zero OperatingPointIdc, and that OperatingPointIdc COULD change when a new sequence header is received)

I'm not sure my wording is 100% best wording, but I hope you get what I'm trying to convey

coded video sequence term

The term coded video sequence is used once in the spec in 6.3 but never defined:

level specifies the level for the coded video sequence.

It should either be defined or replaced.

indent issue on RESTORE_SGRPROJ

https://github.com/AOMediaCodec/av1-spec/blob/master/06.bitstream.syntax.md#decode-loop-restoration-unit-syntax

Then indent of the line with RESTORE_SGRPROJ should be four spaces, not six.

decode_lr_unit(plane, unitRow, unitCol) {
    if (FrameRestorationType[ plane ] == RESTORE_NONE) {
        restoration_type = RESTORE_NONE
    } else if (FrameRestorationType[ plane ] == RESTORE_WIENER) {
        @@use_wiener                                                           S
        restoration_type = use_wiener ? RESTORE_WIENER : RESTORE_NONE
      } else if (FrameRestorationType[ plane ] == RESTORE_SGRPROJ) {
        @@use_sgrproj                                                          S
        restoration_type = use_sgrproj ? RESTORE_SGRPROJ : RESTORE_NONE
    } else {
        @@restoration_type                                                     S
    }

Probably a typo is present in the AV1 specification (section 5.9.22)

at the section 5.9.22. "Inter Block Mode Info Syntax", there is written the following:
...
new_mv
if ( new_mv == 0 )
{ YMode = NEWMV
...

it seems that '0' should be replaced with '1' as follows:

new_mv
if ( new_mv == 1 )
{ YMode = NEWMV

frame_size_override_flag description

In 6.7.1, the following statement is in the spec:

frame_size_override_flag equal to 1 specifies that the frame size will be contained in the bitstream. frame_size_override_flag equal to 0 specifies that the frame size is equal to the size in the sequence header.

The first statement above seems a little misleading at best, If the flag is 1, we could wind up getting the frame size from one of the references. Technically, the statement "the frame size will be contained in the bitstream" is not 100% accurate? (What's contained in the bitstream is a pointer to the reference frame from which the frame size needs to be derived from. The frame size itself is not in the bitstream)

Luma average rounding in 7.10.4 "Predict Chroma From Luma Process"

The description of lumaAvg has only a plain right shift.
lumaAvg >>= Tx_Width_Log2[ txSz ] + Tx_Height_Log2[ txSz ]
However, in libaom this shift includes rounding.
lumaAvg = Round2( lumaAvg, Tx_Width_Log2[ txSz ] + Tx_Height_Log2[ txSz ] )

Clarification on TotalDisplayLumaSampleRate

In section A.3: TotalDisplayLumaSampleRate is defined as: "the luma sample rate (i.e. based on width * height of the output luma plane) for all output frames with the flag show_frame equal to 1 and frames referenced in frame headers with show_existing_frame = 1 for the scalability layer that conforms to the level indicated for this layer". It seems to me that the expression "frames referenced in frame headers" could be read two ways - that you count such frames once - or that you count such frames for each reference (now that we allow multiple references). I assume it should be for each reference (since the frames is displayed multiple times if references multiple times). Just wondering if it would be possible to clarify the text.

Independent tiles and context clear

In section 5.8.1 there is the text:

if ( !dependent_tiles ||
(TileCol == 0 && !AllowDependentTileRow[ TileRow ]) ) {
clear_above_context( )
}

I was just wondering why the context is cleared on every column if dependent_tiles==0 but only on column 0 for allow dependent tiles=0.

Typo in Read Inter Intra Semantics

In 6.10.26, I think this
"interintra_mode specifies the type of intra PREDICTION to be used"
should be replaced by this
"interintra_mode specifies the type of intra BLENDING to be used"

LAST_FRAME and other keywords not defined

The constant LAST_FRAME is used in places such as ref_frame_sign_bias[ LAST_FRAME + i ] but its value is never defined.

Similar remark for the other keywords: GOLDEN_FRAME, ...

Profile clarification

Hi,

In section "A.1.1 Profiles" the three profiles Main, High and Professional are defined by bullets. A Main profile decoder must decode seq_profile=0, High profile decoder seq_profile=0,1. Professional profile decoder seq_profile=0,1,2. This seems fine to me. However the following Note seems confusing, It says for example that "monochrome is not allowed in high profile". The above bullet has just said that a high profile decoder must support seq_profile=0 and so therefore must support monochrome.

It seems to me it would be clearer to not use the Main, High, Professional terminology (that refers to an increasing sequence of capability) when referring to an individual seq_profile number (such as 0,1,2) which as I understand it are defined so as to have a unique setting for each content type.

So, for example the note could say that monochrome is not permitted when seq_profile=1 rather than not permitted in high profile?

Specific process for D203_PRED, D45_PRED etc need to be removed

In section "7.10.1.1 Basic Intra Prediction Process", we need to remove all the processes for all the directional predictors.

This is because, the general process is already specified in section "7.10.1.3 Directional Intra Prediction Process"

change the suffix from _obu to _obp

The obu includes the header and payload . Then change the suffix from _obu to _obp(open bitstream playload) is more accurate .

spec for warp process

In spec 7.10.2, step 9: if useWarp is 1, section 7.10.2.4 is invoked with i8=0...((h-1)>>3), j8=0...((w-1)>>3). So i8 and j8 start with 0.
In 7.10.2.4: srcX = (j8 * 8+4)<<subX, srcY = (i8 * 8+4)<<subY. And input x, y (location relative to top-left sample of current picture) are never used in this section.
Based on the code, I believe srcX should be (x+j8 * 8+4)<<subX; srcY should be (y+i8 * 8+4).

Note about differences between KF and IOF is wrong

The spec currently says:

Note: A key frame is different to an intra-only frame even though both only use intra prediction. The difference is that a key frame fully resets the decoding process. For example, for a sequence of frames A, B, C, frame C can use frame A as a reference if B was an intra-only frame - but not if B was a keyframe.

This is more complex than that. The reset of the decoding process is controlled by refresh_frame_flags.

su(n) definition simplification

In section 4.9.4, the su(n) definition uses:

  value = value - 2 * signBit

You either keep the 'if' and keep it as
value = value - 2

Or delete the 'if' and keep it as:
value = value - 2 * signBit

The text is not in error, it just has unnecessary computations.

Spec for loop filter

I found several problems in spec related to loop filter:

[1]. 6.7.9 If primary_ref_frame is equal to PRIMARY_REF_NONE, loop_filter_delta_enabled should also be set to 1.

[2]. 5.9.12. if ( delta_lf_abs == DELTA_LF_SMALL ) --> if ( delta_lf_abs >= DELTA_LF_SMALL )

[3]. 6.9.12. If delta_lf_abs is larger or equal to DELTA_LF_SMALL, the value is encoded using delta_lf_rem_bits and delta_lf_abs_bits.

[4]. 7.13.1: Logic of applyFilter is not correct.
Should be:
Otherwise, if (isBlockEdge is equal to 1 or skip is equal to 0 or prevSkip is equal to 0) and (current block filter level is not 0 or prev block filter level is not 0), applyFilter is set equal to 1.

[5]. 7.13.2: filterSize is not correct.
Should be:
If plane is equal to 0, filterSize is set equal to Min( 14, baseSize ),
Otherwise, (plane is greater than 0), filterSize is set equal to Min( 6, baseSize ).

[6]. Need to update loop filter advance step size.
Loop filter now doesn’t use a fix step of unit of 4x4. Instead, step size depends on tx size.
For example, if current block tx size is TX_8X4, filtering is in pass 0 (filtering vertical edges), then the step is tx_size_wide_unit[TX_8X4], equal to 2 unit 4x4 step size. Similarly, for pass 1, step is tx_size_high_unit[TX_8X4], equal to 1 unit 4x4 step size.

restrict the min_tile_width if "in-loop-filtering" is enable

This can help alleviating memory latency issue in case load/store some top buffers into external memory .
The min_tile_width can be 256 pixels if "in-loop-filtering " is enable .
otherwise min_tile_width can equal to SB size .

Clarification on ReadDeltas

There is a comment in section 6.10.2 of the specification:

"ReadDeltas specifies whether the current block may read delta values for the quantizer index and loop filter. Delta values for the quantizer index and loop filter are only read on the first non-skipped block of a superblock."

Looking at the syntax tables 5.11.12 and 5.11.13, it appears that the values are read for the first block of a superblock unless the block is skipped and the block is the size of the whole superblock:

"if ( MiSize == sbSize && skip ) return"

So my understanding is that if a superblock starts with a skipped block which is not the whole superblock then ReadDeltas will apply - and so the comment on ReadDeltas seems wrong.

syntax elements in "6.8.28 Token Semantics" seem doesn't be referenced anywhere

syntax elements in "6.8.28 Token Semantics" seem doesn't be referenced anywhere.
Should it be removed? or add corresponding syntax?

Single reference restriction on sub8x8 MI block

In the code decodeframe.c, function dec_build_inter_predictors(): if (sub8x8_inter), assert(!is_compound).

I don't see where the spec restricts it from being compound. I see Section 5.11.25 of the spec, but it can still be skip_mode and therefore compound prediction, right?

Thanks.

Upscaling process

IN Section 7.16, the following line is used to compute pixel position in coding resolution (downscaled).

srcX = -(1 << SUPERRES_SCALE_BITS) + initialSubpelX + x*stepX

Checking against the code, i could not seem to locate the corresponding part to -(1 << SUPERRES_SCALE_BITS), which in some cases makes srcX negative.

Loop Restoration Process: return if coded_lossless == 1 || allow_intrabc == 1

Section 7.16 Loop Restoration Process:
Currently says "If allow_intrabc is equal to 1, then the process returns immediately after performing this copy."
But it should say "If allow_intrabc is equal to 1 or if CodedLossless is equal to 1, then the process returns immediately after performing this copy."

Relevant code change: https://crbug.com/aomedia/1472

PARTITION_HORZ_4 / VERT_4 outside frame

Hi Peter,

I'm wondering if the spec allows for a coded VERT_4 / HORZ_4 partition that is completely outside the frame while the reference decoder doesn't ?

E.g. consider a 24x24 frame, using a VERT_4 partition at the 32x32 level.
The spec then seems to call "decode_block" 4 times, once for each 8x32 partition.
However, the reference decoder seems to call decode_block only 3 times, not coding the rightmost partition.

Is a check missing in the spec to avoid coding partitions completely outside the frame? (such check is already present for the simple 1:2, 2:1 rectangles).

Regards
/Ola

'subexp_more_bits' explanation has a typo

Link: https://github.com/AOMediaCodec/av1-spec/blob/master/07.bitstream.semantics.md#decode-subexp-semantics

Ambiguous statement:
"subexp_more_bits equal to 0 specifies that the parameter is in the range mk to mk+a-1. subexp_more_bits equal to 0 specifies that the parameter is greater than mk+a-1."

Second part should be " subexp_more_bits equal to 1 ...." instead of "equal to 0 ...".

Large Scale Tile height

In Section 7.2 on the large scale tile decoding process there is no limit specified on tileHeight. Our recollection is that it was agreed in the working group that tileHeight must be one superblock - so that a specific area to decode is either Wx64 or Wx128 pixels depending on superblock size.

Minor typo (doesn't affect meaning or interpretation)

I noticed a tiny typo:

currently on page 637 of the PDF:

Note: The level uses a a X.Y format. X is equal to 2 + (seq_level_idx >> 2). Y is given by (seq_level_idx & 3).

("a" is repeated in the first sentence.)

update_cdf semantics

The spec says:
"update_cdf is a function call that indicates that the CDF arrays are set equal to the final CDFs at the end of the largest tile (measured in bytes). This process is described in section 7.4."

However, the update_cdf() function is called only once in the last tile not the largest:

if (tg_end == NumTiles - 1) {
        if ( !error_resilient_mode && !frame_parallel_decoding_mode ) {
            update_cdf( )
        }
...
}

Should the semantics or the syntax be fixed?

Sub 8x8 inter prediction for chroma

Hello,

I would like to clarify a point on the sub-8x8 inter prediction. My understanding from the code is that if an 8x8 luma block is split into four 4x4 luma blocks and the last block is inter but not all blocks are inter then the corresponding 4x4 chroma block is inter coded using a single motion vector from the last luma inter block. If all blocks are inter coded the the chroma inter prediction uses four 2x2 inter predictions.

In the specification section 5.9.32, residual(), there is a loop that sets a variable someUseIntra if any of the blocks are detected as intra which I think is for implementing the condition as described above. However, this loop iterates over ( r = 0; r < num4x4H; r++ ) and ( c = 0; c < num4x4W; c++ ) where as far as I can tell num4x4H=num4x4W=1. So this only seems to look at the top left block r=c=0 rather than all four blocks. Please can you clarify if this loop should really iterate r=0,1 and c=0,1. Thanks.

Encoding of lr_sgr_set

The symbol lr_sgr_set is encoded as NS(SGRPROJ_PARAMS_BITS)=NS(4) in the specification which seems to encode the range 0 to 3 according to 4.9.6. However, in the code it seems to be range 0 to 15 encoded as a 4 bit literal. I'm assuming it should be a 4 bit raw value.

Dangling reference "notLastColumn"

This symbol is defined in the semantics section, but never used.

Self guided parameter w1

Hi Peter,

In section "7.16.2 Self Guided Filter Process" I'm wondering if

   v = w1 * u

should rather be

   v = w2 * u

Regards
/Ola Hugosson

Chroma 4x4 prediction

From 6.9.29 Compute Prediction Semantics: "Normally, a single prediction is performed for the entire chroma residual block based on the mode info of the bottom right luma block."

Now, in 5.9.4 Decode Block Syntax, HasChroma is non-zero only for the bottom-right luma block with odd MiRow and/or MiCol.

Then, in 5.9.32 Compute Prediction Syntax, the outer-most for loop has plane > 0 only for odd MiRow and/or MiCol because HasChroma > 0 in these cases only.

So, for odd MiRow and/or MiCol,
candRow = (MiRow >> subY) << subY
candCol = (MiCol >> subX) << subX

Therefore, candRow, candCol point to the mode information of the TOP LEFT luma block.

This contradicts the quoted statement from 6.9.29: "... mode info of the BOTTOM RIGHT...".

Please clarify.

still_picture and bitstream vs. coded video sequence

The spec says:

still_picture equal to 1 specifies that the bitstream contains only one coded frame. still_picture equal to 0 specifies that the bitstream contains one or more coded frames.

It should say "coded video sequence" instead of "bitstream".

"Page [x] of [y]" count at bottoms of pages is quirky

Expected:

Page-numbering scheme should start from "Page 1".
On the last page, numbers should match.

Here's what I'm seeing:

"Page [x] of [y]" count starts with "Page 0", and goes up from there.
- (Non-techies might find this confusing, and I think it defies expectations.)
Last page says "Page 636 of 647"

specify the dynamic range of " motion vectors" and "qp_delta" and "residual"

Need specfiy the dynamic of the related syntax elements and derived variable .
This can help identify the bistream is legal or not .

frame_offset_update

The current spec indicates:

if (show_frame == 0) {
       frame_offset_update                      f(5)
}

and then:

frame_offset_update specifies how many frames later this frame will be shown.

A frame may never be shown. In that case, what should be the value for frame_offset_update? 0?

Warp valid constraints

In Section 7.11.3.6 (Setup Shear Process), warpValid is determined based on alpha0, beta0, gamma0 and delta0.

Then these four variables are rounded to alpha, beta, gamma, delta with this type of code:
alpha = Round2Signed( alpha0, WARP_PARAM_REDUCE_BITS ) << WARP_PARAM_REDUCE_BITS

This type of rounding could make alpha larger than alpha0 if the rounding is upward.

If so, then it is possible for warpValid to be true for alpha0, beta0, gamma0 and delta0, but false for alpha, beta, gamma, delta.

The purpose of the warpValid constraints is to ensure that offs (in block warp process) is in the range 0 to 192 (corresponding to -1 to 2 in the 1/64-sample space).

So due to the rounding of these variables, is it possible that offs could exceed the 0 to 192 range?

If so, shouldn't we determine warpValid using alpha, beta, gamma, delta instead of alpha0, beta0, gamma0, delta0?

CDEF skip for sub 8x8 blocks

Hi Peter,

It seems that "7.14.1 CDEF Block Process" specifies that the CDEF processing for the 8x8 block at (r,c) should be skipped if the Skips[r][c] is set (i.e. if the top-left 4x4 in the 8x8 is skip).

However, it appears that the reference decoder loops over all 4x4 blocks in the 8x8, and only if all 4 are skip, the CDEF for the 8x8 is skipped (see is_8x8_block_skip()).

I think this might be a spec bug?

Regards
/Ola

add "disable_deblocking" flag in sequence header

now,we must specify specified value in loop-filter_params() to make the filter don't change the pixels indeed then "disable" deblocking alternative .

And change the loop-filter to deblocking is more accurate . as general loop-filter should include (deblocking+cdef+LR)

aomediacodec / av1-spec Goto Github PK

av1-spec's People

Contributors

Stargazers

Watchers

Forkers

av1-spec's Issues

Recommend Projects

Recommend Topics

Recommend Org