Coder Social home page Coder Social logo

aomediacodec / av1-spec Goto Github PK

View Code? Open in Web Editor NEW
326.0 40.0 68.0 51.58 MB

AV1 Bitstream & Decoding Process Specification

Home Page: https://aomedia.org/

License: Other

Ruby 3.80% CoffeeScript 12.44% HTML 15.08% CSS 36.55% JavaScript 2.32% Sass 29.81%

av1-spec's Introduction

This document provides instructions for working with the draft AV1 Bitstream & Decoding Process Specification.

The specification document is built from plaintext section and subsection Markdown files (more specifically, kramdown files) using the Jekyll static site generator tool.

The document build process employs a common NodeJS-based web development toolchain, requiring Node, npm (the Node package manager), and GruntJS, a Node-based task runner.

Note: As a general rule, the packages described below should be installed in user space, not at the system level -- in other words, do not install them as root or via sudo. An exception is Ruby development headers, which are usually needed to build Ruby and certain Ruby gems. {:.alert .alert-info }

Ruby and rbenv

This project currently depends on Ruby v3.2.0. Because your distro may lack this version -- or installing it may conflict with your system's installed version -- first install rbenv, then install Ruby v3.2.0 within it.

# list all available versions:
$ rbenv install -l
3.1.4
3.2.0
3.2.2
jruby-9.4.2.0
mruby-3.2.0

# install a Ruby version:
$ rbenv install 3.2.0

Depending on your distro and environment, you may have trouble building a particular Ruby version. The rbenv project site maintains a wiki page with troubleshooting help.

Bundler

Gem dependencies are managed with bundler.

$ gem install bundler

# Filesystem location where gems are installed
$ gem env home
# => ~/.rbenv/versions/<ruby-version>/lib/ruby/gems/...

Clone the Repo

git clone [email protected]:AOMediaCodec/av1-spec.git
cd av1-spec

Set Local Ruby Version (rbenv)

In the directory of your local clone, do:

rbenv local 3.2.0

Regardless of any other Rubies installed on your system, the project environment will now use v3.2.0 and gems appropriate for it.

Install Gem Dependencies with Bundler

In the directory of your local clone, run

bundle install

Bundler will set dependencies and install needed gems as listed in Gemfile.lock.

NodeJS, npm and GruntJS

Follow these instructions for installing NodeJS and npm.

Next -- from the project directory -- update npm, install the grunt package, and install the project's Node dependencies:

## Update npm globally
npm update -g npm

## Install grunt globally
npm install -g grunt-cli

## Install the project's Node dependencies
## (uses package.json and Gruntfile)
npm install

Grunt Tasks

Building the document is done via Grunt tasks, configured in the CoffeeScript file Gruntfile.coffee.

There are tasks to

  • clean output directories
  • perform some automated text transformations on the content files
  • build the document with Jekyll
  • copy output files to the correct directories for serving and viewing

These tasks are invoked in turn by the Grunt default task:

$ grunt

PDF

Additionally, there is a task grunt exec for generating the spec as a PDF file. It invokes the commercial tool Prince (formerly PrinceXML), which must first be installed on your system. PDF generation can then be done with:

$ grunt && grunt exec

Note that while we are generating PDF as a convenience format during document development, the canonical spec will likely be served as HTML, not PDF. {:.alert .alert-info }

av1-spec's People

Contributors

agrange avatar andrey-norkin avatar barrbrain avatar cconcolato avatar dependabot[bot] avatar eleft avatar jackhaughton-argon avatar jzern avatar kqyang avatar louquillio avatar ltrudeau-twoorioles avatar luctrudeau avatar peterderivaz avatar stefanhamburger avatar stemidts avatar wantehchang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

av1-spec's Issues

Single reference restriction on sub8x8 MI block

In the code decodeframe.c, function dec_build_inter_predictors(): if (sub8x8_inter), assert(!is_compound).

I don't see where the spec restricts it from being compound. I see Section 5.11.25 of the spec, but it can still be skip_mode and therefore compound prediction, right?

Thanks.

Sub 8x8 inter prediction for chroma

Hello,

I would like to clarify a point on the sub-8x8 inter prediction. My understanding from the code is that if an 8x8 luma block is split into four 4x4 luma blocks and the last block is inter but not all blocks are inter then the corresponding 4x4 chroma block is inter coded using a single motion vector from the last luma inter block. If all blocks are inter coded the the chroma inter prediction uses four 2x2 inter predictions.

In the specification section 5.9.32, residual(), there is a loop that sets a variable someUseIntra if any of the blocks are detected as intra which I think is for implementing the condition as described above. However, this loop iterates over ( r = 0; r < num4x4H; r++ ) and ( c = 0; c < num4x4W; c++ ) where as far as I can tell num4x4H=num4x4W=1. So this only seems to look at the top left block r=c=0 rather than all four blocks. Please can you clarify if this loop should really iterate r=0,1 and c=0,1. Thanks.

Description of OperatingPointIdc

In 6.4, the following statement is found:

It is a requirement of bitstream conformance that if OperatingPointIdc is equal to 0, then obu_extension_flag is equal to 0.

Perhaps it would be clearer to add "for all OBUs that follow the sequence header OBU from which the OperatingPointIdc was derived from, until the next sequence header" (to emphasize that the requirement is valid following the zero OperatingPointIdc, and that OperatingPointIdc COULD change when a new sequence header is received)

I'm not sure my wording is 100% best wording, but I hope you get what I'm trying to convey

TM_PRED name should be replaced by PAETH_PRED

This includes "FILTER_TM_PRED" changed to "FILTER_PAETH_PRED" as well.

This will ensure that this new PAETH_PRED mode is not confused with a different TM_PRED mode in VP9, and will also match the name used in the reference encoder/decoder.

terms for Transform block

Transform block is defined as below. I am not sure but I think transform block might be a square or rectangular like as Block. Would you please check this ?

  1. Terms and definitions
    Transform block
    A square transform coefficient matrix, used as input to the inverse transform process.

I have a simple question related to Quant.
In the case of HEVC, we can easily know the maximum bit width for the quantized coefficients as below.

It is a requirement of bitstream conformance that the value of coeff_abs_level_remaining[ n ] shall be constrained such that the corresponding value of TransCoeffLevel[ x0 ][ y0 ][ cIdx ][ xC ][ yC ] is in the range of CoeffMinY to CoeffMaxY, inclusive, for cIdx equal to 0 and in the range of CoeffMinC to CoeffMaxC, inclusive, for cIdx not equal to 0.

However, We can't find it easily in the case of AV1 spec. Of course, AV1 spec. uses the bit and operator with 0xFFFFF. (We can know that maximum bit width is 20. but this will be various according to the input bit depth)
Quant[ pos ] = Quant[ pos ] & 0xFFFFF

From the AV1 reference code, I can see comments as below. Can I know the max and min value of the quantized coefficients according to the input bit depth? (quantized coefficients means that output of the entropy decoder and input of the inverse quantization block.)

      // Bitmasking to clamp level to valid range:
      //   The valid range for 8/10/12 bit vdieo is at most 14/16/18 bit
      level &= 0xfffff;

same variable "i" is used twice

In the description for operating_point_idc:

operating_point_idc[ i ] contains a bitmask that indicates which spatial and temporal layers should be decoded for operating point i. Bit i is equal to 1 if temporal layer i should be decoded (for i between 0 and 7). Bit j+8 is equal to 1 if spatial layer j should be decoded (for j between 0 and 3).

The variable "i" is used twice, once to represent which operating point, and then it's used again to indicate the temporal layer. A different variable should be used for one of them. Otherwise, it is confusing.

Clarification on ReadDeltas

There is a comment in section 6.10.2 of the specification:

"ReadDeltas specifies whether the current block may read delta values for the quantizer index and loop filter. Delta values for the quantizer index and loop filter are only read on the first non-skipped block of a superblock."

Looking at the syntax tables 5.11.12 and 5.11.13, it appears that the values are read for the first block of a superblock unless the block is skipped and the block is the size of the whole superblock:

"if ( MiSize == sbSize && skip ) return"

So my understanding is that if a superblock starts with a skipped block which is not the whole superblock then ReadDeltas will apply - and so the comment on ReadDeltas seems wrong.

dynamic range after column transform (iwht case)

I have a problem between AV1 spec. and reference code about iwht4x4 case.

  1. AV1 spec is described as below.

"Between the row and column transforms, Residual[ i ][ j ] is set equal to Clip3( - ( 1 << (colClampRange - 1 ) ), ( 1 << (colClampRange - 1 ) ) - 1, Residual[ i ][ j ] ) for i = 0..(h-1), for j = 0..(w-1)."

  1. AV1 Reference Code

static INLINE tran_high_t check_range(tran_high_t input, int bd) {
// AV1 TX case
// - 8 bit: signed 16 bit integer
// - 10 bit: signed 18 bit integer
// - 12 bit: signed 20 bit integer
// - max quantization error = 1828 << (bd - 8)
const int32_t int_max = (1 << (7 + bd)) - 1 + (914 << (bd - 7));
const int32_t int_min = -int_max - 1;
#if CONFIG_COEFFICIENT_RANGE_CHECKING
assert(int_min <= input);
assert(input <= int_max);
#endif // CONFIG_COEFFICIENT_RANGE_CHECKING
return (tran_high_t)clamp64(input, int_min, int_max);
}

What is the maximum bit width for the coeffs between row and column transform in the case of iwht4x4 ?

Minor typo (doesn't affect meaning or interpretation)

I noticed a tiny typo:

currently on page 637 of the PDF:

Note: The level uses a a X.Y format. X is equal to 2 + (seq_level_idx >> 2). Y is given by (seq_level_idx & 3).

("a" is repeated in the first sentence.)

typo in 7.13.5.1

May need to change "else if()" to only "if()".


mask = 0
if ( filterLen >= 4 ) {
mask |= (Abs( p1 - p0 ) > limitBd)
mask |= (Abs( q1 - q0 ) > limitBd)
mask |= (Abs( p0 - q0 ) * 2 + Abs( p1 - q1 ) / 2 > blimitBd)
} else if ( filterLen >= 6 ) {
mask |= (Abs( p2 - p1 ) > limitBd)
mask |= (Abs( q2 - q1 ) > limitBd)
} else if ( filterLen >= 8 ) {
mask |= (Abs( p3 - p2 ) > limitBd)
mask |= (Abs( q3 - q2 ) > limitBd)
}

PARTITION_HORZ_4 / VERT_4 outside frame

Hi Peter,

I'm wondering if the spec allows for a coded VERT_4 / HORZ_4 partition that is completely outside the frame while the reference decoder doesn't ?

E.g. consider a 24x24 frame, using a VERT_4 partition at the 32x32 level.
The spec then seems to call "decode_block" 4 times, once for each 8x32 partition.
However, the reference decoder seems to call decode_block only 3 times, not coding the rightmost partition.

Is a check missing in the spec to avoid coding partitions completely outside the frame? (such check is already present for the simple 1:2, 2:1 rectangles).

Regards
/Ola

Warp valid constraints

In Section 7.11.3.6 (Setup Shear Process), warpValid is determined based on alpha0, beta0, gamma0 and delta0.

Then these four variables are rounded to alpha, beta, gamma, delta with this type of code:
alpha = Round2Signed( alpha0, WARP_PARAM_REDUCE_BITS ) << WARP_PARAM_REDUCE_BITS

This type of rounding could make alpha larger than alpha0 if the rounding is upward.

If so, then it is possible for warpValid to be true for alpha0, beta0, gamma0 and delta0, but false for alpha, beta, gamma, delta.

The purpose of the warpValid constraints is to ensure that offs (in block warp process) is in the range 0 to 192 (corresponding to -1 to 2 in the 1/64-sample space).

So due to the rounding of these variables, is it possible that offs could exceed the 0 to 192 range?

If so, shouldn't we determine warpValid using alpha, beta, gamma, delta instead of alpha0, beta0, gamma0, delta0?

Clarification on TotalDisplayLumaSampleRate

In section A.3: TotalDisplayLumaSampleRate is defined as: "the luma sample rate (i.e. based on width * height of the output luma plane) for all output frames with the flag show_frame equal to 1 and frames referenced in frame headers with show_existing_frame = 1 for the scalability layer that conforms to the level indicated for this layer". It seems to me that the expression "frames referenced in frame headers" could be read two ways - that you count such frames once - or that you count such frames for each reference (now that we allow multiple references). I assume it should be for each reference (since the frames is displayed multiple times if references multiple times). Just wondering if it would be possible to clarify the text.

Link to specification goes via Outlook

On your website - https://aomedia.org/av1-bitstream-and-decoding-process-specification/

The link to the specification is https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faomediacodec.github.io%2Fav1-spec%2Fav1-spec.pdf&data=04%7C01%7Cgfrost%40microsoft.com%7Cc01143f4353e426231d508d590e3a9c1%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C636574229902920663%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=lLQibtMygoLH30UNXZcUZGAA1i%2FqNE%2Ff6fgotaX3uhI%3D&reserved=0

That link seems to be a copy & paste from Gabe Frost's email account.

I think it should point directly to https://aomediacodec.github.io/av1-spec/av1-spec.pdf

typo in 7.13

Should the following "plane>0" be replace with "plane==0"?

rowStep = ( plane > 0 ) ? 1 : ( 1 << subsampling_y )
colStep = ( plane > 0 ) ? 1 : ( 1 << subsampling_x )

update_cdf semantics

The spec says:
"update_cdf is a function call that indicates that the CDF arrays are set equal to the final CDFs at the end of the largest tile (measured in bytes). This process is described in section 7.4."

However, the update_cdf() function is called only once in the last tile not the largest:

if (tg_end == NumTiles - 1) {
        if ( !error_resilient_mode && !frame_parallel_decoding_mode ) {
            update_cdf( )
        }
...
}

Should the semantics or the syntax be fixed?

spec for warp process

In spec 7.10.2, step 9: if useWarp is 1, section 7.10.2.4 is invoked with i8=0...((h-1)>>3), j8=0...((w-1)>>3). So i8 and j8 start with 0.
In 7.10.2.4: srcX = (j8 * 8+4)<<subX, srcY = (i8 * 8+4)<<subY. And input x, y (location relative to top-left sample of current picture) are never used in this section.
Based on the code, I believe srcX should be (x+j8 * 8+4)<<subX; srcY should be (y+i8 * 8+4).

indent issue on RESTORE_SGRPROJ

https://github.com/AOMediaCodec/av1-spec/blob/master/06.bitstream.syntax.md#decode-loop-restoration-unit-syntax

Then indent of the line with RESTORE_SGRPROJ should be four spaces, not six.

decode_lr_unit(plane, unitRow, unitCol) {
    if (FrameRestorationType[ plane ] == RESTORE_NONE) {
        restoration_type = RESTORE_NONE
    } else if (FrameRestorationType[ plane ] == RESTORE_WIENER) {
        @@use_wiener                                                           S
        restoration_type = use_wiener ? RESTORE_WIENER : RESTORE_NONE
      } else if (FrameRestorationType[ plane ] == RESTORE_SGRPROJ) {
        @@use_sgrproj                                                          S
        restoration_type = use_sgrproj ? RESTORE_SGRPROJ : RESTORE_NONE
    } else {
        @@restoration_type                                                     S
    }

frame_offset_update

The current spec indicates:

if (show_frame == 0) {
       frame_offset_update                      f(5)
}

and then:

frame_offset_update specifies how many frames later this frame will be shown.

A frame may never be shown. In that case, what should be the value for frame_offset_update? 0?

Spec for loop filter

I found several problems in spec related to loop filter:

[1]. 6.7.9 If primary_ref_frame is equal to PRIMARY_REF_NONE, loop_filter_delta_enabled should also be set to 1.

[2]. 5.9.12. if ( delta_lf_abs == DELTA_LF_SMALL ) --> if ( delta_lf_abs >= DELTA_LF_SMALL )

[3]. 6.9.12. If delta_lf_abs is larger or equal to DELTA_LF_SMALL, the value is encoded using delta_lf_rem_bits and delta_lf_abs_bits.

[4]. 7.13.1: Logic of applyFilter is not correct.
Should be:
Otherwise, if (isBlockEdge is equal to 1 or skip is equal to 0 or prevSkip is equal to 0) and (current block filter level is not 0 or prev block filter level is not 0), applyFilter is set equal to 1.

[5]. 7.13.2: filterSize is not correct.
Should be:
If plane is equal to 0, filterSize is set equal to Min( 14, baseSize ),
Otherwise, (plane is greater than 0), filterSize is set equal to Min( 6, baseSize ).

[6]. Need to update loop filter advance step size.
Loop filter now doesn’t use a fix step of unit of 4x4. Instead, step size depends on tx size.
For example, if current block tx size is TX_8X4, filtering is in pass 0 (filtering vertical edges), then the step is tx_size_wide_unit[TX_8X4], equal to 2 unit 4x4 step size. Similarly, for pass 1, step is tx_size_high_unit[TX_8X4], equal to 1 unit 4x4 step size.

Upscaling process

IN Section 7.16, the following line is used to compute pixel position in coding resolution (downscaled).

srcX = -(1 << SUPERRES_SCALE_BITS) + initialSubpelX + x*stepX

Checking against the code, i could not seem to locate the corresponding part to -(1 << SUPERRES_SCALE_BITS), which in some cases makes srcX negative.

LAST_FRAME and other keywords not defined

The constant LAST_FRAME is used in places such as ref_frame_sign_bias[ LAST_FRAME + i ] but its value is never defined.

Similar remark for the other keywords: GOLDEN_FRAME, ...

Independent tiles and context clear

In section 5.8.1 there is the text:

if ( !dependent_tiles ||
(TileCol == 0 && !AllowDependentTileRow[ TileRow ]) ) {
clear_above_context( )
}

I was just wondering why the context is cleared on every column if dependent_tiles==0 but only on column 0 for allow dependent tiles=0.

add "disable_deblocking" flag in sequence header

now,we must specify specified value in loop-filter_params() to make the filter don't change the pixels indeed then "disable" deblocking alternative .

And change the loop-filter to deblocking is more accurate . as general loop-filter should include (deblocking+cdef+LR)

still_picture and bitstream vs. coded video sequence

The spec says:

still_picture equal to 1 specifies that the bitstream contains only one coded frame. still_picture equal to 0 specifies that the bitstream contains one or more coded frames.

It should say "coded video sequence" instead of "bitstream".

Large Scale Tile height

In Section 7.2 on the large scale tile decoding process there is no limit specified on tileHeight. Our recollection is that it was agreed in the working group that tileHeight must be one superblock - so that a specific area to decode is either Wx64 or Wx128 pixels depending on superblock size.

Note about differences between KF and IOF is wrong

The spec currently says:

Note: A key frame is different to an intra-only frame even though both only use intra prediction. The difference is that a key frame fully resets the decoding process. For example, for a sequence of frames A, B, C, frame C can use frame A as a reference if B was an intra-only frame - but not if B was a keyframe.

This is more complex than that. The reset of the decoding process is controlled by refresh_frame_flags.

Chroma 4x4 prediction

From 6.9.29 Compute Prediction Semantics: "Normally, a single prediction is performed for the entire chroma residual block based on the mode info of the bottom right luma block."

Now, in 5.9.4 Decode Block Syntax, HasChroma is non-zero only for the bottom-right luma block with odd MiRow and/or MiCol.

Then, in 5.9.32 Compute Prediction Syntax, the outer-most for loop has plane > 0 only for odd MiRow and/or MiCol because HasChroma > 0 in these cases only.

So, for odd MiRow and/or MiCol,
candRow = (MiRow >> subY) << subY
candCol = (MiCol >> subX) << subX

Therefore, candRow, candCol point to the mode information of the TOP LEFT luma block.

This contradicts the quoted statement from 6.9.29: "... mode info of the BOTTOM RIGHT...".

Please clarify.

"Page [x] of [y]" count at bottoms of pages is quirky

Expected:

  • Page-numbering scheme should start from "Page 1".
  • On the last page, numbers should match.

Here's what I'm seeing:

  • "Page [x] of [y]" count starts with "Page 0", and goes up from there.
    • (Non-techies might find this confusing, and I think it defies expectations.)
  • Last page says "Page 636 of 647"

Typo in Read Inter Intra Semantics

In 6.10.26, I think this
"interintra_mode specifies the type of intra PREDICTION to be used"
should be replaced by this
"interintra_mode specifies the type of intra BLENDING to be used"

Bitstream format terminology clarification

  • The Abstract/Scope say

This document defines the bitstream format

  • "bitstream" is only defined as:

The sequence of bits generated by encoding a sequence of frames.

  • "AV1 bitstream" is not defined

  • "bitstream format" is not defined

  • The current title for Annex B is:

Annex B: Nominal bitstream interchange format

  • The titles of B.1 and B.2 are:

Bitstream syntax
Bitstream semantics

Overall, this is misleading people into thinking that THE recommended format for AV1 is Annex B.

I would suggest:

  • Adding a top-level section to the spec or as a high-level subsection of one of the existing sections, entitled "low overhead bitstream format", with the following content:

This specification defines a low-overhead bitstream format as a sequence of the OBU syntactical elements defined in Section 5. When using this format, the obu_has_size_field may be used to enable parsers to skip OBUs without having to parse them. For applications requiring a format where it is easier to skip through frames or temporal units, a length-delimited bitstream format is defined in Annex B. Derived specifications, such as container formats enabling storage of AV1 videos together with audio or subtitles should indicate which of these formats they rely on.

  • renaming Annex B to "Annex B: length-delimited bitstream format"
  • Replace the overview as follows:

Section 5 defines the syntax for OBUs. Section XXX defines the low-overhead bitstream format. This annex defines a length-delimited format for packing OBUs into a format that enables skipping through temporal units and frames more easily.

  • Rename B1 and B2 to

Length-delimited bitstream syntax
Length-delimited bitstream semantics

  • Change the Abstract and Scope to use "bitstream formats" (plural)

coded video sequence term

The term coded video sequence is used once in the spec in 6.3 but never defined:

level specifies the level for the coded video sequence.

It should either be defined or replaced.

compound type definitions.

In the spec, compound_type is defined as wedge (0), diffwtd (1), average (2), intra (3), distance (4).
But in the code (enums.h), it is defined as:

typedef enum {
  COMPOUND_AVERAGE,
  COMPOUND_WEDGE,
  COMPOUND_DIFFWTD,
  COMPOUND_TYPES,
} COMPOUND_TYPE;

So the code doesn't have COMPOUND_INTRA or COMPOUND_DISTANCE mode. And constants for average/wedge/diffwtd are also different. Is the spec going to follow the code and remove intra/distance modes?

Thanks.

Profile clarification

Hi,

In section "A.1.1 Profiles" the three profiles Main, High and Professional are defined by bullets. A Main profile decoder must decode seq_profile=0, High profile decoder seq_profile=0,1. Professional profile decoder seq_profile=0,1,2. This seems fine to me. However the following Note seems confusing, It says for example that "monochrome is not allowed in high profile". The above bullet has just said that a high profile decoder must support seq_profile=0 and so therefore must support monochrome.

It seems to me it would be clearer to not use the Main, High, Professional terminology (that refers to an increasing sequence of capability) when referring to an individual seq_profile number (such as 0,1,2) which as I understand it are defined so as to have a unique setting for each content type.

So, for example the note could say that monochrome is not permitted when seq_profile=1 rather than not permitted in high profile?

frame_size_override_flag description

In 6.7.1, the following statement is in the spec:

frame_size_override_flag equal to 1 specifies that the frame size will be contained in the bitstream. frame_size_override_flag equal to 0 specifies that the frame size is equal to the size in the sequence header.

The first statement above seems a little misleading at best, If the flag is 1, we could wind up getting the frame size from one of the references. Technically, the statement "the frame size will be contained in the bitstream" is not 100% accurate? (What's contained in the bitstream is a pointer to the reference frame from which the frame size needs to be derived from. The frame size itself is not in the bitstream)

su(n) definition simplification

In section 4.9.4, the su(n) definition uses:

  value = value - 2 * signBit

You either keep the 'if' and keep it as
value = value - 2

Or delete the 'if' and keep it as:
value = value - 2 * signBit

The text is not in error, it just has unnecessary computations.

CDEF skip for sub 8x8 blocks

Hi Peter,

It seems that "7.14.1 CDEF Block Process" specifies that the CDEF processing for the 8x8 block at (r,c) should be skipped if the Skips[r][c] is set (i.e. if the top-left 4x4 in the 8x8 is skip).

However, it appears that the reference decoder loops over all 4x4 blocks in the 8x8, and only if all 4 are skip, the CDEF for the 8x8 is skipped (see is_8x8_block_skip()).

I think this might be a spec bug?

Regards
/Ola

Self guided parameter w1

Hi Peter,

In section "7.16.2 Self Guided Filter Process" I'm wondering if

   v = w1 * u

should rather be

   v = w2 * u

Regards
/Ola Hugosson

Encoding of lr_sgr_set

The symbol lr_sgr_set is encoded as NS(SGRPROJ_PARAMS_BITS)=NS(4) in the specification which seems to encode the range 0 to 3 according to 4.9.6. However, in the code it seems to be range 0 to 15 encoded as a 4 bit literal. I'm assuming it should be a 4 bit raw value.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.