Coder Social home page Coder Social logo

libiamf's People

Contributors

cconcolato avatar felicialim avatar jwcullen avatar sunghee-hwang avatar tdaede avatar yeroro avatar yilun-zhangs avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libiamf's Issues

Reviewing test_000013 - test_000035

Hello~

Common Issue(Opus codec):
Issue 1: preskip value is 0 in codec_config_metadata { }
Refering to spec, Pre-skip shall be same as the number of audio samples to be trimmed at the start of substreams.

Issue 2: relative_offset of OBU_DATA_TYPE_PARAMETER is 0
Refering to AOMediaCodec/iamf#331, In my opinion, the relative_offset of parameter is same with the value of trimming start.

Issue 3: no edit list box in mp4 file.
Refering to spec, Take the number of trimmed samples from the IA sequence and converting it to the sample rate used by the elst boxes.

Issue 4: After decoding, the wave quality is poor comparing with original wav file.
I think, this could be resolved by improving the bitrate setting, I guess you have set with a low bitrate.

Individual Issues:
Issue 5: test_000015, the parameter_id in parameter_block_metadata{} is 100 and 101, but in element_mix_config{} and element_mix_config{}, both parameter IDs are 100, not matching.

Thanks.

Improve human_readable_description

I tried understanding what the tests were about. I opened this test proto and looked at the human_readable_description: https://github.com/AOMediaCodec/libiamf/blob/main/tests/test_000056.textproto#L5.
It says a "A variant of test_000055". test_000055 pointed me to test_000054, which pointed me to test_000050 ... and recursively I ended up in test_000004 which does not exist.

Generally, the tests should be self understandable, maybe having a list of keywords for the features they exercise would help.

[Cleanup test 000000-000056][test_000056_s.mp4] ctts box in stbl box is not helpful.

currently, IAMF specification assume that PTS and DTS are same like other ordinary audio codec.
Instead of ctts which has the preskip value, according to https://aomediacodec.github.io/iamf/#isobmff-singletrack-basicencapsulationscheme, the preskip value is stored in elst.media_time.
Though ctts is useful for ordninary video codec which is PTS and DTS are generally different,
in IAMF bitstream, ctts box in stbl box is not helpful.
I think that ctts box makes our IAMF decoder misunderstand,
it should be omitted.

[test_000000_3.iamf] the samples of last frame obu.

the last frame obu only has 64 samples, not 128.
test_000000_3.textproto:
codec_config_metadata {
codec_config_id: 200
codec_config {
codec_id: 0x6970636d # "ipcm"
num_samples_per_frame: 128
roll_distance: 0
decoder_config_lpcm {
sample_format_flags: LPCM_LITTLE_ENDIAN
sample_size: 16
sample_rate: 16000
}
}
}

[test_000051 ~ 56] the input bitstream range

The input bitstream for test_000051 to 000056 seems to be part of the wav file audiolab-acoustic-guitar_2OA_470_ALLRAD_5s.wav.
Is is correct that the input bitstream has 239,722 samples while the wav file has 240,000 samples?
If it is, could you describe which part of the wave file matches the input bitstream on the textproto?
or, you can provide the input bitstream as a separate wav file.

[Cleanup test 000000-000056] abnormal signal when use opus coding

#30

Hi,

I think the cases of opus coding, sample file contains strange signals.

Even though we trim as many samples as pre-skip, there is additional strange data of pre-skip size.

If trim_at_start was not operated in the decoder, there was a strange signal as twice as the size of pre-skip.
Therefore, it seems that the back part of the input signal is not included.

(test_000020~000028 etc. the cases using opus coding)

For example)
Input sample: 24000
num_samples_per_frame: 128

What I expected:
312(pre-skip) | 24000(input sample) | 8(padding for num_samples_per_frame)

What I guess:
312(??) + 312(pre-skip) | 23696(shorter than input)

Please check and correct it.

Mp4 container issues

I have found some issues about the updated mp4 files(#19 ).

  • Issue 1, in EDST box, there are 2 sample entries:

Segment duration | Media time | Media rate integer | Media rate fraction
20 4294967295 1 0
1481 312 1 0
In my opinion, one entry with Media time of 312 will be enough. I am not sure what's the first entry.

  • Issue 2, STTS box.

I think the sample delta of first entry should be 960(just check one of the TCs, test_000026) not 648.
Even the trimming start is 312, but the encoded samples are 960 samples from libopus.

  • Issue3, missed last audio frame.

(just check one of the TCs, test_000026), the total duration after decoding is not same with original wav file.
In my opinion, there is 312 pending samples which is not outputted from libopus.
if encoder outputs the pending samples, the sample delta of last frame is 312, and trimming end is 648.

Decoded wav files

Could you tell us how you got the decoded wav files in PR #19?
Can the decoded wav files be used as reference waves for test verification?

PR #30 [Cleanup test 000000-000056] duration of parameter_block_meta seems to be wrong

duration of parameter_block_metadata should cover pre-skip and padding data of the last frame as well as the input wave samples. (PR #433)
However, duration in textproto of test vectors seem to cover only pre-skip and input wave samples.

In the case of test_000049, the duration of parameter_block_metadata should be as the below

The number of samples in the input: 644971
pre_skip: 312
num_samples_per_frame: 2880

The number of encoded samples:
312(pre_skip) + 644971 + 2717(padding for the last frame) = 648000 = 2880 x 225

Therefore, Parameter block duration = 648000
However, duration in textproto is 312(pre_skip) + 644971 = 645283

test_000002 and test_000036 (other tests including padding for the last frame) also have the same problem.

Please check it. Thanks.

Missing files(test_000041)

In the case of test_000041, there is only textproto file and no iamf or mp4 file.

Also, file_name_prefix in test_vector_metadata is set to test_000005, so please check and correct it.
Thanks.

Test vectors review issues

1, All parameter type should have unique OBU ID, as I know, this has been talked on last weekly meeting.
2, The duration of parameter may need to refer to AOMediaCodec/iamf#366
In my opinion, 'obu_redundant_copy' setting with 1 is not allowed during mp4 encapsulation, or we cannot do seeking.
3, test 12
In STTS box of test_000012_s.mp4, the samples of last audio packet is 62. and the trimming end is 2.
so, after decoding, the total samples will be 124*64 + 62 = 7998, which is not match with the samples of original file (sawtooth_100_stereo.wav)

Thanks.

[test_000014] obu_id

I remember that you used all obu_id:0 before changing to 100, 200, etc.
It seems that the modification is not reflected.

If you don't have any other intention, please fix it so that it's the same as other cases.

Review test_000049, test_000050, test_000052~test_000056

Hello ~

Issue 1: there are no demix mode parameters and recon gain parameters in bitstream, it seems that the encoder doesn't downmix channels follows https://aomediacodec.github.io/iamf/#iamfgeneration-scalablechannelaudio

Issue 2:L&R channels switch to SL&SR channels
for example: test_000049.textproto
audio_frame_metadata {
...
substream_id_ordering: [0, 1, 2, 3] # L/R, Ls/Rs, C, LFE
...
}
the channels order should be Ls/Rs, L/R, C, LFE,
please refer to the figure 21 in https://aomediacodec.github.io/iamf/#iamfgeneration-scalablechannelaudio-channelgroupgenerationrule

Reference wav files

Hi

There are no reference files on test_000030 (sawtooth_10000_stereo_44100hz_s16le.wav) and test_000031 (sawtooth_10000_stereo_48khz_s24le.wav)

Please upload these files.

[test_000048] parameter_rate value seems to be strange.

In test_000048 case,
16000 is used as the parameter_rate value of the mix_gain parameter.

However, the duration value of parameter_block_metadata, it seems to be set based on 48000.
(Because input file length is 24000 samples and duration is 23688)

If you have any intention, please let us know clearly.

And.. I have another question personally, do you use textproto file as an input option?, or extracting the value from encoder?, or writing it manually?(maybe not..)
I wonder how you are using protocol_buffers.
Thanks :)

Generate list of test report

It would be good to generate an HTML report listing all the tests (filename, human description, is valid or not...) maybe in the form of a table. That report could be generated automatically whenever a PR is merged and published on GitHub pages. That would offer a synthesis of what is tested.

test_000022.iamf

Hello, I'm Yongmin.

test_000022 is an example of incorrect roll_distance, with -5 in the roll_distance value when 960 num_samples_per_frame.

According to Part 4.6(https://www.rfc-editor.org/rfc/rfc7845#page-11), the value of the sample before at least 3840 should be seen, and the case seems to satisfy the condition. (960 * 5 = 4800 samples)

Please check about this issue.

Compute metadata overhead

We can compare the size of the standalone iamf streams to the component bitstreams (or their bitrates) to compute the overall metadata overhead.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.