summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorJean-Marc Valin <jmvalin@jmvalin.ca>2016-06-30 18:01:04 -0400
committerJean-Marc Valin <jmvalin@jmvalin.ca>2016-06-30 18:01:04 -0400
commit4a4bc08031cf3c909a568d47e007131085b5d350 (patch)
tree256d87df22f422c9c13b1ac16656b58139c640b5 /doc
parentd6642d694333309a9676499c1d7910953290e5ed (diff)
downloadopus-4a4bc08031cf3c909a568d47e007131085b5d350.tar.gz
Adding hybrid folding section and new testvectors to the update draft
Diffstat (limited to 'doc')
-rw-r--r--doc/draft-ietf-codec-opus-update.xml123
1 files changed, 110 insertions, 13 deletions
diff --git a/doc/draft-ietf-codec-opus-update.xml b/doc/draft-ietf-codec-opus-update.xml
index 74147221..cace9680 100644
--- a/doc/draft-ietf-codec-opus-update.xml
+++ b/doc/draft-ietf-codec-opus-update.xml
@@ -10,7 +10,7 @@
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
-<rfc category="std" docName="draft-ietf-codec-opus-update-01"
+<rfc category="std" docName="draft-ietf-codec-opus-update-02"
ipr="trust200902">
<front>
<title abbrev="Opus Update">Updates to the Opus Audio Codec</title>
@@ -47,7 +47,7 @@
- <date day="4" month="September" year="2014" />
+ <date day="1" month="July" year="2016" />
<abstract>
<t>This document addresses minor issues that were found in the specification
@@ -79,8 +79,9 @@
during a mode switch. The old stereo memory can produce a brief impulse
(i.e. single sample) in the decoded audio. This can be fixed by changing
silk/dec_API.c at line 72:
- <figure>
- <artwork><![CDATA[
+ </t>
+<figure>
+<artwork><![CDATA[
for( n = 0; n < DECODER_NUM_CHANNELS; n++ ) {
ret = silk_init_decoder( &channel_state[ n ] );
}
@@ -93,11 +94,9 @@
}
]]></artwork>
</figure>
- This change affects the normative part of the decoder. Fortunately,
- the modified decoder is still compliant with the original specification because
- it still easily passes the testvectors. For example, for the float decoder
- at 48 kHz, the opus_compare (arbitrary) "quality score" changes from
- from 99.9333% to 99.925%.
+ <t>
+ This change affects the normative part of the decoder, although the
+ amount of change is too small to make a significant impact on testvectors.
</t>
</section>
@@ -107,8 +106,9 @@
This is due to an integer overflow if the signaled padding exceeds 2^31-1 bytes
(the actual packet may be smaller). The code can be fixed by applying the following
changes at line 596 of src/opus_decoder.c:
- <figure>
- <artwork><![CDATA[
+ </t>
+<figure>
+<artwork><![CDATA[
/* Padding flag is bit 6 */
if (ch&0x40)
{
@@ -126,7 +126,6 @@
}
]]></artwork>
</figure>
- </t>
<t>This packet parsing issue is limited to reading memory up
to about 60 kB beyond the compressed buffer. This can only be triggered
by a compressed packet more than about 16 MB long, so it's not a problem
@@ -158,6 +157,7 @@
was ever a problem. However, proving that is non-obvious.
</t>
<t>The code can be fixed by applying the following changes to line 70 of silk/resampler_private_IIR_FIR.c:
+ </t>
<figure>
<artwork><![CDATA[
)
@@ -214,6 +214,7 @@ RESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) );
}
]]></artwork>
</figure>
+ <t>
Note: due to RFC formatting conventions, lines exceeding the column width
in the patch above are split using a backslash character. The backslashes
at the end of a line and the white space at the beginning
@@ -223,7 +224,7 @@ RESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) );
</t>
</section>
- <section title="Downmix to Mono">
+ <section title="Downmix to Mono" anchor="stereo">
<t>The last issue is not strictly a bug, but it is an issue that has been reported
when downmixing an Opus decoded stream to mono, whether this is done inside the decoder
or as a post-processing step on the stereo decoder output. Opus intensity stereo allows
@@ -237,6 +238,102 @@ RESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) );
outside of the decoder).
</t>
</section>
+
+ <section title="Hybrid Folding" anchor="folding">
+ <t>When encoding in hybrid mode at low bitrate, we sometimes only have
+ enough bits to code a single CELT band (8 - 9.6 kHz). When that happens,
+ the second band (CELT band 18, from 9.6 to 12 kHz) cannot use folding
+ because it is wider than the amount already coded, and falls back to
+ LCG noise. Because it can also happen on transients (e.g. stops), it
+ can cause audible pre-echo.
+ </t>
+ <t>
+ To address the issue, we change the folding behaviour so that it is
+ never forced to fall back to LCG due to not enough folding data. This
+ is achieved by simply repeating part of the first band in the folding
+ of the second band. This changes the code in celt/bands.c around line 237:
+ </t>
+<figure>
+<artwork><![CDATA[
+ b = 0;
+ }
+
+- if (resynth && M*eBands[i]-N >= M*eBands[start] && \
+(update_lowband || lowband_offset==0))
++ if (resynth && (M*eBands[i]-N >= M*eBands[start] || \
+i==start+1) && (update_lowband || lowband_offset==0))
+ lowband_offset = i;
+
++ if (i == start+1)
++ {
++ int n1, n2;
++ int offset;
++ n1 = M*(eBands[start+1]-eBands[start]);
++ n2 = M*(eBands[start+2]-eBands[start+1]);
++ offset = M*eBands[start];
++ /* Duplicate enough of the first band folding data to \
+be able to fold the second band.
++ Copies no data for CELT-only mode. */
++ OPUS_COPY(&norm[offset+n1], &norm[offset+2*n1 - n2], n2-n1);
++ if (C==2)
++ OPUS_COPY(&norm2[offset+n1], &norm2[offset+2*n1 - n2], \
+n2-n1);
++ }
++
+ tf_change = tf_res[i];
+ if (i>=m->effEBands)
+ {
+]]></artwork>
+</figure>
+
+ <t>
+ as well as line 260:
+ </t>
+
+<figure>
+<artwork><![CDATA[
+ fold_start = lowband_offset;
+ while(M*eBands[--fold_start] > effective_lowband);
+ fold_end = lowband_offset-1;
+- while(M*eBands[++fold_end] < effective_lowband+N);
++ while(++fold_end < i && M*eBands[++fold_end] < \
+effective_lowband+N);
+ x_cm = y_cm = 0;
+ fold_i = fold_start; do {
+ x_cm |= collapse_masks[fold_i*C+0];
+
+]]></artwork>
+</figure>
+ <t>
+ The fix does not impact compatibility, because the improvement does
+ not depend on the encoder doing anything special. There is also no
+ reasonable way for an encoder to use the original behaviour to
+ improve quality over the proposed change.
+ </t>
+ </section>
+
+ <section title="New Test Vectors">
+ <t>Changes in <xref target="stereo"/> and <xref target="folding"/> have
+ sufficient impact on the testvectors to make them fail. For this reason,
+ this document also updates the Opus test vectors. The new test vectors now
+ include two decoded outputs for the same bitstream. The outputs with
+ suffix 'm' do not apply the CELT 180-degree phase shift as allowed in
+ <xref target="stereo"/>, while the outputs with suffix 's' do. An
+ implementation is compliant as long as it passes either the 'm' or the
+ 's' set of vectors.
+ </t>
+ <t>
+ In addition, any Opus implementation
+ that passes the original test vectors from <xref target="RFC6716">RFC 6716</xref>
+ is still compliant with the Opus specification. However, newer implementations
+ SHOULD be based on the new test vectors rather than the old ones.
+ </t>
+ <t>The new test vectors are located at
+ <eref target="https://jmvalin.ca/misc_stuff/opus_newvectors.tar.gz"/>. (EDITOR:
+ change link ietf.org when ready).
+ </t>
+ </section>
+
<section anchor="IANA" title="IANA Considerations">
<t>This document makes no request of IANA.</t>