summaryrefslogtreecommitdiff
path: root/doc/isoflac.txt
blob: 46a218d0177f27b40c234e815b484cb6281be7e8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
Encapsulation of FLAC in ISO Base Media File Format
Version 0.0.1 (draft)

Table of Contents
1 Scope
2 Supproting Normative References
3 Terms and Definitions
4 Design Rules of Encapsulation
  4.1 File Type Indentification
  4.2 Overview of Track Structure
  4.3 Definition of FLAC sample
        4.3.1 Sample entry format
        4.3.2 FLAC Specific Box
        4.3.3 Sample format
        4.3.4 Duration of FLAC sample
        4.3.5 Sub-sample
        4.3.6 Random Access
            4.3.6.1 Random Access Point
   4.4 Basic Structure (informative)
        4.4.1 Initial Movie
   4.5 Example of Encapsulation (informative)
5 Acknowledgements
6 Author's Address

1 Scope

    This document specifies the normative mapping for encapsulation of
    FLAC coded audio bitstreams in ISO Base Media file format and its
    derivatives. The encapsulation of FLAC coded bitstreams in
    QuickTime file format is outside the scope of this specification.

2 Supporting Normative References

    [1] ISO/IEC 14496-12:2012 Corrected version

        Information technology — Coding of audio-visual objects — Part
        12: ISO base media file format

    [2] ISO/IEC 14496-12:2012/Amd.1:2013

        Information technology — Coding of audio-visual objects — Part
        12: ISO base media file format AMENDMENT 1: Various
        enhancements including support for large metadata

    [3] FLAC format specification

        https://xiph.org/flac/format.html

	Definition of the FLAC Audio Codec stream format

    [4] FLAC-in-Ogg mapping specification

    	https://xiph.org/flac/ogg_mapping.html

	Ogg Encapsulation for the FLAC Audio Codec

    [5] Matroska specification

3 Terms and Definitions

    3.1 active track

        enabled track from the non-alternate group or selected track
        from alternate group

    3.2 edit

        entry in the Edit List Box

    3.3 sample-accurate

        for any PCM sample, a timestamp exactly matching its sampling
        timestamp is present in the media timeline.


4 Design Rules of Encapsulation

    4.1 File Type Indentification
    
        This specification does not define any brand to declare files
        are conformant to this specification.  Files conformant to
        this specification shall contain at least one brand which
        supports the requirements and the requirements described in
        this clause without contradiction in the compatible brands
        list of the File Type Box.  The minimal support of the
        encapsulation of FLAC bitstreams in ISO Base Media file format
        requires the 'isom' brand.
	
    4.2 Overview of Track Structure

        FLAC coded audio shall be encapsulated into the ISO Base
        Media File Format as media data within an audio track.
	
            + The handler_type field in the Handler Reference Box
              shall be set to 'soun'.
	    
            + The Media Information Box shall contain the Sound Media
              Header Box.
	    
            + The codingname of the sample entry is 'fLaC'.
	    
                This specification does not define any encapsulation
                using MP4AudioSampleEntry with objectTypeIndication
                specified by the MPEG-4 Registration Authority
                (http://www.mp4ra.org/).  See section 'Sample entry
                format' for the definition of the the sample entry.
		
            + The 'dfLa' box is added to the sample entry to convey
                initializing information for the decoder.

		See section 'FLAC Specific Box' for the definition of
                the box contents.
		
            + A FLAC sample is exactly one FLAC packet.  See section
                'Sample format' for details of the packet contents.
		
            + Every FLAC sample is a sync sample.  No pre-roll or
                lapping is required.  See section 'Random Access' for
                further details.

    4.3 Definition of a FLAC sample
    
        4.3.1 Sample entry format

	    For any track containing one or more FLAC bitstreams, a
            sample entry describing the corresponding FLAC bitstream
            shall be present inside the Sample Table Box. This version
            of the specification defines only one sample entry format
            named FLACSampleEntry whose codingname is 'fLaC'. This
            sample entry includes exactly one FLAC Specific Box
            defined in section 'FLAC specific box' as a mandatory box
            and indicates that FLAC samples described by this sample
            entry are stored by the sample format described in section
            'Sample format'.

            The syntax and semantics of the FLACSampleEntry is shown
            as follows.  The data fields of this box and native
            FLAC[3] structures encoded within FLAC blocks are both
            stored in big-endian format, though for purposes of the
            ISO BMFF container, FLAC native metadata and data blocks
            are treated as unstructured octet streams.
	    
            class FLACSampleEntry() extends AudioSampleEntry ('fLaC'){
                FLACSpecificBox();
            }

            + channelcount:

	        The channelcount field shall be set equal to the
	        channel count specified by the FLAC bitstream's native
	        METADATA_BLOCK_STREAMINFO header as described in [3].
	        Note that the FLAC FRAME_HEADER structure that begins
	        each FLAC sample redundantly encodes channel number;
	        the number of channels declared in each FRAME_HEADER
	        MUST match the number of channels declared here and in
	        the METADATA_BLOCK_STREAMINFO header.

            + samplesize:

	        The samplesize field shall be set equal to the bits
	        per sample specified by the FLAC bitstream's native
	        METADATA_BLOCK_STREAMINFO header as described in [3].
	        Note that the FLAC FRAME_HEADER structure that begins
	        each FLAC sample redundantly encodes the number of
	        bits per sample; the bits per sample declared in each
	        FRAME_HEADER MUST match the samplesize declared here
	        and the bits per sample field declared in the
	        METADATA_BLOCK_STREAMINFO header.
		
            + samplerate:

                The samplerate field shall be set equal to the sample
                rate specified by the FLAC bitstream's native
                METADATA_BLOCK_STREAMINFO header as described in [3],
                left-shifted by 16 bits.  Note that the FLAC
                FRAME_HEADER structure that begins each FLAC sample
                redundantly encodes the sample rate; the sample rate
                declared in each FRAME_HEADER MUST match the sample
                rate declared here and in the
                METADATA_BLOCK_STREAMINFO header.
		
            + FLACSpecificBox
	    
                This box contains initializing information for the
                decoder as defined in section 'FLAC specific box'

        4.3.2 FLAC Specific Box
	
            Exactly one FLAC Specific Box shall be present in each
            FLACSampleEntry.  The FLAC Specific Box contains the
            Version field and this specification defines version 0 of
            this box.  If incompatible changes occur in the fields
            after the Version field within the FLACSpecificBox in the
            future versions of this specification, another version
            will be defined.  The data fields of this box and native
            FLAC[3] structures encoded within FLAC blocks are both
            stored in big-endian format, though for purposes of the
            ISO BMFF container, FLAC native metadata and data blocks
            are treated as unstructured octet streams.

            The syntax and semantics of the FLAC Specific Box is shown
            as follows.

            aligned(8) class FLACMetadataBlock {
	    	unsigned int(1) LastMetadataBlockFlag;
	    	unsigned int(7) BlockType;
		unsigned int(24) Length;
                unsigned int(8) MetadataBlockData[BlockLength];
            }

	    aligned(8) class FLACSpecificBox
	          (unsigned int32 MetadataBlocks) extends Box('dfLa'){
                unsigned int(8) Version;
		for(i=0; i <= MetadataBlocks; i++){
	            FLACMetadataBlock();
		}
            }

	    + MetadataBlocks:

	        The number of FLAC[3] native metadata blocks to
		follow.  This value must be at least 1 as a native
		METADATA_BLOCK_STREAMINFO structure is required to
		decode FLAC audio data.

		This value is not coded as the end of the
		FLACMetadataBlock list can be intuited from the
		LastMetadataBlockFlag (see below).

            + Version:

		The Version field shall be set to 0.

		In the future versions of this specification, this
                field may be set to other values. And without support
                of those values, the reader shall not read the fields
                after this within the FLACSpecificBox.

	    The Version field is followed by a sequence of FLAC[3]
	    native-metadata block structures that fill the remainder
	    of the box length.

	    + LastMetadataBlockFlag:

	        The LastMetadataBlockFlag field maps semantically to
	        the FLAC[3] native MEATADATA_BLOCK_HEADER
	        Last-metadata-block flag as defined in the FLAC[3]
	        file specification.
		
	        The LastMetadataBlockFlag is set to 1 if this
	        MetadataBlock is the last metadata block in the
	        FLACSpecificBox.  It is set to 0 otherwise.
	       
	    + BlockType:

	        The BlockType field maps semantically to the FLAC[3]
	        native MEATADATA_BLOCK_HEADER BLOCK_TYPE field as
	        defined in the FLAC[3] file specification.

		The BlockType is set to a valid FLAC[3] BLOCK_TYPE
 	        value that identifies the type of this native metadata
 	        block.  The BlockType of the first FLACMetadataBlock
 	        must be set to 0, signifying this is a FLAC[3] native
 	        METADATA_BLOCK_STREAMINFO block.
	       
	    + Length:

	        The Length field maps semantically to the FLAC[3]
	        native MEATADATA_BLOCK_HEADER Length field as
	        defined in the FLAC[3] file specification.

		The length field specifies the number of bytes of
		MetadataBlockData to follow.

	    + MetadataBlockData

	        The MetadataBlockData field maps semantically to the
	        FLAC[3] native MEATADATA_BLOCK_HEADER
	        METADATA_BLOCKDATA as defined in the FLAC[3] file
	        specification.

	    The FLACMetadataBlock structure consists of three fields
	    filling a total of four bytes that form a FLAC[3] native
	    METADATA_BLOCK_HEADER, followed by raw octet bytes that
	    comprise the FLAC[3] native METADATA_BLOCK_DATA.  Taken
	    together, the bytes of the FLACMetadataBlock form a
	    complete FLAC[3] native METADATA_BLOCK structure.

	    Note that a minimum of a single FLACMetadataBlock,
	    consisting of a FLAC[3] native METADATA_BLOCK_STREAMINFO
	    structure, is required.  Should the FLACSpecificBox
	    contain more than a single FLACMetadataBlock structure,
	    the FLACMetadataBlock contianing the FLAC[3] native
	    METADATA_BLOCK_STREAMINFO must occur first in the list.

	    Other containers that package FLAC audio streams, such as
	    Ogg[4] and Matroska[5], wrap FLAC[3] native metadata without
	    modification similar to this specification.  When
	    repackaging or remuxing FLAC[3] streams from another
	    format that contains FLAC[3] native metadata into an ISO
	    BMFF file, the complete FLAC[3] native metadata should be
	    preserved in the ISO BMFF stream as described above.  It
	    is also allowed to parse this native metadata and include
	    contextually redundant ISO BMFF-native repackagings and/or
	    reparsings of FLAC[3] native metadata, so long as the
	    native metadata is also preserved.

        4.3.3 Sample format
	
            A FLAC sample is exactly one FLAC audio FRAME packet (as
            defined in the FLAC[3] file specification) belonging to a
            FLAC bitstreams. The FLAC sample data begins with a
            complete FLAC FRAME_HEADER, followed by one FLAC SUBFRAME
            per channel, any necessary bit padding, and ends with the
            usual FLAC FRAME_FOOTER.

	    Note that the FLAC native FRAME_HEADER structure that
	    begins each FLAC sample redundantly encodes channel count,
	    sample rate, and sample size.  The values of these fields
	    must agree both with the values declared in the FLAC
	    METADATA_BLOCK_STREAMINFO structure as well as the
	    FLACDSampleEntry box.

        4.3.4 Duration of a FLAC sample

	    The duration of any given FLAC sample is determined by
            dividing the decoded block size of a FLAC frame, as
            encoded in the FLAC FRAME's FRAME_HEADER structure, by the
            value of the timescale field in the Media Header Box.
            FLAC samples are permitted to have variable durations
            within a given audio stream.  FLAC does not use padding
            values.

        4.3.5 Sub-sample

            Sub-samples are not defined for FLAC samples in this
            specification.

        4.3.6 Random Access
	
            This subclause describes the nature of the random access of FLAC sample.

            4.3.6.1 Random Access Point
	    
                All FLAC samples can be independently decoded
                i.e. every FLAC sample is a sync sample. The Sync
                Sample Box shall not be present as long as there are
                no samples other than FLAC samples in the same
                track. The sample_is_non_sync_sample field for FLAC
                samples shall be set to 0.

    4.4 Basic Structure (informative)

        4.4.1 Initial Movie
	
            This subclause shows a basic structure of the Movie Box as follows:

            +----+----+----+----+----+----+----+----+------------------------------+
            |moov|    |    |    |    |    |    |    | Movie Box                    |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |mvhd|    |    |    |    |    |    | Movie Header Box             |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |trak|    |    |    |    |    |    | Track Box                    |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |tkhd|    |    |    |    |    | Track Header Box             |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |edts|*   |    |    |    |    | Edit Box                     |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |elst|*   |    |    |    | Edit List Box                |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |mdia|    |    |    |    |    | Media Box                    |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |mdhd|    |    |    |    | Media Header Box             |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |hdlr|    |    |    |    | Handler Reference Box        |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |minf|    |    |    |    | Media Information Box        |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |smhd|    |    |    | Sound Media Header Box       |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |dinf|    |    |    | Data Information Box         |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |dref|    |    | Data Reference Box           |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |    |url |    | DataEntryUrlBox              |
            +----+----+----+----+----+----+ or +----+------------------------------+
            |    |    |    |    |    |    |urn |    | DataEntryUrnBox              |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |stbl|    |    |    | Sample Table                 |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |stsd|    |    | Sample Description Box       |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |    |fLaC|    | FLACSampleEntry              |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |    |    |dfLa| FLAC Specific Box            |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |stts|    |    | Decoding Time to Sample Box  |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |stsc|    |    | Sample To Chunk Box          |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |stsz|    |    | Sample Size Box              |
            +----+----+----+----+----+ or +----+----+------------------------------+
            |    |    |    |    |    |stz2|    |    | Compact Sample Size Box      |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |    |    |    |stco|    |    | Chunk Offset Box             |
            +----+----+----+----+----+ or +----+----+------------------------------+
            |    |    |    |    |    |co64|    |    | Chunk Large Offset Box       |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |mvex|*   |    |    |    |    |    | Movie Extends Box            |
            +----+----+----+----+----+----+----+----+------------------------------+
            |    |    |trex|*   |    |    |    |    | Track Extends Box            |
            +----+----+----+----+----+----+----+----+------------------------------+

                    Figure 1 - Basic structure of Movie Box

            It is strongly recommended that the order of boxes should
            follow the above structure.  Boxes marked with an asterisk
            (*) may or may not be present depending on context.  For
            most boxes listed above, the definition is as is defined
            in ISO/IEC 14496-12 [1]. The additional boxes and the
            additional requirements, restrictions and recommendations
            to the other boxes are described in this specification.
	    
    4.5 Example of Encapsulation (informative)
        [File]
            size = 17790
            [ftyp: File Type Box]
                position = 0
                size = 24
                major_brand = mp42 : MP4 version 2
                minor_version = 0
                compatible_brands
                    brand[0] = mp42 : MP4 version 2
                    brand[1] = isom : ISO Base Media file format
            [moov: Movie Box]
                position = 24
                size = 757
                [mvhd: Movie Header Box]
                    position = 32
                    size = 108
                    version = 0
                    flags = 0x000000
                    creation_time = UTC 2014/12/12, 18:41:19
                    modification_time = UTC 2014/12/12, 18:41:19
                    timescale = 48000
                    duration = 33600 (00:00:00.700)
                    rate = 1.000000
                    volume = 1.000000
                    reserved = 0x0000
                    reserved = 0x00000000
                    reserved = 0x00000000
                    transformation matrix
                        | a, b, u |   | 1.000000, 0.000000, 0.000000 |
                        | c, d, v | = | 0.000000, 1.000000, 0.000000 |
                        | x, y, w |   | 0.000000, 0.000000, 1.000000 |
                    pre_defined = 0x00000000
                    pre_defined = 0x00000000
                    pre_defined = 0x00000000
                    pre_defined = 0x00000000
                    pre_defined = 0x00000000
                    pre_defined = 0x00000000
                    next_track_ID = 2
                [iods: Object Descriptor Box]
                    position = 140
                    size = 33
                    version = 0
                    flags = 0x000000
                    [tag = 0x10: MP4_IOD]
                        expandableClassSize = 16
                        ObjectDescriptorID = 1
                        URL_Flag = 0
                        includeInlineProfileLevelFlag = 0
                        reserved = 0xf
                        ODProfileLevelIndication = 0xff
                        sceneProfileLevelIndication = 0xff
                        audioProfileLevelIndication = 0xfe
                        visualProfileLevelIndication = 0xff
                        graphicsProfileLevelIndication = 0xff
                        [tag = 0x0e: ES_ID_Inc]
                            expandableClassSize = 4
                            Track_ID = 1
                [trak: Track Box]
                    position = 173
                    size = 608
                    [tkhd: Track Header Box]
                        position = 181
                        size = 92
                        version = 0
                        flags = 0x000007
                            Track enabled
                            Track in movie
                            Track in preview
                        creation_time = UTC 2014/12/12, 18:41:19
                        modification_time = UTC 2014/12/12, 18:41:19
                        track_ID = 1
                        reserved = 0x00000000
                        duration = 33600 (00:00:00.700)
                        reserved = 0x00000000
                        reserved = 0x00000000
                        layer = 0
                        alternate_group = 0
                        volume = 1.000000
                        reserved = 0x0000
                        transformation matrix
                            | a, b, u |   | 1.000000, 0.000000, 0.000000 |
                            | c, d, v | = | 0.000000, 1.000000, 0.000000 |
                            | x, y, w |   | 0.000000, 0.000000, 1.000000 |
                        width = 0.000000
                        height = 0.000000
                    [mdia: Media Box]
                        position = 273
                        size = 472
                        [mdhd: Media Header Box]
                            position = 281
                            size = 32
                            version = 0
                            flags = 0x000000
                            creation_time = UTC 2014/12/12, 18:41:19
                            modification_time = UTC 2014/12/12, 18:41:19
                            timescale = 48000
                            duration = 34560 (00:00:00.720)
                            language = und
                            pre_defined = 0x0000
                        [hdlr: Handler Reference Box]
                            position = 313
                            size = 51
                            version = 0
                            flags = 0x000000
                            pre_defined = 0x00000000
                            handler_type = soun
                            reserved = 0x00000000
                            reserved = 0x00000000
                            reserved = 0x00000000
                            name = Xiph Audio Handler
                        [minf: Media Information Box]
                            position = 364
                            size = 381
                            [smhd: Sound Media Header Box]
                                position = 372
                                size = 16
                                version = 0
                                flags = 0x000000
                                balance = 0.000000
                                reserved = 0x0000
                            [dinf: Data Information Box]
                                position = 388
                                size = 36
                                [dref: Data Reference Box]
                                    position = 396
                                    size = 28
                                    version = 0
                                    flags = 0x000000
                                    entry_count = 1
                                    [url : Data Entry Url Box]
                                        position = 412
                                        size = 12
                                        version = 0
                                        flags = 0x000001
                                        location = in the same file
                            [stbl: Sample Table Box]
                                position = 424
                                size = 321
                                [stsd: Sample Description Box]
                                    position = 432
                                    size = 79
                                    version = 0
                                    flags = 0x000000
                                    entry_count = 1
                                    [fLaC: Audio Description]
                                        position = 448
                                        size = 63
                                        reserved = 0x000000000000
                                        data_reference_index = 1
                                        reserved = 0x0000
                                        reserved = 0x0000
                                        reserved = 0x00000000
                                        channelcount = 2
                                        samplesize = 16
                                        pre_defined = 0
                                        reserved = 0
                                        samplerate = 48000.000000
                                        [dfLa: FLAC Specific Box]
                                            position = 484
                                            size = 48
                                            Version = 0
					    MetadataBlocks = 1
					    LastMetadataBlockFlag = 1
					    BlockType = 0
					    Length = 34
					    MetadataBlockData[34];
                                [stts: Decoding Time to Sample Box]
                                    position = 490
                                    size = 24
                                    version = 0
                                    flags = 0x000000
                                    entry_count = 1
                                    entry[0]
                                        sample_count = 18
                                        sample_delta = 1920
                                [stsc: Sample To Chunk Box]
                                    position = 514
                                    size = 40
                                    version = 0
                                    flags = 0x000000
                                    entry_count = 2
                                    entry[0]
                                        first_chunk = 1
                                        samples_per_chunk = 13
                                        sample_description_index = 1
                                    entry[1]
                                        first_chunk = 2
                                        samples_per_chunk = 5
                                        sample_description_index = 1
                                [stsz: Sample Size Box]
                                    position = 554
                                    size = 92
                                    version = 0
                                    flags = 0x000000
                                    sample_size = 0 (variable)
                                    sample_count = 18
                                    entry_size[0] = 977
                                    entry_size[1] = 938
                                    entry_size[2] = 939
                                    entry_size[3] = 938
                                    entry_size[4] = 934
                                    entry_size[5] = 945
                                    entry_size[6] = 948
                                    entry_size[7] = 956
                                    entry_size[8] = 955
                                    entry_size[9] = 930
                                    entry_size[10] = 933
                                    entry_size[11] = 934
                                    entry_size[12] = 972
                                    entry_size[13] = 977
                                    entry_size[14] = 958
                                    entry_size[15] = 949
                                    entry_size[16] = 962
                                    entry_size[17] = 848
                                [stco: Chunk Offset Box]
                                    position = 646
                                    size = 24
                                    version = 0
                                    flags = 0x000000
                                    entry_count = 2
                                    chunk_offset[0] = 686
                                    chunk_offset[1] = 12985
            [free: Free Space Box]
                position = 670
                size = 8
            [mdat: Media Data Box]
                position = 678
                size = 17001

5 Acknowledgements

    This spec draws heavily from the Opus-in-ISOBMFF specification
    work done by Yusuke Nakamura <muken.the.vfrmaniac |at| gmail.com>

6 Authors' Address

    Monty Montgomery <cmontgomery@mozilla.com>