summaryrefslogtreecommitdiff
path: root/shared-mime-info-spec.xml
blob: 2b8967702d2feb392a9611bc3ce5c0832477f96c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
<?xml version="1.0" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"/usr/share/sgml/docbook/dtd/xml/4.1.2/docbookx.dtd">
<article id="index">

<articleinfo>
	<authorgroup>
		<corpauthor>
			<ulink url="http://www.freedesktop.org">
			X Desktop Group
			</ulink>
		</corpauthor>
		<author>
			<firstname>Thomas</firstname>
			<surname>Leonard</surname>
			<affiliation>
				<address><email>tal197@users.sf.net</email></address>
			</affiliation>
		</author>
	</authorgroup>

	<title>Shared MIME-info Database</title>
	<date>05 Sep 2002</date>
</articleinfo>

<sect1>
	<title>Introduction</title>
	<sect2>
		<title>Version</title>
		<para>
This is version 0.10 of the Shared MIME-info Database spec, last updated 05 Sep 2002.</para>
	</sect2>
	<sect2>
		<title>What is this spec?</title>
		<para>
Many programs and desktops use the MIME system<citation>MIME</citation>
to represent the types of files. Frequently, it is necessary to work out the
correct MIME type for a file. This is generally done by examining the file's
name or contents, and looking up the correct MIME type in a database.
		</para>
		<para>
It is also useful to store information about each type, such as a textual
description of it, or a list of applications that can be used to view or edit
files of that type.
		</para>
		<para>
For interoperability, it is useful for different programs to use the same
database so that different programs agree on the type of a file and
information is not duplicated. It is also helpful for application authors to
only have to install new information in one place.
		</para>
		<para>
This specification attempts to unify the MIME database systems currently in
use by GNOME<citation>GNOME</citation>, KDE<citation>KDE</citation> and
ROX<citation>ROX</citation>, and provide room for future extensibility.
		</para>
	</sect2>
	<sect2>
		<title>Language used in this specification</title>
		<para>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this document are to be
interpreted as described in RFC 2119.  
		</para>
	</sect2>
</sect1>
<sect1>
	<title>Overview of previous systems</title>
	<sect2>
		<title>KDE</title>
		<para>
KDE uses <filename>.desktop</filename> files, with Type=MimeType, one file
per type to determine type from file name. The files are arranged in the
filesystem to mirror the two-level MIME type hierarchy.
The syntax is very similar to other <filename>.desktop</filename> files,
with Name=, Comment= etc.
		</para>
		<para>
Example file:
			<programlisting><![CDATA[
[Desktop Entry]
Encoding=UTF-8
MimeType=application/x-kword
Comment=KWord
Comment[af]=kword
[... etc. other translations ]
Icon=kword
Type=MimeType
Patterns=*.kwd;*.kwt;
X-KDE-AutoEmbed=false

[Property::X-KDE-NativeExtension]
Type=QString
Value=.kwd
]]></programlisting>
		</para>
		<para>
KDE does not have a separate system for specifying extension matches, but
uses case-sensitive glob patterns for everything.
		</para>
		<para>
A single file stores all the rules for recognising files by content. This
is almost identical to <citerefentry><refentrytitle>file</refentrytitle>
<manvolnum>1</manvolnum></citerefentry>'s <filename>magic.mime</filename>
database file, but without the encoding field.
		</para>
		<para>
The format is described in the file itself as follows:
		<programlisting><![CDATA[
# The format is 4-5 columns:
#    Column #1: byte number to begin checking from, ">" indicates continuation
#    Column #2: type of data to match
#    Column #3: contents of data to match
#    Column #4: MIME type of result
]]></programlisting>
		</para>
	</sect2>
	<sect2>
		<title>GNOME</title>
		<para>
GNOME uses the gnome-vfs library to determine the MIME type of a file.
This library loads name-to-type rules from files with a '.mime' extension
in a system-wide directory (set at install time), and merged with those in the
user's directory. It loads textual descriptions for the types from
files in the same directories, ending with '.keys'. The file
<filename>gnome-vfs.mime</filename> in the system directory is always loaded
first (allowing everything else to override it). The file
<filename>user.mime</filename> in the user's directory is always loaded
last, making these settings take precedence over all others.
		</para>
		<para>
The format of the .mime files are described as follows:
			<programlisting>
# Mime types as provided by the GNOME libraries for GNOME.
#
# Applications can provide more mime types by installing other
# .mime files in the PREFIX/share/mime-info directory.
#
# The format of this file is:
#
# mime-type
#	ext[,prio]:	list of extensions for this mime-type
#	regex[,prio]:	a regular expression that matches the filename
#
# more than one ext: and regex: fields can be present.
#
# prio is the priority for the match, the default is 1. This is required
# to distinguish composed filenames, for example .gz has a priority of 1
# and .tar.gz has a priority of 2 (thus a file having the filename
# something.tar.gz will match the mime-type for tar.gz before the mime-type
# for .gz
#
# The values in this file are kept in alphabetical order for convenience.
# Please maintain this when adding new types. Also consider adding a
# human-readable description to gnome-vfs.keys when adding a new type here.
#
# Also do please not add illegal mime types, observe the mime standard when
# adding new types.
			</programlisting>
When looking up the type for a file, gnome-vfs looks first for an exact-case
match for the extension, then an all upper-case match, then an all lower-case
match. If no matches are found, or there is no '.' in the name, then the
regular expression matches are checked. It does this first for rules with
priority 2, then for those with priority 1. The modification time on the
<filename>mime-info</filename>
directories is used to detect changes.
		</para>
		<para>
The .keys files contain type-to-description rules, eg:
			<programlisting>
application/msword
	description=Microsoft Word document
	[de]description=Microsoft Word-Dokument
	...
			</programlisting>
Guidelines for writing descriptions can be found in the
<filename>mime-descriptions-guidelines.txt</filename> file.
		</para>
		<para>
The format for magic entries is defined as:
			<programlisting><![CDATA[
# The format of magic entries is:
#
#     offset_start[:offset_end] pattern_type pattern [&pattern_mask] type
#
# <offset_start> and <offset_end> are decimal numbers (file offsets).
#
# <pattern_type> is (byte | short | long | string | date | beshort |
#                    belong | bedate | leshort | lelong | ledate).
#
# <pattern> is an ASCII string with non-printable characters escaped
# as hex or octal escape sequences, and spaces and other important
# whitespace escaped with '\'.
#
# <pattern_mask> is a string of hex digits. The mask must be the same
# length as the pattern.
#
# <type> is a valid MIME type.
#
# Order magic patterns such that ambiguous ones (such as
# application/x-ms-dos-executable) are at the end of the list and
# therefore get applied last.
#
# Avoid rules that require a seek deep into the examined file. If you
# must, locate such rules at the end of the list so that they get
# applied last
#
# When designing new document formats, make them easily recognizable
# by defining a sufficiently unique magic pattern near the document
# start. A good pattern is at least four bytes long and contains one
# or two non-printable characters so that text files won't be
# misidentified.
]]></programlisting>
		</para>
	</sect2>
	<sect2>
		<title>ROX</title>
		<para>
Note that ROX is now using this specification. This section details the
previous implementation.
		</para>
		<para>
ROX searches <filename>MIME-info</filename> directories in
<envar>CHOICESPATH</envar> (<filename>~/Choices/MIME-info:/usr/local/share/Choices/MIME-info:/usr/share/Choices/MIME-info</filename> by
default). Files from earlier directories override those in later ones, but
the order within a directory is not specified.
		</para>
		<para>
The files are in the same format as GNOME, except:
			<itemizedlist>
				<listitem><para>
There are no .keys files, so files of all extensions are loaded.
				</para></listitem>
				<listitem><para>
The priority is ignored.
				</para></listitem>
				<listitem><para>
A case-sensitive match is tried first, then a lower-case match. No upper-case
match is tried.
				</para></listitem>
				<listitem><para>
Multiple extensions are allowed. Eg:
					<programlisting>
application/x-compressed-postscript
	ext: ps.gz eps.gz
					</programlisting>
				</para></listitem>
			</itemizedlist>
		</para>
		<para>
When looking up the type for a file, ROX starts with the first '.'
and tries a case-sensitive match of the remaining text against the extensions.
The it tries again with the filename in lower-case. It then tries again
from the second '.', and so on. If no type is found, it tries the regular
expressions.
		</para>
		<para>
ROX has no rules for determining a file's type from its contents.
		</para>
	</sect2>
</sect1>



<sect1>
	<title>Unified system</title>
	<para>
In discussions about these systems, it was clear that the differences between
the databases were simply a result of them being separate, and not due to any
fundamental disagreements between developers. Everyone is keen to see them
merged.
	</para>
	<para>
This spec proposes:

		<itemizedlist>
			<listitem><para>
A standard way for applications to install new MIME related information.
			</para></listitem>
			<listitem><para>
A standard way of getting the MIME type for a file.
			</para></listitem>
			<listitem><para>
A standard way of getting information about a MIME type.
			</para></listitem>
			<listitem><para>
Standard locations for all the files, and methods of resolving conflicts.
			</para></listitem>
		</itemizedlist>
Further, the existing databases have been merged into a single package
<citation>SharedMIME</citation>.
	</para>
	<sect2 id="s2_layout">
		<title>Directory layout</title>
		<para>
There are two important requirements for the way the MIME database is stored:
			<itemizedlist>
				<listitem><para>
Applications must be able to extend the database in any way when they are installed,
to add both new rules for determining type, and new information about specific types.
				</para></listitem>
				<listitem><para>
It must be possible to install applications in /usr, /usr/local and the user's home directory
(in the normal Unix way) and have the MIME information used.
				</para></listitem>
			</itemizedlist>
		</para>
		<para>
The directories to be used to store the files in the database are:
			<itemizedlist>
				<listitem><para>
<filename>/usr/share/mime/</filename>
				</para></listitem>
				<listitem><para>
<filename>/usr/local/share/mime/</filename>
				</para></listitem>
				<listitem><para>
<filename>~/.mime/</filename>
				</para></listitem>
			</itemizedlist>
In the rest of this document, paths shown with the prefix
<filename>&lt;MIME&gt;</filename> indicate the files should be loaded from
all the directories listed above. For example, <quote>Load all the
<filename>&lt;MIME&gt;/text/html.xml</filename> files</quote> means to load
<filename>/usr/share/mime/text/html.xml</filename>,
<filename>/usr/local/share/mime/text/html.xml</filename>, and
<filename>~/.mime/text/html.xml</filename> (if they exist).
		</para>
		<para>
Each application that wishes to contribute to the MIME database will install a
single XML file, named after the application, into one of the three
<filename>&lt;MIME&gt;/packages/</filename> directories (depending on where the user requested
the application be installed). After installing, uninstalling or modifying this
file, the application MUST run the <command>update-mime-database</command> command,
which is provided by the freedesktop.org shared database<citation>SharedMIME</citation>.
		</para>
		<para>
<command>update-mime-database</command> is passed the <filename>mime</filename>
directory containing the <filename>packages</filename> subdirectory which was
modified as its only argument. It scans all the XML files in the <filename>packages</filename>
subdirectory, combines the information in them, and creates a number of output files.
		</para>
		<para>
Where the information from these files is conflicting, information from directories
lower in the list takes precedence.
Any file named <filename>Override.xml</filename> takes precedence over all other files in
the same <filename>packages</filename> directory. Tools which let the user edit the
database should edit the file <filename>~/.mime/packages/Override.xml</filename>.
		</para>
		<para>
The files created by <command>update-mime-database</command> are:
			<itemizedlist>
				<listitem><para>
<filename>&lt;MIME&gt;/globs</filename> (contains a mapping from extension to MIME type)
				</para></listitem>
				<listitem><para>
<filename>&lt;MIME&gt;/magic</filename> (contains a mapping from file contents to MIME type)
				</para></listitem>
				<listitem><para>
<filename>&lt;MIME&gt;/MEDIA/SUBTYPE.xml</filename> (one file for each MIME
type, giving details about the type)
				</para></listitem>
			</itemizedlist>
The format of these generated files and the source files in <filename>packages</filename>
are explained in the following sections. This step serves several purposes. First, it allows
applications to quickly get the data they need without parsing all the source XML files (the
base package alone is over 700K). Second, it allows the database to be used for other
purposes (such as creating the <filename>/etc/mime.types</filename> if desired). Third, it
allows some validation to be performed on the input data, and removes the need for other
applications to carefully check the input for errors themselves.
		</para>
	</sect2>
	<sect2>
		<title>The source XML files</title>
		<para>
Each application provides only a single XML source file, which is installed in the
<filename>packages</filename> directory as described above. This file is an XML file
whose document element is named <userinput>mime-info</userinput> and whose namespace URI
is <ulink url="http://www.freedesktop.org/standards/shared-mime-info"/>. All elements
described in this specification MUST have this namespace too.
		</para><para>
The document element may contain zero or more <userinput>mime-type</userinput> child nodes,
in any order, each describing a single MIME type. Each element has a <userinput>type</userinput>
attribute giving the MIME type that it describes.
		</para><para>
Each <userinput>mime-type</userinput> node may contain any combination of the following elements,
and in any order:
			<itemizedlist>
				<listitem><para>
<userinput>glob</userinput> elements have a <userinput>pattern</userinput> attribute. Any file
whose name matches this pattern will be given this MIME type (subject to conflicting rules in
other files, of course).
				</para></listitem>
				<listitem><para>
<userinput>magic</userinput> elements contain a list of
<userinput>match</userinput> elements, any of which may match, and an optional
<userinput>priority</userinput> attribute for all of the contained rules. Low
numbers should be used for more generic types (such as 'gzip compressed data')
and higher values for specific subtypes (such as a word processor format that
happens to use gzip to compress the file). The default priority value is 50.
				</para><para>
Each <userinput>match</userinput> element must have a type of
<userinput>string</userinput>, <userinput>host16</userinput>,
<userinput>host32</userinput>, <userinput>big16</userinput>,
<userinput>big32</userinput>, <userinput>little16</userinput>,
<userinput>little32</userinput> or <userinput>byte</userinput>. It must also have
<userinput>offset</userinput>, <userinput>value</userinput> and, optionally,
<userinput>mask</userinput> attributes. Each element corresponds to one line of
<citerefentry><refentrytitle>file</refentrytitle>
<manvolnum>1</manvolnum></citerefentry>'s <filename>magic.mime</filename> file.
They can be nested in the same way to provide the equivalent of continuation
lines.
				</para></listitem>
				<listitem><para>
<userinput>action</userinput> elements introduce an action that can be performed on files of this
type. There may be several actions for each type. The format for this element has not yet been
decided. Applications which can handle arbitrary streams of data can indicate
this by setting an action for the type `application/octet-stream'.
				</para></listitem>
				<listitem><para>
<userinput>comment</userinput> elements give a human-readable textual description of the MIME
type. There may be many of these elements with different <userinput>xml:lang</userinput> attributes
to provide the text in multiple languages.
				</para></listitem>
			</itemizedlist>
Applications may also define their own elements, provided they are namespaced to prevent collisions.
Unknown elements are copied directly to the output XML files like <userinput>comment</userinput>
elements.
A typical use for this would be to indicate the default handler application for a particular desktop
("Galeon is the GNOME default text/html browser"). Note that this doesn't indicate the user's preferred
application, only the (fixed) default.
		</para>
		<para>
Here is an example source file, named <filename>diff.xml</filename>:
		<programlisting><![CDATA[
<?xml version="1.0"?>
<mime-info xmlns='http://www.freedesktop.org/standards/shared-mime-info'>
  <mime-type type="text/x-diff">
    <comment>Differences between files</comment>
    <comment xml:lang="af">verskille tussen lĂȘers</comment>
    ...
    <magic priority="50">
      <match type="string" offset="0" value="diff	"/>
      <match type="string" offset="0" value="***	"/>
      <match type="string" offset="0" value="Common subdirectories: "/>
    </magic>
    <glob pattern="*.diff"/>
    <glob pattern="*.patch"/>
  </mime-type>
</mime-info>
]]></programlisting>
In practice, common types such as text/x-diff are provided by the freedesktop.org shared
database. Also, only new information needs to be provided, since this information will be merged
with other information about the same type.
		</para>
	</sect2>
	<sect2>
		<title>The MEDIA/SUBTYPE.xml files</title>
		<para>
These files have a <userinput>mime-type</userinput> element as the root node. The format is
as described above. They are created by merging all the <userinput>mime-type</userinput>
elements from the source files and creating one output file per MIME type. Each file may contain
information from multiple source files. The <userinput>magic</userinput> and
<userinput>glob</userinput> elements will have been removed.
		</para>
		<para>
The example source file given above would (on its own) create an output file called
<filename>&lt;MIME&gt;/text/x-diff.xml</filename> containing the following:
			<programlisting><![CDATA[
<?xml version="1.0" encoding="utf-8"?>
<mime-type xmlns="http://www.freedesktop.org/standards/shared-mime-info" type="text/x-diff">
<!--Created automatically by update-mime-database. DO NOT EDIT!-->
  <comment>Differences between files</comment>
  <comment lang="af">verskille tussen lĂȘers</comment>
  ...
</mime-type>

]]></programlisting>
		</para>
	</sect2>
	<sect2>
		<title>The glob files</title>
		<para>
This is a simple list of lines containing a MIME type and pattern, separated by a colon.
For example:
			<programlisting><![CDATA[
# This file was automatically generated by the
# update-mime-database command. DO NOT EDIT!
...
text/x-diff:*.diff
text/x-diff:*.patch
...
]]></programlisting>
		</para>
		<para>
KDE's glob system replaces GNOME's and ROX's ext/regex fields, since it
is trivial to detect a pattern in the form '*.ext' and store it in an
extension hash table internally. The full power of regular expressions was
not being used by either desktop, and glob patterns are more suitable for
filename matching anyway.
		</para>
		<para>
Applications MUST first try a case-sensitive match, then a case-insensitive
one. This is so that <filename>main.C</filename> will be seen as a C++ file,
but <filename>IMAGE.GIF</filename> will still use the *.gif pattern.
		</para>
		<para>
If several patterns match then the longest pattern SHOULD be used. In
particular, files with multiple extensions (such as
<filename>Data.tar.gz</filename>) MUST match the longest sequence of extensions
(eg '*.tar.gz' in preference to '*.gz'). Literal patterns (eg, 'Makefile') must
be matched before all others. It is acceptable to match patterns of the form
'*.text' before other wildcarded patterns (that is, to special-case extensions
using a hash table).
		</para>
		<para>
There may be several rules mapping to the same type. They should all be merged.
If the same pattern is defined twice, then they MUST be ordered by the
directory the rule came from, as described above.
		</para>
		<para>
Common types (such as MS Word Documents) will be provided in the X Desktop
Group's package, which MUST be required by all applications using this
specification. Since each application will then only be providing information
about its own types, conflicts should be rare.
		</para>
	</sect2>
	<sect2>
		<title>The magic files</title>
		<para>
The magic data is stored in a binary format for ease of parsing. The old magic database
had complex escaping rules; these are now handled by <command>update-mime-database</command>.
		</para><para>
The file starts with the magic string "MIME-Magic" followed by two zero bytes.
There is no version number in the file. Incompatible changes will be handled by
creating both the current `magic' file and a newer `magic2' in the new format.
Where possible, compatible changes only will be made.
		</para><para>
The file is made of a sequence of entries, each corresponding to one line of file's magic
file. All numbers are big-endian, so need to be byte-swapped on little-endian machines.
Each entry has the following format:
<informaltable>
	<tgroup cols="3">
	<thead><row><entry>Byte offset</entry><entry>Size</entry><entry>Value</entry></row></thead>
	<tbody>

	<row><entry>0</entry><entry>1</entry><entry>Indent</entry></row>
	<row><entry>1</entry><entry>1</entry><entry>Priority (0-100)</entry></row>
	<row><entry>2</entry><entry>1</entry><entry>Word size (1, 2, 4, 8) bytes</entry></row>
	<row><entry>3</entry><entry>1</entry><entry>Flags</entry></row>
	<row><entry>4</entry><entry>4</entry><entry>Range start (byte offset)</entry></row>
	<row><entry>8</entry><entry>4</entry><entry>Range end (byte offset)</entry></row>
	<row><entry>12</entry><entry>4</entry><entry>Total entry size</entry></row>
	<row><entry>18</entry><entry>2</entry><entry>Value length (bytes)</entry></row>

	<row><entry>20</entry><entry>-</entry><entry>Value, mask, type name, and unused data</entry></row>
	
	</tbody>
	</tgroup>
</informaltable>
		</para><para>
Indent corresponds to the nesting depth of the rule. Top-level rules have an indent of zero. The parent
of an entry is the preceding entry with an indent one less than the entry.
		</para><para>
The word size is used for byte-swapping. Little-endian systems should reverse the order of groups of bytes
in the value and mask if this is greater than one. This only affects `host'
matches (`big32' entries still have a word size of 1, for example, because no swapping is necessary, whereas
`host32' has a word size of 4).
		</para><para>
Bit 0 of the flags byte indicates that a mask is present. Bit 1 indicates that
the entry should be skipped. All other bits should be ignored. If bit 0 is 1,
then the value is followed by a mask of the same size.
		</para><para>
The range start and end points are byte offsets into the file being checked. All offsets from the start to the
end inclusive should be checked. They will be equal if only one offset is to be checked. These values are
big endian.
		</para><para>
The total entry size (also big-endian) gives the offset to the next entry from the start of this one. This
is always a multiple of four.
		</para><para>
The value length is a 2 byte big-endian number, giving the number of bytes used for the value. If a mask
is present, it follows directly after the value and is the same size. The MIME type name comes last, and is
a nul-terminated string.
		</para><para>
There may be any amount unused space at the end of each entry. This is for future expansion and/or padding.
		</para><para>
The above example would create a magic file starting with:
			<programlisting><![CDATA[
4d 49 4d 45 2d 4d 61 67 69 63 00 00

00 32 01 00 00 00 00 00 00 00 00 00
00 00 00 23 00 05 64 69 66 66 20 74
65 78 74 2f 78 2d 64 69 66 66 00
]]></programlisting>
		</para>
	</sect2>
	<sect2>
		<title>Storing the MIME type using Extended Attributes</title>
		<para>
An implementation may also get a file's MIME type from the <userinput>user.mime_type</userinput> extended
attribute. <!-- The attr(5) man page documents this name --> The type given here should normally be used
in preference to any guessed type, since the user is able to set it explicitly. Applications may choose to
set the type when saving files. Since many applications and filesystems do not support extended attributes,
implementations should not rely on this method being available.
		</para>
	</sect2>
	<sect2>
		<title>Security implications</title>
		<para>
The system described in this document is intended to allow different programs
to see the same file as having the same type. This is to help interoperability.
The type determined in this way is only a guess, and an application MUST NOT
trust a file based simply on its MIME type. For example, a downloader should
not pass a file directly to a launcher application without confirmation simply
because the type looks `harmless' (eg, text/plain).
		</para>
		<para>
Do not rely on two applications getting the same type for the same file, even
if they both use this system. The spec allows some leeway in implementation,
and in any case the programs may be following different versions of the spec.
		</para>
	</sect2>
	<sect2>
		<title>User modification</title>
		<para>
The MIME database is NOT intended to store user preferences. Users should never
edit the database. If they wish to make corrections or provide MIME entries for
software that doesn't provide these itself, they should do so by means of the
Override.xml mentioned in <xref linkend="s2_layout"/>. Information such as
"text/html files need to be opened with Mozilla" should NOT go in the database.
		</para>
	</sect2>
</sect1>

<sect1>
	<title>Contributors</title>
	<simplelist>
		<member>
			Thomas Leonard <email>tal197@users.sf.net</email>
		</member>
		<member>
			David Faure <email>david@mandrakesoft.com</email>
		</member>
		<member>
			Alex Larsson <email>alexl@redhat.com</email>
		</member>
		<member>
			Seth Nickell <email>snickell@stanford.edu</email>
		</member>
		<member>
			Keith Packard <email>keithp@keithp.com</email>
		</member>
		<member>
			Filip Van Raemdonck <email>mechanix@debian.org</email>
		</member>
		<member>
			Christos Zoulas <email>christos@zoulas.com</email>
		</member>
	</simplelist>
</sect1>

<bibliography>
	<title>References</title>

	<bibliomixed>
		<abbrev>GNOME</abbrev><citetitle>The GNOME desktop,
		<ulink url="http://www.gnome.org"/></citetitle>
	</bibliomixed>
	<bibliomixed>
		<abbrev>KDE</abbrev><citetitle>The KDE desktop,
		<ulink url="http://www.kde.org"/></citetitle>
	</bibliomixed>
	<bibliomixed>
		<abbrev>ROX</abbrev><citetitle>The ROX desktop,
		<ulink url="http://rox.sourceforge.net"/></citetitle>
	</bibliomixed>
	<bibliomixed>
		<abbrev>DesktopEntries</abbrev><citetitle>Desktop Entry Specification,
		<ulink url="http://www.freedesktop.org/standards/desktop-entry-spec.html"/>
		</citetitle>
	</bibliomixed>
	<bibliomixed>
		<abbrev>SharedMIME</abbrev><citetitle>Shared MIME-info Database
		<ulink url="http://www.freedesktop.org/standards/shared-mime-info.html"/>
		</citetitle>
	</bibliomixed>

</bibliography>

</article>