summaryrefslogtreecommitdiff
path: root/doc/mallard/C/its.xml
blob: 3aabf05edba346496d7136b6a6cb689a073367fa (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
<page xmlns="http://www.gnome.org/~shaunm/mallard"
      type="topic"
      id="its">

<info>
  <link type="guide" xref="explore#i18n"/>

  <version number="0.1" date="2009-05-26" status="incomplete"/>

  <credit type="author">
    <name>Shaun McCance</name>
    <email>shaunm@gnome.org</email>
  </credit>
  <copyright>
    <year>2009</year>
    <name>Shaun McCance</name>
  </copyright>
</info>

<title>ITS Conformance</title>

<p>This page discusses Mallard's conformance to the requirements in the
<link href="http://www.w3.org/TR/itsreq/">W3C Internationalization
and Localization Markup Requirements</link>, as well as its usage of
attributes from the <link href="http://www.w3.org/TR/its/">W3C
Internationalization Tag Set</link>.</p>

<section id="R002">
  <title>R002: Span-Like Element</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#span">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R002] span-like element is required to allow authors to mark sections
    text that may have special properties, from a localization and
    internationalization point of view.</p>
  </quote>

  <p>Mallard provides the <code xref="mal_inline_span">span</code> element,
  a general-purpose span-like element.  The <code>span</code> element accepts
  attributes from external namespaces, allowing attributes such as
  <code>xml:lang</code> and
  <code href="http://www.w3.org/TR/its/#trans-datacat">its:translate</code>
  to be used in Mallard documents.</p>
</section>

<section id="R004">
  <title>R004: Unique Identifier</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#uid">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R004] It should be possible to attach a unique identifier to any
    localizable item. This identifier should be unique within a document set,
    but should be identical across all translations of the same item.</p>
  </quote>

  <p>While the <code>id</code> attribute is only allowed on
  <code xref="mal_page">page</code> and <code xref="mal_section">section</code>
  elements, Mallard does allow attributes from external namespaces to be used
  on all elements.  If necessary for translation purposes, any attribute from
  an external attribute may be used as a unique identifier.  In particular,
  Mallard does not use the common <code>xml:id</code> for page and section
  IDs, but it may be used on any element to provide a unique identifier for
  translation or any other purposes.</p>
</section>

<section id="R006">
  <title>R006: Identifying Language/Locale</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#langlocale">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R006] Any document at its beginning should declare a language/locale
    that is applied to both main content and external content stored separately.
    While the language/locale may be declared for the whole document, when an
    element or a text span is in a different language/locale from the
    document-level language, it should be labeled appropriately. Therefore,
    DTD/Schema should allow any elements to have a language/locale specifying
    attribute. The language/locale declaration should use industry standard
    approaches.</p>
  </quote>

  <p>Mallard allows the standard <code>xml:lang</code> attribute to be used
  on all elements.</p>

  <p>Note that there are two different methods of identifying language and locale
  information that are likely to be encountered by those working with Mallard.
  Since Mallard is an XML format, language identifiers are expected to conform
  to <link href="http://tools.ietf.org/html/rfc3066">IETF RFC 3066</link>.
  Since Mallard is designed to be used in a desktop help system,
  <link href="http://en.wikipedia.org/wiki/Locale">POSIX locale identifiers</link>
  are more convenient.  This is a potentially serious interchange issue, and this
  document currently offers no solutions to this problem.</p>
</section>

<section id="R007">
  <title>R007: Identifying Terms</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#termid">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R007] It should be possible to identify terms inside an element or a
    span and to provide data for terminology management and index generation.
    Terms should be either associated with attributes for related term
    information or linked to external terminology data.</p>
  </quote>

  <comment>
    <cite date="2009-05-26">shaunm</cite>
    <p>FIXME: not sure what this is asking for.  Is this something we need
    to address directly, or something we get for free with external attrs?</p>
  </comment>
</section>

<section id="R008">
  <title>R008: Purpose Specification/Mapping</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#mapping">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R008] Currently, it does not appear to be realistic that all XML
    vocabularies tag localization-relevant information identical (e.g. all
    use the "term" tag for terms). One way to take care of diverse
    localization-relevant markup in localization environments is a mapping
    mechanism which maps localization-relevant markup onto a canonical
    representation (such as the Internationalization Tag Set).</p>
  </quote>

  <comment>
    <cite date="2009-05-26">shaunm</cite>
    <p>FIXME: need to look into this more.</p>
  </comment>
</section>

<section id="R011">
  <title>R011: Bidirectional Text Support</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#bidi">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R011] Markup should be available to support the needs of bidirectional
    scripts.</p>
  </quote>

  <p>Mallard allows attributes from external namespaces to be used on all
  elements.  Consequently, the
  <code href="http://www.w3.org/TR/its/#directionality">its:dir</code>
  attribute may be used to specify text directionality.</p>
</section>

<section id="R012">
  <title>R012: Indicator of Translatability</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#transinfo">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R012] Methods must exist to allow to specify the parts of a document
    that are to be translated or not.</p>
  </quote>

  <p>Mallard allows attributes from external namespaces to be used on all
  elements.  Consequently, the
  <code href="http://www.w3.org/TR/its/#trans-datacat">its:translate</code>
  attribute may be used to specify whether parts of a document are to be
  translated.</p>

  <p>Additionally, the
  <code href="http://www.w3.org/TR/its/#trans-datacat">its:rules</code>
  element may be used in any <code xref="mal_info">info</code> element to
  provide translatability rules for a page or section.</p>
</section>

<section id="R014">
  <title>R014: Limited Impact</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#impact">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R014] All solutions proposed should be designed to have as less impact
    as possible on the tree structure of the original document and on the
    content models in the original schema.</p>
  </quote>

  <p>Mallard allows tool-specific extensibility using attributes and elements
  from external namespaces.  Mallard has
  <link xref="mal_external">clearly defined rules</link> for how attributes
  and elements from external namespaces are to be processed in various contexts.
  Tools writers are expected to be aware of these issues.  Whenever possible,
  this document issues that can arise from extensions, including those for
  translation purposes.</p>

  <p>While it is impossible to predict all issues one might encounter, Mallard
  was developed after years of developing translation tools for other formats.
  Internationalization and localization were primary concerns in the design
  of Mallard.</p>
</section>

<section id="R015">
  <title>R015: Attributes and Translatable Text</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#transattr">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R015] Provisions must be taken to ensure that attributes with
    translatable values do not impair the localization process.</p>
  </quote>

  <p>Mallard never places translatable text in attribute values.</p>
</section>

<section id="R017">
  <title>R017: Localization Notes</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#locnotes">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R017] A method must exist for authors to communicate information to
    localizers about a particular item of content.</p>
  </quote>

  <p>Mallard allows attributes from external namespaces to be used on all
  elements.  Consequently, the
  <code href="http://www.w3.org/TR/its/#trans-datacat">its:locNote</code> and
  <code href="http://www.w3.org/TR/its/#trans-datacat">its:locNoteRule</code>
  attributes may be used to provide localization notes.</p>

  <p>If more extensive localization notes are needed, the
  <code xref="mal_block_comment">comment</code> element may be used.  Using a
  <code href="http://www.w3.org/TR/its/#trans-datacat">its:rules</code>
  element in an <code xref="mal_info">info</code> element, one can clearly
  specify which editorial comments are localization notes.</p>
</section>

<section id="R020">
  <title>R020: Annotation Markup</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#annomark">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R020] There must be a way to support markup up of text annotations of
    the 'ruby' type.</p>
  </quote>

  <p>Elements from external namespaces may be used in all
  <link xref="mal_inline">inline contexts</link>.  While this allows Ruby
  annotations to be embedded within a Mallard document, the
  <link xref="mal_inline#processing">fallback processing expectations</link>
  are unlikely to produce satisfactory results for tools that do not support
  Ruby.  Future versions of this document should address this issue.</p>
</section>

<section id="R022">
  <title>R022: Nested Elements</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#nestedelems">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R022] Great care must be taken when defining or using nested
    translatable elements.</p>
  </quote>

  <p>Mallard explicitly disallows mixing block and inline content,
  except in well-defined cases which can easily be detected and
  handled.  In Mallard, any block element which can contain text
  directly is considered to be a translation unit.  Since these
  elements do not allow general block content to be mixed into
  the inline content, translation units can always be presented
  to translators without the need for placeholders.</p>

  <p>Note that this may not be the case if a translation tool chooses
  to treat certain container elements as translation units.  For example,
  under some circumstances a translation tool might choose to present
  <link xref="mal_table">tables</link> or
  <link xref="mal_block_list">lists</link> as translatable to allow
  translators to reorder the rows or items.  In these cases, the block
  content inside the entries or items would still constitute discrete
  units of translations, making placeholders necessary.</p>
</section>

<section id="R025">
  <title>R025: Elements and Segmentation</title>

  <quote>
    <cite href="http://www.w3.org/TR/itsreq/#elemseg">W3C
    Internationalization and Localization Markup Requirements</cite>
    <p>[R025] Methods, independent of the semantic, of the elements must
    exist to provide hints on how to break down document content into
    meaningful runs of text.</p>
  </quote>

  <p>Making meaningful distinctions is ultimately the job of a processing
  tool, although the design of an XML vocabulary can have a significant
  impact on implementation difficulty.  The following notes will be relevant
  to most tool implementors.</p>

  <list>
    <item>
      <p>In Mallard, no elements contain “pernicious mixed content”: a
      problematic content model wherein an element can contain either
      inline content or block content, but not both.  Resolving such
      content models generally involves testing for the existence of
      one of a certain set of elements, which can be difficult as
      content models grow.</p>
      <p>In Mallard, pernicious mixed content would be particularly
      problematic, since certain element names are used in both block
      and inline contexts.</p>
    </item>

    <item>
      <p>In Mallard, elements generally contain either block content or
      inline content.  Thus, for example, you cannot place a paragraph
      inside a paragraph.  This is simpler for translators, as well as
      for translation tool implementors, because it reduces the need
      to use placeholders for separate translation units.</p>
    </item>

    <item>
      <p>One notable exception to the above is the <code>item</code>
      element in <link xref="mal_block_tree">tree lists</link>.  To
      simplify writing, tree list items simply take inline content
      followed by any number of nested tree list items.  Since the
      block-like items are not interspersed with the inline content,
      however, translation tools should be able to handle this case
      without placeholders.</p>
    </item>

    <item>
      <p>It is noteworthy that Mallard reuses some element names in both block
      and inline contexts.  The <code xref="mal_block_code">code</code> and
      <code xref="mal_block_media">media</code> elements are two examples of
      this.  Since Mallard never allows general block content to be mixed with
      general inline content, the purpose of these elements is unambiguous when
      processed in context.  Thus, it is important that tools always process
      elements in context to handle them correctly.</p>
    </item>
  </list>
</section>

</page>