1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
|
<page xmlns="http://www.gnome.org/~shaunm/mallard"
type="topic"
id="its">
<info>
<link type="guide" xref="explore#i18n"/>
<version number="0.1" date="2009-05-26" status="incomplete"/>
<credit type="author">
<name>Shaun McCance</name>
<email>shaunm@gnome.org</email>
</credit>
<copyright>
<year>2009</year>
<name>Shaun McCance</name>
</copyright>
</info>
<title>ITS Conformance</title>
<p>This page discusses Mallard's conformance to the requirements in the
<link href="http://www.w3.org/TR/itsreq/">W3C Internationalization
and Localization Markup Requirements</link>, as well as its usage of
attributes from the <link href="http://www.w3.org/TR/its/">W3C
Internationalization Tag Set</link>.</p>
<section id="R002">
<title>R002: Span-Like Element</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#span">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R002] span-like element is required to allow authors to mark sections
text that may have special properties, from a localization and
internationalization point of view.</p>
</quote>
<p>Mallard provides the <code xref="mal_inline_span">span</code> element,
a general-purpose span-like element. The <code>span</code> element accepts
attributes from external namespaces, allowing attributes such as
<code>xml:lang</code> and
<code href="http://www.w3.org/TR/its/#trans-datacat">its:translate</code>
to be used in Mallard documents.</p>
</section>
<section id="R004">
<title>R004: Unique Identifier</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#uid">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R004] It should be possible to attach a unique identifier to any
localizable item. This identifier should be unique within a document set,
but should be identical across all translations of the same item.</p>
</quote>
<p>While the <code>id</code> attribute is only allowed on
<code xref="mal_page">page</code> and <code xref="mal_section">section</code>
elements, Mallard does allow attributes from external namespaces to be used
on all elements. If necessary for translation purposes, any attribute from
an external attribute may be used as a unique identifier. In particular,
Mallard does not use the common <code>xml:id</code> for page and section
IDs, but it may be used on any element to provide a unique identifier for
translation or any other purposes.</p>
</section>
<section id="R006">
<title>R006: Identifying Language/Locale</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#langlocale">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R006] Any document at its beginning should declare a language/locale
that is applied to both main content and external content stored separately.
While the language/locale may be declared for the whole document, when an
element or a text span is in a different language/locale from the
document-level language, it should be labeled appropriately. Therefore,
DTD/Schema should allow any elements to have a language/locale specifying
attribute. The language/locale declaration should use industry standard
approaches.</p>
</quote>
<p>Mallard allows the standard <code>xml:lang</code> attribute to be used
on all elements.</p>
<p>Note that there are two different methods of identifying language and locale
information that are likely to be encountered by those working with Mallard.
Since Mallard is an XML format, language identifiers are expected to conform
to <link href="http://tools.ietf.org/html/rfc3066">IETF RFC 3066</link>.
Since Mallard is designed to be used in a desktop help system,
<link href="http://en.wikipedia.org/wiki/Locale">POSIX locale identifiers</link>
are more convenient. This is a potentially serious interchange issue, and this
document currently offers no solutions to this problem.</p>
</section>
<section id="R007">
<title>R007: Identifying Terms</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#termid">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R007] It should be possible to identify terms inside an element or a
span and to provide data for terminology management and index generation.
Terms should be either associated with attributes for related term
information or linked to external terminology data.</p>
</quote>
<comment>
<cite date="2009-05-26">shaunm</cite>
<p>FIXME: not sure what this is asking for. Is this something we need
to address directly, or something we get for free with external attrs?</p>
</comment>
</section>
<section id="R008">
<title>R008: Purpose Specification/Mapping</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#mapping">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R008] Currently, it does not appear to be realistic that all XML
vocabularies tag localization-relevant information identical (e.g. all
use the "term" tag for terms). One way to take care of diverse
localization-relevant markup in localization environments is a mapping
mechanism which maps localization-relevant markup onto a canonical
representation (such as the Internationalization Tag Set).</p>
</quote>
<comment>
<cite date="2009-05-26">shaunm</cite>
<p>FIXME: need to look into this more.</p>
</comment>
</section>
<section id="R011">
<title>R011: Bidirectional Text Support</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#bidi">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R011] Markup should be available to support the needs of bidirectional
scripts.</p>
</quote>
<p>Mallard allows attributes from external namespaces to be used on all
elements. Consequently, the
<code href="http://www.w3.org/TR/its/#directionality">its:dir</code>
attribute may be used to specify text directionality.</p>
</section>
<section id="R012">
<title>R012: Indicator of Translatability</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#transinfo">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R012] Methods must exist to allow to specify the parts of a document
that are to be translated or not.</p>
</quote>
<p>Mallard allows attributes from external namespaces to be used on all
elements. Consequently, the
<code href="http://www.w3.org/TR/its/#trans-datacat">its:translate</code>
attribute may be used to specify whether parts of a document are to be
translated.</p>
<p>Additionally, the
<code href="http://www.w3.org/TR/its/#trans-datacat">its:rules</code>
element may be used in any <code xref="mal_info">info</code> element to
provide translatability rules for a page or section.</p>
</section>
<section id="R014">
<title>R014: Limited Impact</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#impact">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R014] All solutions proposed should be designed to have as less impact
as possible on the tree structure of the original document and on the
content models in the original schema.</p>
</quote>
<p>Mallard allows tool-specific extensibility using attributes and elements
from external namespaces. Mallard has
<link xref="mal_external">clearly defined rules</link> for how attributes
and elements from external namespaces are to be processed in various contexts.
Tools writers are expected to be aware of these issues. Whenever possible,
this document issues that can arise from extensions, including those for
translation purposes.</p>
<p>While it is impossible to predict all issues one might encounter, Mallard
was developed after years of developing translation tools for other formats.
Internationalization and localization were primary concerns in the design
of Mallard.</p>
</section>
<section id="R015">
<title>R015: Attributes and Translatable Text</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#transattr">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R015] Provisions must be taken to ensure that attributes with
translatable values do not impair the localization process.</p>
</quote>
<p>Mallard never places translatable text in attribute values.</p>
</section>
<section id="R017">
<title>R017: Localization Notes</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#locnotes">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R017] A method must exist for authors to communicate information to
localizers about a particular item of content.</p>
</quote>
<p>Mallard allows attributes from external namespaces to be used on all
elements. Consequently, the
<code href="http://www.w3.org/TR/its/#trans-datacat">its:locNote</code> and
<code href="http://www.w3.org/TR/its/#trans-datacat">its:locNoteRule</code>
attributes may be used to provide localization notes.</p>
<p>If more extensive localization notes are needed, the
<code xref="mal_block_comment">comment</code> element may be used. Using a
<code href="http://www.w3.org/TR/its/#trans-datacat">its:rules</code>
element in an <code xref="mal_info">info</code> element, one can clearly
specify which editorial comments are localization notes.</p>
</section>
<section id="R020">
<title>R020: Annotation Markup</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#annomark">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R020] There must be a way to support markup up of text annotations of
the 'ruby' type.</p>
</quote>
<p>Elements from external namespaces may be used in all
<link xref="mal_inline">inline contexts</link>. While this allows Ruby
annotations to be embedded within a Mallard document, the
<link xref="mal_inline#processing">fallback processing expectations</link>
are unlikely to produce satisfactory results for tools that do not support
Ruby. Future versions of this document should address this issue.</p>
</section>
<section id="R022">
<title>R022: Nested Elements</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#nestedelems">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R022] Great care must be taken when defining or using nested
translatable elements.</p>
</quote>
<p>Mallard explicitly disallows mixing block and inline content,
except in well-defined cases which can easily be detected and
handled. In Mallard, any block element which can contain text
directly is considered to be a translation unit. Since these
elements do not allow general block content to be mixed into
the inline content, translation units can always be presented
to translators without the need for placeholders.</p>
<p>Note that this may not be the case if a translation tool chooses
to treat certain container elements as translation units. For example,
under some circumstances a translation tool might choose to present
<link xref="mal_table">tables</link> or
<link xref="mal_block_list">lists</link> as translatable to allow
translators to reorder the rows or items. In these cases, the block
content inside the entries or items would still constitute discrete
units of translations, making placeholders necessary.</p>
</section>
<section id="R025">
<title>R025: Elements and Segmentation</title>
<quote>
<cite href="http://www.w3.org/TR/itsreq/#elemseg">W3C
Internationalization and Localization Markup Requirements</cite>
<p>[R025] Methods, independent of the semantic, of the elements must
exist to provide hints on how to break down document content into
meaningful runs of text.</p>
</quote>
<p>Making meaningful distinctions is ultimately the job of a processing
tool, although the design of an XML vocabulary can have a significant
impact on implementation difficulty. The following notes will be relevant
to most tool implementors.</p>
<list>
<item>
<p>In Mallard, no elements contain “pernicious mixed content”: a
problematic content model wherein an element can contain either
inline content or block content, but not both. Resolving such
content models generally involves testing for the existence of
one of a certain set of elements, which can be difficult as
content models grow.</p>
<p>In Mallard, pernicious mixed content would be particularly
problematic, since certain element names are used in both block
and inline contexts.</p>
</item>
<item>
<p>In Mallard, elements generally contain either block content or
inline content. Thus, for example, you cannot place a paragraph
inside a paragraph. This is simpler for translators, as well as
for translation tool implementors, because it reduces the need
to use placeholders for separate translation units.</p>
</item>
<item>
<p>One notable exception to the above is the <code>item</code>
element in <link xref="mal_block_tree">tree lists</link>. To
simplify writing, tree list items simply take inline content
followed by any number of nested tree list items. Since the
block-like items are not interspersed with the inline content,
however, translation tools should be able to handle this case
without placeholders.</p>
</item>
<item>
<p>It is noteworthy that Mallard reuses some element names in both block
and inline contexts. The <code xref="mal_block_code">code</code> and
<code xref="mal_block_media">media</code> elements are two examples of
this. Since Mallard never allows general block content to be mixed with
general inline content, the purpose of these elements is unambiguous when
processed in context. Thus, it is important that tools always process
elements in context to handle them correctly.</p>
</item>
</list>
</section>
</page>
|