summaryrefslogtreecommitdiff
path: root/libs/endian/doc/choosing_approach.html
diff options
context:
space:
mode:
Diffstat (limited to 'libs/endian/doc/choosing_approach.html')
-rw-r--r--libs/endian/doc/choosing_approach.html412
1 files changed, 412 insertions, 0 deletions
diff --git a/libs/endian/doc/choosing_approach.html b/libs/endian/doc/choosing_approach.html
new file mode 100644
index 000000000..effb79fdd
--- /dev/null
+++ b/libs/endian/doc/choosing_approach.html
@@ -0,0 +1,412 @@
+<html>
+
+<head>
+<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
+<meta name="ProgId" content="FrontPage.Editor.Document">
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+<title>Choosing Approach</title>
+<link href="styles.css" rel="stylesheet">
+</head>
+
+<body>
+
+<table border="0" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" width="100%">
+ <tr>
+ <td width="339">
+<a href="../../../index.html">
+<img src="../../../boost.png" alt="Boost logo" align="middle" border="0" width="277" height="86"></a></td>
+ <td align="middle" width="1253">
+ <font size="6"><b>Choosing the Approach</b></font></td>
+ </tr>
+</table>
+
+<table border="0" cellpadding="5" cellspacing="0" style="border-collapse: collapse"
+ bordercolor="#111111" bgcolor="#D7EEFF" width="100%">
+ <tr>
+ <td><b>
+ <a href="index.html">Endian Home</a>&nbsp;&nbsp;&nbsp;&nbsp;
+ <a href="conversion.html">Conversion Functions</a>&nbsp;&nbsp;&nbsp;&nbsp;
+ <a href="arithmetic.html">Arithmetic Types</a>&nbsp;&nbsp;&nbsp;&nbsp;
+ <a href="buffers.html">Buffer Types</a>&nbsp;&nbsp;&nbsp;&nbsp;
+ <a href="choosing_approach.html">Choosing Approach</a></b></td>
+ </tr>
+</table>
+<p></p>
+
+<table border="1" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" align="right">
+ <tr>
+ <td width="100%" bgcolor="#D7EEFF" align="center">
+ <i><b>Contents</b></i></td>
+ </tr>
+ <tr>
+ <td width="100%" bgcolor="#E8F5FF">
+<a href="#Introduction">Introduction</a><br>
+<a href="#Choosing">Choosing between conversion functions,</a><br>
+ &nbsp; <a href="#Choosing">buffer types, and arithmetic types</a><br>
+&nbsp;&nbsp;&nbsp;<a href="#Characteristics">Characteristics</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Endianness-invariants">Endianness invariants</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Conversion-explicitness">Conversion explicitness</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Arithmetic-operations">Arithmetic operations</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Sizes">Sizes</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Alignments">Alignments</a><br>
+&nbsp;&nbsp;&nbsp;<a href="#Design-patterns">Design patterns</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#As-needed">Convert only as needed (i.e. lazy)</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Anticipating-need">Convert in anticipation of need</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Convert-generally-as-needed-locally-in-anticipation">Generally
+as needed, locally in anticipation</a><br>
+&nbsp;&nbsp;&nbsp;<a href="#Use-cases">Use case examples</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Porting-endian-unaware-codebase">Porting endian unaware codebase</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Porting-endian-aware-codebase">Porting endian aware codebase</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Reliability-arithmetic-speed">Reliability and arithmetic-speed</a><br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Reliability-ease-of-use">Reliability and ease-of-use</a></td>
+ </tr>
+ </table>
+
+<h2><a name="Introduction">Introduction</a></h2>
+
+<p>Deciding which is the best endianness approach (conversion functions, buffer
+types, or arithmetic types) for a particular application involves complex
+engineering trade-offs. It is hard to assess those trade-offs without some
+understanding of the different interfaces, so you might want to read the
+<a href="conversion.html">conversion functions</a>, <a href="buffers.html">
+buffer types</a>, and <a href="arithmetic.html">arithmetic types</a> pages
+before diving into this page.</p>
+
+<h2><a name="Choosing">Choosing</a> between conversion functions, buffer types,
+and arithmetic types</h2>
+
+<p>The best approach to endianness for a particular application depends on the interaction between
+the application&#39;s needs and the characteristics of each of the three approaches.</p>
+
+<p><b>Recommendation:</b> If you are new to endianness, uncertain, or don&#39;t want to invest
+the time to
+study
+engineering trade-offs, use <a href="arithmetic.html">endian arithmetic types</a>. They are safe, easy
+to use, and easy to maintain. Use the
+<a href="#Anticipating-need"> <i>
+anticipating need</i></a> design pattern locally around performance hot spots
+like lengthy loops, if needed.</p>
+
+<h3><a name="Background">Background</a> </h3>
+
+<p>A dealing with endianness usually implies a program portability or a data
+portability requirement, and often both. That means real programs dealing with
+endianness are usually complex, so the examples shown here would really be
+written as multiple functions spread across multiple translation units. They
+would involve interfaces that can not be altered as they are supplied by
+third-parties or the standard library. </p>
+
+<h3><a name="Characteristics">Characteristics</a></h3>
+
+<p>The characteristics that differentiate the three approaches to endianness are the endianness
+invariants, conversion explicitness, arithmetic operations, sizes available, and
+alignment requirements.</p>
+
+<h4><a name="Endianness-invariants">Endianness invariants</a></h4>
+
+<blockquote>
+
+<p><b>Endian conversion functions</b> use objects of the ordinary C++ arithmetic
+types like <code>int</code> or <code>unsigned short</code> to hold values. That
+breaks the implicit invariant that the C++ language rules apply. The usual
+language rules only apply if the endianness of the object is currently set to the native endianness for the platform. That can
+make it very hard to reason about logic flow, and result in difficult to
+find bugs.</p>
+
+<p>For example:</p>
+
+<blockquote>
+ <pre>struct data_t // big endian
+{
+ int32_t v1; // description ...
+ int32_t v2; // description ...
+ ... additional character data members (i.e. non-endian)
+ int32_t v3; // description ...
+};
+
+data_t data;
+
+read(data);
+big_to_native_inplace(data.v1);
+big_to_native_inplace(data.v2);
+
+...
+
+++v1;
+third_party::func(data.v2);
+
+...
+
+native_to_big_inplace(data.v1);
+native_to_big_inplace(data.v2);
+write(data);
+</pre>
+ <p>The programmer didn&#39;t bother to convert <code>data.v3</code> to native
+ endianness because that member isn&#39;t used. A later maintainer needs to pass
+ <code>data.v3</code> to the third-party function, so adds <code>third_party::func(data.v3);</code>
+ somewhere deep in the code. This causes a silent failure because the usual
+ invariant that an object of type <code>int32_t</code> holds a value as
+ described by the C++ core language does not apply.</p>
+</blockquote>
+<p><b>Endian buffer and arithmetic types</b> hold values internally as arrays of
+characters with an invariant that the endianness of the array never changes.
+That makes these types easier to use and programs easier to maintain. </p>
+<p>Here is the same example, using an endian arithmetic type:</p>
+<blockquote>
+ <pre>struct data_t
+{
+ big_int32_t v1; // description ...
+ big_int32_t v2; // description ...
+ ... additional character data members (i.e. non-endian)
+ big_int32_t v3; // description ...
+};
+
+data_t data;
+
+read(data);
+
+...
+
+++v1;
+third_party::func(data.v2);
+
+...
+
+write(data);
+</pre>
+ <p>A later maintainer can add <code>third_party::func(data.v3)</code>and it
+ will just-work.</p>
+</blockquote>
+
+</blockquote>
+
+<h4><a name="Conversion-explicitness">Conversion explicitness</a></h4>
+
+<blockquote>
+
+<p><b>Endian conversion functions</b> and <b>buffer types</b> never perform
+implicit conversions. This gives users explicit control of when conversion
+occurs, and may help avoid unnecessary conversions.</p>
+
+<p><b>Endian arithmetic types</b> perform conversion implicitly. That makes
+these types very easy to use, but can result in unnecessary conversions. Failure
+to hoist conversions out of inner loops can bring a performance penalty.</p>
+
+</blockquote>
+
+<h4><a name="Arithmetic-operations">Arithmetic operations</a></h4>
+
+<blockquote>
+
+<p><b>Endian conversion functions</b> do not supply arithmetic
+operations, but this is not a concern since this approach uses ordinary C++
+arithmetic types to hold values.</p>
+
+<p><b>Endian buffer types</b> do not supply arithmetic operations. Although this
+approach avoids unnecessary conversions, it can result in the introduction of
+additional variables and confuse maintenance programmers.</p>
+
+<p><b>Endian</b> <b>arithmetic types</b> do supply arithmetic operations. They
+are very easy to use if lots of arithmetic is involved. </p>
+
+</blockquote>
+
+<h4><a name="Sizes">Sizes</a></h4>
+
+<blockquote>
+
+<p><b>Endianness conversion functions</b> only support 1, 2, 4, and 8 byte
+integers. That&#39;s sufficient for many applications.</p>
+
+<p><b>Endian buffer and arithmetic types</b> support 1, 2, 3, 4, 5, 6, 7, and 8
+byte integers. For an application where memory use or I/O speed is the limiting
+factor, using sizes tailored to application needs can be useful.</p>
+
+</blockquote>
+
+<h4><a name="Alignments">Alignments</a></h4>
+
+<blockquote>
+
+<p><b>Endianness conversion functions</b> only support aligned integer and
+floating-point types. That&#39;s sufficient for most applications.</p>
+
+<p><b>Endian buffer and arithmetic types</b> support both aligned and unaligned
+integer and floating-point types. Unaligned types are rarely needed, but when
+needed they are often very useful and workarounds are painful. For example,</p>
+
+<blockquote>
+ <p>Non-portable code like this:<blockquote>
+ <pre>struct S {
+ uint16_t a;&nbsp; // big endian
+ uint32_t b;&nbsp; // big endian
+} __attribute__ ((packed));</pre>
+ </blockquote>
+ <p>Can be replaced with portable code like this:</p>
+ <blockquote>
+ <pre>struct S {
+ big_uint16_ut a;
+ big_uint32_ut b;
+};</pre>
+ </blockquote>
+ </blockquote>
+
+</blockquote>
+
+<h3><a name="Design-patterns">Design patterns</a></h3>
+
+<p>Applications often traffic in endian data as records or packets containing
+multiple endian data elements. For simplicity, we will just call them records.</p>
+
+<p>If desired endianness differs from native endianness, a conversion has to be
+performed. When should that conversion occur? Three design patterns have
+evolved.</p>
+
+<h4><a name="As-needed">Convert only as needed</a> (i.e. lazy)</h4>
+
+<p>This pattern defers conversion to the point in the code where the data
+element is actually used.</p>
+
+<p>This pattern is appropriate when which endian element is actually used varies
+greatly according to record content or other circumstances</p>
+
+<h4><a name="Anticipating-need">Convert in anticipation of need</a></h4>
+
+<p>This pattern performs conversion to native endianness in anticipation of use,
+such as immediately after reading records. If needed, conversion to the output
+endianness is performed after all possible needs have passed, such as just
+before writing records.</p>
+
+<p>One implementation of this pattern is to create a proxy record with
+endianness converted to native in a read function, and expose only that proxy to
+the rest of the implementation. If a write function, if needed, handles the
+conversion from native to the desired output endianness.</p>
+
+<p>This pattern is appropriate when all endian elements in a record are
+typically used regardless of record content or other circumstances</p>
+
+<h4><a name="Convert-generally-as-needed-locally-in-anticipation">Convert
+only as needed, except locally in anticipation of need</a></h4>
+
+<p>This pattern in general defers conversion but for specific local needs does
+anticipatory conversion. Although particularly appropriate when coupled with the endian buffer
+or arithmetic types, it also works well with the conversion functions.</p>
+
+<p>Example:</p>
+
+<blockquote>
+ <pre>struct data_t
+{
+ big_int32_t v1;
+ big_int32_t v2;
+ big_int32_t v3;
+};
+
+data_t data;
+
+read(data);
+
+...
+++v1;
+...
+
+int32_t v3_temp = data.v3; // hoist conversion out of loop
+
+for (int32_t i = 0; i &lt; <i><b>large-number</b></i>; ++i)
+{
+ ... <i><b>lengthy computation that accesses </b></i>v3_temp<i><b> many times</b></i> ...
+}
+data.v3 = v3_temp;
+
+write(data);
+</pre>
+</blockquote>
+
+<p dir="ltr">In general the above pseudo-code leaves conversion up to the endian
+arithmetic type <code>big_int32_t</code>. But to avoid conversion inside the
+loop, a temporary is created before the loop is entered, and then used to set
+the new value of <code>data.v3</code> after the loop is complete.</p>
+
+<blockquote>
+
+<p dir="ltr">Question: Won&#39;t the compiler&#39;s optimizer hoist the conversion out
+of the loop anyhow?</p>
+
+<p dir="ltr">Answer: VC++ 2015 Preview, and probably others, does not, even for
+a toy test program. Although the savings is small (two register <code>
+<span style="font-size: 85%">bswap</span></code> instructions), the cost might
+be significant if the loop is repeated enough times. On the other hand, the
+program may be so dominated by I/O time that even a lengthy loop will be
+immaterial.</p>
+
+</blockquote>
+
+<h3><a name="Use-cases">Use case examples</a></h3>
+
+<h4><a name="Porting-endian-unaware-codebase">Porting endian unaware codebase</a></h4>
+
+<p>An existing codebase runs on big endian systems. It does not
+currently deal with endianness. The codebase needs to be modified so it can run
+on&nbsp; little endian systems under various operating systems. To ease
+transition and protect value of existing files, external data will continue to
+be maintained as big endian.</p>
+
+<p dir="ltr">The <a href="arithmetic.html">endian
+arithmetic approach</a> is recommended to meet these needs. A relatively small
+number of header files dealing with binary I/O layouts need to change types. For
+example,&nbsp;
+<code>short</code> or <code>int16_t</code> would change to <code>big_int16_t</code>. No
+changes are required for <code>.cpp</code> files.</p>
+
+<h4><a name="Porting-endian-aware-codebase">Porting endian aware codebase</a></h4>
+
+<p>An existing codebase runs on little-endian Linux systems. It already
+deals with endianness via
+<a href="http://man7.org/linux/man-pages/man3/endian.3.html">Linux provided
+functions</a>. Because of a business merger, the codebase has to be quickly
+modified for Windows and possibly other operating systems, while still
+supporting Linux. The codebase is reliable and the programmers are all
+well-aware of endian issues. </p>
+
+<p dir="ltr">These factors all argue for an <a href="conversion.html">endian conversion
+approach</a> that just mechanically changes the calls to <code>htobe32</code>,
+etc. to <code>boost::endian::native_to_big</code>, etc. and replaces <code>&lt;endian.h&gt;</code>
+with <code>&lt;boost/endian/conversion.hpp&gt;</code>.</p>
+
+<h4><a name="Reliability-arithmetic-speed">Reliability and arithmetic-speed</a></h4>
+
+<p>A new, complex, multi-threaded application is to be developed that must run
+on little endian machines, but do big endian network I/O. The developers believe
+computational speed for endian variable is critical but have seen numerous bugs
+result from inability to reason about endian conversion state. They are also
+worried that future maintenance changes could inadvertently introduce a lot of
+slow conversions if full-blown endian arithmetic types are used.</p>
+
+<p>The <a href="buffers.html">endian buffers</a> approach is made-to-order for
+this use case.</p>
+
+<h4><a name="Reliability-ease-of-use">Reliability and ease-of-use</a></h4>
+
+<p>A new, complex, multi-threaded application is to be developed that must run
+on little endian machines, but do big endian network I/O. The developers believe
+computational speed for endian variables is <b>not critical</b> but have seen
+numerous bugs result from inability to reason about endian conversion state.
+They are also concerned about ease-of-use both during development and long-term
+maintenance.</p>
+
+<p>Removing concern about conversion speed and adding concern about ease-of-use
+tips the balance strongly in favor the <a href="arithmetic.html">endian
+arithmetic approach</a>.</p>
+
+<hr>
+<p>Last revised:
+<!--webbot bot="Timestamp" s-type="EDITED" s-format="%d %B, %Y" startspan -->19 January, 2015<!--webbot bot="Timestamp" endspan i-checksum="38903" --></p>
+<p>© Copyright Beman Dawes, 2011, 2013, 2014</p>
+<p>Distributed under the Boost Software License, Version 1.0. See
+<a href="http://www.boost.org/LICENSE_1_0.txt">www.boost.org/ LICENSE_1_0.txt</a></p>
+
+<p>&nbsp;</p>
+
+</body>
+
+</html> \ No newline at end of file