summaryrefslogtreecommitdiff
path: root/www/performance/performance.xml
blob: 475551e2699a0a51494055fc82c4d0135cd4b7c2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE article PUBLIC 
   "-//OASIS//DTD DocBook XML V4.1.2//EN"
   "docbook/docbookx.dtd" [
<!ENTITY howto         "http://en.tldp.org/HOWTO/">
<!ENTITY mini-howto    "http://en.tldp.org/HOWTO/mini/">
<!ENTITY homepage      "http://catb.org/~esr/">
]>
<article>
<title>Where's the Latency?  Performance analysis of GPSes and GPSD</title>

<articleinfo>

<author>
  <firstname>Eric</firstname>
  <othername>Steven</othername>
  <surname>Raymond</surname>
  <affiliation>
    <orgname><ulink url="&homepage;">
    Thyrsus Enterprises</ulink></orgname> 
    <address>
    <email>esr@thyrsus.com</email>
    </address>
  </affiliation>
</author>
<copyright>
  <year>2005</year>
  <holder role="mailto:esr@thyrsus.com">Eric S. Raymond</holder> 
</copyright>

<revhistory>
   <revision>
      <revnumber>2.0</revnumber>
      <date>23 September 2011</date>
      <authorinitials>esr</authorinitials>
      <revremark>
	 Update to include whole-cycle profiling.
      </revremark>
   </revision>
   <revision>
      <revnumber>1.2</revnumber>
      <date>27 September 2009</date>
      <authorinitials>esr</authorinitials>
      <revremark>
	 Endnote about the big protocol change.
      </revremark>
   </revision>
   <revision>
      <revnumber>1.1</revnumber>
      <date>4 January 2008</date>
      <authorinitials>esr</authorinitials>
      <revremark>
	 Typo fixes and clarifications for issues raised by Bruce Sutherland.
      </revremark>
   </revision>
   <revision>
      <revnumber>1.0</revnumber>
      <date>21 February 2005</date>
      <authorinitials>esr</authorinitials>
      <revremark>
	 Initial draft.
      </revremark>
   </revision>
</revhistory>

<abstract>
<para>Many GPS manufacturers tout binary protocols more dense than
NMEA and baud rates higher than the NMEA-standard 4800 bps as ways to
increase GPS performance.  While working on
<application>gpsd</application>, I became interested in evaluating
these claims, which have some implications for the design of
<application>gpsd</application>.  Later, the average and peak latency
became of interest for modeling the performance of GPS as a time
service. This paper discusses the theory and the results of profiling
the code, and reaches some conclusions about system tuning and latency
control.</para>
</abstract>
</articleinfo>

<sect1><title>What Can We Measure and Improve?</title>

<para>The most important characteristic of a GPS, the positional
accuracy of its fixes, is a function of the GPS system and receiver 
design; GPSD can do nothing to affect it.  However, a GPS fix has a
timestamp, and the transmission path from the GPS to a user
application introduces some latency (that is, delay between the time
of fix and when it is available to the user).</para>

<para>Latency could be significant source of position error for a GPS in
motion. It may also be a significant issue if the GPS is being used as
a high-precision time source.</para>

<para>This paper describes the results of two rounds of measurement
using different tools.  Both yielded interesting results, and can usefully 
be juxtaposed with one another.</para>

<para>The first round was performed in 2005 using a version of the
code very near 2.13.  This was before the JSON protocol change and before
GPSD developed the capability to do automated cycle detection.
Consequently the statistics extracted were primarily timings of
packets coming over the wire from the GPS.</para>

<para>The second round was performed in 2011 using a version of the
code between 3.1 and 3.2. At this time GPSD was being evaluated as a 
high-precision time source for use in measuring network latency and
checking the performance of NTP.  The profiling tools built into
GPSD had been rebuilt with an emphasis on timing the entire GPS
reporting cycle, rather than individual packets.</para>

<para>Both the 2005 and 2011 rounds used the same consumer-grade
sensor attached via USB to a Linux machine via a USB link, reporting
to <application>gpsd</application> which is polled via sockets by a
profiling application.  In all cases the GPS was stationary at
approximately 40N 75W.</para>

</sect1>
<sect1><title>Per-sentence profiling</title>

<sect2><title>Modeling the reporting chain</title>

<para>Consider the whole transmission path of a PVT
(position/velocity/time) report from a GPS to the user or user
application.  It has the following stages:</para>

<orderedlist>
<listitem>
<para>A PVT report is generated in the GPS</para>
</listitem>
<listitem>
<para>It is encoded (into NMEA or a vendor binary protocol) 
and buffered for transmission via serial link.</para>
</listitem>
<listitem>
<para>The encoding is transmitted via serial link to a buffer in <application>gpsd</application>.</para>
</listitem>
<listitem>
<para>The encoding is decoded and translated into a notification in
GPSD's protocol.</para>
</listitem>
<listitem>
<para>The GPSD-protocol notification is polled and read over a 
client socket.</para>
</listitem>
<listitem>
<para>The GPSD-protocol notification is decoded by libgps and unpacked
into a session structure available to the user application.</para>
</listitem>
</orderedlist>

<para>It is also relevant that consumer-grade GPSes do not expect to
be polled, but are designed to issue PVT reports on a fixed cycle
time, which we'll call C and which is usually 1
second. <application>gpsd</application> expects this.  A few GPSes
(notably SiRF-II-based ones) can be polled, and we might thus be able
to pull PVT reports out of them at a higher rate.
<application>gpsd</application> doesn't do this; one question this
investigation can address is whether there would be any point to
that.</para>

<para>At various times GPS manufacturers have promoted proprietary
binary protocols and transmission speeds higher than the NMEA-standard
4800bps as ways to improve GPS performance.  Obviously these cannot
affect positional accuracy; all they can change is the latency at
stages 2, 3, and 4.</para>

<para>The implementation of <application>gpsd</application> affects
how much latency is introduced at stage 4.  The design of the
<application>gpsd</application> protocol (in particular, the average
and worst-case size and complexity of a position/velocity/time report)
affects how much latency is introduced at stages 5, 6, and 7.</para>

<para>At stages 5 and later, the client design and implementation 
matter a lot.  In particular, it matters how frequently the client
samples the PVT reports that <application>gpsd</application> makes
available.</para>

<para>The list of stages above implies the following formula for 
expected latency L, and a set of tactics for reducing it:</para>

<literallayout>
L = C/2 + E1 + T1 + D1 + W + E2 + T2 + D2
</literallayout>

<para>where:</para>

<orderedlist>
<listitem>
<para>C/2 is the expected delay introduced by a cycle time of C
(worst-case delay would just be C). We can decrease this by decreasing
C, but consumer-grade GPSes don't go below 1 second.</para>
</listitem>
<listitem>
<para>E1 is PVT encoding time within the GPS. We can't affect this.</para>
</listitem>
<listitem>
<para>T1 is transmission time over the serial link.  We can decrease
this by raising the baud rate or increasing the information density 
of the encoding.</para>
</listitem>
<listitem>
<para>D1 is decode time required for <application>gpsd</application>
to update its session structure.  We can decrease this, if need be,
by tuning the implementation or using faster hardware.</para>
</listitem>
<listitem>
<para>W is the wait until the application polls
<application>gpsd</application>. This can only be reduced by designing
the application to poll frequently.</para>
</listitem>
<listitem>
<para>E2 is PVT encoding time within the daemon. We can speed this up
with faster hardware or a simpler GPSD format.</para>
</listitem>
<listitem>
<para>T2 is transmission time over the client socket. Faster hardware,
a better TCP/IP stack or a denser encoding can decrease this.</para>
</listitem>
<listitem>
<para>D2 is decoding time required for the client library to update
the session structure visible to the user application.  A simpler 
GPSD format could decrease this</para>
</listitem>
</orderedlist>

<para>The total figure L is of interest, of course.  The first
question to ask is how it compares to C. But to know where
tuning this system is worth the effort and where it isn't, the
relative magnitude of these six components is what is important. In
particular, if C or E1 dominate, there is no point in trying to tune
the system at all.</para>

<para>The rule on modern hardware is that computation is cheap,
communication is expensive.  By this rule, we expect E1, D1, E2, and D2 to
be small relative to T1 and T2.  We can't predict W.  Thus there is no
knowing how the sum of the other terms will compare to C, but we know
that E1 + T1 is the other statistic GPS vendors can easily measure.  C
&lt; E1 + T1 would be a bad idea, and we can guess that competition among
GPS vendors will probably tend to push C downwards to the point where
it's not much larger than E1 + T1.</para>

<para>C is known from manufacturer specifications.
<application>gpsd</application> and its client libraries can be built
with profiling code that measures all the other timing variables.  The
tool
<citerefentry><refentrytitle>gpsprof</refentrytitle><manvolnum>1</manvolnum></citerefentry>
collects this data and generates reports and plots from it.  There
are, however, some sources of error to be aware of:</para>

<itemizedlist>
<listitem>
<para>Our way of measuring E1 and T1 is to collect a timestamp on the 
first character read of a new NMEA sentence, then on the terminating 
newline, and compare those to the GPS timestamp on the sentence.  
While this will measure E1+T1 accurately, it will underestimate
the contribution of T1 to the whole because it doesn't measure
RS232 activity taking place before the first character becomes
visible at the receive end.</para>
</listitem>
<listitem>
<para>Because we compare GPS sentence timestamps with local ones, 
inaccuracy in the computer's clock fuzzes the measurements.  The test
machine updated time from NTP, so the expected inaccuracy from this
source should be not more than about ten milliseconds.</para>
</listitem>
<listitem>
<para>The $ clause that the daemon uses to ship per-sentence profiling info to
the client adds substantial bulk to the traffic.  Thus, it will tend 
to inflate E2, T2, and D2 somewhat.</para>
</listitem>
<listitem>
<para>The client library used for profiling is written in Python,
which will further inflate D2 relative to the C client library most
applications are likely to use.</para>
</listitem>
<listitem>
<para>The system-call overhead of profiling (seven
<citerefentry><refentrytitle>gettimeofday</refentrytitle><manvolnum>2</manvolnum></citerefentry>
calls per sentence to collect timestamps, several other time-library
calls per sentence to convert ISO8661 timestamps) will introduce a
small amount of noise into the figures.  These are cheap calls that
don't induce disk activity; thus, on modern hardware; we may expect
the overhead per call to be at worst in the microsecond range.  The
entire per-sentence overhead system-call overhead should be on the
order of ten microseconds.</para>
</listitem>
</itemizedlist>

</sect2>
<sect2><title>Data and Analysis</title>

<para>I took measurements using a Haicom 204s USB GPS mouse.  This
device, using a SiRF-II GPS chipset and PL2303 USB-to-serial chipset, is
very typical of 2005's consumer-grade GPS hardware; the Haicom people
themselves estimated to me in late 2004 that the SirF-II had about 80%
and rising market share, and the specification sheets I find with
Web searches back this up. Each profile run used 100 samples.</para>

<para>My host system for the measurements was an Opteron 3400 running an
"everything" installation of Fedora Core 3.  This is still a
moderately fast machine in early 2005, but average processor
utilization remained low throughout the tests.</para>

<para>The version of the GPSD software I used for the test was
released as 2.13.  It was configured with
&mdash;&mdash;enable-profiling.  All graphs and figures were generated
with
<citerefentry><refentrytitle>gpsprof</refentrytitle><manvolnum>1</manvolnum></citerefentry>,
a tool built for this purpose and included in the distribution.</para>

<para>One of the effects of building with
&ndash;&ndash;enable-profiling is that a form of the B command that
normally just reports the RS232 parameters can be used to set them (it
ships a SiRF-II control string to the GPS and then changes the line
settings).</para>

<para>Another effect is to enable a Z command to switch on profiling.
When profiling is on, each time 
<application>gpsd</application>
reports a fix with timestamp (e.g. on GPGGA, GPMRC and GPGLL
sentences) it also reports timing information from five checkpoints 
inside the daemon.  The client library adds two more checkpoints.</para>

<para>Our first graph is with profile reporting turned off, to give us
a handle on performance with the system disturbed as little as
possible.  This was generated with <command>gpsprof  -t "Haicom 204s" -T png -f
uninstrumented -s 4800</command>.  We'll compare it to later plots to
see the effect of profiling overhead.</para>

<figure><title>Total latency</title>
<mediaobject>
  <imageobject>
    <imagedata fileref='graph1.png'/>
  </imageobject>
</mediaobject>
</figure>

<para>The first thing to notice here is that the fix latency can be
just over a second; you can see the exact figures in the <ulink
url='profile1.txt'>raw data</ulink>.  Where is the time going?  Our next
graph was generated with <command>gpsprof -T png -t
"Haicom 204s" -f raw -s 4800</command></para>

<figure><title>Instrumented latency report</title>
<mediaobject>
  <imageobject>
    <imagedata fileref='graph2.png'/>
  </imageobject>
</mediaobject>
</figure>

<para>By comparing this graph to the previous one, it is pretty clear
that the profiling reports are not introducing any measurable latency.
But what is more interesting is to notice that D1 + W + E2 + T2 + D2
vanishes &mdash; at this timescale, all we can see is E1 and T1.</para>

<para>The <ulink url='profile2.txt'>raw data</ulink> bears this out.
All times besides E1 and T1 are so small that they are comparable to
the noise level of the measurements.  This may be a bit surprising
unless one knows that a W near 0 is expected in this setup;
<application>gpsprof</application> sets watcher mode.  Also, a modern
zero-copy TCP/IP stack like Linux's implements local sockets with very
low overhead. It is also a little surprising that E1 is so large
relative to E1+T1. Recall, however, that this may be measurement
error.</para>

<para>Our third graph (<command>gpsprof  -t "Haicom 204s" -T png -f split -s 4800</command> changes the presentation so we can see
how latency varies with sentence type.</para>

<figure><title>Split latency report</title>
<mediaobject>
  <imageobject>
    <imagedata fileref='graph3.png'/>
  </imageobject>
</mediaobject>
</figure>

<para>The reason for the comb pattern in the previous graphs is now
apparent; latency is constant for any given sentence type. The obvious
correlate would be sentence length &mdash; but looking at the <ulink
url='profile3.txt'>raw data</ulink>, we see that that is not the only
factor.  Consider this table:</para>

<informaltable>
<tgroup cols='3'>
<thead>
<row>
<entry>Sentence type</entry>
<entry>Typical length</entry>
<entry>Typical latency</entry>
</row>
</thead>
<tbody>
<row>
<entry>GPRMC</entry>
<entry>70</entry>
<entry>1.01</entry>
</row>
<row>
<entry>GPGGA</entry>
<entry>81</entry>
<entry>0.23</entry>
</row>
<row>
<entry>GPGLL</entry>
<entry>49</entry>
<entry>0.31</entry>
</row>
</tbody>
</tgroup>
</informaltable>

<para>For illustration, here are some sample NMEA sentences logged 
while I was conducting these tests:</para>

<literallayout>
$GPRMC,183424.834,A,4002.1033,N,07531.2003,W,0.00,0.00,170205,,*11
$GPGGA,183425.834,4002.1035,N,07531.2004,W,1,05,1.9,134.7,M,-33.8,M,0.0,0000*48
$GPGLL,4002.1035,N,07531.2004,W,183425.834,A*27
</literallayout>

<para>Though GPRMCs are shorter than GPGAs, they consistently have an
associated latency four times as long.  The graph tells us most of
this is E1.  There must be something the GPS is doing that is
computationally very expensive when it generates GPRMCs.  It may well
be that it is actually doing that fix at that point in the send cycle
and buffering the results for retransmission in GPGGA and GPGLL forms.
Alternatively, perhaps the speed/track computation is
expensive.</para>

<para>Now let's look at how the picture changes when we double the
baud rate.  <command>gpsprof -t "Haicom 204s" -T png -s 9600</command>
gives us this:</para>

<figure><title>Split latency report, 9600bps</title>
<mediaobject>
  <imageobject>
    <imagedata fileref='graph4.png'/>
  </imageobject>
</mediaobject>
</figure>

<para>This graph looks almost identical to the previous one, except
for vertical scale &mdash; latency has been cut neatly in half.
Transmission times for GPMRC go from about 0.15sec to 0.75sec. Oddly,
average E1 is also cut almost in half.  I don't know how to explain
this, unless a lot of what looks like E1 is actually RS232
transmission time spent before the first character appears in the
daemon's receive buffers.  You can also view the 
<ulink url='profile4.txt'>raw data</ulink>.</para>

<para>For comparison, here's the same plot made with a BU303b, a
different USB GPS mouse using the same SiRF-II/PL2303 combination:</para>

<figure><title>Split latency report, 9600bps</title>
<mediaobject>
  <imageobject>
    <imagedata fileref='graph5.png'/>
  </imageobject>
</mediaobject>
</figure>

<para>This, and the <ulink url='profile5.txt'>raw data</ulink>, look
very similar to the Haicom numbers.  The main difference seems to be
that the BU303b firmware doesn't ship GPGLL by default.</para>

</sect2>
</sect1>

<sect1><title>Per-cycle profiling</title>

<sect2><title>Modeling the reporting chain</title>

<para>When the old GPSD protocol was replaced by an application of JSON
and the daemon developed the capability to perform automatic detection of
the beginning and end of GPS reporting cycles, it became possible to measure
whole-cycle latency.  Also, embedding timing statistics in the JSON digest
of an entire cycle rather than as a $ sentence after each GPS packet
significantly reduced the overhead of profiling in the report stream.</para>

<para>The model for these measurements is as follows:</para>

<orderedlist>
<listitem>
<para>A PVT report is generated in the GPS (at time 'T')</para>
</listitem>
<listitem>
<para>It is encoded into a burst of sentences in NMEA or a vendor
binary protocol and buffered for transmission via serial link.</para>
</listitem>
<listitem>
<para>The encoding is transmitted via serial link to a buffer in
<application>gpsd</application>, beginning at a time we shall call
'S'.</para>
</listitem>
<listitem>
<para>Because it consists of multiple packets, a period combining
serial transmission time with <application>gpsd</application>
processing (packet-sniffing and analysis) time will follow.</para>
</listitem>
<listitem>
<para>At the end of this interval (at a time we shall call 'E'),
<application>gpsd</application> has seen the GPS data it needs and is
ready to produce a report to ship to clients.</para>
</listitem>
<listitem>
<para>Meanwhile, the GPS may still be transmitting data that GPSD does
not use.  But when the transmission burst is done, there will be quiet
time on the link (exception: as we noted in 2005, some devices' transmissions 
may slightly overflow the 1-second cycle time at 4800bps).</para>
</listitem>
<listitem>
<para>The JSON report is shipped, arriving at the client at a time we
shall call 'R'.</para>
</listitem>
</orderedlist>

<para>We cannot know T directly.  The GPS's timestamp on the fix will
tell us when it thinks T was, but because we don't know how our local
clock diverges from GPS's atomic-clock timebase we don't actually know
what T was in system time (call that T').  If we trust NTP, we then
believe that the skew between T and T' is no more than 10ms.</para>

<para>We catch time S by recording system time each time data becomes
available from the device.  If adjacent returns of select(2) are separated 
by more than 250msec, we have good reason to believe that the second one 
follows end-of-cycle quiet time.  This guard interval is reliable at 9600bps
and will only be more so at higher speeds.</para>

<para>We catch time E just before a JSON report is generated from the 
per-device session structures. This is the wnd of the analysis phase.
If timing is enabled, extra members carrying S, E and the number
of characters transmitted during the cycle (C) are included in the JSON.</para>

<para>We catch time R by noting when the JSON arrives at the client.</para>

<para>We know that the transmission-time portion of [S, E] can be
approximated by the formula (C * 10) / B where B is the
transmission rate in bits per second.  (Each character costs 8 bits
plus one parity bit plus one stop bit.)</para>

<para>Knowing this, we can subtract (C * 10) / B from (E-S) to approximate
the internal proccessing time spent by <application>gpsd</application>.
Due to other UART overheads, this formula will slightly underestimate
transmission time and this overestimate processing time, but even a rough
comparison of the two is interesting.</para>
</sect2>

<sect2><title>Data and Analysis</title>

<para>With the new profiling tools, one graph (made with
<command>gpsprof -f instrumented -n 100 -T png</command>) tells the
story.  This is from the same Haicom 204s used in the 2005 tests.  You
can see the exact figures in the <ulink url='profile6.txt'>raw
data</ulink>.</para>

<figure><title>Per-cycle latency report, 19200bps</title>
<mediaobject>
  <imageobject>
    <imagedata fileref='graph6.png'/>
  </imageobject>
</mediaobject>
</figure>

<para>Fix latency (S - T, the purple time segment in each sample) is
consistently about 70msec.  Some of this represents. on-chip
processing.  Some may represent actual skew between NTP time and GPS
time.</para>

<para>RS232 time (the blue segment) is the character transmission time
estimate we computed. It seems steady at around 100ms. This is
probably a bit low, proportionately speaking.</para>

<para>The green segment is (E-S) with RS232 computed time subtracted.
It approximates the time required by <application>gpsd</application>
for itelf. It too seems steady at around 100ms. This is probably a bit
high, proportionately speaking.</para>

<para>The red dots that are just barely visible at the tops of some
sample bars represent R-E, the client reception delta. Inspection of
the raw data reveals that it is on the close order of 1ms.</para>

<para>Total fix latency is steady at about 260ms.</para>

<para>It is instructive to compare this with the graph (and the 
<ulink url='profile7.txt'>raw data</ulink>) from the same device 
at 9600bps.</para>

<figure><title>Per-cycle latency report, 9600bps</title>
<mediaobject>
  <imageobject>
    <imagedata fileref='graph7.png'/>
  </imageobject>
</mediaobject>
</figure>

<para>As we might expect, RS232 time changes drastically and the other
components barely change at all.  This gives us reason to be confident that
computed RS232 time is in fact tracking actual transmission time pretty 
closely.  It also confirms that the most effective way to decrease total
fix latency is simply to bump up the transmission speed.
</para>
</sect2>

</sect1>

<sect1><title>Conclusions</title>

<para>All these conclusions apply to the consumer-grade GPS hardware
generally available back in 2005 and today in 2011, e.g. with a cycle time
of one second. As it happens, 2005 was just after the point when
consumer-grade GPS chips stabilized as a technology, and though unit
prices have fallen they have changed very little in technology and
performance over the intervening six years.</para>

<sect2><title>For Application Programmers</title>

<para>For the best tradeoff between minimizing latency and use of
application resources, Nyquist's Theorem tells us to poll
<application>gpsd</application> once every half-cycle &mdash; that is,
on almost all GPSes at time of writing, twice a second.</para>

<para>With the SiRF chips still used in most consumer-grade GPSes at time of
writing, 9600bps is the optimal line speed.  4800 is slightly too low,
not guaranteeing updates within the 1-second cycle time.  9600bps
yields updates in about 0.45sec, 19600bps in about 0.26sec.</para>

</sect2>
<sect2><title>For Manufacturer Claims</title>

<para>At the 1-second cycle time of consumer-grade devices, being able
to operate at 9600bps is useful, but higher speeds would probably not
be worth the extra computation unless your sensor is in rapid motion. Even
whole-cycle latency, most sensitive to transmission speed, is only cut
by less than 200ms by going to 19200. Higher speed will exhibit 
diminishing returns.</para>

<para>Comparing the SiRF-II performance at 4800bps and 9600 shows a
drop in E1+T1 that looks about linear, suggesting that for a cycle of
n seconds, the optimal line speed would be about 9600/n. Since future
GPS chips are likely to have faster processors and thus less latency,
this may be considered an upper bound on useful line speed.</para>

<para>Because 9600bps is readily available, the transmission- and
decode-time advantages of binary protocols over NMEA are not
significant within a 1-per-second update cycle.  Because line speeds
up to 38400 are readily available through standard UARTs, we may
expect this to continue to be the case even with cycle times as
low as 0.25sec.</para>

<para>More generally, binary protocols are largely pointless except as
market-control devices for the manufacturers.  The additional
capabilities they support could be just as effectively supported
through NMEA's $P-prefix extension mechanism.</para>

</sect2>
<sect2><title>For GPSD as a Time Service</title>

<para>We have measured a typical intrisic time latency of about 70msec due to
on-GPS processing and the USB polling interval. While this is noticeably 
higher than NTP's expected accuracy of &plusmn;10msec, it should be
adequate for most applications other than physics experiments.</para>

</sect2>
<sect2><title>For the Design of GPSD</title>

<para>In 2005, I wrote that <application>gpsd</application> does not
introduce measurable latency into the path from GPS to application.  I
said that cycle times would have to decrease by two orders of
magnitude for this to change.</para>

<para>In 2011, with better whole-cycle oriented profiling tools and a
faster test machine, latency incurred by
<application>gpsd</application> can be measured.  It is less than 0.1
sec on a 2.66 Intel Core Duo under normal load. How much less depends
on how the model computations underestimate RS232 transmission time
for the GPS data.</para>
</sect2>
</sect1>
</article>

<!--
Local Variables:
compile-command: "xmlto xhtml-nochunks performance.xml"
End:
-->