summaryrefslogtreecommitdiff
path: root/HACKING
blob: 459873bfe00db428f6c6914b2ea4d0d017a52ac2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
This is the Hacker's Guide to gpsd.  If you're viewing it with Emacs, try
doing Ctl-C Ctl-t and browsing through the outline headers.

If you're looking for things to hack on, first see the TODO file.

** Contribution guidelines

We prefer diff -u format, but diff -c is acceptable.  Do not send
patches in the default (-e or ed) mode, as they are too brittle.

If you are introducing a new feature, include a documentation patch.

If you are contributing a driver for a new GPS, please also do the
following things:

1) Send us a representative sample of the GPS output for future 
   regression-testing.

2) Write a hardware entry describing the GPS for the hardware page at
   <http://gpsd.berlios.de/hardware.html>.

Before shipping a patch, it is a good idea to make sure the patched
code displays no warnings when you run 'make splint'.  See
http://www.splint.org for a description of this tool.

** The license on contributions

The GPSD libraries are under the BSD license.  Please do not send 
contributions with GPL attached!

The reason for this policy is to avoid making people nervous about
linking the GPSD libraries to applications that may be under other
kicenses (such as MIT, BSD, AFL, etc.).

** Debugging

For debugging purposes, it may be helpful to configure with --disable-shared.
This turns off all the shared-library crud, making it somewhat easier to
use gdb.

There is a script called logextract in the distribution that you can use
to strip clean NMEA out of the log files produced by gpsd.  This can be
useful if someone ships you a log that they allege caused gpsd to 
misbehave.

gpsfake enables you to repeatedly feed a packet sequence to a gpsd
instance running as non-root.  Watching such a session with gdb should
smoke out any repeatable bug pretty quickly.

The parsing of GPGSV sentences in the NMEA driver has been a
persistent and nasty trouble spot, causing more buffer overruns and
weird secondary damage than all the rest of the code put together.
Any time you get a bug report that seems to afflict NMEA devices
only, suspicion should focus here.

** The Y2.1K problem and other calendar issues

Because of limitations in various GPS protocols (e.g., they were
designed by fools who weren't looking past the ends of their noses) 
this code unavoidably includes some assumptions that will turn around
and bite on various future dates. 

The two specific problms are:

1) NMEA delivers only two-digit years.

2) SiRF chips at firmware level 231 deliver only GPS time in binary mode,
not leap-second-corrected UTC.

See the timebase.h file for various constants that will need to
be tweaked accasionally to cope with these problems.

Note that gpsd does not rely on the system clock in any way.  This
is so you can use it to set the system clock.

** Profiling

There is a barely-documented Z command in the daemon will cause it to emit
a $ clause on every request.  The $ clause contains four
space-separated fields:

(1) An identifing sentence tag.

(2) The character length of the sentence containing the timestamp data.

(3) The timestamp associated with the sentence, in seconds since
    the Unix epoch (this time *is* leap-second corrected, like UTC).
    This timestamp may be zero.  If nonzero, it is the base time for
    the packet.

(4) An offset from the timestamp telling when gpsd believes the
    transmission of the current packet started (this is actually 
    recorded just before the first read of the new packet).  If
    the sentence timestamp was zero, this offset is a full timestamp 
    and the base time of the packet.

(5) An offset from the base time telling when gpsd received the last
    bytes of the packet.

(6) An offset from the base time telling when gpsd decoded the data.

(7) An offset from the base time taken just before encoding the
    response -- effectively, when gpsd was polled to transmit the data.

(8) An offset from the base time telling when gpsd transmitted 
    the data.

The Z figures measure components of the latency between the GPS's time
measurement and when the sentence data became available to the
client. For it to be meaningful, the GPS has to ship timestamps with
sub-second precision. SiRF-II and Evermore chipsets ship times with
millisecond resolution.  Your machine's time reference must also be
accurate to subsecond precision; I recommend using ntpd, which will
normally give you about 15 microseconds precision (two orders of
magnitude better than GPSes report).

Note, some inaccuracy is introduced into the start- and end-of-packet
timestamps by the fact that the last read of a packet may grab a few
bytes of the next one.

The distribution lincludes a Python script, gpsprof, that uses the 
Z command to collect profiling information from a running GPS instance.
You can use this to measure the latency at each stage -- GPS to daemon,
daemon to client library -- and to estimate the portion of the latency 
induced by serial transmit time.  The gpsprof script creates latency
plots using gnuplot(1).  It can also report the raw data.

** Architecture and how to hack it

gpsd is not a complicated piece of code.  Essentially, it spins in a loop 
polling for input from one of three sources:

1) A client making requests over a TCP/IP port.

2) A set of GPSes, connected via serial or USB devices.

3) A DGPS server issuing periodic differential-GPS updates.

The daemon only connects to a GPS when clients are connected to it.
Otherwise all GPS devices are closed and the daemon is quiescent, but
retains fix and timestamp data from the last active period. 

All writes to client sockets go through throttled_write().
This code addresses two cases.  First, client has dropped the connection.
Second, client is connected but not picking up data and our buffers are
backing up.  If we let this continue, the write buffers will fill and 
the effect will be denial-of-service to clients that are better behaved.

Our strategy is brutally simple and takes advantage of the fact that
GPS data has a short shelf life.  If the client doesn't pick it up 
within a few minutes, it's probably not useful to that client.  So if
data is backing up to a client, drop that client.  That's why we set
the client socket to nonblocking.

GPS input updates an internal data structure which has slots in it for
all the data you can get from a GPS.  Client commands mine that
structure and ship reports up the socket to the client.  DGPS data is
passed through, raw, to the GPS.

The trickiest part of the code is the handling of input sources in gpsd.c 
itself.  It had to tolerate clients connecting and disconnecting at random
times, and the GPS being unplugged and replugged, without leaking file 
descriptors; also arrange for the GPS to be open when and only when clients 
are active.

The function is_input_waiting() is not strictly necessary for the most
important use of the low-level interface, which is when it gets called
from the daemon mainline.  In that context, FD_ISSET() on the element
of the file-descriptor set representing the GPS would tell us if there
were input waiting.  The explicit test is there for other programs
that might call gps_poll() without such a guarantee.

** Security Issues

Between versions 2.16 and 2.20, hotplugging was handled in the most
obvious way, by allowing the F command to declare new GPS devices for
gpsd to look at.  Because gpsd runs as root, this had problems:

1) A malicious client with non-root access on the host could use F to
point gpsd at a spoof GPS that was actually a pty feeding bogus
location data.

2) A malicious client could use repeated probes of a target tty or
other device to cause data loss to other users.  This is a potential
remote exploit! Not too bad if the bytes he steals are your mouse, it
would just get jumpy and jerky -- but suppose they're from your disk
drive and sections drop out of a file you are retrieving for edit?

The conclusion was inescapable.  Switching among and probing devices
that gpsd already knows about can be an unprivileged operation, but 
editing gpsd's device list must be privileged.  Hotplug scripts 
should be able to do it, but ordinary clients should not.

Adding an authentication mechanism was considered and rejected (can you
say "can of big wriggly worms"?).  Instead, there is a separate
control channel for the daemon, only locally accessible, only
recognizing "add device" and "remove device" commands.

The channel is a Unix-domain socket owned by root, so it has
file-system protection bits.  An intruder would need root permissions
to get at it, in which case you'd have much bigger problems than a
spoofed GPS.

More generally, certainly gpsd needs to treat command input as
untrusted and for safety's sake should treat GPS data as untrusted
too (in particular this means never assuming that either source won't
try to overflow a buffer).

Daemon versions after 2.21 drop privileges after startup, setting UID
to "nobody" and GID to whichever group owns the GPS device specified
at startup time -- or, if it doesn't exist, the system's
lowest-numbered TTY device named in PROTO_TTY.  It may be necessary to
change PROTO_TTY in gpsd.c for non-Linux systems.

** Autoconfiguration

One of the design goals for gpsd is to be as near zero-configuration
as possible.  Under most circumstances, it doesn't require either
the GPS type or the serial-line parameters to connect to it to be
specified.  Presently, here's how the autoconfig works.

1. At each baud rate gpsd grabs packets until it sees either a
   well-formed and checksum-verified NMEA packet, a well-formed and
   checksum-verified SiRF packet, a well-formed and checksum-verified
   Zodiac packet, or it sees one of the two special trigger strings
   EARTHA or ASTRAL, or it fills a long buffer with garbage (in which
   case it steps to the next baud rate).

2. If it finds a SiRF packet, it queries the chip for firmware
   version.  If the version is < 231.000 it drops back to SiRF NMEA.
   We're done.

3. If it finds a Zodiac binary packet (led with 0xff 0x81), it
   switches to the Zodiac driver.  We're done.

4. If it finds EARTHA, it selects the Earthmade driver, which then
   flips the connection to Zodiac binary mode.  We're done.

5. If it finds ASTRAL, it feeds the TripMate on the other end what
   it wants and goes to Tripmate NMEA mode.  We're done.

6. If it finds a NMEA packet, it selects the NMEA driver.  This
   initializes by shipping all vendor-specific initialization strings
   to the device.  The objectives are to enable GSA, disable GLL, and
   disable VTG.  Probe strings go here too, like the one that turns 
   on SiRF debugging output in order to detect SiRF chips.

7. Now gpsd reads NMEA packets.  If it sees a driver trigger string it
   invokes the matching driver.  Presently there is really only one of
   these: "$Ack Input 105.\r\n", the response to the SiRF probe. On
   seeing this, gpsd switches from NMEA to SiRF binary mode, probes
   for firmware version, and either stays in binary or drops back 
   to SiRF NMEA.

The outcome is that we know exactly what we're looking at, without any
driver-type or baud rate options.

** Don't add options!

If you send a patch that adds a command-line option to the daemon, it
will almost certainly be refused.  Ditto for any patch that requires
gpsd to parse a dotfile.  

One of the major objectives of this project is for gpsd *not to
require administration* -- under Linux, at least.  It autobauds,
it does protocol discovery, and it's activated by the hotplug
system.  Arranging these things involved quite a lot of work, 
and we're not willing to lose the zero-configuration property
that work gained us.

Instead of adding a command-line option to support whatever feature
you had in mind, try to figure out a way that the feature can
autoconfigure itself by doing runtime checks.  If you're not clever
enough to manage that, consider whether your feature control might be
implemented with an extension to the gpsd protocol or the
control-socket command set.

** Error modeling

To estimate errors (which we must do if the GPS isn't nice like a
Garmin and reports them in meters), we need to multiply an estimate of
User Equivalent Range Error (UERE) by the appropriate dilution factor,

The UERE estimate is usually computed as the square root of the sum of
the squares of individual error estimates from a physical model.  The
following is a representative physical error model for satellite range
measurements:

From R.B Langley's 1997 "The GPS error budget". 
GPS World , Vol. 8, No. 3, pp. 51-56

Atmospheric error -- ionosphere                 7.0m
Atmospheric error -- troposphere                0.7m
Clock and ephemeris error                       3.6m
Receiver noise                                  1.5m
Multipath effect                                1.2m

From Hoffmann-Wellenhof et al. (1997), "GPS: Theory and Practice", 4th
Ed., Springer.

Code range noise (C/A)                          0.3m
Code range noise (P-code)                       0.03m
Phase range                                     0.005m

We're assuming these are 1-sigma error ranges. This needs to
be checked in the sources.

See http://www.seismo.berkeley.edu/~battag/GAMITwrkshp/lecturenotes/unit1/
for discussion.

Carl Carter of SiRF says: "Ionospheric error is typically corrected for 
at least in large part, by receivers applying the Klobuchar model using 
data supplied in the navigation message (subframe 4, page 18, Ionospheric 
and UTC data).  As a result, its effect is closer to that of the 
troposphere, amounting to the residual between real error and corrections.

"Multipath effect is dramatically variable, ranging from near 0 in
good conditions (for example, our roof-mounted antenna with few if any
multipath sources within any reasonable range) to hundreds of meters in
tough conditions like urban canyons.  Picking a number to use for that
is, at any instant, a guess."

"Using Hoffman-Wellenhoff is fine, but you can't use all 3 values.
You need to use one at a time, depending on what you are using for
range measurements.  For example, our receiver only uses the C/A
code, never the P code, so the 0.03 value does not apply.  But once
we lock onto the carrier phase, we gradually apply that as a
smoothing on our C/A code, so we gradually shift from pure C/A code
to nearly pure carrier phase.  Rather than applying both C/A and
carrier phase, you need to determine how long we have been using
the carrier smoothing and use a blend of the two."

On Carl's advice we would apply tropospheric error twice, and use
the largest Wellenhof figure:

UERE = sqrt(0.7^2 + 0.7^2 + 3.6^2 + 1.5^2 + 1.2^2 + 0.3^2) = 4.1

DGPS corrects for atmospheric distortion, ephemeris error, and satellite/
receiver clock error.  Thus:

UERE =  sqrt(1.5^2 + 1.2^2 + 0.3^2) = 1.8

which we round up to 2.

Due to multipath uncertainty, Carl says 4.1 is too low and recommends
a non-DGPS UERE estimate of 8.  That's what we use.

** Adding new GPS types

Almost all GPSes speak NMEA 0183.  However, it may occasionally be necessary
to add support for some odd binary format.  We're told that the hex dump
functions in CuteCom <http://cutecom.sourceforge.net/> can be useful for
investigating such protocols.

Internally, gpsd supports multiple GPS types.  All are represented by
driver method tables; the main loop knows nothing about the driver
methods except when to call them.  At any given time one driver is
active; by default it's the NMEA one.  To add a new device, populate
another driver structure and add it to the list.

Each driver may have a trigger string that the NMEA interpreter
watches for.  When that string is recognized at the start of a 
line, the interpreter switches to its driver.  The new driver 
initializer method is called immediately. 

Note: it is not necessary to add a driver just because your GPS wants
some funky initialization string.  Simply ship the string in the
initializer for the NMEA driver.  Because vendor control strings live
in vendor-specific namespaces (PSRF for SiRF, PGRM for Garmin, etc.)
your initializing control string will almost certainly be ignored by
anything not specifically watching for it.

Another good thing to send from the NMEA initializer is probe strings.
These are strings which should elicit an identifying response from
the GPS that you can use as a trigger string for a driver.  This is
how we detect SiRF chips (see step 5 under autoconfiguration above).

If you're writing a driver, look in gpsutils.c; driver helper
functions live there.

Your packet parser must return field-validity mask bits (using the
*_SET macros in gps.h), suitable to be put in session->gpsdata.valid.
The watcher-mode logic relies on these as its way of knowing what to
publish.

Your packet parser is also responsible for setting the tag field 
in the gps_data_t structure.  This is the string that will be emitted
as the first field of each $ record for profiling.  The packet getter
will set the sentence-length for you; it will be raw byte length, 
including both payload and header/trailer bytes.

Note, also, that all the timestamps your driver puts in the session
structure should be UTC (with leap-second corrections) not just Unix
seconds since the epoch.  The report-generator function for D
does *not* apply a timezone offset.

** Blind alleys

Things we've considered doing and rejected.

*** Allowing clients to ship arbitrary control strings to a GPS

Tempting -- it would allow us to do sirfmon-like things with the
daemon running -- but a bad idea.  It would make denial-of-service 
attacks on applications using the GPS far too easy.  For examples,
suppose the control string were a baud-rate change?

*** Using libusb to do USB device discovery

There has been some consideration of going to the cross-platform libusb
library to do USB device discovery. This would create an external
dependency that gpsd doesn't now have, and bring more complexity on
board than is probably desirable.

We've chosen instead to rely on the local hotplug system.  That way
gosd can concentrate solely on knowing about GPSes.

*** Setting FIFO threshold to 1 to reduce jitter in serial-message times

When using gpsd as a time reference, one of the things we'd like to do
is make the amount of lag in the message path from GPS to GPS small
and with as little jitter as possible, so we can correct for it with
a constant offset.

A possibility we considered is to set the FIFO threshold on the serial
device UART to 1 using TIOCGSERIAL/TIOCSSERIAL.  This would, in
effect, disable transmission buffering, increasing lag but decreasing
jitter.

But it's almost certainly not worth the work.  Rob Janssen, our timekeeping
expert, reckons that at 4800bps the UART buffering can cause at most
about 15msec of jitter.  This is, observably, swamped by other less
controllable sources of variation.

Local variables:
mode: outline
paragraph-separate: "[ 	]*$"
end: