HACKING


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208

This is the Hacker's Guide to gpsd.  If you're viewing it with Emacs, try
doing Ctl-C Ctl-t and browsing through the outline headers.

If you're looking for things to hack on, first see the TODO file.

** Debugging

For debugging purposes, it may be helpful to configure with --disable-shared.
This turns off all the shared-library crud, making it somewhat easier to
use gdb.

There is a script called logextract in the distribution that you can use
to strip clean NMEA out of the log files produced by gpsd.  This can be
useful if someone ships you a log that they allege caused gpsd to 
misbehave.

gpsfake enables you to repeatedly feed a packet sequence to a gpsd
instance running as non-root.  Watching such a session with gdb should
smoke out any repeatable bug pretty quickly.

** Profiling

There is a barely-documented Z command in the daemon will cause it to emit
a $ clause on every D request.  The $ clause contains four
space-separated fields:

(1) An identifing sentence tag.

(2) The character length of the sentence containing the timestamp data.

(3) The timestamp associated with the sentence, in seconds since
    the Unix epoch (this time *is* leap-second corrected, like UTC).

(4) An offset from the timestamp telling when gpsd believes the
    transmission of the current packet started (this is actually 
    recorded just before the first read od the new packet).

(5) An offset from the timestamp telling when gpsd received the last
    bytes of the packet.

(6) An offset from the timestamp telling when gpsd decoded the data.

(7) An offset from the timestamp taken just before encoding the
    response -- effectively, when gpsd was polled to transmit the data.

(8) An offset from the timestamp telling when gpsd transmitted 
    the data.

The Z figures measure components of the latency between the GPS's time
measurement and when the sentence data became available to the
client. For it to be meaningful, the GPS has to ship timestamps with
sub-second precision. SiRF-II and Evermore chipsets ship times with
millisecond resolution.  Your machine's time reference must also be
accurate to subsecond precision; I recommend using ntpd, which will
normally give you about 15 microseconds precision (two orders of
magnitude better than GPSes report).

Note, some inaccuracy is introduced into the start- and end-of-packet
timestamps by the fact that the last read of a packet may grab a few
bytes of the next one.

The distribution lincludes a Python script, gpsprof, that uses the 
Z command to collect profiling information from a running GPS instance.
You can use this to measure the latency at each stage -- GPS to daemon,
daemon to client library -- and to estimate the portion of the latency 
induced by serial transmit time.  The gpsprof script creates latency
plots using gnuplot(1).  It can also report the raw data.

** Architecture and how to hack it

This is not a complicated piece of code.  Essentially, it spins in a loop 
polling for input from one of three sources:

1) A client making requests over a TCP/IP port.

2) The GPS, connected via serial or USB device.

3) A DGPS server issuing periodic differential-GPS updates.

The daemon only connects to the GPS when clients are connected to it.
Otherwise the GPS device is closed and the daemon is quiescent, but
retains fix and timestamp data from the last active period.  This is
better functional design than starting the daemon from a hotplug
script would be; that would lose the old data, leaving no fix at all
available if the GPS were momentarily unplugged.

All writes to client sockets go through throttled_write().
This code addresses two cases.  First, client has dropped the connection.
Second, client is connected but not picking up data and our buffers are
backing up.  If we let this continue, the write buffers will fill and 
the effect will be denial-of-service to clients that are better behaved.

Our strategy is brutally simple and takes advantage of the fact that
GPS data has a short shelf life.  If the client doesn't pick it up 
within a few minutes, it's probably not useful to that client.  So if
data is backing up to a client, drop that client.  That's why we set
the client socket to nonblocking.

GPS input updates an internal data structure which has slots in it for
all the data you can get from a GPS.  Client commands mine that
structure and ship reports up the socket to the client.  DGPS data is
passed through, raw, to the GPS.

The trickiest part of the code is the handling of input sources in gpsd.c 
itself.  It had to tolerate clients connecting and disconnecting at random
times, and the GPS being unplugged and replugged, without leaking file 
descriptors; also arrange for the GPS to be open when and only when clients 
are active.

The function is_input_waiting() is not strictly necessary for the most
important use of the low-level interface, which is when it gets called
from the daemon mainline.  In that context, FD_ISSET() on the element
of the file-descriptor set representing the GPS would tell us if there
were input waiting.  The explicit test is there for other programs
that might call gps_poll() without such a guarantee.

** Autoconfiguration

One of the design goals for gpsd is to be as near zero-configuration
as possible.  Under most circumstances, it doesn't require either
the GPS type or the serial-line parameters to connect to it to be
specified.  Presently, here's how the autoconfig works.

1. Ay each baud rate gpsd grabs packets until it sees either a
   well-formed and checksum-verified NMEA packet, a well-formed and
   checksum-verified SiRF packet, or it sees one of the two special 
   trigger strings EARTHA or ASTRAL, or it fills a long buffer with garbage
   (in which case it steps to the next baud rate).

2. If it finds a SiRF packet, it queries the chip for firmware
   version.  If the version is < 231.000 it drops back to SiRF NMEA.
   We're done.

3. If it finds EARTHA, it selects the Earthmade driver, which then
   flips the connection to Zodiac binary mode.  We're done.

4. If it finds ASTRAL, it feeds the TripMate on the other end what
   it wants and goes to Tripmate NMEA mode.  We're done.

5. If it finds a NMEA packet, it selects the NMEA driver.  This
   initializes by shipping all vendor-specific initialization strings
   to the device.  Presently there are two such, one for SiRF and one
   for the FV-18.  The FV18 just sets some sentence frequencies, but
   the SiRF one is itself a probe,

6. Now gpsd reads NMEA packets.  If it sees a driver trigger string it
   invokes the matching driver.  Presently there is really only one of
   these: "$Ack Input 105.\r\n", the response to the SiRF probe. On
   seeing this, gpsd switches from NMEA to SiRF binary mode, probes
   for firmware version, and either stays in binary or drops back 
   to SiRF NMEA.

The outcome is that we know exactly what we're looking at, without any
driver or baud rate options.

** Adding new GPS types

Almost all GPSes speak NMEA 0183.  However, it may occasionally be necessary
to add support for some odd binary format.  We're told that the hex dump
functions in CuteCom <http://cutecom.sourceforge.net/> can be useful for
investigating such protocols.

Internally, gpsd supports multiple GPS types.  All are represented by
driver method tables; the main loop knows nothing about the driver
methods except when to call them.  At any given time one driver is
active; by default it's the NMEA one.  To add a new device, populate
another driver structure and add it to the list.

Each driver may have a trigger string that the NMEA interpreter
watches for.  When that string is recognized at the start of a 
line, the interpreter switches to its driver.  The new driver 
initializer method is called immediately. 

Note: it is not necessary to add a driver just because your GPS wants
some funky initialization string.  Simply ship the string in the
initializer for the NMEA driver.  Because vendor control strings live
in vendor-specific namespaces (PSRF for SiRF, PGRM for Garmin, etc.)
your initializing control string will almost certainly be ignored by
anything not specifically watching for it.

Another good thing to send from the NMEA initializer is probe strings.
These are strings which should elicit an identifying response from
the GPS that you can use as a trigger string for a driver.  This is
how we detect SiRF chips (see step 5 under autoconfiguration above).

If you're writing a driver, look in gpsutils.c; driver helper
functions live there.

Your packet parser must return field-validity mask bits (using the
*_SET macros in gps.h), suitable to be put in session->gpsdata.valid.
The watcher-mode logic relies on these as its way of knowing what to
publish.

Your packet parser is also responsible for setting the tag field 
in the gps_data_t structure.  This is the string that will be emitted
as the first field of each $ record for profiling.  The packet getter
will set the sentence-length for you; it will be raw byte length, 
including both payload and header/trailer bytes.

Note, also, that all the timestamps your driver puts in the session
structure should be UTC (with leap-second corrections) not just Unix
seconds since the epoch.  The report-generator function for D
does *not* apply a timezone offset.

Local variables:
mode: outline
paragraph-separate: "[ 	]*$"
end: