Contents

If you're looking for things to hack on, first see the TODO file in the source distribution.

  1. Goals and philosophy of the project
  2. Audience and supported hardware
    1. The time and location service
    2. The testing and tuning tools
    3. The upgrading tools
    4. The GPS/GNSS monitoring tools
  3. Contribution guidelines
    1. Verify your patch or commit
    2. Send patches in diff -u or -c format
    3. The license on contributions
    4. Don't add invocation options!
    5. Don't create static variables in the libraries!
    6. Don't use malloc!
    7. Avoid use of sizeof(<int type>)!
  4. Understanding the code
    1. Debugging
    2. Profiling
    3. Porting to weird machines: endianness, width, and signedness issues.
    4. Architecture and how to hack it
    5. Autoconfiguration
    6. Error modeling
    7. Ancient history
  5. Known trouble spots
    1. The Y2.1K problem and other calendar issues
    2. Hotplug interface problems
    3. Security Issues
  6. Adding new GPS types
    1. Driver architecture
    2. When not to add a driver
    3. Initializing time and date
    4. How drivers are invoked
    5. Where to put the data you get from the GPS
    6. Report errors with a 95% confidence interval
    7. Log files for regression testing
  7. Future Protocol Directions
    1. Proposed sentences
  8. Blind alleys
    1. Reporting fix data only once per cycle
    2. Allowing clients to ship arbitrary control strings to a GPS
    3. Setting FIFO threshold to 1 to reduce jitter in serial-message times
    4. Stop using a compiled-in UTC-TAI offset
    5. Subsecond polling
  9. Release Checklist

Goals and philosophy of the project

If the GPSD project ever needs a slogan, it will be "Hide the ugliness!" GPS technology works, but is baroque, ugly and poorly documented. Our job is to know the grotty details so nobody else has to.

Audience and supported hardware

Our paradigm user, the one we have in mind when making design choices, is running navigational or wardriving software on a Linux laptop or PDA. Some of our developers are actively interested in supporting GPS-with-SBC (Single-Board-Computer) hardware such as is used in balloon telemetry, marine navigation, and aviation.

These two use cases have similar issues in areas like client interfaces and power/duty-cycle management. The one place where they differ substantially is that in the SBC case we generally know in advance what devices will be connected and when. Thus, by designing for the less predictable laptop/PDA environment, we cover both. But it is not by accident that the source code can be built with support for only any single GPS type compiled in.

While we will support survey-grade GPSes when and if we have that hardware for testing, our focus will probably remain on inexpensive and readily-available consumer-grade GPS hardware, especially GPS mice.

The time and location service

The primary aim of the GPSD project is to support a simple time-and-location service for users and their geographically-aware applications.

A GPS is a device for delivering fourteen numbers: x, y, z, t, vx, vy, vz, and error estimates for each of these seven coordinates. The gpsd daemon's job is to deliver these numbers to user applications with minimum fuss. This is a "TPV" — time-position-velocity report. A GPS is a TPV oracle.

'Minimum fuss' means that the only action the user should have to take to enable location service is to plug in a GPS. The gpsd daemon, and its associated hotplug scripts or local equivalent, is responsible for automatically configuring itself. That includes autobauding, handshaking with the device, determining the correct mode or protocol to use, and issuing any device-specific initializations required.

Features (such as GPS type or serial-parameter switches) that would require the user to perform administrative actions to enable location service will be rejected. GPSes that cannot be autoconfigured will not be supported. 99% of the GPS hardware on the market in 2005 is autoconfigurable, and the design direction of GPS chipsets is such that this percentage will rise rather than fall; we deliberately choose simplicity of interface and zero administration over 100% coverage.

Here is a concrete example of how this principle applies. At least one very low-end GPS chipset (the San Jose Navigation GM-38) does not deliver correct checksums on the packets it ships to a host unless it has a fix. Typically, GPSes do not have a fix when they are plugged in, at the time gpsd must recognize and autoconfigure the device. Thus, supporting this chipset would require that we either (a) disable packet integrity checking in the autoconfiguration code, making detection of other more well-behaved devices unreliable, or (b) add an invocation switch to disable packet integrity checking for that chipset alone. We refuse to do either, and do not support this chipset.

The testing and tuning tools

Another principal goal of the GPSD software is that it be able to demonstrate its own correctness, give technical users good tools for measuring GPS accuracy and diagnosis of GPS idiosyncrasies, and provide a test framework for gpsd-using applications.

Accordingly, we support the gpsfake tool that simulates a GPS using recorded or synthetic log data. We support gpsprof, which collects accuracy and latency statistics on GPSes and the GPS+gpsd combination. And we include a comprehensive regression-test suite with the package. These tools are not accidents, they are essential to ensure that the basic GPS-monitoring code is not merely correct but demonstrably correct.

We support a tool, gpsmon, which is a low-level packet monitor and diagnostic tool. gpsmon is capable of tuning some device-specific control settings such as the SiRF static-navigation mode. A future direction of the project is to support diagnostic monitoring and tuning for our entire range of chipsets.

The GPS/GNSS monitoring tools

Another secondary goal of the project is to provide open-source tools for diagnostic monitoring and accuracy profiling not just of individual GPSes but of the GPS/GNSS network itself. The protocols (such as IS-GPS-200 for satellite downlink and RCTM104 for differential-GPS corrections) are notoriously poorly documented, and open-source tools for interpreting them have in the past been hard to find and only sporadically maintained.

We aim to remedy this. Our design goal is to provide lossless translators between these protocols and readable, documented text-stream formats.

We currently provide a tool for decoding RTCM104 reports on satellite health, almanacs, and pseudorange information from differential-GPS radios and reference stations. A future direction of the project is to support an RTCM104 encoder.

Contribution guidelines

Our languages are C, Python, and sh

The project implementation languages are C and Python. The core libgpsd libraries (and the daemon, which is a thin wrapper around them) are written in C; the test and profiling tools are written in Python, with a limited amount of glue in POSIX-conformant sh.

Code in other languages will, in general, be accepted only if it supplies a language binding for the libgps or libgpsd libraries that we don't already have. This restriction is an attempt to keep our long-term maintenance problem as tractable as possible.

We require C for anything that may have to run on an embedded system. Thus, the daemon and libgpsd libraries need to stay pure C. Anything that links direct to the core libraries should also be in C, because Python's alien-type facilities are still just a little too complex and painful to be a net win for our situation. (We know this may be about to change with the advent of the ctypes module in Python 2.6 and will keep an open mind, especially to anyone who actually supplies a ctypes Python wrapper for libgpsd.)

We prefer Python anywhere we aren't required to use C by technical constraints — in particular, for test/profiling/logging tools, hotplug agents, and miscellaneous scripts. Again, this is a long-term maintainability issue; there are whole classes of potential C bugs that simply don't exist in Python, and Python programs have a drastically lower line count for equivalent capability.

Shell scripts are acceptable for test and build code that only has to run in our development and test environments, as opposed to target or production environments. Note that shell scripts should not assume bash is available but rather stick to POSIX sh; among other benefits, this helps portability to BSD systems. Generally code that will run in the Ubuntu/Debian dash can be considered safe.

Here are two related rules:

Any complexity that can be moved out of the gpsd daemon to external test or framework code doesn't belong in the daemon.

Any complexity that can be moved out of C and into a higher-level language (Python, in particular) doesn't belong in C.

Both rules have the same purpose: to move complexity and resource costs from the places in the codebase where we can least afford it to the places where it is most manageable and inflicts the least long-term maintenance burden.

Verify your patch or commit

GPSD is written to a high quality standard, and has a defect rate that is remarkably low for a project of this size and complexity. Our first Coverity scan, in March 2007, flagged only 4 potential problems in 22,397 LOC — and two of those were false positives. This is three orders of magnitude cleaner than typical commercial software, and about half the defect density of the Linux kernel itself.

This did not happen by accident. We put a lot of effort into test tools and regression tests so we can avoid committing bad code. For committers, using those tests isn't just a good idea, it's the law — which is to say that if make a habit of not using them when you should, your commit access will be yanked.

Before shipping a patch or committing to the repository, you should go through the following checklist:

  1. If you are introducing a new feature or driver, include documentation.
  2. If your patch changes executable code in any way that is more than trivial, use the regression-test suite — "make testregress" — to check that your patch doesn't break the handling of any already-supported GPS.
  3. In the rare case where your patch or commit breaks the regression test and it's for a good reason, part of your responsibility is to (a) rebuild the regression tests, (b) include the test changes in your patch, and (c) explain in detail why the regression broke in your change comment.
  4. Check that the patched code displays no warnings when you run 'make splint' (see the Splint website for further description of this tool if you don't already know about it). Yes, tweaking your code to be splint-clean is a pain in the ass. Experience shows it's worth it.
  5. After code changes that do anything to the storage handling, run valgrind-audit and look out for reports of memory leaks and other dynamic-allocation problems (see the Valgrind website for a description of this tool if you don't already know about it).

Not breaking the regression tests is especially important. We rely on these to catch damaging side-effects of seemingly innocent but ill-thought-out changes, and to nail problems before they become user-visible.

The reason we use splint is twofold: (1) it's good at catching static buffer overruns, and (2) gpsd does a lot of low-level bit-bashing that can be sensitive to 32-vs.-64-bit problems. Getting the code splint-clean tends to prevent these.

If you are contributing a driver for a new GPS, please also do the following things:

  1. Send us a representative sample of the GPS output for future regression-testing.
  2. Write a hardware entry describing the GPS for the hardware page.

There's a whole section on adding new drivers later in this document.

Send patches in diff -u or -c format

We prefer diff -u format, but diff -c is acceptable. Do not send patches in the default (-e or ed) mode, as they are too brittle.

When you send a patch, we expect you to do at least the first three of the same verification steps we require from our project committers. Doing all of them is better, and makes it far more likely your patch will be accepted.

The license on contributions

The GPSD libraries are under the BSD license. Please do not send contributions with GPL attached!

The reason for this policy is to avoid making people nervous about linking the GPSD libraries to applications that may be under other licenses (such as MIT, BSD, AFL, etc.).

Don't add invocation options!

If you send a patch that adds a command-line option to the daemon, it will almost certainly be refused. Ditto for any patch that requires gpsd to parse a dotfile.

One of the major objectives of this project is for gpsd not to require administration — under Linux, at least. It autobauds, it does protocol discovery, and it's activated by the hotplug system. Arranging these things involved quite a lot of work, and we're not willing to lose the zero-configuration property that work gained us.

Instead of adding a command-line option to support whatever feature you had in mind, try to figure out a way that the feature can autoconfigure itself by doing runtime checks. If you're not clever enough to manage that, consider whether your feature control might be implemented with an extension to the gpsd protocol or the control-socket command set.

Here are three specific reasons command-line switches are evil:

(1) Command-line switches are often a lazy programmer's way out of writing correct adaptive logic. This is why we keep rejecting requests for a baud-rate switch and a GPS type switch — the right thing is to make the packet-sniffer work better, and if we relented in our opposition the pressure to get that right would disappear. Suddenly we'd be back to end-users having to fiddle with settings the software ought to figure out for itself, which is unacceptable.

(2) Command-line switches without corresponding protocol commands pin the daemon's behavior for its entire lifespan. Why should the user have to fix a policy at startup time and never get to change his/her mind afterwards? Stupid design...

(3) The command-line switches used for a normal gpsd startup can only be changed by modifying the hotplug script. Requiring end-users to modify hotplug scripts (or anything else in admin space) is a crash landing.

Don't create static variables in the libraries!

Don't create static variables in library or driver files; it makes them non-reentrant and hard to light. In practice, this means you shouldn't declare a static in any file that doesn't have a main() function of its own, and silly little test mains cordoned off by a preprocessor conditional don't count.

Instead, use the 'driver' union in gps_device_t and the gps_context_t storage area that's passed in common in a lot of the calls. These are intended as places to stash stuff that needs to be shared within a session, but would be thread-local if the code were running in a thread.

Don't use malloc!

The best way to avoid having dynamic-memory allocation problems is not to use malloc/free at all. The gpsd daemon doesn't (though the client-side code does). Thus, even the longest-running instance can't have memory leaks. The only cost for this turned out to be embedding a PATH_MAX-sized buffer in the gpsd.h structure.

Don't undo this by using malloc/free in a driver or anywhere else.

Avoid use of sizeof(<int type>)!

It's tempting to extract parts of packets with by using a loop of the form "for(i = 0; i & len; i += sizeof(long))". Don't do that; not all integer types have the same length across architectures. A long may be 4 bytes on a 32-bit machine and 8 bytes on a 64-bit. If you mean to skip 4 bytes in a packet, then say so (or use sizeof(int32_t)).

Understanding the code

Debugging

For debugging purposes, it may be helpful to configure with --disable-shared. This turns off all the shared-library crud, making it somewhat easier to use gdb. (Don't forget to set CFLAGS=-g in the configure environment)

There is a script called logextract in the devtools directory of the source distribution that you can use to strip clean NMEA out of the log files produced by gpsd. This can be useful if someone ships you a log that they allege caused gpsd to misbehave.

gpsfake enables you to repeatedly feed a packet sequence to a gpsd instance running as non-root. Watching such a session with gdb(1) should smoke out any repeatable bug pretty quickly.

Almost all the C programs have a -D option that enables logging of progress messages to standard error. In gpsd itself this ups the syslogging level if it is running in background; see the LOG_* defines in gpsd.h to get an idea of what the log levels do. Most of the test clients accept this switch to enable progress message from the libgps code; you can use it, for example, to watch what the client-side parser for the wire protocol is actually doing.

The parsing of GPGSV sentences in the NMEA driver has been a persistent and nasty trouble spot, causing more buffer overruns and weird secondary damage than all the rest of the gpsd put together. Any time you get a bug report that seems to afflict NMEA devices only, suspicion should focus here.

Profiling

There is a barely-documented timing policy flag in the WATCH command that will cause it to emit a TIMING object on every sentence. The TIMING response contains the following attributed:

  1. tag: An identifying sentence tag.
  2. len: The character length of the sentence containing the timestamp data.
  3. timebase: The timestamp associated with the sentence, in seconds since the Unix epoch (this time is leap-second corrected, like UTC). This timestamp may be zero. If nonzero, it is the base time for the packet.
  4. xmit: An offset from the timestamp telling when gpsd believes the transmission of the current packet started (this is actually recorded just before the first read of the new packet). If the sentence timestamp was zero, this offset is a full timestamp and the base time of the packet.
  5. recv: An offset from the base time telling when gpsd received the last bytes of the packet.
  6. decode: An offset from the base time telling when gpsd decoded the data.
  7. poll: An offset from the base time taken just before encoding the response — effectively, when gpsd was polled to transmit the data.
  8. elapsed: An offset from the base time to the time of the TIMING emission.

The TIMING figures measure components of the latency between the GPS's time measurement and when the sentence data became available to the client. For them to be meaningful, the GPS has to ship timestamps with sub-second precision. SiRF-II and Evermore chipsets ship times with millisecond resolution. Your machine's time reference must also be accurate to subsecond precision; I recommend using ntpd, which will normally give you about 15 microseconds precision (two orders of magnitude better than GPSes report).

Note, some inaccuracy is introduced into the start- and end-of-packet timestamps by the fact that the last read of a packet may grab a few bytes of the next one.

The distribution includes a Python script, gpsprof, that uses the timing support to collect profiling information from a running GPS instance. You can use this to measure the latency at each stage — GPS to daemon, daemon to client library — and to estimate the portion of the latency induced by serial transmit time. The gpsprof script creates latency plots using gnuplot(1). It can also report the raw data.

Porting to weird machines: endianness, width, and signedness issues.

The gpsd code is well-tested on 32- and 64-bit IA chips, also on PPCs. Thus, it's known to work on mainstream chips of either 32 or 64 bits and either big-endian or little-endian representation with IEE754 floating point.

Handling of NMEA devices should not be sensitive to the machine's internal numeric representations, however, because the binary-protocol drivers have to mine bytes out of the incoming packets and mung them into fixed-width integer quantities, there could potentially be issues on weird machines. The regression test should spot these.

If you are porting to a true 16-bit machine, or something else with an unusual set of data type widths, take a look at bits.h. We've tried to collect all the architecture dependencies here.

Architecture and how to hack it

There are two useful ways to think about the GPSD architecture. One is in terms of the layering of the software, the other is in terms of the normal flow of information through it.

The software-layering view

The gpsd breaks naturally into four pieces: the drivers, the packet sniffer, the core library and the multiplexer. We'll describe these from the bottom up.

The drivers are essentially user-space device drivers for each kind of chipset we support. The key entry points are methods to parse a data packet into time-position-velocity or status information, change its mode or baud rate, probe for device subtype, etc. See Driver Architecture for more details about them.

The packet sniffer is responsible for mining data packets out of serial input streams. It's basically a state machine that's watching for anything that looks like a properly checksummed packet. Because devices can hotplug or change modes, the type of packet that will come up the wire from a serial or USB port isn't necessarily fixed forever by the first one recognized.

The core library manages a session with a GPS device. The key entry points are (a) Starting a session by opening the device and reading data from it, hunting through baud rates and parity/stopbit combinations until the packet sniffer achieves synchronization lock with a known packet type, (b) polling the device for a packet, and (b) closing the device and wrapping up the session.

A key feature of the core library is that it's responsible for switching each GPS connection to using the correct device driver depending on the packet type that the sniffer returns. This is not configured in advance and may change over time, notably if the device switches between different reporting protocols (most chipsets support NMEA and one or more vendor binary protocols, and devices like AIS receivers may report packets in two different protocols on the same wire).

Finally, the multiplexer is the part of the daemon that handles client sessions and device assignment. It is responsible for passing TPV reports up to clients, accepting client commands, and responding to hotplug notifications. It is essentially all contained in the gpsd.c source file.

The first three components (other than the multiplexer) are linked together in a library called libgpsd and can be used separately from the multiplexer. Our other tools that talk to GPSes directly, such as gpsmon and gpsctl, do it by calling into the core library and driver layer directly.

Under some circumstances, the packet sniffer by itself is separately useful. gpscat uses it without the rest of the lower layer in order to detect and report packet boundaries in raw data. So does gpsfake, in order to chunk log files so they can be fed to a test instance of the daemon packet-by-packet with something approximating realistic timing.

The data-flow view

Essentially, gpsd spins in a loop polling for input from one of these sources:

  1. A set of clients making requests over a TCP/IP port.
  2. A set of GPSes, connected via serial or USB devices.
  3. A set of DGPS or NTRIP servers issuing periodic differential-GPS updates.
  4. The special control socket used by hotplug scripts and some configuration tools.

The daemon only connects to a GPS when clients are connected to it. Otherwise all GPS devices are closed and the daemon is quiescent.

All writes to client sockets go through throttled_write(). This code addresses two cases. First, client has dropped the connection. Second, client is connected but not picking up data and our buffers are backing up. If we let this continue, the write buffers will fill and the effect will be denial-of-service to clients that are better behaved.

Our strategy is brutally simple and takes advantage of the fact that GPS data has a short shelf life. If the client doesn't pick it up within a few minutes, it's probably not useful to that client. So if data is backing up to a client, drop that client. That's why we set the client socket to non-blocking.

For similar reasons, we don't try to recover from short writes to the GPS, e.g. of DGPS corrections. They're packetized, so the device will ignore a fragment, and there will generally be another correction coming along shortly. Fixing this would require one of two strategies:

  1. Buffer any data not shipped by a short write for retransmission. Would require us to use malloc and just be begging for memory leaks.

  2. Block till select indicates the hardware or lower layer is read for write. Could introduce arbitrary delays for time-sensitive data.

So far, as of early 2009, we've only seen short writes on Bluetooth devices under Linux. It is not clear whether this is a problem with the Linux Bluetooth driver (it could be failing to coalesce and buffer adjacent writes properly) or with the underlying hardware (Bluetooth devices tend to be cheaply made and dodgy in other respects as well).

GPS input updates an internal data structure which has slots in it for all the data you can get from a GPS. Client commands mine that structure and ship reports up the socket to the client. DGPS data is passed through, raw, to the GPS.

The trickiest part of the code is the handling of input sources in gpsd.c itself. It had to tolerate clients connecting and disconnecting at random times, and the GPS being unplugged and replugged, without leaking file descriptors; also arrange for the GPS to be powered up when and only when clients are active.

The special control socket is primarily there to be used by hotplug facilities like Linux udev. It is intended to be written to by scripts activated when a relevant device (basically, a USB device with one of a particular set of vendor IDs) is connected to or disconnected from the system. On receipt of these messages, gpsd may add a device to its pool, or remove one and (if possible) shift clients to a different one.

The reason these scripts have to look for vendor IDs is that USB has no GPS class. Thus, GPSes present the ID of whatever serial-to-USB converter chip they happen to be using. Fortunately there are fewer types of these in use than there are GPS chipsets; in fact, just two of them account for 80% of the USB GPS market and don't seem to be used by other consumer-grade devices.

Driver initialization and wrapup

Part of the job gpsd does is to minimize the amount of time attached GPSes are in a fully powered-up state. So there is a distinction between initializing the gpsd internal channel block structure for managing a GPS device (which we do when the hotplug system tells us it's available) and activating the device (when a client wants data from it) which actually involves opening it and preparing to read data from it. This is why gpsd_init() and gpsd_activate() are separate library entry points.

There is also a distinction between deactivating a device (which we do when no users are listening to it) and finally releasing the gpsd channel block structure for managing the device (which typically happens either when gpsd terminates or the hotplug system tells gpsd that the device has been disconnected). This is why gpsd_deactivate() and gpsd_wrap() are separate entry points.

gpsd has to configure some kinds of GPS devices when it recognizes them; this is what the event_identify and event_configure hooks are for. gpsd tries to clean up after itself, restoring settings that were changed by the configurator method; this is done by gpsd_deactivate(), which fires the deactivate event so the driver can revert settings.

Autoconfiguration

One of the design goals for gpsd is to be as near zero-configuration as possible. Under most circumstances, it doesn't require either the GPS type or the serial-line parameters to connect to it to be specified. Presently, here's how the autoconfig works.

  1. At each baud rate gpsd grabs packets until the sniffer sees either a well-formed and checksum-verified NMEA packet, a well-formed and checksum-verified packet of one of the binary protocols, or it sees one of the two special trigger strings EARTHA or ASTRAL, or it fills a long buffer with garbage (in which case it steps to the next baud-rate/parity/stop-bit).
  2. If it finds a SiRF packet, it queries the chip for firmware version. If the version is < 231.000 it drops back to SiRF NMEA. We're done.
  3. If it finds a Zodiac binary packet (led with 0xff 0x81), it switches to the Zodiac driver. We're done.
  4. If it finds an Evermore binary packet (led with DLE=0x10 followed by STX=0x02) it switches to Evermore binary protocol. We're done.
  5. If it finds a Navcom binary packet (led with 0x10 0x02 0x99 0x66) it switches to Navcom driver. We're done.
  6. If it finds a TSIP binary packet (led with 0x10=DLE), it switches to the TSIP driver. We're done.
  7. If it finds an iTalk v3 binary packet (led with <! ), it switches to the iTalk driver. We're done.
  8. If it finds a UBX binary packet (led with μB ), it switches to the UBX driver. We're done.
  9. If it finds EARTHA, it selects the Earthmate driver, which then flips the connection to Zodiac binary mode. We're done.
  10. If it finds ASTRAL, it feeds the TripMate on the other end what it wants and goes to Tripmate NMEA mode. We're done.
  11. If it finds a NMEA packet, it selects the NMEA driver. This initializes by shipping all vendor-specific initialization strings to the device. The objectives are to enable GSA, disable GLL, and disable VTG. Probe strings go here too, like the one that turns on SiRF debugging output in order to detect SiRF chips.
  12. Now gpsd reads NMEA packets. If it sees a driver trigger string it invokes the matching driver.

The outcome is that we know exactly what we're looking at, without any driver-type or baud rate options.

(The above sequence of steps may be out of date. If so, it will be because we have added more recognized packet types and drivers.)

Error modeling

To estimate errors (which we must do if the GPS isn't nice and reports them in meters with a documented confidence interval), we need to multiply an estimate of User Equivalent Range Error (UERE) by the appropriate dilution factor,

The UERE estimate is usually computed as the square root of the sum of the squares of individual error estimates from a physical model. The following is a representative physical error model for satellite range measurements:

From R.B Langley's 1997 "The GPS error budget". GPS World , Vol. 8, No. 3, pp. 51-56

 
Atmospheric error — ionosphere7.0m
Atmospheric error — troposphere0.7m
Clock and ephemeris error3.6m
Receiver noise1.5m
Multipath effect1.2m

From Hoffmann-Wellenhof et al. (1997), "GPS: Theory and Practice", 4th Ed., Springer.

 
Code range noise (C/A)0.3m
Code range noise (P-code)0.03m
Phase range0.005m

We're assuming these are 2-sigma error ranges. This needs to be checked in the sources. If they're 1-sigma the resulting UEREs need to be doubled.

See this discussion of conversion factors.

Carl Carter of SiRF says: "Ionospheric error is typically corrected for at least in large part, by receivers applying the Klobuchar model using data supplied in the navigation message (subframe 4, page 18, Ionospheric and UTC data). As a result, its effect is closer to that of the troposphere, amounting to the residual between real error and corrections."

"Multipath effect is dramatically variable, ranging from near 0 in good conditions (for example, our roof-mounted antenna with few if any multipath sources within any reasonable range) to hundreds of meters in tough conditions like urban canyons. Picking a number to use for that is, at any instant, a guess."

"Using Hoffman-Wellenhoff is fine, but you can't use all 3 values. You need to use one at a time, depending on what you are using for range measurements. For example, our receiver only uses the C/A code, never the P code, so the 0.03 value does not apply. But once we lock onto the carrier phase, we gradually apply that as a smoothing on our C/A code, so we gradually shift from pure C/A code to nearly pure carrier phase. Rather than applying both C/A and carrier phase, you need to determine how long we have been using the carrier smoothing and use a blend of the two."

On Carl's advice we would apply tropospheric error twice, and use the largest Wellenhof figure:

UERE = sqrt(0.7^2 + 0.7^2 + 3.6^2 + 1.5^2 + 1.2^2 + 0.3^2) = 4.1

DGPS corrects for atmospheric distortion, ephemeris error, and satellite/ receiver clock error. Thus:

UERE = sqrt(1.5^2 + 1.2^2 + 0.3^2) = 1.8

which we round up to 2 (95% confidence).

Due to multipath uncertainty, Carl says 4.1 is too low and recommends a non-DGPS UERE estimate of 8 (95% confidence). That's what we use.

Ancient history

The project is presently kept in a git repository. Up until mid-August 2004 (r256) it was kept in CVS, which was mechanically upconverted to Subversion. On 12 March 2010 the Subversion repository was converted to git. The external releases from the Subversion era still exist as tags.

Known trouble spots

The Y2.1K problem and other calendar issues

Because of limitations in various GPS protocols (e.g., they were designed by fools who weren't looking past the ends of their noses) this code unavoidably includes some assumptions that will turn around and bite on various future dates.

The three specific problems are:

  1. The GPS radio format has a Y2K-style bug, the week counter rollover, which happens either every 1024 weeks (roughly 20 years) or every 8192 weeks (roughly 157 years), depending on whether your receiver can decode a 10-bit or 13-bit GPS week field. At time of writing the last 0 week was in 1999, the next 10-bit wraparound will be in 2019, and the next 13-bit wraparound will be in 2157.
  2. NMEA delivers only two-digit years.
  3. SiRF chips at firmware level 231 deliver only GPS time in binary mode, not leap-second-corrected UTC.

Because of the first problem, the receiver's notion of the year may reset to the year of the last zero week if it is cold-booted on a date after a rollover. This can have side effects:

  1. The year part of the reported date will be invalid.
  2. UTC time will be correct, but a local time calculated on this will sometimes be off +- 1 hour, due to incorrect DST calculation.
  3. Some receivers may fail to get a fix, especially if they don't have a recent ephemeris.

The public documentation is unclear, but it appears from a reference in the Transmission Week Number section of IS-GPS-200 PIRN-002 that whether you can get 10 or 13 bits is a function of the satellite firmware revision, with 13 bits in the Block IIF and later birds (the first of these was launched in May 2010). Of course your receiver firmware also has to know that the extra three bits are present; at time of writing in late 2010 this capability is very rare and unavailable on consumer-grade receivers.

For these reasons, GPSD needs the host computer's system clock to be accurate to within one second. (Work is in progress to relax this requirement to accuracy within half of a rollover period.)

When debugging time and date issues, you may find an interactive GPS calendar useful.

Hotplug interface problems

The hotplug interface works pretty nicely for telling gpsd which device to look at, at least on Fedora and Ubuntu Linux machines. But it's Linux-specific. OpenBSD (at least) features a hotplug daemon with similar capabilities. We ought to do the right thing there as well.

Hotplug is nice, but on Linux it appears to be a moving target. For help debugging a hotplug problem, see Udev Hotplug Troubleshooting.

Security Issues

Between versions 2.16 and 2.20, hotplugging was handled in the most obvious way, by allowing the F command to declare new GPS devices for gpsd to look at. Because gpsd ran as root, this had problems:

  1. A malicious client with non-root access on the host could use F to point gpsd at a spoof GPS that was actually a pty feeding bogus location data.
  2. A malicious client could use repeated probes of a target tty or other device to cause data loss to other users. This is a potential remote exploit! Not too bad if the bytes he steals are your mouse, it would just get jumpy and jerky — but suppose they're from an actual tty and sections drop out of a serial data stream you were relying on?

The conclusion was inescapable. Switching among and probing devices that gpsd already knows about can be an unprivileged operation, but editing gpsd's device list must be privileged. Hotplug scripts should be able to do it, but ordinary clients should not.

Adding an authentication mechanism was considered and rejected (can you say "can of big wriggly worms"?). Instead, there is a separate control channel for the daemon, only locally accessible, only recognizing "add device" and "remove device" commands.

The channel is a Unix-domain socket owned by root, so it has file-system protection bits. An intruder would need root permissions to get at it, in which case you'd have much bigger problems than a spoofed GPS.

More generally, certainly gpsd needs to treat command input as untrusted and for safety's sake should treat GPS data as untrusted too (in particular this means never assuming that either source won't try to overflow a buffer).

Daemon versions after 2.21 drop privileges after startup, setting UID to "nobody" and GID to whichever group owns the GPS device specified at startup time — or, if it doesn't exist, the system's lowest-numbered TTY device named in PROTO_TTY. It may be necessary to change PROTO_TTY in gpsd.c for non-Linux systems.

Adding new GPS types

This section explains the conventions drivers for new devices should follow.

Driver architecture

Internally, gpsd supports multiple GPS types. All are represented by driver method tables; the main loop knows nothing about the driver methods except when to call them. At any given time one driver is active; by default it's the NMEA one.

To add a new device, populate another driver structure and add it to the null-terminated array in drivers.c.

Unless your driver is a nearly trivial variant on an existing one, it should live in its own C source file named after the driver type. Add it to the libgps_c_sources name list in Makefile.am

The easiest way to write a driver is probably to copy the driver_proto.c file in the source distribution, change names appropriately, and write the guts of the analyzer and writer functions. Look in gpsutils.c before you do; driver helper functions live there. Also read some existing drivers for clues.

You can read an implementer's Notes On Writing A GPSD Driver.

There's a second kind of driver architecture for gpsmon, the real-time packet monitor and diagnostic tool. It works from monitor-object definitions that include a pointer to the device driver for the GPS type you want to monitor. See monitor_proto.c for a prototype and technical details.

When not to add a driver

It is not necessary to add a driver just because your NMEA GPS wants some funky initialization string. Simply ship the string in the initializer for the default NMEA driver. Because vendor control strings live in vendor-specific namespaces (PSRF for SiRF, PGRM for Garmin, etc.) your initializing control string will almost certainly be ignored by anything not specifically watching for it.

Initializing time and date

Some mode-changing commands have time field that initializes the GPS clock. If the designers were smart, they included a control bit that allows the GPS to retain its clock value (and previous fix, if any) and for you to leave those fields empty (sometimes this is called "hot start").

If the GPS-Week/TOW fields are required, as on the Evermore chip, don't just zero them. GPSes do eventually converge on the correct time when they've tracked the code from enough satellites, but the time required for convergence is related to how far off the initial value is. Most modern receivers can cold start in 45 seconds given good reception; under suboptimal conditions this can take upwards of ten minutes. So make a point of getting the time of week right.

How drivers are invoked

Drivers are invoked in one of three ways: (1) when the NMEA driver notices a trigger string associated with another driver. (2) when the packet state machine in packet.c recognizes a special packet type, or (3) when a probe function returns true during device open.

Each driver may have a trigger string that the NMEA interpreter watches for. When that string is recognized at the start of a line, the interpreter switches to its driver.

When a driver switch takes place, the old driver's wrapup method is called. Then the new driver's initializer method is called.

A good thing to send from the NMEA probe_subtype method is probe strings. These are strings which should elicit an identifying response from the GPS that you can use as a trigger string for a native-mode driver.

Don't worry about probe strings messing up GPSes they aren't meant for. In general, all GPSes have rather rigidly defined packet formats with checksums. Thus, for this probe to look legal in a different binary command set, not only would the prefix and any suffix characters have to match, but the checksum algorithm would have to be identical.

Incoming characters from the GPS device are gathered into packets by an elaborate state machine in packet.c. The purpose of this state machine is so gpsd can autobaud and recognize GPS types automatically. The other way for a driver to be invoked is for the state machine to recognize a special packet type associated with the driver. It will look through the list of drivers compiled in to find the (first) one that handles that packet type.

If you have to add a new packet type to packet.c, add tests for the type to the TESTMAIN code.

Probe-detect methods are intended for drivers that don't use the packet getter because they read from a device with special kernel support. See the Garmin binary driver for an example.

Where to put the data you get from the GPS

Your driver should put new data from each incoming packet or sentence in the 'gpsdata' member of the GPS (fixes go in the 'newdata' member), and return a validity flag mask telling what members were updated (all float members are initially set to not-a-number. as well). There is driver-independent code that will be responsible for merging that new data into the existing fix. To assist this, the CYCLE_START_SET flag is special. Set this when the driver returns the first timestamped message containing fix data in an update cycle. (This excludes satellite-picture messages and messages about GPS status that don't contain fix data.)

Your packet parser must return field-validity mask bits (using the _SET macros in gps.h), suitable to be put in session->gpsdata.valid. The watcher-mode logic relies on these as its way of knowing what to publish. Also, you must ensure that gpsdata.fix.mode is set properly to indicate fix validity after each message; the framework code relies on this. Finally, you must set gpsdata.status to indicate when DGPS fixes are available, whether through RTCM or WAAS/Egnos.

Your packet parser is also responsible for setting the tag field in the gps_data_t structure. This is the string that will be emitted as the first field of each $ record for profiling. The packet getter will set the sentence-length for you; it will be raw byte length, including both payload and header/trailer bytes.

Note, also, that all the timestamps your driver puts in the session structure should be UTC (with leap-second corrections) not just Unix seconds since the epoch. The report-generator function for D does not apply a timezone offset.

Report errors with a 95% confidence interval

gpsd drivers are expected to report position error estimates with a 95% confidence interval. A few devices (Garmins and Zodiacs) actually report error estimates. For the rest we have to compute them using an error model.

Here's a table that explains how to convert from various confidence interval units you might see in vendor documentation.

 
sqr(alpha)ProbabilityNotation
1.00 39.4% 1-sigma or standard ellipse
1.18 50.0% Circular Error Probable (CEP)
1.414 63.2% Distance RMS (DRMS)
2.00 86.5% 2 sigma ellipse
2.45 95.0% 95% confidence level
2.818 98.2% 2DRMS
3.00 98.9% 3 sigma ellipse

There are constants in gpsd.h for these factors.

Log files for regression testing

Any time you add support for a new GPS type, you should also send us a representative log for your GPS. This will help ensure that support for your device is never broken in any gpsd release, because we will run the full regression before we ship.

The correct format for a capture file is described in the FAQ entry on reporting bugs.

See the header comment of the gpsfake.py module for more about the logfile format.

An ideal log file for regression testing would include an initial portion during which the GPS has no fix, a portion during which it has a fix but is stationary, and a portion during which it is moving.

Future Protocol Directions

The new protocol based on JSON (JavaScript Object Notation) shipped in 2.90.

A major virtue of JSON is its extensibility. There are lots of other things a sensor wedded to a GPS might report that don't fit the position-velocity-time model of the oldstyle O report. Depth of water. Temperature of water. Compass heading. Roll. Pitch. Yaw. We've already had requests to handle some of these for NMEA-emitting devices like magnetic compasses (which report heading via a proprietary TNTHTM sentence) and fish finders (which report water depth and temperature via NMEA DPT and MTW sentences). JSON gives a natural way to add ad-hoc fields, and we expect to exploit that in the future.

Proposed sentences:

Chris Kuethe has floated the following list requests for discussion:

?A .. ?Z -> map to the original A..Z gpsd commands
?almanac -> poll the almanac from the receiver
?ephemeris -> poll the ephemeris from the receiver
?assist -> load assistance data (time, position, etc) into the receiver
?raw -> get a full dump of the last measurement cycle... at least
           clock, doppler, pseudorange and carrier phase.
?readonly -> get/set read-only mode. no screwing up bluetooth devices
?listen -> set the daemon's bind address. added privacy for laptop users.
?port -> set the daemon's control port used by the regression
            tests, at least.
?debug -> set/modify the daemon's debug level, including after launch.

Blind alleys

Things we've considered doing and rejected.

Reporting fix data only once per cycle

See the discussion of the buffering problem, above. The "Buffer all, report then clear on start-of-cycle" policy would introduce an unpleasant amount of latency. gpsd actually uses the "Buffer all, report on every packet, clear at start-of-cycle" policy.

Allowing clients to ship arbitrary control strings to a GPS

Tempting — it would allow us to do gpsmon-like things with the daemon running — but a bad idea. It would make denial-of-service attacks on applications using the GPS far too easy. For example, suppose the control string were a baud-rate change?

Setting FIFO threshold to 1 to reduce jitter in serial-message times

When using gpsd as a time reference, one of the things we'd like to do is make the amount of lag in the message path from GPS to GPS small and with as little jitter as possible, so we can correct for it with a constant offset.

A possibility we considered is to set the FIFO threshold on the serial device UART to 1 using TIOCGSERIAL/TIOCSSERIAL. This would, in effect, disable transmission buffering, increasing lag but decreasing jitter.

But it's almost certainly not worth the work. Rob Janssen, our timekeeping expert, reckons that at 4800bps the UART buffering can cause at most about 15msec of jitter. This is, observably, swamped by other less controllable sources of variation.

Stop using a compiled-in UTC-TAI offset

Instead, from the hotplug script, we could maintain a local offset file:

  1. If there is no local offset file, download the current leap-second offset from IERS or the U.S. Naval Observatory and copy it to a local offset file
  2. If there is a local offset file, consider it stale after five months and reload it.
  3. gpsd should read the local offset file when it starts up, if it exists.

However, it turns out this is only an issue for EverMore chips. SiRF GPSes can get the offset from the PPS or subframe data; NMEA GPSes don't need it; and the other binary protocols supply it. Looks like it's not worth doing.

Subsecond polling

gpsd relies on the GPS to periodically send TPV reports to it. A few GPSes have the capability to change their cycle time so they can ship reports more often (gpsd 'c' command). These all send in some vendor-binary format; no NMEA GPS I've ever seen allows you to set a cycle time of less than a second, if only because at 4800bps, a full TPV report takes just under one second in NMEA.

But most GPSes send TPV reports once a second. At 50km/h (31mi/h) that's 13.8 meters change in position between updates, about the same as the uncertainty of position under typical conditions.

There is, however, a way to sample some GPSes at higher frequency. SiRF chips, and some others, allow you to shut down periodic notifications and poll them for TPV. At 57600bps we could poll a NMEA GPS 16 times a second, and a SiRF one maybe 18 times a second.

Alas, Chris Kuethe reports: "At least on the SiRF 2 and 3 receivers I have, you get one fix per second. I cooked up a test harness to disable as many periodic messages as possible and then poll as quickly as possible, and the receiver would not kick out more than one fix per second. Foo!"

So subsecond polling would be a blind alley for all SiRF devices and all NMEA devices. That's well over 90% of cases and renders it not worth doing.

Release Procedure

First, defining some terms. There are three tiers of code in our tarballs. Each has a different risk profile.

1) Exterior code

Test clients and test tools: xgps, cgps, gpsfake, gpsprof, gpsmon, etc. These are at low risk of breakage and are easy to eyeball-check for correctness -- when they go wrong they tend to do so in obvious ways. Point errors in a tool don't compromise the other tools or the daemon.

2) Device drivers

Drivers for the individual GPS device types. Errors in a driver can be subtle and hard to detect, but they generally break support for one class of device without affecting others. Driver maintainers can test their drivers with high confidence.

3) Core code

Core code is shared among all device types; most notably, it includes the packet-getter state machine, the channel-management logic, and the error-modeling code. Historically these are the three most bug-prone areas in the code.

We also need to notice that there are two different kinds of devices with very different risk profiles:

A) Static-testable: These are devices like NMEA and SiRF chipsets that can be effectively simulated with a test-load file using gpsfake. We can verify these with high confidence using a mechanical regression test.

B) The problem children: Currently this includes Garmin USB and the PPS support. In the future it must include any other devices that aren't static-testable. When the correctness of drivers and core code in handling is suspect, they have to be live-tested by someone with access to the actual device.

The goal of our release procedure is simple: prevent functional progressions. No device that worked in release N should break in release N+1. Of course we also want to prevent shipping broken core code.

For static-testable devices this is fairly easy to ensure. Now that we've fixed the problems with ill-conditioned floating-point, the regression-test suite does a pretty good job of exercising those drivers and the related core code and producing repeatable results. Accordingly, I'm fairly sure we will never again ship a release with serious breakage on NMEA or SiRF devices.

The problem children are another matter. Right now our big exposure here is Garmins, but we need to have good procedure in case we get our TnT support unbroken and for other ill-behaved devices we might encounter in the future.

Here are the new release-readiness states:

State 0 (red): There are known blocker bugs, Blocker bugs include functional regressions in core or driver code.

State 1 (blue): There are no known blocker bugs. 'make testregress' passes, but problem-children (USB Garmin and serial PPS) have not been live-tested.

State 2 (yellow): There are no known blocker bugs. 'make testregress' passes. Problem children have been live-tested. From this state, we drop back to state 1 if anyone commits a logic change to core code or the driver for a problem child. In state 2, devs with release authority (presently myself, Chris, and Gary) may ship a release candidate at any time.

State 3 (green): We've been in state 2 for 7 days. In state 3, a dev with release authority can call a freeze for release.

State 4: (freeze): No new features that could destabilize existing code. Release drops us to state 3.

When you do something that changes our release state -- in particular, when you commit a patch that touches core or a problem-child driver at state 2 -- you must notify the dev list.

Anyone notifying the list of a blocker bug drops us back to state 0.

When you announce a state change on the dev list, do it like this:

     Red light: total breakage in Garmin USB, partial breakage in Garmin serial
     Blue light: no known blockers, cosmetic problems in xgps
     Yellow light: Garmins tested successfully 20 Dec 2007
     Green light: I'm expecting to call freeze in about 10 days
     Freeze: Scheduled release date 1 Feb 2008

Release Checklist

This is a reminder for release engineers preparing to ship a public gpsd version. Release requires the following steps in the following order:

1. Check the Berlios bug tracker for release blockers.
2. Issue the pre-release heads-up
About 48 hours before release, announce that it's coming so people will have a day or so to get their urgent fixes in.
3. Code pull from the public repo
This is the revision the release will be built from.
4. Check that the version number is correct
It may need to be modified in configure.ac, and another autogen.sh done. Make sure the 'dev' suffix is gone.
5. Rebuild leapcheck.cache.
Rebuild the leap-second list used for rollover checks with 'make leapcheck.cache' Note: This will require Internet access, and fail (leaving the existing leap-second list in place) if that is not available.
6. Update the NEWS file
Make sure the topmost entry in the changelog portion of gpsd.spec.in is up-to-date and properly timestamped.
7. Run the regression tests and other checks
If it doesn't pass regressions, it isn't ready to ship.
  1. make testbuild
  2. make testregress
  3. make splint
  4. make cppcheck
  5. make xmllint
  6. devtools/valgrind-audit
  7. devtools/flockest
  8. scan-build
8. Ship to Berlios
Make the tarball (make dist) and upload to berlios.de (make upload-ftp), and do Berlios's release procedure.
9. Tag the release in git
Doing 'make release-tag' should accomplish this. The tag push will trigger mail to the announce list.
10. Bump the release number and push that commit
Bump the release number in configure.ac, adding a '~dev' suffix so tarball instances pulled from the repo will be clearly distinguishable from the next public release.
11. Announce the release
Announce the release, and the resumption of regular commits, on the dev list.

Packaging links

Debian build controls