summaryrefslogtreecommitdiff
path: root/TODO
blob: bd153d5f2a25294780fa1a392b27ee293539ee2d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
This is the gpsd to-do list.  If you're viewing it with Emacs, try
doing Ctl-C Ctl-t and browsing through the outline headers.  Ctl-C Ctl-a 
will unfold them again.

** Bugs in gpsd and its clients:

*** EPH and EPV reports are zeroed too often in the TSIP driver 

There is some bad interaction between the policy code in
libgpsd_core.c and the TSIP driver that we haven't figured out.

This may be a symptom of more general problems in data management
on devices that ship several sentences of fix and related data
per cycle.  It does not affect devices speaking SiRF or Zodiac or 
Garmin-binary protocol.

*** PPS code is flaky, possibly due to a pthreads bug

Some code attempting to terminate the PPS-monitoring thread when there
is no DCD (e.g., on a USB device) seems to have tickled some kind of
bug in pthreads -- termination seems to close the GPS device or
otherwise do something nasty to the serial I/O layer.  

The default build has ENABLE_PPS off until we figure this one out.

** Bugs gpsd tickles in other software

*** A long-running xgps may induce a memory leak in the X server

Rob Janssen writes, reporting from SuSE 9.2 + Xorg 6.8.1 + KDE 3.3:
>I have found something that leaks.  Not in our software, but in the X server.
>But caused by xgps.
>
>After 3 copies of xgps ran for a couple of days, I noticed a lot of swap
>is in use:
>             total       used       free     shared    buffers     cached
>Mem:       1035776     992324      43452          0     174580      54756
>-/+ buffers/cache:     762988     272788
>Swap:      2104440    1444944     659496
>
>This is more than normal for my system.  I noticed it when switching to a
>virtual screen where mozilla is running.  It took several seconds to
>redraw.
>
>I did a "ps axu" and found these interesting lines:
>
>USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
>root      7004  2.5 33.6 1861128 348460 ?    SL   Jun19 210:46 /usr/X11R6/bi
>ntp      18011  0.0  0.2  2696 2692 ?        SLs  Jun19   0:00 /usr/sbin/ntp
>nobody   23105  0.0  0.0  7004  900 ?        S<sl Jun21   0:17 /home/rob/src
>rob      28724  0.2  0.1  7084 1968 pts/18   S    Jun22   9:03 /home/rob/src
>rob      28744  0.3  0.1  7084 2040 pts/18   S    Jun22  14:52 /home/rob/src
>rob      28759  0.2  0.1  7084 1936 pts/18   S    Jun22   9:44 /home/rob/src
>
>So the gps programs were not consuming that much memory.
>I still stopped the 3 xgps programs (the last three in this list) and then
>the X server showed:
>root      7004  2.5  2.2 367140 23424 ?      SL   Jun19 210:49 /usr/X11R6/bi
>
>So there certainly is a relation here.
>Unfortunately I have zero knowledge about X programming.  I would guess
>some kind of session or operation is started and never ended, or something
>is said to be saved, but apparently it remains related to the specific
>window because the X server neatly frees it once the program disconnects.

But Rob's xgps memory leak doesn't reproduce on a stock Fedora Core 3
system.  ESR tested for this in the simplest possible way, by doing
system("free t") at the end of each handle_input() call.

*** Under Reiserfs gpsd may conspire with I/O buffering traffic to eat the CPU.

Rob Janssen writes:
> When gpsd is running and I start a backup of my system, which starts and
> mounts a normally idle disk and does a tar cvzf to it, the load of the
> system is quite high and gpsd seems to get out of sync with the received
> data from the SiRF receiver.
> It then gets stuck in a tight loop.  As gpsd is running at nice
> --10, this further increases the load and slows down the backup to a crawl.
> Killing gpsd makes the backup finish and then it can be started again.

He further observers that the FIONREAD ioctl() we use to check for
input waiting spins returning 0 when this happens.  We suspect that
memory resource crises do something nasty to the buffering in the
Linux serial layer.  The bug seems not to occur during ext3 backups,
only Reiserfs ones.

** To do:

*** Track error computation

Presently the track error member in the fix structure is neither
reported by any GPS nor filled in by computation.

*** Hotplug interface problems

The hotplug interface works pretty nicely for telling gpsd which
device to look at, at least on my FC3 Linux machines.  The fly in the
ointment is that I'm using a deprecated version of the interface, the
old-style /etc/hotplug version with usermap files.

It is unlikely this interface will be dropped by distro makers any
time soon, because it's supporting a bunch of popular USB cameras.
Still, it would be nice not to be using a deprecated interface.

I tried moving to the new-style /etc/hotplug.d interface, but I ran
into a nasty race condition.  My hotplug agent got woken up on a USB
add event as it should, but in the new interface the creation of
/dev/ttyUSB* can be delayed arbitrarily long after the wakeup event.
Thus, it may not be there when gpsd goes to probe it unless I
busy-wait in the script.

Ultimately this should all be done through udev.  The problem is that at
the current state of udev, we'd need to do it through a script that would
fire every time a tty activates.  Because of virtual consoles firing up at
boot time, this would introduce significant boot lag.

This would be antisocial and I'm not willing to do it, so udev needs
to grow better filtering before I'll use it.

When and if udev supports HOTPLUG and ACTION keys, this will work:

# The Prolific Technology 2303 (commonly in tandem with SiRF chips)
BUS="usb" SYSFS{vendor}="067b" SYSFS{product}="2303" \
		NAME="gps%e" \
		HOTPLUG="/usr/bin/gps-probe"
# FTDI 8U232AM
BUS="usb" SYSFS{vendor}="0403" SYSFS{product}="6001" \
		NAME="gps%e" \
		HOTPLUG="/usr/bin/gps-probe"
# Cypress M8/CY7C64013 (DeLorme uses these)
BUS="usb" SYSFS{vendor}="1163" SYSFS{product}="0100" \
		NAME="gps%e" \
		HOTPLUG="/usr/bin/gps-probe"

More generally, the hotplug code we have is Linux-specific.  OpenBSD
(at least) features a hotplug daemon with similar capabilities.

*** The mess near error modeling

One of my goals has been to report an uncertainty along with every
dimension of PVT, so that the return from the GPS actually (and
realistically) describes the volume of kinematic state space within
which it is located at 1-sigma or 66% confidence. (Because the errors
are taken to be normally distributed, we can square the error to get
2-sigma or 95% confidence.)

There are several problems with this. 

A. I don't know how to derive or estimate uncertainty of time in the
general case.  There are clock drift and bias fields in the SiRF
binary protocol, but I don't know how to interpret these.  Does
anyone?

B. Only Garmin devices report estimated position uncertainties in meters.
They won't say what the confidence interval is, but it is generally
believed to be 1-sigma.  See <http://gpsinformation.net/main/epenew.txt>.

Here is what I am presently doing in the new E command:

1. I pass up the Garmin PGRME fields (uncertainty in meters) if
   they're available.

2. Otherwise, I apply the error model described in the HACKING dociment.

What non-Garmin GPSes will return in the E command is UERE multiplied
by PDOP/HDOP/VDOP.  Annoyingly, SiRF binary mode only offers HDOP,
one respect in which it is functionally inferior to SiRF NMEA.  We
compute VDOP and PDOP using an algorithm supplied by SiRF.

I don't know, because my sources didn't give, the confidence level
associated with the range uncertainties in gpsd.h.  My educated guess
is that they are 1-sigma (66%), and that's what the gpsd documentation
now says, but it needs to be confirmed.

This area needs some attention from somebody who cares a lot about
GPS accuracy and is willing to do research on error budgets to pin
down the numbers and confidence levels.

*** Do the research to figure out just what is going on with status bits

NMEA actually has *four* kinds of validity bits: Mode, Status, the
Active/Void bit (some sources interpret 'V' as 'Navigation receiver
warning'), and in later versions the FAA indicator mode.  Sentences
that have an Active/Void send V when there is no fix, so the position
data is no good.

Let's look at which sentences send what:

                GPRMC     GPGLL     GPGGA     GPGSA
Returns fix      Yes       Yes       Yes        No
Returns status   No        Yes       Yes        No
Returns mode     No        No        No         Yes
Returns A/V      Yes       Yes       No         No

In addition, some sentences use empty fields to signify invalid data.

My first conclusion from looking at this table is that the designers
of NMEA 0183 should be hung for galloping incompetence.  But never mind that.
What are we to make of this mess?

The fact that the FV18 sends GPMRC/GPGLL/GPGGA but not GPGSA
argues that GPGSA is optional.  I don't see how it can be, since it
seems to be the only status bit that applies to altitude.  Just how are
we supposed to know when altitude is valid if it doesn't ship GSA?  
Can a receiver ever ship a non-empty but invalid altitude?

Which of these override which other bits?  I don't think status is ever
nonzero when mode is zero. So status overrides mode.  What other such
relationships are there?

News flash: it develops that the "Navigation receiver warning" is
supposed to indicate a valid fix that has a DOP too high or fails
an elevation test.

** Future features (?)

*** Subsecond polling

gpsd relies on the GPS to periodically send PVT reports to it.

Most GPSes send PVT reports once a second.  No GPS I am aware of
allows you to set a cycle time of less than a second.  This is because
at 4800bps, a full PVT report takes just under one second in NMEA.

At 50km/h (31mi/h) that's 13.8 meters change in position between
updates, about the same as the uncertainty of position under typical
conditions.

There is, however, a way to sample GPSes at higher frequency.  SiRF
chips, and some others, allow you to shut down periodic notifications
and poll them for PVT.  At 57600bps we could poll a NMEA GPS 16 times
a second, and a SiRF one maybe 18 times a second.

Is this worth doing?  Maybe.  It would reduce fix latency, possibly
to good effect if your GPS is in motion.  Opinions?  Calculations?

*** Set the system time zone from latitude/longitude

If we're going to give gpsd the capability to set system time via
ntpd, why not let it set timezone as well?  A good thing for hackers
travelling with laptops!

The major issue here is that I have not yet found code, or a
database, that would allow mapping from lon/lat to timezone.
And the rules change from year to year.

Actually this should be built as a specialized client, as some
people won't want it.

From <http://www.linuxsa.org.au/tips/time.html>:

    The timezone under Linux is set by a symbolic link from
    /etc/localtime[1] to a file in the /usr/share/zoneinfo[2] directory
    that corresponds with what timezone you are in. For example, since I'm
    in South Australia, /etc/localtime is a symlink to
    /usr/share/zoneinfo/Australia/South. To set this link, type:

    ln -sf ../usr/share/zoneinfo/your/zone /etc/localtime

    Replace your/zone with something like Australia/NSW or
    Australia/Perth. Have a look in the directories under
    /usr/share/zoneinfo to see what timezones are available.

    [1] This assumes that /usr/share/zoneinfo is linked to /etc/localtime as it is under Red Hat Linux.

    [2] On older systems, you'll find that /usr/lib/zoneinfo is used
    instead of /usr/share/zoneinfo.

Changing the hardlink will, of course, update the system timezone for
all users.  If I were designing this feature, I'd ensure that the
system timezone can be overridden by a user-set TZ, but I don't know
if it actually works that way.

If I'm reading the tea leaves correctly, this functionality is actually
embedded in the GCC library version of tzset(), so the same method will
work on any system that uses that.

Problem: system daemons use the timezone set when they start up. You
can't get them to grok a new one short of rebooting

Sources: 

Sources for Time Zone and Daylight Saving Time Data
http://www.twinsun.com/tz/tz-link.htm

Free time-zone maps of the U.S.
http://www.manifold.net/download/freemaps.html

Local variables:
mode: outline
paragraph-separate: "[ 	]*$"
end: