summaryrefslogtreecommitdiff
path: root/TODO
blob: afd387942fb4398ef92de1be105b8834dcc9f5ea (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
This is the gpsd to-do list.  If you're viewing it with Emacs, try
doing Ctl-C Ctl-t and browsing through the outline headers.  Ctl-C Ctl-a 
will unfold them again.

** Bugs:

*** EPH and EPV reports are zeroed too often in the TSIP driver 

There is some bad interaction between the policy code in
libgpsd_core.c and the driver that we haven't figured out.

*** PPS code is flaky, possibly due to a pthreads bug

Some code attempting to terminate the PPS-monutoring thread when there
is no DCD (e.g., on a USB device) seems to have tickled some kind of
bug in pthreads -- termination seems to close the GPS device or
otherwise do something nasty to the serial I/O layer.  

The default build has ENABLE_PPS off until we figure this one out.

*** Under uncertain circumstances, gpsd eats the processor

Rob Janssen writes:
> When gpsd is running and I start a backup of my system, which starts and
> mounts a normally idle disk and does a tar cvzf to it, the load of the
> system is quite high and gpsd seems to get out of sync with the received
> data from the SiRF receiver.
> It then gets stuck in a tight loop.  As gpsd is running at nice
> --10, this further increases the load and slows down the backup to a crawl.
> Killing gpsd makes the backup finish and then it can be started again.

He adds that this occurs with SiRF+Zodiac and SiRF+TSIP.  He suspects
this may be due to a bug in the SiRF support, interacting with extreme
conditions in the Linux kernel's memory management. 

>The method to trigger this is caused by a Linux problem that I think is
>mainly visible with Reiserfs, but I am not sure.  For years I am using
>SuSE linux and they like Reiserfs, so it has been the default FS on many
>systems I administer.
>What I observe is: when you copy many gigabytes of data from disk to disk,
>the kernel's buffer management goes completely haywire.  The copy
>operation is reading and writing many files, and the kernel tries to
>buffer as much filedata as possible, even to the point where it starts to
>swap out most of the running applications to get more memory available for
>buffers.
>I think this is caused both by allocation of many buffers for reading
>files, and by accumulation of many dirty buffers that still have to be
>written.
>At some point, programs like gpsd (but also all interactive programs and
>the X display manager) come to a complete standstill while the system is
>swapping like mad.
>(of course the swap partition is on one of the disks involved in the copy,
>so swapping really does not help things in this case, it only slows down
>the system)

Rob recommends:

>When you have a system with ext3 maybe you could do a test to see if it at
>least performs similar to what I see.  You need a system with 2 disks, of
>"modern" size (say 20GB or more).  One disk has a lot of files on it
>(preferably in not too small files), the other has space.
>
>Then you do something like:
>
>cp -a * /disk2
>
>or
>
>tar cf /disk2/backup.tar.gz .
>
>or
>
>rsync -a * /disk2
>
>well, anything that makes it read several gigabytes of data and write it
>to another disk.
>while this is running, look at "top" to see if the system is swapping, and
>if it starts feeling sluggish.  in extreme cases (like on my system) it
>will become that slow that you will have trouble moving focus from window
>to window, and obtaining even character echo in a terminal window.
>
>the "load average" reported by Linux is high, but that value is buggy
>because Linux counts processes waiting for disk I/O completion as loading

I have not yet reproduced this.

*** Possible resource-leak bug, not yet reproduced or confirmed

Wojciech Kazubski <wk@ire.pw.edu.pl> reports: when I connect to gpsd first
time everything goes fine, and several clients can connect without
problem but if the last client disconnects, the gpsd does not respond
to any inquiry. It is living and accepting commands but responding
with GPSD,P=? or so. And possibly after some time (few hours?) it
stops responding but the process still looks active (running out of
resources??).

** To do:

*** Track error computation

Presently the track error member in the fix structure is neither
reported by any GPS nor filled in by computation.

*** Hotplug interface problems

The hotplug interface works pretty nicely for telling gpsd which
device to look at, at least on my FC3 Linux machines.  The fly in the
ointment is that I'm using a deprecated version of the interface, the
old-style /etc/hotplug version with usermap files.

It is unlikely this interface will be dropped by distro makers any
time soon, because it's supporting a bunch of popular USB cameras.
Still, it would be nice not to be using a deprecated interface.

I tried moving to the new-style /etc/hotplug.d interface, but I ran
into a nasty race condition.  My hotplug agent got woken up on a USB
add event as it should, but in the new interface the creation of
/dev/ttyUSB* can be delayed arbitrarily long after the wakeup event.
Thus, it may not be there when gpsd goes to probe it unless I
busy-wait in the script.

Ultimately this should all be done through udev.  The problem is that at
the current state of udev, we'd need to do it through a script that would
fire every time a tty activates.  Because of virtual consoles firing up at
boot time, this would introduce significant boot lag.

This would be antisocial and I'm not willing to do it, so udev needs
to grow better filtering before I'll use it.

When and if udev supports HOTPLUG and ACTION keys, this will work:

# The Prolific Technology 2303 (commonly in tandem with SiRF chips)
BUS="usb" SYSFS{vendor}="067b" SYSFS{product}="2303" \
		NAME="gps%e" \
		HOTPLUG="/usr/bin/gps-probe"
# FTDI 8U232AM
BUS="usb" SYSFS{vendor}="0403" SYSFS{product}="6001" \
		NAME="gps%e" \
		HOTPLUG="/usr/bin/gps-probe"
# Cypress M8/CY7C64013 (DeLorme uses these)
BUS="usb" SYSFS{vendor}="1163" SYSFS{product}="0100" \
		NAME="gps%e" \
		HOTPLUG="/usr/bin/gps-probe"

More generally, the hotplug code we have is Linux-specific.  OpenBSD
(at least) features a hotplug daemon with similar capabilities.

*** The mess near error modeling

One of my goals has been to report an uncertainty along with every
dimension of PVT, so that the return from the GPS actually (and
realistically) describes the volume of kinematic state space within
which it is located at 1-sigma or 66% confidence. (Because the errors
are taken to be normally distributed, we can square the error to get
2-sigma or 95% confidence.)

There are several problems with this. 

A. I don't know how to derive or estimate uncertainty of time in the
general case.  There are clock drift and bias fields in the SiRF
binary protocol, but I don't know how to interpret these.  Does
anyone?

B. Only Garmin devices report estimated position uncertainties in meters.
They won't say what the confidence interval is, but it is generally
believed to be 1-sigma.  See <http://gpsinformation.net/main/epenew.txt>.

Here is what I am presently doing in the new E command:

1. I pass up the Garmin PGRME fields (uncertainty in meters) if
   they're available.

2. Otherwise, I apply the error model described in the HACKING dociment.

What non-Garmin GPSes will return in the E command is UERE multiplied
by PDOP/HDOP/VDOP.  Annoyingly, SiRF binary mode only offers HDOP,
one respect in which it is functionally inferior to SiRF NMEA.  We
compute VDOP and PDOP using an algorithm supplied by SiRF.

I don't know, because my sources didn't give, the confidence level
associated with the range uncertainties in gpsd.h.  My educated guess
is that they are 1-sigma (66%), and that's what the gpsd documentation
now says, but it needs to be confirmed.

This area needs some attention from somebody who cares a lot about
GPS accuracy and is willing to do research on error budgets to pin
down the numbers and confidence levels.

*** Do the research to figure out just what is going on with status bits

NMEA actually has *four* kinds of validity bits: Mode, Status, the
Active/Void bit (some sources interpret 'V' as 'Navigation receiver
warning'), and in later versions the FAA indicator mode.  Sentences
that have an Active/Void send V when there is no fix, so the position
data is no good.

Let's look at which sentences send what:

                GPRMC     GPGLL     GPGGA     GPGSA
Returns fix      Yes       Yes       Yes        No
Returns status   No        Yes       Yes        No
Returns mode     No        No        No         Yes
Returns A/V      Yes       Yes       No         No

In addition, some sentences use empty fields to signify invalid data.

My first conclusion from looking at this table is that the designers
of NMEA 0183 should be hung for galloping incompetence.  But never mind that.
What are we to make of this mess?

The fact that the FV18 sends GPMRC/GPGLL/GPGGA but not GPGSA
argues that GPGSA is optional.  I don't see how it can be, since it
seems to be the only status bit that applies to altitude.  Just how are
we supposed to know when altitude is valid if it doesn't ship GSA?  
Can a receiver ever ship a non-empty but invalid altitude?

Which of these override which other bits?  I don't think status is ever
nonzero when mode is zero. So status overrides mode.  What other such
relationships are there?

News flash: it develops that the "Navigation receiver warning" is
supposed to indicate a valid fix that has a DOP too high or fails
an elevation test.

** Future features (?)

*** Subsecond polling

gpsd relies on the GPS to periodically send PVT reports to it.

Most GPSes send PVT reports once a second.  No GPS I am aware of
allows you to set a cycle time of less than a second.  This is because
at 4800bps, a full PVT report takes just under one second in NMEA.

At 50km/h (31mi/h) that's 13.8 meters change in position between
updates, about the same as the uncertainty of position under typical
conditions.

There is, however, a way to sample GPSes at higher frequency.  SiRF
chips, and some others, allow you to shut down periodic notifications
and poll them for PVT.  At 57600bps we could poll a NMEA GPS 16 times
a second, and a SiRF one maybe 18 times a second.

Is this worth doing?  Maybe.  It would reduce fix latency, possibly
to good effect if your GPS is in motion.  Opinions?  Calculations?

*** Set the system time zone from latitude/longitude

If we're going to give gpsd the capability to set system time via
ntpd, why not let it set timezone as well?  A good thing for hackers
travelling with laptops!

The major issue here is that I have not yet found code, or a
database, that would allow mapping from lon/lat to timezone.
And the rules change from year to year.

Actually this should be built as a specialized client, as some
people won't want it.

From <http://www.linuxsa.org.au/tips/time.html>:

    The timezone under Linux is set by a symbolic link from
    /etc/localtime[1] to a file in the /usr/share/zoneinfo[2] directory
    that corresponds with what timezone you are in. For example, since I'm
    in South Australia, /etc/localtime is a symlink to
    /usr/share/zoneinfo/Australia/South. To set this link, type:

    ln -sf ../usr/share/zoneinfo/your/zone /etc/localtime

    Replace your/zone with something like Australia/NSW or
    Australia/Perth. Have a look in the directories under
    /usr/share/zoneinfo to see what timezones are available.

    [1] This assumes that /usr/share/zoneinfo is linked to /etc/localtime as it is under Red Hat Linux.

    [2] On older systems, you'll find that /usr/lib/zoneinfo is used
    instead of /usr/share/zoneinfo.

Changing the hardlink will, of course, update the system timezone for
all users.  If I were designing this feature, I'd ensure that the
system timezone can be overridden by a user-set TZ, but I don't know
if it actually works that way.

If I'm reading the tea leaves correctly, this functionality is actually
embedded in the GCC library version of tzset(), so the same method will
work on any system that uses that.

Problem: system daemons use the timezone set when they start up. You
can't get them to grok a new one short of rebooting

Sources: 

Sources for Time Zone and Daylight Saving Time Data
http://www.twinsun.com/tz/tz-link.htm

Free time-zone maps of the U.S.
http://www.manifold.net/download/freemaps.html

Local variables:
mode: outline
paragraph-separate: "[ 	]*$"
end: