| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We found a distbuild controller stuck in a busy loop, with the logs
full of the same error message repeated:
... _flush(): Exception 'IOError: [Errno 32] Broken pipe' from sock.write()
We suspect this came about because the initiator disconnected without
sending an EOF. The initiator was in a VM on a laptop so it seems
possible that the host OS turned off the wireless adaptor without giving
the VM a chance to close its connections gracefully.
The busy loop is because nothing in the SocketBuffer class handles the
SocketError events queued by the _flush() method. Unhandled events are
ignored. So the SocketBuffer stays in 'w' state without ever shifting
any data and never returns. Adding transitions to handle the SocketError
event will fix the problem.
If a socket error happens now in the same scenario, it will be handled
as if the initiator disconnected.
Change-Id: I0f6834f7186a01ca2bc74aef899a4cccbc891e51
|
|
|
|
| |
Change-Id: Ibda7a938cd16e35517a531140f39ef4664d85c72
|
|
|
|
| |
Change-Id: I992dc0c1d40f563ade56a833162d409b02be90a0
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found an issue in distbuild where the controller was stuck in a busy
loop where it was continually writing to a closed socket. With 'strace'
I saw write(), SIGPIPE, write(), SIGPIPE, ad infinitum. I got this much
of a Python backtrace using GDB:
distbuild.socketsrc.SocketEventSource.write()
distbuild.sockbuf.SocketBuffer._flush()
distbuild.sm.StateMachine.handle_event()
I didn't manage to get further. However, I suspect one of the state
machine transitions may be creating an event loop instead of correctly
handling the error.
The log file was quiet at this point, the last entries were:
2014-06-19 08:57:36 INFO There seems to be nothing to build
2014-06-19 08:57:36 INFO Requested artifact is built
2014-06-19 08:57:36 DEBUG InitiatorConnection: sent to 10.24.1.215:53818: {'mess
age': 'Need to build 0 artifacts', 'type': 'build-progress', 'id': 790629564}
2014-06-19 08:57:36 DEBUG Notifying initiator of successful build
2014-06-19 08:57:36 DEBUG MainLoop.remove_state_machine: <BuildController at 0xb
6c554c, request-id InitiatorConnection-93>
2014-06-19 08:57:36 DEBUG InitiatorConnection: sent to 10.24.1.215:53818: {'type
': 'build-finished', 'id': 790629564, 'urls': [u'http://hawkdevtrove:8080/1.0/ar
tifacts?filename=861f640923494ca3626bbd65655b350ce1bebea4c0bf7a57693bc06ed122cef
4.system.devel-system-x86_32-chroot-rootfs']}
2014-06-19 08:57:36 DEBUG InitiatorConnection: 10.24.1.215:53818: closing: <Json
Machine at 0xc6cb22c: socket 10.24.1.164:7878 -> 10.24.1.215:53818, max_buffer 1
6384>
2014-06-19 08:57:36 DEBUG MainLoop.remove_state_machine: <InitiatorConnection at 0xc6cbcec: remote 10.24.1.215:53818>
2014-06-19 08:57:36 DEBUG MainLoop.remove_state_machine: <JsonMachine at 0xc6cb22c: socket 10.24.1.164:7878 -> 10.24.1.215:53818, max_buffer 16384>
2014-06-19 08:57:36 DEBUG MainLoop.remove_state_machine: <SocketBuffer at 0xc6cbe2c: socket None max_buffer 16384>
This commit should improve matters a little: in future the log file will show
the ID of the SocketEventSource object and error we hit when calling its
write() function.
|
|
|
|
| |
Makes it easier to see what they mean at a glance.
|
|
|
|
|
|
|
|
|
|
| |
New DistbuildSocket class that wraps socket.socket(), providing a
descriptive repr() handler showing where the socket is connected, and
providing a couple of helper methods for fetching local and remote
endpoint names.
This commit also adds a descriptive repr() handler to a few other
objects (mostly giving socket connection details).
|
|
|