summaryrefslogtreecommitdiff
path: root/docs/manual/misc/fin_wait_2.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/manual/misc/fin_wait_2.html')
-rw-r--r--docs/manual/misc/fin_wait_2.html324
1 files changed, 0 insertions, 324 deletions
diff --git a/docs/manual/misc/fin_wait_2.html b/docs/manual/misc/fin_wait_2.html
deleted file mode 100644
index e1e313d75c..0000000000
--- a/docs/manual/misc/fin_wait_2.html
+++ /dev/null
@@ -1,324 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
-<HTML>
-<HEAD>
-<TITLE>Connections in FIN_WAIT_2 and Apache</TITLE>
-<LINK REV="made" HREF="mailto:marc@apache.org">
-
-</HEAD>
-
-<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
-<BODY
- BGCOLOR="#FFFFFF"
- TEXT="#000000"
- LINK="#0000FF"
- VLINK="#000080"
- ALINK="#FF0000"
->
-<!--#include virtual="header.html" -->
-
-<H1 ALIGN="CENTER">Connections in the FIN_WAIT_2 state and Apache</H1>
-<OL>
-<LI><H2>What is the FIN_WAIT_2 state?</H2>
-Starting with the Apache 1.2 betas, people are reporting many more
-connections in the FIN_WAIT_2 state (as reported by
-<CODE>netstat</CODE>) than they saw using older versions. When the
-server closes a TCP connection, it sends a packet with the FIN bit
-sent to the client, which then responds with a packet with the ACK bit
-set. The client then sends a packet with the FIN bit set to the
-server, which responds with an ACK and the connection is closed. The
-state that the connection is in during the period between when the
-server gets the ACK from the client and the server gets the FIN from
-the client is known as FIN_WAIT_2. See the <A
-HREF="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</A> for the
-technical details of the state transitions.<P>
-
-The FIN_WAIT_2 state is somewhat unusual in that there is no timeout
-defined in the standard for it. This means that on many operating
-systems, a connection in the FIN_WAIT_2 state will stay around until
-the system is rebooted. If the system does not have a timeout and
-too many FIN_WAIT_2 connections build up, it can fill up the space
-allocated for storing information about the connections and crash
-the kernel. The connections in FIN_WAIT_2 do not tie up an httpd
-process.<P>
-
-<LI><H2>But why does it happen?</H2>
-
-There are numerous reasons for it happening, some of them may not
-yet be fully clear. What is known follows.<P>
-
-<H3>Buggy clients and persistent connections</H3>
-
-Several clients have a bug which pops up when dealing with
-<A HREF="../keepalive.html">persistent connections</A> (aka keepalives).
-When the connection is idle and the server closes the connection
-(based on the <A HREF="../mod/core.html#keepalivetimeout">
-KeepAliveTimeout</A>), the client is programmed so that the client does
-not send back a FIN and ACK to the server. This means that the
-connection stays in the FIN_WAIT_2 state until one of the following
-happens:<P>
-<UL>
- <LI>The client opens a new connection to the same or a different
- site, which causes it to fully close the older connection on
- that socket.
- <LI>The user exits the client, which on some (most?) clients
- causes the OS to fully shutdown the connection.
- <LI>The FIN_WAIT_2 times out, on servers that have a timeout
- for this state.
-</UL><P>
-If you are lucky, this means that the buggy client will fully close the
-connection and release the resources on your server. However, there
-are some cases where the socket is never fully closed, such as a dialup
-client disconnecting from their provider before closing the client.
-In addition, a client might sit idle for days without making another
-connection, and thus may hold its end of the socket open for days
-even though it has no further use for it.
-<STRONG>This is a bug in the browser or in its operating system's
-TCP implementation.</STRONG> <P>
-
-The clients on which this problem has been verified to exist:<P>
-<UL>
- <LI>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
- <LI>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE i386)
- <LI>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)
- <LI>MSIE 3.01 on the Macintosh
- <LI>MSIE 3.01 on Windows 95
-</UL><P>
-
-This does not appear to be a problem on:
-<UL>
- <LI>Mozilla/3.01 (Win95; I)
-</UL>
-<P>
-
-It is expected that many other clients have the same problem. What a
-client <STRONG>should do</STRONG> is periodically check its open
-socket(s) to see if they have been closed by the server, and close their
-side of the connection if the server has closed. This check need only
-occur once every few seconds, and may even be detected by a OS signal
-on some systems (<EM>e.g.</EM>, Win95 and NT clients have this capability, but
-they seem to be ignoring it).<P>
-
-Apache <STRONG>cannot</STRONG> avoid these FIN_WAIT_2 states unless it
-disables persistent connections for the buggy clients, just
-like we recommend doing for Navigator 2.x clients due to other bugs.
-However, non-persistent connections increase the total number of
-connections needed per client and slow retrieval of an image-laden
-web page. Since non-persistent connections have their own resource
-consumptions and a short waiting period after each closure, a busy server
-may need persistence in order to best serve its clients.<P>
-
-As far as we know, the client-caused FIN_WAIT_2 problem is present for
-all servers that support persistent connections, including Apache 1.1.x
-and 1.2.<P>
-
-<H3>A necessary bit of code introduced in 1.2</H3>
-
-While the above bug is a problem, it is not the whole problem.
-Some users have observed no FIN_WAIT_2 problems with Apache 1.1.x,
-but with 1.2b enough connections build up in the FIN_WAIT_2 state to
-crash their server.
-
-The most likely source for additional FIN_WAIT_2 states
-is a function called <CODE>lingering_close()</CODE> which was added
-between 1.1 and 1.2. This function is necessary for the proper
-handling of persistent connections and any request which includes
-content in the message body (<EM>e.g.</EM>, PUTs and POSTs).
-What it does is read any data sent by the client for
-a certain time after the server closes the connection. The exact
-reasons for doing this are somewhat complicated, but involve what
-happens if the client is making a request at the same time the
-server sends a response and closes the connection. Without lingering,
-the client might be forced to reset its TCP input buffer before it
-has a chance to read the server's response, and thus understand why
-the connection has closed.
-See the <A HREF="#appendix">appendix</A> for more details.<P>
-
-The code in <CODE>lingering_close()</CODE> appears to cause problems
-for a number of factors, including the change in traffic patterns
-that it causes. The code has been thoroughly reviewed and we are
-not aware of any bugs in it. It is possible that there is some
-problem in the BSD TCP stack, aside from the lack of a timeout
-for the FIN_WAIT_2 state, exposed by the <CODE>lingering_close</CODE>
-code that causes the observed problems.<P>
-
-<H2><LI>What can I do about it?</H2>
-
-There are several possible workarounds to the problem, some of
-which work better than others.<P>
-
-<H3>Add a timeout for FIN_WAIT_2</H3>
-
-The obvious workaround is to simply have a timeout for the FIN_WAIT_2 state.
-This is not specified by the RFC, and could be claimed to be a
-violation of the RFC, but it is widely recognized as being necessary.
-The following systems are known to have a timeout:
-<P>
-<UL>
- <LI><A HREF="http://www.freebsd.org/">FreeBSD</A> versions starting at
- 2.0 or possibly earlier.
- <LI><A HREF="http://www.netbsd.org/">NetBSD</A> version 1.2(?)
- <LI><A HREF="http://www.openbsd.org/">OpenBSD</A> all versions(?)
- <LI><A HREF="http://www.bsdi.com/">BSD/OS</A> 2.1, with the
- <A HREF="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
- K210-027</A> patch installed.
- <LI><A HREF="http://www.sun.com/">Solaris</A> as of around version
- 2.2. The timeout can be tuned by using <CODE>ndd</CODE> to
- modify <CODE>tcp_fin_wait_2_flush_interval</CODE>, but the
- default should be appropriate for most servers and improper
- tuning can have negative impacts.
- <LI><A HREF="http://www.linux.org/">Linux</A> 2.0.x and
- earlier(?)
- <LI><A HREF="http://www.hp.com/">HP-UX</A> 10.x defaults to
- terminating connections in the FIN_WAIT_2 state after the
- normal keepalive timeouts. This does not
- refer to the persistent connection or HTTP keepalive
- timeouts, but the <CODE>SO_LINGER</CODE> socket option
- which is enabled by Apache. This parameter can be adjusted
- by using <CODE>nettune</CODE> to modify parameters such as
- <CODE>tcp_keepstart</CODE> and <CODE>tcp_keepstop</CODE>.
- In later revisions, there is an explicit timer for
- connections in FIN_WAIT_2 that can be modified; contact HP
- support for details.
- <LI><A HREF="http://www.sgi.com/">SGI IRIX</A> can be patched to
- support a timeout. For IRIX 5.3, 6.2, and 6.3,
- use patches 1654, 1703 and 1778 respectively. If you
- have trouble locating these patches, please contact your
- SGI support channel for help.
- <LI><A HREF="http://www.ncr.com/">NCR's MP RAS Unix</A> 2.xx and
- 3.xx both have FIN_WAIT_2 timeouts. In 2.xx it is non-tunable
- at 600 seconds, while in 3.xx it defaults to 600 seconds and
- is calculated based on the tunable "max keep alive probes"
- (default of 8) multiplied by the "keep alive interval" (default
- 75 seconds).
- <LI><A HREF="http://www.sequent.com">Sequent's ptx/TCP/IP for
- DYNIX/ptx</A> has had a FIN_WAIT_2 timeout since around
- release 4.1 in mid-1994.
-</UL>
-<P>
-The following systems are known to not have a timeout:
-<P>
-<UL>
- <LI><A HREF="http://www.sun.com/">SunOS 4.x</A> does not and
- almost certainly never will have one because it as at the
- very end of its development cycle for Sun. If you have kernel
- source should be easy to patch.
-</UL>
-<P>
-There is a
-<A HREF="http://www.apache.org/dist/contrib/patches/1.2/fin_wait_2.patch"
->patch available</A> for adding a timeout to the FIN_WAIT_2 state; it
-was originally intended for BSD/OS, but should be adaptable to most
-systems using BSD networking code. You need kernel source code to be
-able to use it. If you do adapt it to work for any other systems,
-please drop me a note at <A HREF="mailto:marc@apache.org">marc@apache.org</A>.
-<P>
-<H3>Compile without using <CODE>lingering_close()</CODE></H3>
-
-It is possible to compile Apache 1.2 without using the
-<CODE>lingering_close()</CODE> function. This will result in that
-section of code being similar to that which was in 1.1. If you do
-this, be aware that it can cause problems with PUTs, POSTs and
-persistent connections, especially if the client uses pipelining.
-That said, it is no worse than on 1.1, and we understand that keeping your
-server running is quite important.<P>
-
-To compile without the <CODE>lingering_close()</CODE> function, add
-<CODE>-DNO_LINGCLOSE</CODE> to the end of the
-<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE> file,
-rerun <CODE>Configure</CODE> and rebuild the server.
-<P>
-<H3>Use <CODE>SO_LINGER</CODE> as an alternative to
-<CODE>lingering_close()</CODE></H3>
-
-On most systems, there is an option called <CODE>SO_LINGER</CODE> that
-can be set with <CODE>setsockopt(2)</CODE>. It does something very
-similar to <CODE>lingering_close()</CODE>, except that it is broken
-on many systems so that it causes far more problems than
-<CODE>lingering_close</CODE>. On some systems, it could possibly work
-better so it may be worth a try if you have no other alternatives. <P>
-
-To try it, add <CODE>-DUSE_SO_LINGER -DNO_LINGCLOSE</CODE> to the end of the
-<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE>
-file, rerun <CODE>Configure</CODE> and rebuild the server. <P>
-
-<STRONG>NOTE:</STRONG> Attempting to use <CODE>SO_LINGER</CODE> and
-<CODE>lingering_close()</CODE> at the same time is very likely to do
-very bad things, so don't.<P>
-
-<H3>Increase the amount of memory used for storing connection state</H3>
-<DL>
-<DT>BSD based networking code:
-<DD>BSD stores network data, such as connection states,
-in something called an mbuf. When you get so many connections
-that the kernel does not have enough mbufs to put them all in, your
-kernel will likely crash. You can reduce the effects of the problem
-by increasing the number of mbufs that are available; this will not
-prevent the problem, it will just make the server go longer before
-crashing.<P>
-
-The exact way to increase them may depend on your OS; look
-for some reference to the number of "mbufs" or "mbuf clusters". On
-many systems, this can be done by adding the line
-<CODE>NMBCLUSTERS="n"</CODE>, where <CODE>n</CODE> is the number of
-mbuf clusters you want to your kernel config file and rebuilding your
-kernel.<P>
-</DL>
-
-<H3>Disable KeepAlive</H3>
-<P>If you are unable to do any of the above then you should, as a last
-resort, disable KeepAlive. Edit your httpd.conf and change "KeepAlive On"
-to "KeepAlive Off".
-
-<H2><LI>Feedback</H2>
-
-If you have any information to add to this page, please contact me at
-<A HREF="mailto:marc@apache.org">marc@apache.org</A>.<P>
-
-<H2><A NAME="appendix"><LI>Appendix</A></H2>
-<P>
-Below is a message from Roy Fielding, one of the authors of HTTP/1.1.
-
-<H3>Why the lingering close functionality is necessary with HTTP</H3>
-
-The need for a server to linger on a socket after a close is noted a couple
-times in the HTTP specs, but not explained. This explanation is based on
-discussions between myself, Henrik Frystyk, Robert S. Thau, Dave Raggett,
-and John C. Mallery in the hallways of MIT while I was at W3C.<P>
-
-If a server closes the input side of the connection while the client
-is sending data (or is planning to send data), then the server's TCP
-stack will signal an RST (reset) back to the client. Upon
-receipt of the RST, the client will flush its own incoming TCP buffer
-back to the un-ACKed packet indicated by the RST packet argument.
-If the server has sent a message, usually an error response, to the
-client just before the close, and the client receives the RST packet
-before its application code has read the error message from its incoming
-TCP buffer and before the server has received the ACK sent by the client
-upon receipt of that buffer, then the RST will flush the error message
-before the client application has a chance to see it. The result is
-that the client is left thinking that the connection failed for no
-apparent reason.<P>
-
-There are two conditions under which this is likely to occur:
-<OL>
-<LI>sending POST or PUT data without proper authorization
-<LI>sending multiple requests before each response (pipelining)
- and one of the middle requests resulting in an error or
- other break-the-connection result.
-</OL>
-<P>
-The solution in all cases is to send the response, close only the
-write half of the connection (what shutdown is supposed to do), and
-continue reading on the socket until it is either closed by the
-client (signifying it has finally read the response) or a timeout occurs.
-That is what the kernel is supposed to do if SO_LINGER is set.
-Unfortunately, SO_LINGER has no effect on some systems; on some other
-systems, it does not have its own timeout and thus the TCP memory
-segments just pile-up until the next reboot (planned or not).<P>
-
-Please note that simply removing the linger code will not solve the
-problem -- it only moves it to a different and much harder one to detect.
-</OL>
-<!--#include virtual="footer.html" -->
-</BODY>
-</HTML>