summaryrefslogtreecommitdiff
path: root/docs/manual/vhosts/details.html
blob: f14bd088a5b25d026248a7aef58a40cd5603d178 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML><HEAD>
<TITLE>An In-Depth Discussion of Virtual Host Matching</TITLE>
</HEAD>

<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
<BODY
 BGCOLOR="#FFFFFF"
 TEXT="#000000"
 LINK="#0000FF"
 VLINK="#000080"
 ALINK="#FF0000"
>
<!--#include virtual="header.html" -->
<H1 ALIGN="CENTER">An In-Depth Discussion of Virtual Host Matching</H1>

<P>The virtual host code was completely rewritten in
<STRONG>Apache 1.3</STRONG>.
This document attempts to explain exactly what Apache does when
deciding what virtual host to serve a hit from. With the help of the
new <A HREF="../mod/core.html#namevirtualhost"><SAMP>NameVirtualHost</SAMP></A>
directive  virtual host configuration should be a lot easier and safer
than with versions prior to 1.3.

<P>If you just want to <CITE>make it work</CITE> without understanding
how, here are <A HREF="examples.html">some examples</A>.

<H3>Config File Parsing</H3>

<P>There is a <EM>main_server</EM> which consists of all
the definitions appearing outside of <CODE>&lt;VirtualHost&gt;</CODE> sections.
There are virtual servers, called <EM>vhosts</EM>, which are defined by
<A HREF="../mod/core.html#virtualhost"><SAMP>&lt;VirtualHost&gt;</SAMP></A>
sections.

<P>The directives
<A HREF="../mod/core.html#port"><SAMP>Port</SAMP></A>,
<A HREF="../mod/core.html#servername"><SAMP>ServerName</SAMP></A>,
<A HREF="../mod/core.html#serverpath"><SAMP>ServerPath</SAMP></A>,
and
<A HREF="../mod/core.html#serveralias"><SAMP>ServerAlias</SAMP></A>
can appear anywhere within the definition of
a server.  However, each appearance overrides the previous appearance
(within that server).

<P>The default value of the <CODE>Port</CODE> field for main_server
is 80.  The main_server has no default <CODE>ServerPath</CODE>, or
<CODE>ServerAlias</CODE>. The default <CODE>ServerName</CODE> is
deduced from the servers IP address.

<P>The main_server Port directive has two functions due to legacy
compatibility with NCSA configuration files.  One function is
to determine the default network port Apache will bind to.  This
default is overridden by the existence of any
<A HREF="../mod/core.html#listen"><CODE>Listen</CODE></A> directives.
The second function is to specify the port number which is used
in absolute URIs during redirects.

<P>Unlike the main_server, vhost ports <EM>do not</EM> affect what
ports Apache listens for connections on.

<P>Each address appearing in the <CODE>VirtualHost</CODE> directive
can have an optional port.  If the port is unspecified it defaults to
the value of the main_server's most recent <CODE>Port</CODE> statement.
The special port <SAMP>*</SAMP> indicates a wildcard that matches any port.
Collectively the entire set of addresses (including multiple
<SAMP>A</SAMP> record
results from DNS lookups) are called the vhost's <EM>address set</EM>.

<P>Unless a <A HREF="../mod/core.html#namevirtualhost">NameVirtualHost</A>
directive is used for a specific IP address the first vhost with
that address is treated as an IP-based vhost.

<P>If name-based vhosts should be used a <CODE>NameVirtualHost</CODE>
directive <EM>must</EM> appear with the IP address set to be used for the
name-based vhosts. In other words, you must specify the IP address that
holds the hostname aliases (CNAMEs) for your name-based vhosts via a
<CODE>NameVirtualHost</CODE> directive in your configuration file.

<P>Multiple <CODE>NameVirtualHost</CODE> directives can be used each
with a set of <CODE>VirtualHost</CODE> directives but only one
<CODE>NameVirtualHost</CODE> directive should be used for each
specific IP:port pair.

<P>The ordering of <CODE>NameVirtualHost</CODE> and 
<CODE>VirtualHost</CODE> directives is not important which makes the
following two examples identical (only the order of the
<CODE>VirtualHost</CODE> directives for <EM>one</EM> address set
is important, see below):

<PRE>
                                |
  NameVirtualHost 111.22.33.44  | &lt;VirtualHost 111.22.33.44&gt;
  &lt;VirtualHost 111.22.33.44&gt;    | # server A
  # server A  		        | &lt;/VirtualHost&gt;
  ... 			        | &lt;VirtualHost 111.22.33.55&gt;
  &lt;/VirtualHost&gt;	        | # server C
  &lt;VirtualHost 111.22.33.44&gt;    | ...
  # server B  		        | &lt;/VirtualHost&gt;
  ... 			        | &lt;VirtualHost 111.22.33.44&gt;
  &lt;/VirtualHost&gt;	        | # server B
                                | ...
  NameVirtualHost 111.22.33.55  | &lt;/VirtualHost&gt;
  &lt;VirtualHost 111.22.33.55&gt;    | &lt;VirtualHost 111.22.33.55&gt;
  # server C  		        | # server D
  ... 			        | ...
  &lt;/VirtualHost&gt;	        | &lt;/VirtualHost&gt;
  &lt;VirtualHost 111.22.33.55&gt;    |
  # server D  		        | NameVirtualHost 111.22.33.44
  ... 			        | NameVirtualHost 111.22.33.55
  &lt;/VirtualHost&gt;	        |
                                |
</PRE>

<P>(To aid the readability of your configuration you should prefer the
left variant.)

<P> After parsing the <CODE>VirtualHost</CODE> directive, the vhost server
is given a default <CODE>Port</CODE> equal to the port assigned to the
first name in its <CODE>VirtualHost</CODE> directive.

<P>The complete list of names in the <CODE>VirtualHost</CODE> directive
are treated just like a <CODE>ServerAlias</CODE> (but are not overridden by any
<CODE>ServerAlias</CODE> statement) if all names resolve to the same address
set.  Note that subsequent <CODE>Port</CODE> statements for this vhost will not
affect the ports assigned in the address set.

<P>During initialization a list for each IP address
is generated and inserted into an hash table. If the IP address is
used in a <CODE>NameVirtualHost</CODE> directive the list contains
all name-based vhosts for the given IP address. If there are no
vhosts defined for that address the <CODE>NameVirtualHost</CODE> directive
is ignored and an error is logged. For an IP-based vhost the list in the
hash table is empty.

<P>Due to a fast hashing function the overhead of hashing an IP address
during a request is minimal and almost not existent. Additionally
the table is optimized for IP addresses which vary in the last octet.

<P>For every vhost various default values are set. In particular:

<OL>
<LI>If a vhost has no
    <A HREF="../mod/core.html#serveradmin"><CODE>ServerAdmin</CODE></A>,
    <A HREF="../mod/core.html#resourceconfig"><CODE>ResourceConfig</CODE></A>,
    <A HREF="../mod/core.html#accessconfig"><CODE>AccessConfig</CODE></A>,
    <A HREF="../mod/core.html#timeout"><CODE>Timeout</CODE></A>,
    <A HREF="../mod/core.html#keepalivetimeout"
    ><CODE>KeepAliveTimeout</CODE></A>,
    <A HREF="../mod/core.html#keepalive"><CODE>KeepAlive</CODE></A>,
    <A HREF="../mod/core.html#maxkeepaliverequests"
    ><CODE>MaxKeepAliveRequests</CODE></A>,
    or
    <A HREF="../mod/core.html#sendbuffersize"><CODE>SendBufferSize</CODE></A>
    directive then the respective value is
    inherited from the main_server.  (That is, inherited from whatever
    the final setting of that value is in the main_server.)

<LI>The &quot;lookup defaults&quot; that define the default directory
    permissions
    for a vhost are merged with those of the main_server.  This includes
    any per-directory configuration information for any module.

<LI>The per-server configs for each module from the main_server are
    merged into the vhost server.
</OL>

Essentially, the main_server is treated as &quot;defaults&quot; or a
&quot;base&quot; on which to build each vhost.
But the positioning of these main_server
definitions in the config file is largely irrelevant -- the entire
config of the main_server has been parsed when this final merging occurs.
So even if a main_server definition appears after a vhost definition
it might affect the vhost definition.

<P> If the main_server has no <CODE>ServerName</CODE> at this point,
then the hostname of the machine that httpd is running on is used
instead.  We will call the <EM>main_server address set</EM> those IP
addresses returned by a DNS lookup on the <CODE>ServerName</CODE> of
the main_server.

<P> For any undefined <CODE>ServerName</CODE> fields, a name-based vhost
defaults to the address given first in the <CODE>VirtualHost</CODE>
statement defining the vhost.

<P>Any vhost that includes the magic <SAMP>_default_</SAMP> wildcard
is given the same <CODE>ServerName</CODE> as the main_server.


<H3>Virtual Host Matching</H3>

<P>The server determines which vhost to use for a request as follows:

<H4>Hash table lookup</H4>

<P>When the connection is first made by a client, the IP address to
which the client connected is looked up in the internal IP hash table.

<P>If the lookup fails (the IP address wasn't found) the request is
served from the <SAMP>_default_</SAMP> vhost if there is such a vhost
for the port to which the client sent the request. If there is no
matching <SAMP>_default_</SAMP> vhost the request is served from the
main_server.

<P>If the lookup succeeded (a corresponding list for the IP address was
found) the next step is to decide if we have to deal with an IP-based
or a name-base vhost.

<H4>IP-based vhost</H4>

<P>If the entry we found has an empty name list then we have found an
IP-based vhost, no further actions are performed and the request is
served from that vhost.

<H4>Name-based vhost</H4>

<P>If the entry corresponds to a name-based vhost the name list contains
one or more vhost structures. This list contains the vhosts in the same
order as the <CODE>VirtualHost</CODE> directives appear in the config
file.

<P>The first vhost on this list (the first vhost in the config file with
the specified IP address) has the highest priority and catches any request
to an unknown server name or a request without a <CODE>Host:</CODE>
header field.

<P>If the client provided a <CODE>Host:</CODE> header field the list is
searched for a matching vhost and the first hit on a <CODE>ServerName</CODE>
or <CODE>ServerAlias</CODE> is taken and the request is served from
that vhost. A <CODE>Host:</CODE> header field can contain a port number, but
Apache always matches against the real port to which the client sent
the request.

<P>If the client submitted a HTTP/1.0 request without <CODE>Host:</CODE>
header field we don't know to what server the client tried to connect and
any existing <CODE>ServerPath</CODE> is matched against the URI
from the request. The first matching path on the list is used and the
request is served from that vhost.

<P>If no matching vhost could be found the request is served from the
first vhost with a matching port number that is on the list for the IP
to which the client connected (as already mentioned before).

<H4>Persistent connections</H4>
The IP lookup described above is only done <EM>once</EM> for a particular
TCP/IP session while the name lookup is done on <EM>every</EM> request
during a KeepAlive/persistent connection. In other words a client may
request pages from different name-based vhosts during a single
persistent connection.


<H4>Absolute URI</H4>

<P>If the URI from the request is an absolute URI, and its hostname and
port match the main server or one of the configured virtual hosts
<EM>and</EM> match the address and port to which the client sent the request,
then the scheme/hostname/port prefix is stripped off and the remaining
relative URI is served by the corresponding main server or virtual host.
If it does not match, then the URI remains untouched and the request is
taken to be a proxy request.


<H3>Observations</H3>

<UL>

<LI>A name-based vhost can never interfere with an IP-base vhost and
    vice versa. IP-based vhosts can only be reached through an IP address
    of its own address set and never through any other address.
    The same applies to name-based vhosts, they can only be reached
    through an IP address of the corresponding address set which must
    be defined with a <CODE>NameVirtualHost</CODE> directive.
    <P>

<LI><CODE>ServerAlias</CODE> and <CODE>ServerPath</CODE> checks are never
    performed for an IP-based vhost.
    <P>
    
<LI>The order of name-/IP-based, the <SAMP>_default_</SAMP>
    vhost and the <CODE>NameVirtualHost</CODE> directive within the config
    file is not important. Only the ordering
    of name-based vhosts for a specific address set is significant. The one
    name-based vhosts that comes first in the configuration file has
    the highest priority for its corresponding address set.
    <P>

<LI>For security reasons the port number given in a <CODE>Host:</CODE>
    header field is never used during the matching process. Apache always
    uses the real port to which the client sent the request.
    <P>

<LI>If a <CODE>ServerPath</CODE> directive exists which is a prefix of
    another <CODE>ServerPath</CODE> directive that appears later in
    the configuration file, then the former will always be matched
    and the latter will never be matched.  (That is assuming that no
    <CODE>Host:</CODE> header field was available to disambiguate the two.)
    <P>

<LI>If two IP-based vhosts have an address in common, the vhost appearing
    first in the config file is always matched.  Such a thing might happen
    inadvertently. The server will give a warning in the error
    logfile when it detects this.
    <P>
    
<LI>A <CODE>_default_</CODE> vhost catches a request only if there is no
    other vhost with a matching IP address <EM>and</EM> a matching port
    number for the request. The request is only caught if the port number
    to which the client sent the request matches the port number of your
    <CODE>_default_</CODE> vhost which is your standard <CODE>Port</CODE>
    by default. A wildcard port can be specified (<EM>i.e.</EM>,
    <CODE>_default_:*</CODE>) to catch requests to any available port.
    <P>
    
<LI>The main_server is only used to serve a request if the IP address
    and port number to which the client connected is unspecified
    and does not match any other vhost (including a <CODE>_default_</CODE>
    vhost). In other words the main_server only catches a request for an
    unspecified address/port combination (unless there is a
    <CODE>_default_</CODE> vhost which matches that port).
    <P>
    
<LI>A <CODE>_default_</CODE> vhost or the main_server is <EM>never</EM>
    matched for a request with an unknown or missing <CODE>Host:</CODE> header
    field if the client connected to an address (and port) which is used
    for name-based vhosts, <EM>e.g.</EM>, in a <CODE>NameVirtualHost</CODE>
    directive.
    <P>
    
<LI>You should never specify DNS names in <CODE>VirtualHost</CODE>
    directives because it will force your server to rely on DNS to boot.
    Furthermore it poses a security threat if you do not control the
    DNS for all the domains listed.
    There's <A HREF="../dns-caveats.html">more information</A>
    available on this and the next two topics.
    <P>

<LI><CODE>ServerName</CODE> should always be set for each vhost.  Otherwise
    A DNS lookup is required for each vhost.
    <P>

</UL>

<H3>Tips</H3>

<P>In addition to the tips on the <A HREF="../dns-caveats.html#tips">DNS
Issues</A> page, here are some further tips:

<UL>

<LI>Place all main_server definitions before any <CODE>VirtualHost</CODE>
    definitions. (This is to aid the readability of the configuration --
    the post-config merging process makes it non-obvious that definitions 
    mixed in around virtual hosts might affect all virtual hosts.)
    <P>

<LI>Group corresponding <CODE>NameVirtualHost</CODE> and
    <CODE>VirtualHost</CODE> definitions in your configuration to ensure
    better readability.
    <P>

<LI>Avoid <CODE>ServerPaths</CODE> which are prefixes of other
    <CODE>ServerPaths</CODE>.  If you cannot avoid this then you have to
    ensure that the longer (more specific) prefix vhost appears earlier in
    the configuration file than the shorter (less specific) prefix
    (<EM>i.e.</EM>, &quot;ServerPath /abc&quot; should appear after
    &quot;ServerPath /abc/def&quot;).
    <P>

</UL>

<!--#include virtual="footer.html" -->
</BODY>
</HTML>