diff options
Diffstat (limited to 'docs/manual/misc/howto.html')
-rw-r--r-- | docs/manual/misc/howto.html | 208 |
1 files changed, 0 insertions, 208 deletions
diff --git a/docs/manual/misc/howto.html b/docs/manual/misc/howto.html deleted file mode 100644 index 88c182355e..0000000000 --- a/docs/manual/misc/howto.html +++ /dev/null @@ -1,208 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> -<HTML> -<HEAD> -<META NAME="description" - CONTENT="Some 'how to' tips for the Apache httpd server"> -<META NAME="keywords" CONTENT="apache,redirect,robots,rotate,logfiles"> -<TITLE>Apache HOWTO documentation</TITLE> -</HEAD> - -<!-- Background white, links blue (unvisited), navy (visited), red (active) --> -<BODY - BGCOLOR="#FFFFFF" - TEXT="#000000" - LINK="#0000FF" - VLINK="#000080" - ALINK="#FF0000" -> -<!--#include virtual="header.html" --> -<H1 ALIGN="CENTER">Apache HOWTO documentation</H1> - -How to: -<UL> -<LI><A HREF="#redirect">redirect an entire server or directory to a single - URL</A> -<LI><A HREF="#logreset">reset your log files</A> -<LI><A HREF="#stoprob">stop/restrict robots</A> -<LI><A HREF="#proxyssl">proxy SSL requests <EM>through</EM> your non-SSL - server</A> -</UL> - -<HR> -<H2><A NAME="redirect">How to redirect an entire server or directory to a -single URL</A></H2> - -<P>There are two chief ways to redirect all requests for an entire -server to a single location: one which requires the use of -<CODE>mod_rewrite</CODE>, and another which uses a CGI script. - -<P>First: if all you need to do is migrate a server from one name to -another, simply use the <CODE>Redirect</CODE> directive, as supplied -by <CODE>mod_alias</CODE>: - -<BLOCKQUOTE><PRE> - Redirect / http://www.apache.org/ -</PRE></BLOCKQUOTE> - -<P>Since <CODE>Redirect</CODE> will forward along the complete path, -however, it may not be appropriate - for example, when the directory -structure has changed after the move, and you simply want to direct people -to the home page. - -<P>The best option is to use the standard Apache module -<CODE>mod_rewrite</CODE>. -If that module is compiled in, the following lines - -<BLOCKQUOTE><PRE>RewriteEngine On -RewriteRule /.* http://www.apache.org/ [R] -</PRE></BLOCKQUOTE> - -This will send an HTTP 302 Redirect back to the client, and no matter -what they gave in the original URL, they'll be sent to -"http://www.apache.org". - -The second option is to set up a <CODE>ScriptAlias</CODE> pointing to -a <STRONG>CGI script</STRONG> which outputs a 301 or 302 status and the -location -of the other server.</P> - -<P>By using a <STRONG>CGI script</STRONG> you can intercept various requests -and -treat them specially, <EM>e.g.</EM>, you might want to intercept -<STRONG>POST</STRONG> -requests, so that the client isn't redirected to a script on the other -server which expects POST information (a redirect will lose the POST -information.) You might also want to use a CGI script if you don't -want to compile mod_rewrite into your server. - -<P>Here's how to redirect all requests to a script... In the server -configuration file, -<BLOCKQUOTE><PRE>ScriptAlias / /usr/local/httpd/cgi-bin/redirect_script/</PRE> -</BLOCKQUOTE> - -and here's a simple perl script to redirect requests: - -<BLOCKQUOTE><PRE> -#!/usr/local/bin/perl - -print "Status: 302 Moved Temporarily\r -Location: http://www.some.where.else.com/\r\n\r\n"; - -</PRE></BLOCKQUOTE></P> - -<HR> - -<H2><A NAME="logreset">How to reset your log files</A></H2> - -<P>Sooner or later, you'll want to reset your log files (access_log and -error_log) because they are too big, or full of old information you don't -need.</P> - -<P><CODE>access.log</CODE> typically grows by 1Mb for each 10,000 requests.</P> - -<P>Most people's first attempt at replacing the logfile is to just move the -logfile or remove the logfile. This doesn't work.</P> - -<P>Apache will continue writing to the logfile at the same offset as before the -logfile moved. This results in a new logfile being created which is just -as big as the old one, but it now contains thousands (or millions) of null -characters.</P> - -<P>The correct procedure is to move the logfile, then signal Apache to tell -it to reopen the logfiles.</P> - -<P>Apache is signaled using the <STRONG>SIGHUP</STRONG> (-1) signal. -<EM>e.g.</EM> -<BLOCKQUOTE><CODE> -mv access_log access_log.old<BR> -kill -1 `cat httpd.pid` -</CODE></BLOCKQUOTE> -</P> - -<P>Note: <CODE>httpd.pid</CODE> is a file containing the -<STRONG>p</STRONG>rocess <STRONG>id</STRONG> -of the Apache httpd daemon, Apache saves this in the same directory as the log -files.</P> - -<P>Many people use this method to replace (and backup) their logfiles on a -nightly or weekly basis.</P> -<HR> - -<H2><A NAME="stoprob">How to stop or restrict robots</A></H2> - -<P>Ever wondered why so many clients are interested in a file called -<CODE>robots.txt</CODE> which you don't have, and never did have?</P> - -<P>These clients are called <STRONG>robots</STRONG> (also known as crawlers, -spiders and other cute name) - special automated clients which -wander around the web looking for interesting resources.</P> - -<P>Most robots are used to generate some kind of <EM>web index</EM> which -is then used by a <EM>search engine</EM> to help locate information.</P> - -<P><CODE>robots.txt</CODE> provides a means to request that robots limit their -activities at the site, or more often than not, to leave the site alone.</P> - -<P>When the first robots were developed, they had a bad reputation for -sending hundreds/thousands of requests to each site, often resulting -in the site being overloaded. Things have improved dramatically since -then, thanks to <A -HREF="http://info.webcrawler.com/mak/projects/robots/guidelines.html"> -Guidelines for Robot Writers</A>, but even so, some robots may exhibit -unfriendly behavior which the webmaster isn't willing to tolerate, and -will want to stop.</P> - -<P>Another reason some webmasters want to block access to robots, is to -stop them indexing dynamic information. Many search engines will use the -data collected from your pages for months to come - not much use if your -serving stock quotes, news, weather reports or anything else that will be -stale by the time people find it in a search engine.</P> - -<P>If you decide to exclude robots completely, or just limit the areas -in which they can roam, create a <CODE>robots.txt</CODE> file; refer -to the <A HREF="http://info.webcrawler.com/mak/projects/robots/robots.html" ->robot information pages</A> provided by Martijn Koster for the syntax.</P> - -<HR> -<H2><A NAME="proxyssl">How to proxy SSL requests <EM>through</EM> - your non-SSL Apache server</A> - <BR> - <SMALL>(<EM>submitted by David Sedlock</EM>)</SMALL> -</H2> -<P> -SSL uses port 443 for requests for secure pages. If your browser just -sits there for a long time when you attempt to access a secure page -over your Apache proxy, then the proxy may not be configured to handle -SSL. You need to instruct Apache to listen on port 443 in addition to -any of the ports on which it is already listening: -</P> -<PRE> - Listen 80 - Listen 443 -</PRE> -<P> -Then set the security proxy in your browser to 443. That might be it! -</P> -<P> -If your proxy is sending requests to another proxy, then you may have -to set the directive ProxyRemote differently. Here are my settings: -</P> -<PRE> - ProxyRemote http://nicklas:80/ http://proxy.mayn.franken.de:8080 - ProxyRemote http://nicklas:443/ http://proxy.mayn.franken.de:443 -</PRE> -<P> -Requests on port 80 of my proxy <SAMP>nicklas</SAMP> are forwarded to -proxy<SAMP>.mayn.franken.de:8080</SAMP>, while requests on port 443 are -forwarded to <SAMP>proxy.mayn.franken.de:443</SAMP>. -If the remote proxy is not set up to -handle port 443, then the last directive can be left out. SSL requests -will only go over the first proxy. -</P> -<P> -Note that your Apache does NOT have to be set up to serve secure pages -with SSL. Proxying SSL is a different thing from using it. -</P> -<!--#include virtual="footer.html" --> -</BODY> -</HTML> |