summaryrefslogtreecommitdiff
path: root/pod/perlfaq9.pod
diff options
context:
space:
mode:
Diffstat (limited to 'pod/perlfaq9.pod')
-rw-r--r--pod/perlfaq9.pod277
1 files changed, 277 insertions, 0 deletions
diff --git a/pod/perlfaq9.pod b/pod/perlfaq9.pod
new file mode 100644
index 0000000000..e62dac4bd3
--- /dev/null
+++ b/pod/perlfaq9.pod
@@ -0,0 +1,277 @@
+=head1 NAME
+
+perlfaq9 - Networking ($Revision: 1.13 $)
+
+=head1 DESCRIPTION
+
+This section deals with questions related to networking, the internet,
+and a few on the web.
+
+=head2 My CGI script runs from the command line but not the browser. Can you help me fix it?
+
+Sure, but you probably can't afford our contracting rates :-)
+
+Seriously, if you can demonstrate that you've read the following FAQs
+and that your problem isn't something simple that can be easily
+answered, you'll probably receive a courteous and useful reply to your
+question if you post it on comp.infosystems.www.authoring.cgi (if it's
+something to do with HTTP, HTML, or the CGI protocols). Questions that
+appear to be Perl questions but are really CGI ones that are posted to
+comp.lang.perl.misc may not be so well received.
+
+The useful FAQs are:
+
+ http://www.perl.com/perl/faq/idiots-guide.html
+ http://www3.pair.com/webthing/docs/cgi/faqs/cgifaq.shtml
+ http://www.perl.com/perl/faq/perl-cgi-faq.html
+ http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html
+ http://www.boutell.com/faq/
+
+=head2 How do I remove HTML from a string?
+
+The most correct way (albeit not the fastest) is to use HTML::Parse
+from CPAN (part of the libwww-perl distribution, which is a must-have
+module for all web hackers).
+
+Many folks attempt a simple-minded regular expression approach, like
+C<s/E<lt>.*?E<gt>//g>, but that fails in many cases because the tags
+may continue over line breaks, they may contain quoted angle-brackets,
+or HTML comment may be present. Plus folks forget to convert
+entities, like C<&lt;> for example.
+
+Here's one "simple-minded" approach, that works for most files:
+
+ #!/usr/bin/perl -p0777
+ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs
+
+If you want a more complete solution, see the 3-stage striphtml
+program in
+http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz
+.
+
+=head2 How do I extract URLs?
+
+A quick but imperfect approach is
+
+ #!/usr/bin/perl -n00
+ # qxurl - tchrist@perl.com
+ print "$2\n" while m{
+ < \s*
+ A \s+ HREF \s* = \s* (["']) (.*?) \1
+ \s* >
+ }gsix;
+
+This version does not adjust relative URLs, understand alternate
+bases, deal with HTML comments, or accept URLs themselves as
+arguments. It also runs about 100x faster than a more "complete"
+solution using the LWP suite of modules, such as the
+http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz
+program.
+
+=head2 How do I download a file from the user's machine? How do I open a file on another machine?
+
+In the context of an HTML form, you can use what's known as
+B<multipart/form-data> encoding. The CGI.pm module (available from
+CPAN) supports this in the start_multipart_form() method, which isn't
+the same as the startform() method.
+
+=head2 How do I make a pop-up menu in HTML?
+
+Use the B<E<lt>SELECTE<gt>> and B<E<lt>OPTIONE<gt>> tags. The CGI.pm
+module (available from CPAN) supports this widget, as well as many
+others, including some that it cleverly synthesizes on its own.
+
+=head2 How do I fetch an HTML file?
+
+Use the LWP::Simple module available from CPAN, part of the excellent
+libwww-perl (LWP) package. On the other hand, and if you have the
+lynx text-based HTML browser installed on your system, this isn't too
+bad:
+
+ $html_code = `lynx -source $url`;
+ $text_data = `lynx -dump $url`;
+
+=head2 how do I decode or create those %-encodings on the web?
+
+Here's an example of decoding:
+
+ $string = "http://altavista.digital.com/cgi-bin/query?pg=q&what=news&fmt=.&q=%2Bcgi-bin+%2Bperl.exe";
+ $string =~ s/%([a-fA-F0-9]{2})/chr(hex($1))/ge;
+
+Encoding is a bit harder, because you can't just blindly change
+all the non-alphanumunder character (C<\W>) into their hex escapes.
+It's important that characters with special meaning like C</> and C<?>
+I<not> be translated. Probably the easiest way to get this right is
+to avoid reinventing the wheel and just use the URI::Escape module,
+which is part of the libwww-perl package (LWP) available from CPAN.
+
+=head2 How do I redirect to another page?
+
+Instead of sending back a C<Content-Type> as the headers of your
+reply, send back a C<Location:> header. Officially this should be a
+C<URI:> header, so the CGI.pm module (available from CPAN) sends back
+both:
+
+ Location: http://www.domain.com/newpage
+ URI: http://www.domain.com/newpage
+
+Note that relative URLs in these headers can cause strange effects
+because of "optimizations" that servers do.
+
+=head2 How do I put a password on my web pages?
+
+That depends. You'll need to read the documentation for your web
+server, or perhaps check some of the other FAQs referenced above.
+
+=head2 How do I edit my .htpasswd and .htgroup files with Perl?
+
+The HTTPD::UserAdmin and HTTPD::GroupAdmin modules provide a
+consistent OO interface to these files, regardless of how they're
+stored. Databases may be text, dbm, Berkley DB or any database with a
+DBI compatible driver. HTTPD::UserAdmin supports files used by the
+`Basic' and `Digest' authentication schemes. Here's an example:
+
+ use HTTPD::UserAdmin ();
+ HTTPD::UserAdmin
+ ->new(DB => "/foo/.htpasswd")
+ ->add($username => $password);
+
+=head2 How do I parse an email header?
+
+For a quick-and-dirty solution, try this solution derived
+from page 222 of the 2nd edition of "Programming Perl":
+
+ $/ = '';
+ $header = <MSG>;
+ $header =~ s/\n\s+/ /g; # merge continuation lines
+ %head = ( UNIX_FROM_LINE, split /^([-\w]+):\s*/m, $header );
+
+That solution doesn't do well if, for example, you're trying to
+maintain all the Received lines. A more complete approach is to use
+the Mail::Header module from CPAN (part of the MailTools package).
+
+=head2 How do I decode a CGI form?
+
+A lot of people are tempted to code this up themselves, so you've
+probably all seen a lot of code involving C<$ENV{CONTENT_LENGTH}> and
+C<$ENV{QUERY_STRING}>. It's true that this can work, but there are
+also a lot of versions of this floating around that are quite simply
+broken!
+
+Please do not be tempted to reinvent the wheel. Instead, use the
+CGI.pm or CGI_Lite.pm (available from CPAN), or if you're trapped in
+the module-free land of perl1 .. perl4, you might look into cgi-lib.pl
+(available from http://www.bio.cam.ac.uk/web/form.html).
+
+=head2 How do I check a valid email address?
+
+You can't.
+
+Without sending mail to the address and seeing whether it bounces (and
+even then you face the halting problem), you cannot determine whether
+an email address is valid. Even if you apply the email header
+standard, you can have problems, because there are deliverable
+addresses that aren't RFC-822 (the mail header standard) compliant,
+and addresses that aren't deliverable which are compliant.
+
+Many are tempted to try to eliminate many frequently-invalid email
+addresses with a simple regexp, such as
+C</^[\w.-]+\@([\w.-]\.)+\w+$/>. However, this also throws out many
+valid ones, and says nothing about potential deliverability, so is not
+suggested. Instead, see
+http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/ckaddr.gz ,
+which actually checks against the full RFC spec (except for nested
+comments), looks for addresses you may not wish to accept email to
+(say, Bill Clinton or your postmaster), and then makes sure that the
+hostname given can be looked up in DNS. It's not fast, but it works.
+
+=head2 How do I decode a MIME/BASE64 string?
+
+The MIME-tools package (available from CPAN) handles this and a lot
+more. Decoding BASE64 becomes as simple as:
+
+ use MIME::base64;
+ $decoded = decode_base64($encoded);
+
+A more direct approach is to use the unpack() function's "u"
+format after minor transliterations:
+
+ tr#A-Za-z0-9+/##cd; # remove non-base64 chars
+ tr#A-Za-z0-9+/# -_#; # convert to uuencoded format
+ $len = pack("c", 32 + 0.75*length); # compute length byte
+ print unpack("u", $len . $_); # uudecode and print
+
+=head2 How do I return the user's email address?
+
+On systems that support getpwuid, the $E<lt> variable and the
+Sys::Hostname module (which is part of the standard perl distribution),
+you can probably try using something like this:
+
+ use Sys::Hostname;
+ $address = sprintf('%s@%s', getpwuid($<), hostname);
+
+Company policies on email address can mean that this generates addresses
+that the company's email system will not accept, so you should ask for
+users' email addresses when this matters. Furthermore, not all systems
+on which Perl runs are so forthcoming with this information as is Unix.
+
+The Mail::Util module from CPAN (part of the MailTools package) provides a
+mailaddress() function that tries to guess the mail address of the user.
+It makes a more intelligent guess than the code above, using information
+given when the module was installed, but it could still be incorrect.
+Again, the best way is often just to ask the user.
+
+=head2 How do I send/read mail?
+
+Sending mail: the Mail::Mailer module from CPAN (part of the MailTools
+package) is UNIX-centric, while Mail::Internet uses Net::SMTP which is
+not UNIX-centric. Reading mail: use the Mail::Folder module from CPAN
+(part of the MailFolder package) or the Mail::Internet module from
+CPAN (also part of the MailTools package).
+
+=head2 How do I find out my hostname/domainname/IP address?
+
+A lot of code has historically cavalierly called the C<`hostname`>
+program. While sometimes expedient, this isn't very portable. It's
+one of those tradeoffs of convenience versus portability.
+
+The Sys::Hostname module (part of the standard perl distribution) will
+give you the hostname after which you can find out the IP address
+(assuming you have working DNS) with a gethostbyname() call.
+
+ use Socket;
+ use Sys::Hostname;
+ my $host = hostname();
+ my $addr = inet_ntoa(scalar(gethostbyname($name)) || 'localhost');
+
+Probably the simplest way to learn your DNS domain name is to grok
+it out of /etc/resolv.conf, at least under Unix. Of course, this
+assumes several things about your resolv.conf configuration, including
+that it exists.
+
+(We still need a good DNS domain name-learning method for non-Unix
+systems.)
+
+=head2 How do I fetch a news article or the active newsgroups?
+
+Use the Net::NNTP or News::NNTPClient modules, both available from CPAN.
+This can make tasks like fetching the newsgroup list as simple as:
+
+ perl -MNews::NNTPClient
+ -e 'print News::NNTPClient->new->list("newsgroups")'
+
+=head2 How do I fetch/put an FTP file?
+
+LWP::Simple (available from CPAN) can fetch but not put. Net::FTP (also
+available from CPAN) is more complex but can put as well as fetch.
+
+=head2 How can I do RPC in Perl?
+
+A DCE::RPC module is being developed (but is not yet available), and
+will be released as part of the DCE-Perl package (available from
+CPAN). No ONC::RPC module is known.
+
+=head1 AUTHOR AND COPYRIGHT
+
+Copyright (c) 1997 Tom Christiansen and Nathan Torkington.
+All rights reserved. See L<perlfaq> for distribution information.