diff options
Diffstat (limited to 'pod/perlfaq8.pod')
-rw-r--r-- | pod/perlfaq8.pod | 334 |
1 files changed, 0 insertions, 334 deletions
diff --git a/pod/perlfaq8.pod b/pod/perlfaq8.pod index 4fabce6f36..f4d3c12f6f 100644 --- a/pod/perlfaq8.pod +++ b/pod/perlfaq8.pod @@ -849,337 +849,3 @@ included with the 5.002 release of Perl. Copyright (c) 1997 Tom Christiansen and Nathan Torkington. All rights reserved. See L<perlfaq> for distribution information. - END-of-perlfaq8.pod -echo x - perlfaq9.pod -sed 's/^X//' >perlfaq9.pod << 'END-of-perlfaq9.pod' -=head1 NAME - -perlfaq9 - Networking ($Revision: 1.17 $, $Date: 1997/04/24 22:44:29 $) - -=head1 DESCRIPTION - -This section deals with questions related to networking, the internet, -and a few on the web. - -=head2 My CGI script runs from the command line but not the browser. Can you help me fix it? - -Sure, but you probably can't afford our contracting rates :-) - -Seriously, if you can demonstrate that you've read the following FAQs -and that your problem isn't something simple that can be easily -answered, you'll probably receive a courteous and useful reply to your -question if you post it on comp.infosystems.www.authoring.cgi (if it's -something to do with HTTP, HTML, or the CGI protocols). Questions that -appear to be Perl questions but are really CGI ones that are posted to -comp.lang.perl.misc may not be so well received. - -The useful FAQs are: - - http://www.perl.com/perl/faq/idiots-guide.html - http://www3.pair.com/webthing/docs/cgi/faqs/cgifaq.shtml - http://www.perl.com/perl/faq/perl-cgi-faq.html - http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html - http://www.boutell.com/faq/ - -=head2 How do I remove HTML from a string? - -The most correct way (albeit not the fastest) is to use HTML::Parse -from CPAN (part of the libwww-perl distribution, which is a must-have -module for all web hackers). - -Many folks attempt a simple-minded regular expression approach, like -C<s/E<lt>.*?E<gt>//g>, but that fails in many cases because the tags -may continue over line breaks, they may contain quoted angle-brackets, -or HTML comment may be present. Plus folks forget to convert -entities, like C<<> for example. - -Here's one "simple-minded" approach, that works for most files: - - #!/usr/bin/perl -p0777 - s/<(?:[^>'"]*|(['"]).*?\1)*>//gs - -If you want a more complete solution, see the 3-stage striphtml -program in -http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz -. - -=head2 How do I extract URLs? - -A quick but imperfect approach is - - #!/usr/bin/perl -n00 - # qxurl - tchrist@perl.com - print "$2\n" while m{ - < \s* - A \s+ HREF \s* = \s* (["']) (.*?) \1 - \s* > - }gsix; - -This version does not adjust relative URLs, understand alternate -bases, deal with HTML comments, deal with HREF and NAME attributes in -the same tag, or accept URLs themselves as arguments. It also runs -about 100x faster than a more "complete" solution using the LWP suite -of modules, such as the -http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz -program. - -=head2 How do I download a file from the user's machine? How do I open a file on another machine? - -In the context of an HTML form, you can use what's known as -B<multipart/form-data> encoding. The CGI.pm module (available from -CPAN) supports this in the start_multipart_form() method, which isn't -the same as the startform() method. - -=head2 How do I make a pop-up menu in HTML? - -Use the B<E<lt>SELECTE<gt>> and B<E<lt>OPTIONE<gt>> tags. The CGI.pm -module (available from CPAN) supports this widget, as well as many -others, including some that it cleverly synthesizes on its own. - -=head2 How do I fetch an HTML file? - -One approach, if you have the lynx text-based HTML browser installed -on your system, is this: - - $html_code = `lynx -source $url`; - $text_data = `lynx -dump $url`; - -The libwww-perl (LWP) modules from CPAN provide a more powerful way to -do this. They work through proxies, and don't require lynx: - - # print HTML from a URL - use LWP::Simple; - getprint "http://www.sn.no/libwww-perl/"; - - # print ASCII from HTML from a URL - use LWP::Simple; - use HTML::Parse; - use HTML::FormatText; - my ($html, $ascii); - $html = get("http://www.perl.com/"); - defined $html - or die "Can't fetch HTML from http://www.perl.com/"; - $ascii = HTML::FormatText->new->format(parse_html($html)); - print $ascii; - -=head2 how do I decode or create those %-encodings on the web? - -Here's an example of decoding: - - $string = "http://altavista.digital.com/cgi-bin/query?pg=q&what=news&fmt=.&q=%2Bcgi-bin+%2Bperl.exe"; - $string =~ s/%([a-fA-F0-9]{2})/chr(hex($1))/ge; - -Encoding is a bit harder, because you can't just blindly change -all the non-alphanumunder character (C<\W>) into their hex escapes. -It's important that characters with special meaning like C</> and C<?> -I<not> be translated. Probably the easiest way to get this right is -to avoid reinventing the wheel and just use the URI::Escape module, -which is part of the libwww-perl package (LWP) available from CPAN. - -=head2 How do I redirect to another page? - -Instead of sending back a C<Content-Type> as the headers of your -reply, send back a C<Location:> header. Officially this should be a -C<URI:> header, so the CGI.pm module (available from CPAN) sends back -both: - - Location: http://www.domain.com/newpage - URI: http://www.domain.com/newpage - -Note that relative URLs in these headers can cause strange effects -because of "optimizations" that servers do. - -=head2 How do I put a password on my web pages? - -That depends. You'll need to read the documentation for your web -server, or perhaps check some of the other FAQs referenced above. - -=head2 How do I edit my .htpasswd and .htgroup files with Perl? - -The HTTPD::UserAdmin and HTTPD::GroupAdmin modules provide a -consistent OO interface to these files, regardless of how they're -stored. Databases may be text, dbm, Berkley DB or any database with a -DBI compatible driver. HTTPD::UserAdmin supports files used by the -`Basic' and `Digest' authentication schemes. Here's an example: - - use HTTPD::UserAdmin (); - HTTPD::UserAdmin - ->new(DB => "/foo/.htpasswd") - ->add($username => $password); - -=head2 How do I make sure users can't enter values into a form that cause my CGI script to do bad things? - -Read the CGI security FAQ, at -http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html, and the -Perl/CGI FAQ at -http://www.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.html. - -In brief: use tainting (see L<perlsec>), which makes sure that data -from outside your script (eg, CGI parameters) are never used in -C<eval> or C<system> calls. In addition to tainting, never use the -single-argument form of system() or exec(). Instead, supply the -command and arguments as a list, which prevents shell globbing. - -=head2 How do I parse an email header? - -For a quick-and-dirty solution, try this solution derived -from page 222 of the 2nd edition of "Programming Perl": - - $/ = ''; - $header = <MSG>; - $header =~ s/\n\s+/ /g; # merge continuation lines - %head = ( UNIX_FROM_LINE, split /^([-\w]+):\s*/m, $header ); - -That solution doesn't do well if, for example, you're trying to -maintain all the Received lines. A more complete approach is to use -the Mail::Header module from CPAN (part of the MailTools package). - -=head2 How do I decode a CGI form? - -A lot of people are tempted to code this up themselves, so you've -probably all seen a lot of code involving C<$ENV{CONTENT_LENGTH}> and -C<$ENV{QUERY_STRING}>. It's true that this can work, but there are -also a lot of versions of this floating around that are quite simply -broken! - -Please do not be tempted to reinvent the wheel. Instead, use the -CGI.pm or CGI_Lite.pm (available from CPAN), or if you're trapped in -the module-free land of perl1 .. perl4, you might look into cgi-lib.pl -(available from http://www.bio.cam.ac.uk/web/form.html). - -=head2 How do I check a valid email address? - -You can't. - -Without sending mail to the address and seeing whether it bounces (and -even then you face the halting problem), you cannot determine whether -an email address is valid. Even if you apply the email header -standard, you can have problems, because there are deliverable -addresses that aren't RFC-822 (the mail header standard) compliant, -and addresses that aren't deliverable which are compliant. - -Many are tempted to try to eliminate many frequently-invalid email -addresses with a simple regexp, such as -C</^[\w.-]+\@([\w.-]\.)+\w+$/>. However, this also throws out many -valid ones, and says nothing about potential deliverability, so is not -suggested. Instead, see -http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/ckaddr.gz , -which actually checks against the full RFC spec (except for nested -comments), looks for addresses you may not wish to accept email to -(say, Bill Clinton or your postmaster), and then makes sure that the -hostname given can be looked up in DNS. It's not fast, but it works. - -Here's an alternative strategy used by many CGI script authors: Check -the email address with a simple regexp (such as the one above). If -the regexp matched the address, accept the address. If the regexp -didn't match the address, request confirmation from the user that the -email address they entered was correct. - -=head2 How do I decode a MIME/BASE64 string? - -The MIME-tools package (available from CPAN) handles this and a lot -more. Decoding BASE64 becomes as simple as: - - use MIME::base64; - $decoded = decode_base64($encoded); - -A more direct approach is to use the unpack() function's "u" -format after minor transliterations: - - tr#A-Za-z0-9+/##cd; # remove non-base64 chars - tr#A-Za-z0-9+/# -_#; # convert to uuencoded format - $len = pack("c", 32 + 0.75*length); # compute length byte - print unpack("u", $len . $_); # uudecode and print - -=head2 How do I return the user's email address? - -On systems that support getpwuid, the $E<lt> variable and the -Sys::Hostname module (which is part of the standard perl distribution), -you can probably try using something like this: - - use Sys::Hostname; - $address = sprintf('%s@%s', getpwuid($<), hostname); - -Company policies on email address can mean that this generates addresses -that the company's email system will not accept, so you should ask for -users' email addresses when this matters. Furthermore, not all systems -on which Perl runs are so forthcoming with this information as is Unix. - -The Mail::Util module from CPAN (part of the MailTools package) provides a -mailaddress() function that tries to guess the mail address of the user. -It makes a more intelligent guess than the code above, using information -given when the module was installed, but it could still be incorrect. -Again, the best way is often just to ask the user. - -=head2 How do I send/read mail? - -Sending mail: the Mail::Mailer module from CPAN (part of the MailTools -package) is UNIX-centric, while Mail::Internet uses Net::SMTP which is -not UNIX-centric. Reading mail: use the Mail::Folder module from CPAN -(part of the MailFolder package) or the Mail::Internet module from -CPAN (also part of the MailTools package). - - # sending mail - use Mail::Internet; - use Mail::Header; - # say which mail host to use - $ENV{SMTPHOSTS} = 'mail.frii.com'; - # create headers - $header = new Mail::Header; - $header->add('From', 'gnat@frii.com'); - $header->add('Subject', 'Testing'); - $header->add('To', 'gnat@frii.com'); - # create body - $body = 'This is a test, ignore'; - # create mail object - $mail = new Mail::Internet(undef, Header => $header, Body => \[$body]); - # send it - $mail->smtpsend or die; - -=head2 How do I find out my hostname/domainname/IP address? - -A lot of code has historically cavalierly called the C<`hostname`> -program. While sometimes expedient, this isn't very portable. It's -one of those tradeoffs of convenience versus portability. - -The Sys::Hostname module (part of the standard perl distribution) will -give you the hostname after which you can find out the IP address -(assuming you have working DNS) with a gethostbyname() call. - - use Socket; - use Sys::Hostname; - my $host = hostname(); - my $addr = inet_ntoa(scalar(gethostbyname($name)) || 'localhost'); - -Probably the simplest way to learn your DNS domain name is to grok -it out of /etc/resolv.conf, at least under Unix. Of course, this -assumes several things about your resolv.conf configuration, including -that it exists. - -(We still need a good DNS domain name-learning method for non-Unix -systems.) - -=head2 How do I fetch a news article or the active newsgroups? - -Use the Net::NNTP or News::NNTPClient modules, both available from CPAN. -This can make tasks like fetching the newsgroup list as simple as: - - perl -MNews::NNTPClient - -e 'print News::NNTPClient->new->list("newsgroups")' - -=head2 How do I fetch/put an FTP file? - -LWP::Simple (available from CPAN) can fetch but not put. Net::FTP (also -available from CPAN) is more complex but can put as well as fetch. - -=head2 How can I do RPC in Perl? - -A DCE::RPC module is being developed (but is not yet available), and -will be released as part of the DCE-Perl package (available from -CPAN). No ONC::RPC module is known. - -=head1 AUTHOR AND COPYRIGHT - -Copyright (c) 1997 Tom Christiansen and Nathan Torkington. -All rights reserved. See L<perlfaq> for distribution information. - |