@c -*-texinfo-*- @c This is part of the GNU Guile Reference Manual. @c Copyright (C) 2010, 2011, 2012, 2013, 2015, 2018, 2019, 2020 Free Software Foundation, Inc. @c See the file guile.texi for copying conditions. @node Web @section @acronym{HTTP}, the Web, and All That @cindex Web @cindex WWW @cindex HTTP It has always been possible to connect computers together and share information between them, but the rise of the World Wide Web over the last couple of decades has made it much easier to do so. The result is a richly connected network of computation, in which Guile forms a part. By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for protocol, but this phrase appears repeatedly in RFC 2616.} as handled by servers, clients, proxies, caches, and the various kinds of messages and message components that can be sent and received by that protocol, notably HTML. On one level, the web is text in motion: the protocols themselves are textual (though the payload may be binary), and it's possible to create a socket and speak text to the web. But such an approach is obviously primitive. This section details the higher-level data types and operations provided by Guile: URIs, HTTP request and response records, and a conventional web server implementation. The material in this section is arranged in ascending order, in which later concepts build on previous ones. If you prefer to start with the highest-level perspective, @pxref{Web Examples}, and work your way back. @menu * Types and the Web:: Types prevent bugs and security problems. * URIs:: Universal Resource Identifiers. * HTTP:: The Hyper-Text Transfer Protocol. * HTTP Headers:: How Guile represents specific header values. * Transfer Codings:: HTTP Transfer Codings. * Requests:: HTTP requests. * Responses:: HTTP responses. * Web Client:: Accessing web resources over HTTP. * Web Server:: Serving HTTP to the internet. * Web Examples:: How to use this thing. @end menu @node Types and the Web @subsection Types and the Web It is a truth universally acknowledged, that a program with good use of data types, will be free from many common bugs. Unfortunately, the common practice in web programming seems to ignore this maxim. This subsection makes the case for expressive data types in web programming. By ``expressive data types'', we mean that the data types @emph{say} something about how a program solves a problem. For example, if we choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}), this indicates that there is a part of the program that will always have valid dates. Error handling for a number of basic cases, like invalid dates, occurs on the boundary in which we produce a SRFI 19 date record from other types, like strings. With regards to the web, data types are helpful in the two broad phases of HTTP messages: parsing and generation. Consider a server, which has to parse a request, and produce a response. Guile will parse the request into an HTTP request object (@pxref{Requests}), with each header parsed into an appropriate Scheme data type. This transition from an incoming stream of characters to typed data is a state change in a program---the strings might parse, or they might not, and something has to happen if they do not. (Guile throws an error in this case.) But after you have the parsed request, ``client'' code (code built on top of the Guile web framework) will not have to check for syntactic validity. The types already make this information manifest. This state change on the parsing boundary makes programs more robust, as they themselves are freed from the need to do a number of common error checks, and they can use normal Scheme procedures to handle a request instead of ad-hoc string parsers. The need for types on the response generation side (in a server) is more subtle, though not less important. Consider the example of a POST handler, which prints out the text that a user submits from a form. Such a handler might include a procedure like this: @example ;; First, a helper procedure (define (para . contents) (string-append "

" (string-concatenate contents) "

")) ;; Now the meat of our simple web application (define (you-said text) (para "You said: " text)) (display (you-said "Hi!")) @print{}

You said: Hi!

@end example This is a perfectly valid implementation, provided that the incoming text does not contain the special HTML characters @samp{<}, @samp{>}, or @samp{&}. But this provision of a restricted character set is not reflected anywhere in the program itself: we must @emph{assume} that the programmer understands this, and performs the check elsewhere. Unfortunately, the short history of the practice of programming does not bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS}) vulnerability is just such a common error in which unfiltered user input is allowed into the output. A user could submit a crafted comment to your web site which results in visitors running malicious Javascript, within the security context of your domain: @example (display (you-said "