Turning the Brochure into a Form
Writing and Executing Scripts
Script Directives
Useful Scripts
Debugging Scripts
Setting Environment Variables
suEXEC on Unix
Handlers
Actions
Things are going so well here at Butterthlies, Inc., that we are hard put to keep up with the flood of demand. Everyone, even the cat, is hard at work typing in orders that arrive incessantly by mail and telephone.
Then someone has a brainstorm: "Hey," she cries, "let's use the Internet to take the orders!" The essence of her scheme is simplicity itself. Instead of letting customers read our catalog pages on the Web and then, drunk with excitement, phone in their orders, we provide them with a form they can fill out on their screens. At our end we get a chunk of data back from the Web, which we then pass to a script or program we have written.
Creating the form is a simple matter of editing our original brochure to turn it into a form. We have to resist the temptation to fool around, making our script more and more beautiful. We just want to add four fields to capture the number of copies of each card the customer wants and, at the bottom, a field for the credit card number. Before we get embroiled in artistry, let's look briefly at a bit of theory.
To recapitulate amidst a sea of initials: HTTP (HyperText Transmission Protocol) is the standard way of sending documents over the Web. HTTP uses the TCP protocol. The client (which is normally a browser such as Netscape) establishes a TCP connection to the server (which in our case is Apache) and then sends a request in HTTP format down that channel. The server examines the request and responds in whatever way its webmaster has told it to. The webmaster does this by configuring the Apache server and the files or scripts he or she provides on the system.
The machine's response may be in HTML, graphics, audio, VRML, Java, or whatever new fad the web fanatics have dreamed up since we went to press. Whatever it is, it consists of bytes of data that are made into packets by the server's TCP/IP stack and transmitted. You can find a list of MIME types in the file mime.types or at http://www.isi.edu/in-notes/iana/assignments/media-types/media-types. The meanings are pretty obvious: text/html is HTML, text/plain is plain text, image/jpeg is a JPEG, and so on.
One of the more important fields in a request is METHOD. This tells the server how to handle the incoming data. For a complete account, see the HTTP/1.1 specification. Briefly, however, the methods are as follows:
Returns the data asked for. To save network traffic, a "conditional GET " only generates a return if the condition is satisfied. For instance, a page that alters frequently may be transmitted. The client asks for it again: if it hasn't changed since last time, the conditional GET generates a response telling the client to get it from its local cache.
Returns the headers that a GET would have included, but without data. They can be used to test the freshness of the client's cache.
Tells the server to accept the data and do something with it, using the CGI[33] specified by the URL[34] in the ACTION field. For instance, when you buy a book across the Web, you fill in a form with the book's title, your credit card numbers, and so on. Your browser will then tell the server to POST this data.
[33]Typically, although the URL could specify a module or even something more exotic.
[34]Often this will be the ACTION field from an HTML form, but in principle, it could be generated in any way a browser sees fit.
Tells the server to store the data.
Tells the server to delete the data.
Tells the server to return a diagnostic trace of the actions it takes.
Used to ask a proxy to make a connection to another host and simply relay the content, rather than attempting to parse or cache it. This is often used to make SSL connections through a proxy.
Note that servers do not have to implement all these methods. See RFC 2068 for more detail.
The catalog, now a form with the new lines marked:
<!-- NEW LINE - CREATES A FORM FIELD -->
is shown here. As we'll see, the Unix and Win32 versions are slightly different in the extensions they will tolerate for CGI scripts.
Unix doesn't mind what a script is called, provided it is made executable with:
chmod +x <scriptname>
Win32 has a default shell -- COMMAND.COM -- that will execute batch files with the extension .bat. If you want to use it, you don't have to specify it (see later in this chapter):
<html>
<body>
<!-- UNIX -->
<!--TWO VERSIONS - see text above -->
<FORM METHOD=GET ACTION="mycgi.cgi">
<!-- OR -->
<FORM METHOD=GET ACTION="cgi-bin/mycgi.cgi">
<!-- WIN32 -->
<!--TWO VERSIONS - see text above -->
<FORM METHOD=GET ACTION="mycgi.bat">
<!-- OR -->
<FORM METHOD=GET ACTION="cgi-bin/mycgi.bat">
<h1> Welcome to Butterthlies Inc</h1>
<h2>Summer Catalog</h2>
<p> All our cards are available in packs of 20 at $2 a pack.
There is a 10% discount if you order more than 100.
</p>
<hr>
<p>
Style 2315
<p align=center>
<img src="bench.jpg" alt="Picture of a bench">
<p align=center>
Be BOLD on the bench
<!-- NEW LINE - CREATES A FORM FIELD -->
<p>How many packs of 20 do you want? <INPUT NAME="2315_order" TYPE=int>
<hr>
<p>
Style 2316
<p align=center>
<img src="hen.jpg" ALT="Picture of a hencoop like a pagoda">
<p align=center>
Get SCRAMBLED in the henhouse
<!-- NEW LINE - CREATES A FORM FIELD -->
<p>How many packs of 20 do you want? <INPUT NAME="2316_order" TYPE=int>
<HR>
<p>
Style 2317
<p align=center>
<img src="tree.jpg" alt="Very nice picture of tree">
<p align=center>
Get HIGH in the treehouse
<!-- NEW LINE - CREATES A FORM FIELD -->
<p>How many packs of 20 do you want? <INPUT NAME="2317_order" TYPE=int>
<hr>
<p>
Style 2318
<p align=center>
<img src="bath.jpg" alt="Rather puzzling picture of a batchtub">
<p align=center>
Get DIRTY in the bath
<!-- NEW LINE - CREATES A FORM FIELD -->
<p>How many packs of 20 do you want? <INPUT NAME="2318_order" TYPE=int>
<hr>
<!-- NEW LINES - CREATE FORM FIELDS -->
<p>Which Credit Card are you using?
<ol><li>Access <INPUT NAME="card_type" TYPE=checkbox VALUE="Access">
<li>Amex <INPUT NAME="card_type" TYPE=checkbox VALUE="Amex">
<li>MasterCard <INPUT NAME="card_type" TYPE=checkbox VALUE="MasterCard">
</ol>
<p>Your card number? <INPUT NAME="card_num" SIZE=20>
<hr>
<p align=right>
Postcards designed by [email protected]
<hr>
<br>
Butterthlies Inc, Hopeful City, Nevada 99999
</br>
<!-- NEW LINE - CREATES A FORM FIELD -->
<p><INPUT TYPE=submit><INPUT TYPE=reset>
</FORM>
>/body>
</html>
This is all pretty straightforward stuff, except perhaps for the line:
<FORM METHOD=GET ACTION="/cgi-bin/mycgi.cgi">
or:
<FORM METHOD=GET ACTION="mycgi.bat">
The tag <FORM> introduces the form; at the bottom, </FORM> ends it. The tag <METHOD> tells Apache how to return the data to the CGI script we are going to write. For the moment it is irrelevant because the simple script mycgi.cgi ignores the returned data.
The ACTION specification tells Apache to use the URL /cgi-bin/mycgi.cgi (amplified to /usr/www/cgi-bin/mycgi) to do something about it all:
ACTION="/cgi-bin/mycgi.cgi"
Or, if we are using the second method, where we keep the CGI script in the htdocs directory:
ACTION="/mycgi.cgi"
The ACTION specification tells Apache to use the URL /cgi-bin/mycgi.cgi (amplified to \usr\www\cgi-bin\mycgi ) to do something about it all:
ACTION="/cgi-bin/mycgi.bat"
Or, if we are using the second method, where we keep the CGI script in the htdocs directory:
ACTION="/mycgi.bat"
Copyright © 2001 O'Reilly & Associates. All rights reserved.