
Parameters to a CGI program are transferred either:
The method used to pass parameters is determined by the METHOD attribute of the <FORM> tag. Before data supplied on a form can be sent to a CGI program, each Form element's name (specified by the NAME attribute) is equated with the VALUE entered by the user to create a NAME=VALUE pair. For example, if the user entered "30" when asked for his or her age, the NAME=VALUE pair would be "age=30". NOTE: In the transferred data, NAME=VALUE pairs are separated by the ampersand (&) character. Since under the GET method the form information is sent as part of the URL, Form information can't include any spaces or other special characters that are not allowed in URLs, or characters that have other meanings in URLs, like slashes (/). (For the sake of consistency, this constraint also exists when the POST method is being used.) Therefore, the Web browser performs some special encoding on user-supplied information -- this process is known as URL Encoding. The GET Method transfers the data within the URL itself. Under the GET Method, the browser might initiate the HTTP transaction as follows:
The "first" and "last" variable names that were defined in the HTML <FORM>, coupled with the values entered by the user. An ampersand (&) is used to separate the NAME=VALUE pairs. The POST Method uses the body portion of the HTTP request to pass parameters. The same transaction with the POST method would read as follows:
To summarize a client browser can make a CGI request to a server by either of 2 Methods:
The client initiates a CGI process by clicking any of the following on an HTML page:
The URL that the client browser sends to the server contains the name of the CGI script or application to be run. The server compares the file name’s extension to the server's Script Mapping registry key to determine which executable to launch. The NT server, for example, has Script Map entries for .cmd and .bat files, which launch Cmd.exe; and for .idc files, which launch the Internet Database Connector, just to name a few.
The server passes information to the CGI application by means of Environment Variables, then launches the "CGI" application. Some of these variables are server-related; the majority come from the client browser and relate either to the client browser or to the request it is sending.
The application performs its processing. If it is appropriate, the application then writes data in a format the client can receive to the Standard Output stream (STDOUT). The application must follow a specific format in returning data:
The server takes the data it receives from STDOUT and adds standard HTTP headers. It then passes the HTTP message back to the client.
When the user presses the "Submit" button, the data entered into the <INPUT> text fields is passed to the CGI program specified by the action attribute of the <FORM> tag. What is send to the server are the NAME=VALUE pairs:
Notice that every ampersand (&) delimits a single piece of data in the query string, and that the spaces have been translated into ‘+’ signs. Other characters are referenced by their ANSI code in hexadecimal, and preceded by a % character. For example:
This is an example of URL Encoding. The GET method is used to pass user input to an HTML form to a CGI program for processing. Although GET is the default value for the METHOD attribute, it is considered to be the less-preferred method for forms-input handling because of limitations on the amount of data that can be passed using GET (a limitation not incurred by using the POST method). Because writing CGI programs that obtain their input from the QUERY_STRING Environment Variable is very straightforward, GET still is useful for simpler forms with only a few input objects. A fair number of older HTML and CGI programs still exist that use GET. Forms which use the GET method also submit their data as part of the URL. The data is sent as a series of NAME=VALUE pairs, separated by ampersands. The listing below shows a forms-based version of the address book. <HTML> <BODY> <H1>Address
Book</H1> <FORM METHOD="GET" ACTION="http://www.mycompany.com/cgi-bin/addrform.pl">
</FORM> </BODY> After a user clicks the Submit button on a form, the client browser URL Encodes and assembles user-input data into a query string that is appended to the ACTION URL specified in the <FORM> tag in the HTML document. If "John Smith" were entered in this form, the client would generate the URL with the following NAME=VALUE pairs:
? demarcates the boundary between the name of the script, addrform.pl and the query information. The server parses this address and runs addrform.pl while providing the first=name&last=Smith in the QUERY_STRING Environment Variable. NOTE: If the user fails to enter anything in text and password-entry fields, the field value is empty, but the field name still is appended to the URL query string as "fieldname=". Also note that disabled checkboxes are ignored entirely and are not appended as part of the query string. Ordinary hyperlinks may also specify URLs which include a query string. This makes it possible to create hyperlinks whose action is action is equivalent to a user's filling out a form:
To the address book script, input from a user's click on this hyperlink would be essentially indistinguishable from data which was manually entered into the form and submitted. NOTE: When this URL is placed in HTML, the &s must be escaped as &, since the ampersand symbol has special meaning in HTML text. CAUTION: As mentioned previously, GET is useful for very simple forms. However, GET has serious limitations on the amount of user-input data that can be transmitted from the browser to the server and subsequently to the CGI program. The amount of data that can be transferred typically is limited to ~1000 characters. This limitation can be especially constrictive for Forms with multiple fields and Forms with <TEXTAREA> Objects. The amount of data, along with URL Encoding, easily can surpass the limitations of GET, resulting in data being truncated while being passed. For this reason, POST is the preferred METHOD for Forms processing. After a user clicks the Submit button on a form, the client browser URL Encodes user input in the same manner it does for GET. However, the data is not appended to the specified Action URL. The POST method uses the message body to send additional information from the user, rather than encoding it as part of the URL. The data is sent in a data block to the server as part of the POST operation. A data block is simply a stream of data, of arbitrary length, passed to the CGI program. In this case, the Action URL is the URL to which the data block is POSTed. A particular request to the server might look like this:
The server now passes the encoded user data to the CGI program by Standard Input. Additionally, the CONTENT_LENGTH and CONTENT_TYPE Environment Variables are set for use by the CGI program. The Content-length of the NAME=VALUE pairs first=John&last=Smith is 21. NOTE: you should be aware that when sending data to the CGI program using POST, the server is not required to send an End-of-File (EOF) character at the end of the data. You should use the Environment Variable CONTENT_LENGTH to determine how much data your CGI program needs to read from the Standard Input File (STDIN) Descriptor. A script gets its stream of data from the transmission receive through the POST method. This stream is a collection of NAME=VALUE pairs for each Form variable on the Form. Here’s your first look at the CGI source code (it reads the STandarD INput (STDIN) into a memory buffer for the number of characters passed to the server from the client):
This will be covered in the next example that follows this page. |