| The Hypertext Transfer Protocol (HTTP) is the language
that Web clients and Web servers use to communicate with each other. It
is essentially the backbone of the Web.
While HTTP is largely the realm of server and client programming, a
firm understanding of HTTP is also important for CGI programming.
HTTP
Basics
All HTTP transactions follow the same general format. Each client
request and server response has three parts: the request or response
line, a header section, and the entity body. The client initiates a
transaction as follows:
- The client contacts the server at a designated port number (by
default, 80). Then it sends a document request by specifying an HTTP
command called a method, followed by a document address, and
an HTTP version number. For example:
GET /index.html HTTP/1.1
uses the GET method to request the document index.html
using version 1.1 of HTTP.
- Next, the client sends optional header information to inform the
server of its configuration and the document formats it will accept.
All header information is given line by line, each with a header
name and value. For example, this header information sent by the
client indicates its name and version number and specifies several
document preferences:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)
Accept: image/gif, image/jpg, image/jpeg, */*
The client sends a blank line to end the header.
- After sending the request and headers, the client may send
additional data. This data is mostly used by CGI programs using the POST
method.
The server responds in the following way to the client's request:
- The server replies with a status line containing three fields:
HTTP version, status code, and description. The HTTP version
indicates the version of HTTP that the server is using to respond.
The status code is a three digit number that indicates the server's
result of the client's request. The description following the status
code is just human-readable text that describes the status code. For
example, this status line:
HTTP/1.1 200 OK
indicates that the server uses version 1.1 of HTTP in its response.
A status code of 200 means that the client's request was successful
and the requested data will be supplied after the headers. Server
Response Codes contains a listing of the status codes and their
descriptions.
- After the status line, the server sends header information to the
client about itself and the requested document. For example:
Date: Sun, 21 Jan 2001 08:17:58 GMT
Server: Microsoft-IIS/4.0
Last-modified: Mon, 8 Jan 2001 21:53:08 GMT
Content-type: text/html
Content-length: 2482
A blank line ends the header.
- If the client's request is successful, the requested data is sent.
This data may be a copy of a file, or the response from a CGI
program. If the client's request could not be fulfilled, additional
data may be a human-readable explanation of why the server could not
fulfill the request.
In HTTP 1.1 the default is for the server to maintain the
connection and allow the client to make additional requests. Since
many documents embed other documents as inline images, frames,
applets, etc., this saves the overhead of the client having to
repeatedly connect to the same server just to draw a single page.
Under HTTP 1.1, therefore, the transaction might cycle back to the
beginning, until either the client or server explicitly closes the
connection.
Being a stateless protocol, HTTP does not maintain any information
from one transaction to the next, so the next transaction needs to start
all over again. The advantage is that an HTTP server can serve a lot
more clients in a given period of time, since there's no additional
overhead for tracking sessions from one connection to the next. The
disadvantage is that more elaborate CGI programs need to use hidden
input fields or external tools such as Cookies
to maintain information from one transaction to the next. |