In the last installment, we saw how to program a network client by writing a simple tool to get pages from remote Web servers. In this issue, we will explore how to write a simple network server. As an example project, we will actually write a simpleminded Web server (the complete code is presented at the end of this article in case you find it easier to follow along that way). Reread the previous issue if you think you have forgotten any of the basic networking concepts I presented there.
use Socket; $this_host = `my-server.netmarket.com'; $port = 8080; $server_addr = (gethostbyname($this_host))[4]; $server_struct = pack("S n a4 x8", AF_INET, $port, $server_addr); $proto = (getprotobyname(`tcp'))[2]; socket(SOCK, PF_INET, SOCK_STREAM, $proto)|| die "Failed to initialize socket: $!\n";First, the program has to pull in the Perl
Socket.pm
module. The hostname of the machine upon which the server will run and
the port upon which it will accept requests are specified on the next
two lines (you can imagine getting these parameters out of a
configuration file or on the command line). The program then calls
gethostbyname()
to get the IP address of the server
machine and uses that information to create a C structure which we
will use later. Finally, we call socket()
to create a
file handle for the socket.
Remember from the last article that Web servers usually wait for connections on port 80. Why does the code above specify the port as 8080? As a security feature, only the superuser is allowed to run servers that accept connections on ports below 1024. The thinking behind this policy is that users should then be able to trust connecting to unknown machines as long as they are connecting to services (like Telnet, FTP, gopher, et al.) that listen for connections at low port numbers because they will require the system manager at the remote site to "approve" the service being run on those ports. This reasoning is probably no longer true in this age of workstations on every desk, but the rule remains.
Returning to our example, the server now needs to prepare to receive connections at the given address and port combination:
setsockopt(SOCK, SOL_SOCKET, SO_REUSEADDR,1) || die "setsockopt() failed: $!\n"; bind(SOCK, $server_struct) || die "bind() failed: $!\n"; listen(SOCK, SOMAXCONN) || die "listen() failed: $!\n";The
setsockopt()
function allows the program to change
various parameters associated with the socket: more on
SO_REUSEADDR
in a moment. The bind()
call is
what actually associates the SOCK
file handle with the
address and port number pair specified at the top of the program. As
long as any program has bound itself to a particular address and port,
no other program can bind to the same location. This is useful and
prevents confusion. However, even after a given server program has
exited, its address/port combination does not become available for
reuse (at least until the machine the server was running on is
rebooted) - even if you rerun the exact same program. This is annoying
and creates bad feelings. Use setsockopt()
to set the
SO_REUSEADDR
bit to 1 (true) - BEFORE the call to
bind()
- so other programs can reuse the same port after
the server program has exited. Both the SOL_SOCKET
and
SO_REUSEADDR
constants are defined in Socket.pm
.
The listen()
call is probably misnamed. All this function
does is specify how long a queue of pending connection attempts the
server is willing to deal with. If the server queue is full, further
connection attempts will be rejected. On almost every socket
implementation in existence, the maximum queue length that you can set
is 5 (so handle incoming connection requests quickly!), and
SOMAXCONN
(another helpful constant from
Socket.pm
) is usually set to 5. If you try to set the
queue length to a value above 5, the operating system silently
throttles the queue length back to the maximum value. Solaris 2.x is
the only modern operating system that I am aware of where you can
meaningfully specify queue length values that are greater than 5
(though interestingly SOMAXCONN
is still given as 5 in
the Solaris 2.x system header files).
for (;;) { $remote_host = accept(NEWSOCK, SOCK); die "accept() error: $!\n" unless ($remote_host); # do some work here close(NEWSOCK); }The
accept()
call grabs the next connection request off
the pending queue for SOCK
. (If there are no pending
connections, accept()
pauses until one comes in.) A new
socket that is the local endpoint of this new communications channel
is created. If you print to NEWSOCK
you are sending data
to the remote machine making the connection, and you can read data
from NEWSOCK
just like any other file handle to get data
from the remote machine. Always remember to close NEWSOCK
when it is no longer needed.
The accept()
function returns a C structure containing
the address of the remote machine (or undef if the
accept()
fails for any reason). This structure is the
same as the one passed to bind()
and
connect()
, and you can extract the IP address of the
remote machine as follows:
$raw_addr = (unpack("S n a4 x8",$remote_host))[2]; @octets = unpack("C4", $raw_addr); $address = join(".", @octets);You can also obtain the hostname of the remote host (usually) with the
gethostbyaddr()
function:
$hostname = (gethostbyaddr($raw_addr,AF_INET))[0];This can be useful for logging purposes. Note the reappearance of
AF_INET - gethostbyaddr()
needs to be told what type of
network address it is being given.
HTTP is an incredibly simpleminded protocol. Requests sent by the Web browser are simply lines of ASCII text, terminated by a blank line. After seeing the blank line, the server sends back the requested data and shuts down the connection. Although the client typically sends over a great deal of useful information in its request, a simple Web server can ignore everything except the line that looks like:
GET /some/path/to/file.html ...Here's some code that reads the client request and extracts the path to the information that the user is requesting:
while (<NEWSOCK>) { last if (/^\s*$/); next unless (/^GET /); $path = (split(/\s+/))[1]; }Now the server has to respond. Typically
$path
is
relative to the top of some directory hierarchy where your Web
documentation lives - your $docroot
in Web-speak. This
directory can be defined in a config
file or on the
command line. Assuming that $docroot
has been defined
elsewhere we can simply
if (open(FILE, "< $docroot$path")) { @lines = <FILE>; print NEWSOCK @lines; close(FILE); } else { print NEWSOCK <<"EOErrMsg"; <TITLE>Error</TITLE><H1>Error</H1> The following error occurred while trying to retrieve your information: $! EOErrMsg }If we are able to open the requested file, we simply dump its contents down
NEWSOCK
. Note that the server sends back an error
message if the open()
fails. Never forget that there is
somebody on the other end of that connection who is waiting to hear
something back as a result of his or her request.
Congratulations. If you glue together all the code fragments in this article, you will have a bare-bones Web server. You will find all of the code in proper order at the end of this article to make it easier to review all the concepts presented here.
/../../../../../../../etc/passwdand get a copy of your password file. Obviously, a better access control mechanism is needed.
In the third and final installment of this series, we will look at ways to solve these (and other) problems with our mini Web server.
#!/packages/misc/bin/perl use Socket; $docroot = `/home/hal/public_html'; $this_host = `my-server.netmarket.com'; $port = 8080; # Initialize C structure $server_addr =(gethostbyname($this_host))[4]; $server_struct = pack("S n a4 x8", AF_INET,$port, $server_addr); # Set up socket $proto = (getprotobyname(`tcp'))[2]; socket(SOCK, PF_INET, SOCK_STREAM,$proto)|| die "Failed to initialize socket:$!\n"; # Bind to address/port and set up pending queue setsockopt(SOCK, SOL_SOCKET, SO_REUSEADDR, 1) || die "setsockopt() failed: $!\n"; bind(SOCK, $server_struct) || die "bind() failed: $!\n"; listen(SOCK, SOMAXCONN) || die "listen() failed: $!\n"; # Deal with requests for (;;) { # Grab next pending request # $remote_host = accept(NEWSOCK, SOCK); die "accept() error: $!\n" unless ($remote_host); # Read client request and get $path while (<NEWSOCK>) { last if (/^\s*$/); next unless (/^GET /); $path = (split(/\s+/))[1]; } # Print a line of logging info to STDOUT $raw_addr = (unpack("S n a4 x8", $remote_host))[2]; $dot_addr = join(".", unpack("C4", $raw_addr)); $name = (gethostbyaddr($raw_addr, AF_INET))[0]; print "$dot_addr\t$name\t$path\n"; # Respond with info or error message if (open(FILE, "< $docroot$path")) { @lines = <FILE>; print NEWSOCK @lines; close(FILE); } else { print NEWSOCK <<"EOErrMsg"; <TITLE>Error</TITLE><H1>Error</H1> The following error occurred while trying to retrieve your information: $! EOErrMsg } # All done close(NEWSOCK); }
Reproduced from ;login: Vol. 21 No. 5, October 1996.
Back to Table of Contents
12/5/96ah