Sockets
A socket is a communication mechanism. A socket is normally idenitifed by a small integer which may be called the socket descriptor. The socket mechanism was first introduced in the 4.2 BSD Unix system in 1983 in conjunction with the TCP/IP protocols that first appeared in the 4.1 BSD Unix system in late 1981.
Formally a socket is defined by a group of four numbers, these are
- The remote host identification number or address
- The remote host port number
- The local host identification number or address
- The local host port number
Users of Internet applications are normally aware of all except the local port number, this is allocated when connection is established and is almost entirely arbitrary unlike the well known port numbers associated with popular applications.
To the application programmer the sockets mechanism is accessed via a number of functions. These are.
socket() | create a socket |
bind() | associate a socket with a network address |
connect() | connect a socket to a remote network address |
listen() | wait for incoming connection attempts |
accept() | accept incoming connection attempts |
In addition the functions setsockopt()
, getsockopt()
, fcntl()
and ioctl()
may be used to manipulate the properties of a socket, the function select()
may be used to identify sockets with particular communication statuses. The function close()
may be used to close a socket liaison.
Data can be written to a socket using any of the functions write(), writev(), send(), sendto()
and sendmsg()
. Data can be read from a socket using any of the functions read(), readv(), recv(), recvfrom()
and recvmsg()
These notes have been written in the context of SUN's Solaris 2 operating system, in particular version 2.5. Other operating systems and environments will provide similar facilities and functions but reference to the documentation is advised. In particular many of the functions described here may well be system calls, i.e. direct entries to the operating system.
Daemon Processes
The sockets mechanism is usually used to implement client-server applications. The client process is directly or indirectly user driven whereas the server process sits on a host waiting for incoming connections. A server process will run unattended and continuously. In the Unix environment such processes are called daemon processes.
A daemon process can be initiated as part of the system boot up sequence. Alternatively a daemon process can be initiated by a user in such a way that it carries on running after the user logs out. The latter can be achieved using the nohup(1)
command in conjunction with an ampersand (&) at the end of the command line but special coding techniques can also be used.
To create a daemon observe the following steps.
- Ensure that the return value from functions is always checked.
- Remember that the daemon inherits various things from the shell from which it was invoked.
- Close all open files. This includes
stdin, stdout
andstderr
. Open suitable special files forstdin, stdout
andstderr
if necessary. - Change the working directory. Remember that if the daemon crashes it will drop
core
in the current working directory. Also the working directory, and under some systems the executable file, are open when the daemon is running. This may cause difficulties if the system manager wants to unmount the relevant partition whilst the daemon is running. - Reset the file creation mask using
umask()
. - To run a daemon in background after interaction to start it up, which may be useful during development, use the
fork()
function. - Run the daemon in a separate process group using the
setpgrp()
function. This avoids unexpected signals. - Catch all likely signals. The
SIGTERM
signal is sent to all processes when the system closes down, arrange to catch this for a graceful shut down of the daemon. - Ensure that there is no control terminal.
- For development you may wish to log events. To avoid information being lost in buffers use unbuffered output either using
sprintf()
andwrite()
or usefopen()
andfclose()
before and after every logged event. ANSI C compilers offer further options for un-buffered output. Log every signal. - Use lock files to avoid multiple instances of daemons. I.e. when a daemon starts up it checks for the existence of a lock file which it creates if it doesn't exist and if it does exist the daemon refuses to run.
A socket is created using the function socket()
. The prototype is
#include#include int socket(int domain, int type, int protocol)
domain is either AF_UNIX
or AF_INET
. This parameter specifies whether the socket is to be used for communicating between Unix file system like objects or Internet objects. Actually this parameter is intended to allow sockets to be used with a wide variety of networking protocols and products. Examination of the socket.h
will show that there are lrage number of possible values for networking domains such as CCITT/X.25, Novell and many others. The information in these notes is only intended to cover the Internet domain.
type specifies the communications semantics. There are a number of possible values.
SOCK_STREAM | stream based full-duplex communication |
SOCK_DGRAM | datagram based communication |
SOCK_RAW | use raw IP sockets (must be super-user) |
SOCK_SEQPACKET | sequenced reliable datagrams |
SOCK_RDM | reliably delivered messages |
protocol is normally set to zero.
The return value from socket()
is a small integer that may be used to refer to a socket in subsequent calls. It may be called the socket descriptor or handle and is analogous to a file descriptor.
The behaviour of a socket can be modified using the functions setsockopt(), fcntl()
and ioctl()
. The function getsockopt()
can be used to detemine the current values of certain aspects of the behaviour of a socket. The default options are satisfactory for most applications.
With fcntl()
the FNDELAY
option can be associated with the command F_SETFL
to make reads non-blocking.
The prototype of setsockopt()
is
#include#include int setsockopt(int s, int level, int optname, char *optval, int *optlen)
s is the socket number
level is set to SOL_SOCKET
optname indicates which option to modify. There are several options, see the manual for details.
optval/optname are used to provide new values for the various options
The select()
function
If an application is using several sockets then the select()
function may be used to find out which ones are active, i.e. which ones have outstanding incoming data or which can now accept further data for transmission or which have outstanding exceptional conditions. This function would probably only be used with server daemon programs using multiplexing to handle multiple clients.
select() is normally used in conjunction with the macros FD_SET, FD_CLR, FD_ISSET
and FD_ZERO
. The prototype is
#includeBefore calling#include int select(int width, fd_set *readfds, fd_set *writefds, fd_set exceptfds, struct timeval *timeout)
select() FD_SET, FD_CLEAR
and FD_ZERO
should be used to put the relevant descriptors into objects of type fd_set
. After the call the return value gives the number of "active" sockets and the FD_ISSET
macro may be used to determine whether a particular socket is active, width
specifies the number of the largest descriptor to be checked. The information in timeout
specifies how long to wait to complete the selection. The connect()
function
The connect()
function is used by a client program to establish communication with a remote entity. The prototype is
#include#include int connect(int s, struct sockaddr *name, int namelen)
s specifies the socket
name points to an area of memory containing the address information and namelen give the length of the address information block. This is done because addresses in some addressing domains are far longer than in others. connect()
is normally only used for SOCK_STREAM
sockets.
The form of a struct sockaddr
is
struct sockaddr { u_short sa_family; /* address family */ char sa_data[14]; /* actual address */ }
IP Addresses for connect()
If the AF_INET
domain is being used this will be the address of an object of type struct sockaddr_in
whose internal structure is
struct sockaddr_in { short sin_family; u_short sin_port; struct in_addr sin_addr; char sin_zero; }
A struct in_addr
in turn consists of
struct in_addr { union { struct { u_char s_b1, s_b2, s_b3, s_b4; }; struct { u_short s_w1, s_w2; }; u_long S_addr; } S_un; }
As a convenience the following definition can be used
#define s_addr S_un.S_addrThe use of the othe "views" of the internal structure of
struct in_addr
is obsolete. Howere, the function connect()
requires the address of an area of memory containing a data object of type struct sockaddr
. The conflict between an IP address stored in an object of type struct sockaddr_in
and the object of type struct sockaddr
required by connect()
and related routines can be resolved by the use of a union thus.
union sock { struct sockaddr s; struct sockaddr_in i; } sock;
The usual way of opening a connection is by the user supplying a fully qualified DNS name (e.g. scitsc.wlv.ac.uk
) and using the library call gethostbyname()
to determine the associated Internet address. The prototype is
#include#include struct hostent *gethostbyname(char *)
This returns the address of an object of type struct hostent
. The relevant fields need to be copied into an object of type struct sockaddr_in
as shown below
union sock { struct sockaddr s; struct sockaddr_in i; } sock; struct in_addr internet_address; struct hostnet *hp; hp = gethostbyname(remote address); memcpy(&internet_address, *(hp->h_addr_list),sizeof(struct in_addr);
The relevant components of the struct sockaddr_in
can now be filled in. Some older references on sockets programming may use the function bcopy() rather than memcpy().
sock.i.sin_family = AF_INET; sock.i.sin_port = required port number; sock.i.sin_addr = internet_address;
In practice the final line of code above may be written thus
memcpy(&(sock.i.sin_addr),*(hp->h_addr_list),sizeof(struct in_addr));
connect()
can now be called
connect(socket number, &sock.s, sizeof(struct sockaddr));
Domain Name Service
Domain name service (DNS) is a mechanism that provides easily remembered names for Interent hosts rather than using the dotted decimal addresses.
It can be accessed via the program /usr/etc/nslookup
thus
/usr/etc/nslookup Default Server: ccub.wlv.ac.uk Address: 134.220.1.20 >unix.hensa.ac.uk Server: ccub.wlv.ac.uk Address: 134.220.1.20 Non-authoritative answer: Name: unix.hensa.ac.uk Address: 129.12.21.7 > Ctrl-D
Incoming connections bind()
To accept incoming connection requests a server process must first create a socket using socket()
and then use bind()
to associate a port number with the socket. The prototype of bind()
is
#include#include int bind(int s, struct sockaddr *name, int namelen)
This is similar to the connect()
function except that, when binding in the AF_INET
domain, the components of the struct sockaddr_in
are filled in differently
union sock { struct sockaddr s; struct sockaddr_in i; } sock; sock.i.sin_family = AF_INET; sock.i.sin_port = port number; sock.i.sin_addr.s_addr = INADDR_ANY;
The final value simply means that connections will be accepted from any remote host.
Incoming Connections listen()
Once an address has been bound to a socket it is then necessary to indicate the socket is to be listened to for incoming connection requests. This is done using the listen()
function. Its prototype is
int listen(int s, int backlog)
s specifies the socket.
backlog specifies the maximum number of outstanding connection requests in listen()
's input queue. listen()
can only be associated with SOCK_STREAM
or SOCK_SEQPACKET
type sockets.
Incoming connections accept()
Once the listen()
call has returned the accept()
call should be issued, this will block until a connection request is received from a remote host. The prototype is
#include#include int accept(int s, struct sockaddr *addr, int *addrlen)
s is the socket number
addr points to a struct sockaddr
that will be filled in with the address of the remote (calling) system.
addrlen points to a location that will be filled in with the amount of significant information in the remote address, initially it should specify the size of the space set aside for the incoming address.
The return value of the call is the number of a new socket descriptor that should be used for all subsequent communication with the remote host. You can, and should, carry on listening on the original socket number. There is no way of rejecting a connection request, you must accept it then close()
it. Receiving Data
There are a variety of functions that may be used to receive incoming messages.
read()
may be used in the exactly the same way as for reading from files. There are some complications with non-blocking reads. This call may only be used with SOCK_STREAM
type sockets.
readv()
may be used in the same way as read()
to read into several separate buffers. This is called a scatter read. This call may only be used with SOCK_STREAM
type sockets.
recv()
may be used in the same way as read()
. The prototype is
#include#include int recv(int s, char *buff, int len, int flags)
buff is the address of a buffer area.
len is the size of the buffer.
flags is formed by ORing MSG_OOB
and MSG_PEEK
allowing receipt of out of band data and allowing peeking at the incoming data. This call may only be used with SOCK_STREAM
type sockets.
recvfrom()
and recvmsg()
may be used with all types of socket. recvfrom()
is similar to recv()
with extra parameters specifying the remote system address and recvmsg()
receives into a message structure.
In all cases the return value is the number of bytes received or -1. The value -1 indicates an error.
It is tempting and a common beginner's error to think that data sent using a single write()
to a socket can be read at the other end using a single read()
. There is no guarantee that this will happen, data may well arrive in "dribs and drabs". If the application is a network server, it is not possible to impose any constraint on clients to write the data using a single call to write()
.
The following code shows how to receive a message whose termination is indicated by the character pair CR+LF (a common internet convention).
char buff[BUFSIZ+1]; /* +1 for string terminator */ char *bptr; int i; bptr = buff; do { i = read(sd,bptr,BUFSIZ-(bptr-buff)); if(i<=0) break; bptr += i; *bptr = '\0'; /* make what we've got into a string */ if((cp = strstr(buff,"\r\n")) { *cp = '\0'; break; } if(bptr >= buff+BUFSIZ) { /* message too long */ } } while(1);
The code also converts the received message into a string. If part of the next message is received in the same "chunk" of data as the end of the current message, this code is likely to loose the initial part of the next message.
Another common requirement of network programming is to implement a time out if aread()
call does not return in a defined time. [An alternative technique is make the socket non-blocking.] This is normally done using the alarm() system call in conjunction with a signal catching routine. Unfortunately on many operating systems (such as Solaris), after catching the signal, the system call is re-issued. If this happens with time-outs the program will loop indefinitely. The problem can be overcome either using the setjmp() and longjmp() routines or by using sigaction() and sigsetops to modify the behaviour associated with the signal. Here's some sample Solaris code
void (*func)() = timeout; struct sigaction action; action.sa_handler = func; action.sa_flags = 0; sigemptyset(&(action.sa_mask)); /* ignore all known signals */ sigaction(SIGALRM,&action,NULL)); /* ensures that SA_RESTART is NOT set */ alarm(TIMEOUT);The sigaction() uses the information in action to define the handling of the SIGALRM signal.
Sending Data
There are a variety of functions that may be used to send outgoing messages.
write()
may be used in exactly the same way as it is used to write to files. This call may only be used with SOCK_STREAM
tpye sockets.
writev()
is similar to write()
except that it writes from a set of buffers. This is called a gather write. This call may only be used with SOCK_STREAM
type sockets.
send()
may be used in the same way as write()
. The prototype is
#include#include int send(int s, char *msg, int len, int flags)
s is the socket number.
msg and len specify the buffer holding the text of the messagee.
flags may be formed by ORing MSG_OOB
and MSG_DONTROUTE
. The latter is only useful for debugging. This call may only be used with SOCK_STREAM
type sockets.
sendto()
can be used to send messages to sockets of any type and sendmsg()
sends a message held in a message structure.
In all cases the return value is the number of bytes sent or -1.
Multiple Server Sessions
If sockets programming is being used to proved a server facility it is important to ensure that multiple simultaneous sessions are handled correctly. Once a connection request has been accepted the program will be engaged in handling the associated dialogue, further connection requests will be held until the listener program gets round to issuing the accept()
call again, so only one client can be handled at a time.
There are several ways of handling this issue. Under Unix the simplest solution is to fork a separate process to handle the client/server dialogue once a connection request has been received.
Here's an outline of the basic code.
listen(sd,2); /* at most 2 pending connections */ do { nsd = accept(sd,&(work.s),&addrlen); pid = fork(); if(pid == 0) { /* Child process handles the dialogue using descriptor 'nsd' */ close(nsd); exit(0); /* end of child process */ } else close(nsd); /* parent won't be using 'nsd' */ } while(1);The overheads associated with forking a separate process everytime a client connects can be significant. Alternative approaches use either multiplexing or multithreading.
Multiplexing requires that all the code be capable of talking on serveral different sockets simultanesously, typically there will be an array of descriptors and it is the responsibility of the program to keep track of the state of the dialogue associated with each active socket. The poll()
can be used to check for new connection requests. Multiplexing is the most complex and most efficient way of handling multiple dialogues.
Multithreading is in many ways similar to forking a separate process for each dialogue with the exception that thread creation overheads are much lower than process creation overheads. Synchronisation between threads is usually easier than synchronisation between processes.
Miscellaneous Routines
There are various useful miscellaneous routines.
bcmp(), bcopy()
and bzero()
(see bstring(3)
) manipulate blocks of memory. They are equivalent to the ANSI functions memcmp(), memcpy()
and memset()
.
htonl(), htons(), ntohl()
and ntohs()
convert strings of bytes to/from host order to network order. They should always be used in a heterogeneous environment. On SUN systems they are NULL macros but on systems such as VAXes and PCs they convert local byte order to/from network order. See byteorder(3N)
.
The routines inet_addr(), inet_network(), inet_makeaddr(), inet_lnaof()
and inet_ntoa()
converts an Internet address to a string. See inet(3N)
.
The library routines gethostbyaddr()
and gethostbyname()
can be used for conversions between dotted decimal and DNS addresses. See gethostent(3N)
.
The function getpeername()
determines the address of the remote system on a socket. This should be used in conjunction with gethostbyname()
to determine the DNS address of the remote system
No comments:
Post a Comment