Network Sockets
Network Sockets
- Details
- Category: Network Technology
- Published on Wednesday, 17 December 2008 09:26
- Written by Administrator
- Hits: 2750
Sockets represent a single connection between two network applications. These two applications nominally run on different computers, but sockets can also be used for inter process communication on a single computer. Applications can create multiple sockets for communicating with each other. Sockets are bidirectional, meaning that either side of the connection is capable of both sending and receiving data. Programming libraries like Winsock hide many of the low-level details of socket programming. Sockets are widely used in network programming.
It is an end-point of a bidirectional process-to-process communication flow across an IP based network, such as the Internet. Each socket is mapped to an application process or thread. A socket is an interface between an application process or thread and the TCP/IP protocol stack provided by the operating system.
An Internet socket is identified by the operating system as a unique combination of the following:
Ø Protocol (TCP, UDP or raw IP)
Ø Local IP address
Ø Local port number
Ø Remote IP address (Only for established TCP sockets)
Ø Remote port number (Only for established TCP sockets)
The operating system forwards incoming IP data packets to the corresponding application process by extracting the above socket address information from the IP, UDP and TCP headers.
A somewhat simplified definition occurring in the literature follows: "The combination of an IP address and a port number is referred to as a socket." Communicating local and remote sockets are called socket pairs.
On UNIX-like operating systems the netstat -an command-line tool lists all currently established sockets. On some system, netstat -b lists sockets created by application programs.
There are three Internet socket types:
1. Datagram sockets, also known as connectionless sockets, which use UDP
2. Stream sockets, also known as connection-oriented sockets, which use TCP or SCTP.
3. Raw sockets (or Raw IP sockets), typically available in routers and other network equipment. Here the transport layer is bypassed, and the packet headers are not stripped off, but are accessible to the application. Application examples are ICMP, IGMP and OSPF.
(There are also non-Internet sockets, implemented over other transport protocols, such as SNA. See also UNIX domain sockets (UDS), for internal inter-process communication.)
Network Equipment and Sockets
Network equipment such as routers and switches traditionally do not deal with the socket identifiers of the routed or switched data. However, stateful network firewalls and Network Address Translation proxy servers automatically keep track of all active socket pairs, UDP as well as TCP, based on certain time-out settings. Also in fair queuing, layer 3 switching and Quality of Service support in routers, packet flows may be identified by extracting information about the socket pairs.
Raw sockets are typically available in network equipment, and used for routing protocols such as IGMP and OSPF, and in ICMP.
Windows Sockets
The Windows Sockets specification defines a network programming interface for Microsoft Windows which is based on the "socket" paradigm popularized in the Berkeley Software Distribution (BSD) from the University of California at Berkeley. It encompasses both familiar Berkeley socket style routines and a set of Windows-specific extensions designed to allow the programmer to take advantage of the message-driven nature of Windows.
The Windows Sockets Specification is intended to provide a single API to which application developers can program and multiple network software vendors can conform. Furthermore, in the context of a particular version of Microsoft Windows, it defines a binary interface (ABI) such that an application written to the Windows Sockets API can work with a conformant protocol implementation from any network software vendor. This specification thus defines the library calls and associated semantics to which an application developer can program and which a network software vendor can implement.
Network software which conforms to this Windows Sockets specification will be considered "Windows Sockets Compliant". Suppliers of interfaces which are "Windows Sockets Compliant" shall be referred to as "Windows Sockets Suppliers". To be Windows Sockets Compliant, a vendor must implement 100% of this Windows Sockets specification.
Applications which are capable of operating with any "Windows Sockets Compliant" protocol implementation will be considered as having a "Windows Sockets Interface" and will be referred to as "Windows Sockets Applications".
This version of the Windows Sockets specification defines and documents the use of the API in conjunction with the Internet Protocol Suite (IPS, generally referred to as TCP/IP). Specifically, all Windows Sockets implementations support both stream (TCP) and datagram (UDP) sockets.
While the use of this API with alternative protocol stacks is not precluded (and is expected to be the subject of future revisions of the specification), such usage is beyond the scope of this version of the specification.
Berkeley Sockets
The Windows Sockets Specification has been built upon the Berkeley Sockets programming model which is the de facto standard for TCP/IP networking. It is intended to provide a high degree of familiarity for programmers who are used to programming with sockets in UNIX and other environments, and to simplify the task of porting existing sockets-based source code. The Windows Sockets API is consistent with release 4.3 of the Berkeley Software Distribution (4.3BSD).
Portions of the Windows Sockets specification are derived from material which is Copyright (c) 1982-1986 by the Regents of the University of California. All rights are reserved. The Berkeley Software License Agreement specifies the terms and conditions for redistribution.
Basic concepts
The basic building block for communication is the socket. A socket is an endpoint of communication to which a name may be bound. Each socket in use has a type and an associated process. Sockets exist within communication domains. A communication domain is an abstraction introduced to bundle common properties of threads communicating through sockets. Sockets normally exchange data only with sockets in the same domain (it may be possible to cross domain boundaries, but only if some translation process is performed). The Windows Sockets facilities support a single communication domain: the Internet domain, which is used by processes which communicate using the Internet Protocol Suite. (Future versions of this specification may include additional domains.)
Sockets are typed according to the communication properties visible to a user. Applications are presumed to communicate only between sockets of the same type, although there is nothing that prevents communication between sockets of different types should the underlying communication protocols support this.
Two types of sockets currently are available to a user. A stream socket provides for the bi-directional, reliable, sequenced, and unduplicated flow of data without record boundaries.
A datagram socket supports bi-directional flow of data which is not promised to be sequenced, reliable, or unduplicated. That is, a process receiving messages on a datagram socket may find messages duplicated, and, possibly, in an order different from the order in which it was sent. An important characteristic of a datagram socket is that record boundaries in data are preserved. Datagram sockets closely model the facilities found in many contemporary packet switched networks such as Ethernet.
Client-server model
The most commonly used paradigm in constructing distributed applications is the client/server model. In this scheme client applications request services from a server application. This implies an asymmetry in establishing communication between the client and server.
The client and server require a well-known set of conventions before service may be rendered (and accepted). This set of conventions comprises a protocol which must be implemented at both ends of a connection. Depending on the situation, the protocol may be symmetric or asymmetric. In a symmetric protocol, either side may play the master or slave roles. In an asymmetric protocol, one side is immutably recognized as the master, with the other as the slave. An example of a symmetric protocol is the TELNET protocol used in the Internet for remote terminal emulation. An example of an asymmetric protocol is the Internet file transfer protocol, FTP. No matter whether the specific protocol used in obtaining a service is symmetric or asymmetric, when accessing a service there is a ``client process'' and a ``server process''.
A server application normally listens at a well-known address for service requests. That is, the server process remains dormant until a connection is requested by a client's connection to the server's address. At such a time the server process ``wakes up'' and services the client, performing whatever appropriate actions the client requests of it. While connection-based services are the norm, some services are based on the use of datagram sockets.
Out-of-band data
Note: The following discussion of out-of-band data, also referred to as TCP Urgent data, follows the model used in the Berkeley software distribution. Users and implementors should be aware of the fact that there are at present two conflicting interpretations of RFC 793 (in which the concept is introduced), and that the implementation of out-of-band data in the Berkeley Software Distribution does not conform to the Host Requirements laid down in RFC 1122. To minimize interoperability problems, applications writers are advised not to use out-of-band data unless this is required in order to interoperate with an existing service. Windows Sockets suppliers are urged to document the out-of-band semantics (BSD or RFC 1122) which their product implements. It is beyond the scope of this specification to mandate a particular set of semantics for out-of-band data handling.
The stream socket abstraction includes the notion of ``out of band'' data. Out-of-band data is a logically independent transmission channel associated with each pair of connected stream sockets. Out-of-band data is delivered to the user independently of normal data. The abstraction defines that the out-of-band data facilities must support the reliable delivery of at least one out-of-band message at a time. This message may contain at least one byte of data, and at least one message may be pending delivery to the user at any one time. For communications protocols which support only in-band signaling (i.e. the urgent data is delivered in sequence with the normal data), the system normally extracts the data from the normal data stream and stores it separately. This allows users to choose between receiving the urgent data in order and receiving it out of sequence without having to buffer all the intervening data. It is possible to ``peek'' at out-of-band data.
An application may prefer to process out-of-band data "in-line", as part of the normal data stream. This is achieved by setting the socket option SO_OOBINLINE (see set sockopt()). In this case, the application may wish to determine whether any of the unread data is "urgent" (the term usually applied to in-line out-of-band data). To facilitate this, the Windows Sockets implementation will maintain a logical "mark" in the data stream to indicate the point at which the out-of-band data was sent. An application can use the SIOCATMARK ioctlsocket() command to determine whether there is any unread data preceding the mark. For example, it might use this to resynchronize with its peer by ensuring that all data up to the mark in the data stream is discarded when appropriate.
The WSAAsyncSelect() routine is particularly well suited to handling notification of the presence of out-of-band-data.
Broadcasting
By using a datagram socket, it is possible to send broadcast packets on many networks supported by the system. The network itself must support broadcast: the system provides no simulation of broadcast in software. Broadcast messages can place a high load on a network, since they force every host on the network to service them. Consequently, the ability to send broadcast packets has been limited to sockets which are explicitly marked as allowing broadcasting. Broadcast is typically used for one of two reasons: it is desired to find a resource on a local network without prior knowledge of its address, or important functions such as routing require that information be sent to all accessible neighbors.
The destination address of the message to be broadcast depends on the network(s) on which the message is to be broadcast. The Internet domain supports a shorthand notation for broadcast on the local network, the address INADDR_BROADCAST. Received broadcast messages contain the senders address and port, as datagram sockets must be bound before use.
Some types of network support the notion of different types of broadcast. For example, the IEEE 802.5 token ring architecture supports the use of link-level broadcast indicators, which control whether broadcasts are forwarded by bridges. The Windows Sockets specification does not provide any mechanism whereby an application can determine the type of underlying network, nor any way to control the semantics of broadcasting.
Byte Ordering
The Intel byte ordering is like that of the DEC VAX, and therefore differs from the Internet and 68000-type processor byte ordering. Thus care must be taken to ensure correct orientation.
Any reference to IP addresses or port numbers passed to or from a Windows Sockets routine must be in network order. This includes the IP address and port fields of a struct sockaddr_in (but not the sin_family field).
Consider an application which normally contacts a server on the TCP port corresponding to the "time" service, but which provides a mechanism for the user to specify that an alternative port is to be used. The port number returned by getservbyname() is already in network order, which is the format required constructing an address, so no translation is required. However if the user elects to use a different port, entered as an integer, the application must convert this from host to network order (using the htons() function) before using it to construct an address. Conversely, if the application wishes to display the number of the port within an address (returned via, e.g., getpeername()), the port number must be converted from network to host order (using ntohs()) before it can be displayed.
Since the Intel and Internet byte orders are different, the conversions described above are unavoidable. Application writers are cautioned that they should use the standard conversion functions provided as part of the Windows Sockets API rather than writing their own conversion code, since future implementations of Windows Sockets are likely to run on systems for which the host order is identical to the network byte order. Only applications which use the standard conversion functions are likely to be portable.
Sample Code for Socket Programming using C
UDP Server Side
/* udpServer.c */ #include <sys/types.h>#include <sys/socket.h>#include <netinet/in.h>#include <arpa/inet.h>#include <netdb.h>#include <stdio.h>#include <unistd.h> /* close() */#include <string.h> /* memset() */ #define LOCAL_SERVER_PORT 1500#define MAX_MSG 100 int main(int argc, char *argv[]) { int sd, rc, n, cliLen; struct sockaddr_in cliAddr, servAddr; char msg[MAX_MSG]; /* socket creation */ sd=socket(AF_INET, SOCK_DGRAM, 0); if(sd<0) { printf("%s: cannot open socket \n",argv[0]); exit(1); } /* bind local server port */ servAddr.sin_family = AF_INET; servAddr.sin_addr.s_addr = htonl(INADDR_ANY); servAddr.sin_port = htons(LOCAL_SERVER_PORT); rc = bind (sd, (struct sockaddr *) &servAddr,sizeof(servAddr)); if(rc<0) { printf("%s: cannot bind port number %d \n", argv[0], LOCAL_SERVER_PORT); exit(1); } printf("%s: waiting for data on port UDP %u\n", argv[0],LOCAL_SERVER_PORT); /* server infinite loop */ while(1) { /* init buffer */ memset(msg,0x0,MAX_MSG); /* receive message */ cliLen = sizeof(cliAddr); n = recvfrom(sd, msg, MAX_MSG, 0, (struct sockaddr *) &cliAddr, &cliLen); if(n<0) { printf("%s: cannot receive data \n",argv[0]); continue; } /* print received message */ printf("%s: from %s:UDP%u : %s \n", argv[0],inet_ntoa(cliAddr.sin_addr), ntohs(cliAddr.sin_port),msg); }/* end of server infinite loop */ return 0; } - Prev
- Next >>

