IRC Server

this repo made to work on the ft_irc project in the cursus of 42

Overview

In a typical client-server scenario, applications communicate using sockets as follows:

Each application creates a socket. A socket is the "apparatus" that allows communication, and both application require one.
the server bind its socket to a well-known address (name) so that clients can locate it .

A socket is created using the socket() system call, which return a file descriptor used to refer to the socket in subsequent system calls:

fd = socket(domain, type, protocol);

Communication domains

Sockets exist in a communication domain, which determines:

the method of identifying a socket (i.e., the format of a socket "address"); and
the range of communication (i.e., either between application on the same host or between applications on different hosts connected via a network).

Modern operating systems support at least the following domains:

the UNIX (AF_UNIX) domain allows communication between applications on the same host. (POSIX.1g used the name AF_LOCAL as a synonym for AF_UNIX , but this is not used in SUSv3.)
The IPv4 (AF_INET) domain allows communication between applications running on hosts connected via an Internet Protocol version 4 (IPv4) network.
The IPv6 (AF_INET6) domain allows communication between applications running on hosts connected via an Internet Protocol version 6 (IPv6) network. Although IPv6 is designed as the successor to IPv4, the latter protocol is currently still the most widely used.

Socket domains

Domain	Communication performed	Communication between applications	Address format	Address structure
AF_UNIX	within kernel	on same host	pathname	sockaddr_un
AF_INET	via IPv4	on hosts connected via an IPv4 network	32-bit IPv4 address + 16-bit port number	sockaddr_in
AF_INET6	via IPv6	on hosts connected via an IPv6 network	128-bit IPv6 address + 16-bit port number	sockaddr_in6

Socket types

Every sockets implementation provides at least two types of sockets: stream and datagram. These socket types are supported in both th UNIX and the Internet domains. the next table summarizes the properties of these cocket types.

Socket types and their properties

Propery	Socket type
Propery	Stream	Datagram
Reliable delivery?	Y	N
Message boundaries preserved?	N	Y
Connection-oriented?	Y	N

Stream sockets (SOCK_STREAM) provide a reliable, bidirectional, byte-stream communication channel. By the terms in this description, we mean the following:

Reliable means that we are guaranteed that either the transmitted data will arrive intact at the receiving application, exactly as it was transmitted by the sender (assuming that neither the network link nor the receiver crashes), or that we’ll receive notification of a probable failure in transmission.
Bidirectional means that data may be transmitted in either direction between two sockets.
Byte-stream means that, as with pipes, there is no concept of message boundaries

A stream socket is similar to using a pair of pipes to allow bidirectional communication between two applications, with the difference that (internet domain) sockets permit communication over a network.

Stream sockets operate in connected pairs. For this reason, stream sockets are described as connection-oriented.

Datagram sockets (SOCK_DGRAM) allow dadta to be exchanged in the form of messages called `datagrams`.

with datagram sockets, message boundaries are preserved, but data transmission is not reliable. Messages may arrive out of order, be duplicated, or not arrive at all. Datagram sockets are an example of the more generic concept of a connectionless socket. Unlike a stream socket, a datagram socket doesn't need to be connected to another socket in order to be used. In the Internet domain, datagram sockets employ the User Datagram Protocol (UDP), and stream sockets (usually) employ the Transmission Control Protocol (TCP).

Socket system calls:

The key socket system calls are the following:

The socket() : creates a new socket.
the bind() : binds a socket to an address. Usually, a server employs this call to bind its socket to a well-known address so that clients can locate the socket.
The listen() : allows a stream socket to accept incomming connections from other sockets.
The accept() : accepts a connection from a peer application on a listening stream socket, and optionally returns the address of the peer socket.
The connect() : establishes a connection with another socket.

Socket I/O can be performed using the conventional read() and write() system calls, or using a range of socket-specific system calls (e.g., send(), recv(), sendto(), and recvfrom()). By default, these system calls block if the I/O operation can't be completed immediately. Nonblocking I/O is also possible, by using the fcntl() F_SETFL operation to enable the O_NONBLOCK open file status flag.

Creating a Socket: socket()

the socket() system call creates a new socket.

int socket(int domain, int type, int protocol);

- domain : specifies the communication domain for the socket (e.g, AF_UNIX, AF_INET, AF_INET6 ...)
- type : specifies the socket type (e.g, SOCK_STREM, SOCK_DGRAM...)
- protocol : non-zero protocol is specified as IPPROTO_RAW for raw sockets (SOCK_RAW).
return : returns a file descriptor used to refer to the newly created socket ,or -1 on error

starting with kernel 2.6.27, linux provides a second use for the type argument, by allowing two nonstandard flags to be ORed with the socket type. The SOCK_CLOEXEC flag causes the kernel to enable the close-on-exec flag (FD_CLOEXEC) for new file descriptor. This flag is useful for the same reasons as the open() O_CLOEXEC flag . The SOCK_NONBLOCK flag causes the kernel to set the O_NONBLOCK flag on the underlying open file description, so that future I/O operations on the socket will be nonblocking. This saves additional calls to fcntl() to acheve the same result.

binding a Socket to an address: bind()

the bind() system call binds a socket to an address.

int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);

- sockfd : is a file descriptor obtained from the previous call to socket().
- addr : is a pointer to a structure specifying the address to which this socket is to be bound. the type of structure passed in this argument depends on the socket domain.
- addrlen : specifies the size of the address structure. The socklen_t data type used for the addrlen argument is an integer type specified by SUSv3.
return : returns 0 on success, or –1 on error

typecally, we bind a server's socket to a well-known address-that is, a fixed address that is known in advance to client applications that need to communicate with that server.

Generic Socket Address Structures: struct sockaddr

The addr and addrlen arguments to bind() require some furher explanation. Looking to the table above we see that each socket domain uses a different address format. For exemple, UNIX domain sockets use pathnames, while Internet domain sockets use the combination of an IP address plus a port number. For each socket domain, a different structure type is defined to store a socket address. However, because system calls such as bind() are generic to all socket domains, they must be able to accept address structures of any type. In order to permit this, the socket API defines a generic address structure, struct sockaddr. The only purpose for this type is to cast the various domain-specific address structures to a single type for use as arguments in the socket system calls. The sockaddr structure is typically defined as follows:

struct sockaddr {
	sa_family_t sa_family;   /*Address family (AF_* constant)*/
	char        sa_data[14]; /*Socket address (size varies according to socket domain) */
};

this structure serves as a template for all of the domain-specific address structures. Each of these address structures begins with a family field corresponding to the sa_family field of the sockaddr structure.(The sa_family_t data type is an integer type specified in SUSv3.) the value in the family field is sufficient to determine the size and format of the address stored in the remainder of the structure.

Some UNIX implementations also define an additional field in the sockaddr structure, sa_len, that specifies the total size of the structure. SUSv3 doesn't require this field, and it is not present in the linux implementation of the sockets API.

Stream Sockets

The operation of stream sockets can be explained by anology with the telephone system:

The socket() system call, which creates a socket, is the equivalent of installing a telephone. In order for two application to communicate, each of them must create a socket.
Communication via a stream socket is analogous to a telephone call. One application must connect its socket to another application's socket before communication can take place. Two sockets are connected as follows:
1. One application calls bind() in order to bind the socket to a well-known address, and then calls listen() to notify the kernel of its willingness to accept incoming connections. This step is analogous to having a known telephone number and ensuring that out telephone is turned on so that people can call us.
2. The other application establishes the connection by calling connect(), specifying the address of the socket to which the connection isto be made. This is analogous to dialing someone's telephone number.
3. The application that called listen() then accepts the connection using accept(). This is analogous to picking up the telephone when it rings. If the accept() is performed before the peer application calls connect() , then the accept() blocks (waiting by the telephone) .
Once a connection has been established, data can be transmitted in both directions between the applications (analogous to a two-way telephone conversation) until one of them closes the connection using close(). Communication is performed using the conventional read() and write system calls or via a number of socket specific system calls (such as send() and recv()) that provide additional functionality.

Active sockets (client) and passive sockets (server) :

Stream sockets are often distinguished as being either active or passive:

by default, a socket that has been created using socket() is active. An active socket can be used in a connect() call to establish a connection to a passive socket. This is referred to as performing an active open .
A passive sicket (also called a listening socket) is one that has been marked to allow incoming connections by calling listen. Accepting an incoming connection is referred to as perfoming a passive open.

In most applications that employ stream sockets, the server performs the passive open, and the client performs the active open.

Listening for Incoming Connections: listen()

The listen() system call marks the stream socket referred to by the file descriptor sock_fd as passive. The socket will subsequently be used to accept connections from other (active) sockets.

int listen(int sockfd, int backlog);

- sockfd : is a file descriptor obtained from the previous call to socket().
- backlog : allows us to limit the number of pending connections. Connection requests up to this limit succeed immediately.
return : return 0 on success, or -1 on error.

We can’t apply listen() to a connected socket—that is, a socket on which a connect() has been successfully performed or a socket returned by a call to accept().

To understand the purpose of the backlog argument, we first observe that the client may call connect() before the server calls accept(). This could happen, for example, because the server is busy handling some other client(s). This results in a pending connection .

The kernel must record some information about each pending connection request so that a subsequent accept() can be processed. The backlog argument allows us to limit the number of such pending connections. Connection requests up to this limit succeed immediately. (For TCP sockets, the story is a little different) Further connection requests block until a pending connection is accepted (via accept()), and thus removed from the queue of pending connections.

Accepting a Connection: accept()

the accept() system call accepts an incoming connection on the linstening stream socket referred to by the file descriptor sockfd. If there are no pending connections when accept() is called, the call blocks until a connection request arrives.

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);

The key point to understand about accept() is that it creates n new socket, and it is this new socket that is connected to the peer socket that performed the connect(). A file descriptor for the connected socket is returned as the function result of the accept() call. The listening socket ( sockfd) remains open, and can be used to accept further connections. A typical server application creates one listening socket, binds it to a well-known address, and then handles all client requests by accepting connection via that socket.

The remainnig arguments to accept() return the address of the peer socket (client socket) . The addr argument points to a structure that is used to return the socket address. The type of this argument depends on the socket domain (as for bind()). The addrlen argument is a value-result argument. It points to an integer that, proir to the call, must be initialized to the size of the buffer pointed to by addr, so that the kernel knows how much space is available to return the socket address.

Upon return from accept(), this integer is set to indicate the number of bytes of data actually copied into the buffer.

If we are not interested in the address of the peer socket, then addr and addrlen should be specified as NULL and 0, respectively. (If desired, we can retrieve the peer's address later using the getpeername() system call).

Connecting to a Peer Socket: connect()

The connect() system call connects the active socket referred to by the file descriptor sockfd to the listening socket whose address is specified by addr and addrlen .

int connect(int sockfd, cont struct sockaddr *addr, socklen_t addrlen);
return 0 on success, or -1 on error.

the addr and addrlen arguments are specified in the same way as the corresponding arguments to bind().

If connect() fails and we wish to reattempt the connection, then SUSv3 specifies that the portable method of doing so is to close the socket, create a new socket, and reattempt the connection with the new socket.

I/O Strem Sockets

A pair of connected stream sockets provides a bidirectional communication channel between the two endpoints.

The next figure shows what this looks like in the UNIX domain.

The semantics of I/O on connected stream sockets are similar to those for pipes:

To perform I/O, we use the read() and write() system calls (or the socket-specific send() and recv()). Since sockets are bidirectional, both calls may be used on each end of the connection.
A socket may be closed using the close() system call or as a consequence of the application terminating. Afterward, when the peer application attempts to read from the other end of the connection, it recieves end-of-file (once all buffread data has been read). If the peer application attempts to write to its socket, it receives a SIGPIPE signal, and the system call fails with the error EPIPE. the usual way of dealing with this possibility is to ignore the SIGPIPE signal and find out about the closed connection via the EPIPE error.

Connection Termination: close()

The usual way of terminating a stream socket connection is to call close(). If multiple file descriptors refer to the same socket, then the connection is terminated when all of the descriptors are closed.

Suppose that, after we close a connection, the peer application crashes or other- wise fails to read or correctly process the data that we previously sent to it. In this case, we have no way of knowing that an error occurred. If we need to ensure that the data was successfully read and processed, then we must build some type of acknowledgement protocol into our application. This normally consists of an explicit acknowledgement message passed back to us from the peer.

I/O Muntiplexing

I/O multiplexing allows us to simultaneously monitor multiple file descriptors to see if I/O is possible on any of them. We can perform I/O multiplexing using either of two system calls with essentially the same functionality. The first of these, select(), appeared along with the sockets API in BSD. This was historically the more widespread of the two system calls. The other system call, poll(), appeared in System V. Both select() and poll() are nowadays required by SUSv3.

We can use select() and poll() to monitor file descriptors for regular files, terminals, pseudoterminals, pipes, FIFOs, sockets, and some types of character devices.
Both system calls allow a process either to block indefinitely waiting for the file descriptors to become ready or to specify a timeout on the call.

The select() System Call

The select() system call blocks until one or more of a set of file descriptors becomes ready.

	#include <sys/time.h>
	#include <sys/select.h>
	int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
	Returns number of ready file descriptors, 0 on timeout, or –1 on error

The nfds, readfds, writefds, and exceptfds arguments specify the file descriptors that select() is to monitor. the timeout argument can be used to set an upper limit on the time for whiche select() will block.

File descriptor sets

The readfds, writefds, and exceptfds argumets are pointers to **file descriptor ** sets, represented using the data type fd_set . These arguments are used as follows :

readfds : is the set of file descriptors to be tested to see if input is possible;
writefds : is the set of file descriptors to be tested to see if output is possible;
exceptfds : is the set of file descriptors to be tested to see if an exceptional condition has occurred.

The term exceptional condition is often misunderstood to mean that some sort of error condition has arisen on the file descriptor. This is not the case. An exceptional condition occurs in just two circumstances on Linux(other UNIX implementations are similar):

A state change occurs on a pseudoterminal slave connected to a master that is in packet mode
Out-of-band data is received on a stream socket

Typically, the fd_set data type is implemented as a bit mask. However, we don't need to know the details, since all manipulation of file descriptor sets is done via four macros: FD_ZERO(), FD_SET(), FD_CLR(), and FD_ISSET().

	#include <sys/select.h>
	void FD_ZERO(fd_set *fdset);
	void FD_SET(int fd, fd_set *fdset);
	void FD_CLR(int fd, fd_set *fdset);

	int FD_ISSET(int fd, fd_set *fdset);
		Returns true (1) if fd is in fdset, or false (0) otherwise

These macros operate as follows:

FD_ZERO() initializes the set pointed to by fdset to be empty.
FD_SET() adds the file descriptor fd to the set pointed to by fdset.
FD_CLR() removes the file descriptor fd from the set pointed to by fdset
FD_ISSET() returns true if the file descriptor fd is a member of the set pointed to by fdset.

a file descriptor set has a maximum size, defined by the constant FD_SETSIZE. On Linux, this constant has the value 1024.

The readfds, writefds, and exceptfds arguments are all value-result. Befor the call to select(), the fd_set structures pointed to by these arguments must be initialized (using FD_ZERO() and FD_SET()) to contain the set of file descriptors of interest. The select() call modifies each of these structures so that, on return, they contain the set of file descriptors that are ready. (Since these structures are modified by the call, we must ensure that we reinitialize them if we are repeatedly calling select() from within a loop.) The structures can then be examined using FD_ISSET().

If we are not interested in a particular class of events, then the corresponding fd_set argument can be specified as NULL.

the timeout argument

The timeout argument controls the blocking behavior of select(). It can bespecified either as NULL, in which casse select() indefinitely, or as a pointer to a timeval structure:

	struct timeval {
		time_t		tv_sec;					/*Seconds*/
		suseconds_t	tv_usec;				/*Microseconds (long int ) */
	};

If both fields of timeout are 0, then select() doesn't block; it simply polls the specified file descriptors to see which ones are ready and returns immediately. Otherwise, timeout specifies an upper limit on the time for which select() is to wait.

Although the timeval structure affords mmicrosecond precision, the accuracy of the call is limited by the granularity of the software clock.

when timeout is Null, or points to a structure containing nonzero fields, select() blocks until one of the following occurs:

at least one of the file descriptors specified in readfds, writefds, or exceptfds becomes ready;
the call is interrupted by a signal handleer;
the amount of time specified by timeout has passed.

Return value from select()

As its function result, select() return one of the following:

A return value of -1 indicates that an error occurred. Possible errors include EBADF and EINTR.
- EBADF indecates that one of the file descriptors in readfds, writefds, or exceptfds is invalid (e.g., not currently open).
- EINTR indecates that the call was interrupted by a singnal handler.
A return value of 0 means that the call timed out before any file descriptor became ready .In this case each of the returned file descriptor sets will be empty.
a positive return value indicates that one or more file descriptors is ready. The return value in the number of ready descriptors. In this case, each of the returned file descriptor sets must be examined (using FD_ISSET()) in order to find out which I/O events occurred. If the same file descriptor is specified in more than one of readfds, writefds, and exceptfds ,it is counted multiple times if it is ready for more than one event. In other words, select() returns the total number of file descriptors marked as ready in all three returned sets.

The poll() System Call

The poll() system call performs a similar task to select(). the major difference between the two system calls lies in how we specify the file descriptors to be monitored. With select(), we provide three sets, each marked to indicate the file descriptors of interest. With poll(), we provide a list of file descriptors, each marked with the set of events of interest.

	#include <poll.h>
	int poll(struct pollfd fds[], nfds_t nfds, int timeout);
			Returns number of ready file descriiptors, 0 on timeout, or -1 or error

The fds argument and the pollfd array (nfds) specify the file descriptors that poll() is to monitor. The timeout argument can be used to set an upper limit on the time for which poll() will block. We describe each of these arguments in detail below.

the pollfd array

The fds argument lists the file descriptors to be monitored by poll(). This argument is an array of pollfd structures, defined as follows:

	struct pollfd {
        int   fd;				/* File descriptor */
        short events;			/* Requested events bit mask */
        short revents;			/* Returned events bit mask */
	};

The nfds arguments specifies the number of items in the fds array. The nfds_t data type used to type the nfds argument is an unsigned integer type.

The events and revents fields of the pollfd structure are bit masks, The caller initializes events to specify the events to be monitored for the file descriptor fd. Upon return from poll(), revents is set to indicate which of those events actually occurred for this file descriptor.

Table Bellow lists the bits that may appear in the events and revents fields.

The first group of bits in this table (POLLIN, POLLRDNORM, POLLRDBAND, POLLPRI, and POLLRDHUP) are concerned with input events.

The next group of bits (POLLOUT, POLLWRNORM, and POLLWRBAND) are concerned with output events.

the third group of bits (POLLERR, POLLHUP , and POLLNVAL) are set in the revents field to return additional information about the file descriptor. If specified in the events field, these three bits are ignored.

The final bit (POLLMSG) is unsused by poll() on linux.

Bit-mask values for events and revents fields of the pollfd structure

Bit	Input in events?	Return in revents	Description
POLLIN	*	*	Data other than high-priority data can be read
POLLRDNORM	*	*	Equivalent to POLLIN
POLLRDBAND	*	*	Priority data can be read (unused on Linux)
POLLPRI	*	*	High-priority data can be read
POLLRDHUP	*	*	Shutdown on peer socket
POLLOUT	*	*	Normal data can be written
POLLWRNORM	*	*	Equivalent to POLLOUT
POLLWRBAND	*	*	Priority data can be written
POLLERR		*	An error has occurred
POLLHUP		*	A hangup has occurred
POLLNVAL		*	File descriptor is not open
POLLMSG			Unused on Linux (and unspecified in SUSv3)

the timeout argument:

The timeout argument determines the blocking behavoir of poll() as follows:

case 1: timeout = -1 : block until one of the file descriptors listed in the fds array is ready (as defined by the corresponging events field) or a signal is caught.
case 2: timeout = 0 : do not block - just perform a check to see which file descriptors are ready.
case 3: timeout > 0 : block for up to timeout milliseconds, until one of the file descriptors in fds is ready, or until a signal is caught.

As with select(), the accuracy of timeout is limited by the granularity of the software clock.

Return value from poll()

As its function result, poll() returns one of the following :

-1 : indicates that an error occurred. One possible error is EINTR, indicating taht the call was interrupted by a signal handler.
0 : means that the call timed out before any file descriptor became ready.
> 0 : indicates that one or more file descriptors are ready. The returned value is the number of pollfd structures in the fds array that have a nonzero revents field .

Note the slightly different meaning of a positive return value from select() and poll(). The select() system call counts a file descriptor multiple times if it occurs in more than one returned file descriptor set. The poll() system call returns a count of ready file descriptors, and a file descriptor is counted only once, even if multiple bits are set in the corresponding revents field.

takrayoutmohamed / irc Goto Github PK

irc's Introduction