SIP is an application-layer control protocol that allows users to create, modify, and terminate sessions with one or more participants. It can be used to create two-party, multiparty, or multicast sessions that include Internet telephone calls, multimedia distribution, and multimedia conferences.

Session Initiation Protocol (SIP) was published by the IETF in 1996, but the first recognized standard published later in 1999. SIP was revised over the years and re-published in 2002 as RFC 3261, which is the currently recognized standard for SIP. These delays in the standards process resulted in delays in market adoption of the SIP protocol, which is why H.323 is considered the VoIP connectivity standard.

Today, H.323 still commands the bulk of the VoIP deployments in the service provider market for voice transit, especially for transporting voice calls internationally. H.323 is also widely used in room-based video conferencing systems and is the preferred protocol for IP-based video systems. SIP has, most recently, become more popular for use in instant messaging systems.

How it Works

Like HTTP or SMTP, SIP works in the Application layer of the Open Systems Interconnection (OSI) communications model, the level that ensures communications. SIP can establish multimedia sessions or Internet telephony calls, and modify or terminate them. The protocol can also invite participants to unicast or multicast sessions that do not necessarily involve the initiator. Because the SIP supports name mapping and redirection services, it makes it possible for users to initiate and receive communications and services from any location, and for networks to identify the users wherever they are.

SIP is a request-response protocol, dealing with requests from clients and responses from servers. Participants are identified by SIP URLs, and requests can be sent through any transport protocol. SIP will determine the end system that will be used for any given session, the communication media and its parameters, and the recipient’s response to the call. Once these actions have been executed, SIP establishes the call parameters at the caller and at the recipient ends and handles all transfer and termination.

Although SIP is as old as H.323 as an initiation protocol, SIP wasn’t designed to address many problems within legacy communication systems. Additionally, since H.323 has been the industry standard, many more people are familiar with this protocol. Although SIP has been marketed as easy to use and to debug, the reality is that there is the same amount of complexity involved in this standard as any other standard within VoIP.

SIP does appear to be easier to develop and troubleshoot, but these attributes don’t make the protocol easier to use. Instead, these abilities have resulted in a number of non-standard SIP variations and a number of non-standard extensions for these developments.

Basic Usage

SIP can run on TCP, UDP, or SCTP, and it supports five facets of establishing and terminating multimedia communications:

  • It determines the end system that will be used for communication;
  • It determines the willingness of the called party to engage in communications;
  • It determines the media and its parameters;
  • It ‘rings’ the establishment of session parameters on both ends;
  • It includes transfer and termination of sessions, modifies session parameters, and invokes services.

SIP provides a suite of security services, which include denial-of-service prevention, authentication (both user to user and proxy to user), integrity protection, and encryption and privacy services.