How Sip Phone (User Agent) works and communicates with sip servers

1.  There is a sip server and a sip registrar.
The sip server and sip registrar can be one server or two different servers. The sip registrar is the server on which it is recorded that a phone number is on a certain ip address and port.

When someone wants to talk to that number he sends INVITE request to his SIP server with the phone number.

 Sip server through the registrar finds out what is the according ip, where to resend the INVITE request.

 According to RFC 3261 ( section 8.3): The "expires" parameter of a Contact header field value indicates how long the URI is valid.  The value of the parameter is a number indicating seconds.  If this parameter is not provided, the value of the Expires header field determines how long the URI is valid.

Then a 2xx response (200 OK) on REGISTER request is received.  It has Contact header expires or an Expires header, with the value in seconds when the Register is expiring. Before this value the device must send a new REGISTER request. Here there is an example of REGISTER and its response:

2.  Next is the SDP( SDP is sent with the INVITE request and a 200 OK response and ACK request. Through it the sip user agents are negotiating the media voice, video.

These are the codecs G711, G729, T38 and others. The user agents must have at least one common codec so as to be able to communicate.

Every user agent must support G711 G711A, G711U codec, as far as I can remember. There exists servers that supports two media for the two user agents that communicate like Asterisk sip server.

3. Dial plan -
 With dial plan usually user can dial one or another sip server if the configuration allows to configure more than one sip account, directly another line on the device, MGCP or H323.

4. Next the user should correctly configure FXS, E1 and T1 ports.

5. The next issue is when the device is behind NAT. .

6. And last cisco call flow examples. You can monitor/sniff your device on uplink when you put the outgoing interface and a computer in the same network and use Wireshark with filter ether host mac, where mac is the mac address of the outgoing interface to see what in fact is going on.
User should be able to see the call flows for your sip traffic. You can use Wireshark->Statistics->SIP , Wireshark->Statistics->Voi p Calls menus to analyse the traffic.

7. T38 codec is usually used for sending faxes;

8. Quality of service for good quality of speech

9.  DNS can give ips for servers supporting sip in the network (

10. The Real-time Transport Protocol (RTP) defines a standardized packet format for delivering audio and video over IP networks. RTP is used extensively in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applications, television services and web-based push-to-talk features.
RTP is used in conjunction with the RTP Control Protocol (RTCP). While RTP carries the media streams (e.g., audio and video), RTCP is used to monitor transmission statistics and quality of service (QoS) and aids synchronization of multiple streams. RTP is originated and received on even port numbers and the associated RTCP communication uses the next higher odd port number.

In VOIP the speech is first encoded through some codec and then carried through internet with RTP packets.

Comments (2)

Most Valuable Expert 2011

Hi user_n,

A very interesting and useful discussion here. Thanks very much for putting this article out there for us all. The use of SIP technology is growing as the world moves in leaps and bounds towards IP-based PBXs, so this article could probably not come at a better time! I have hit the "Yes" button to vote this one up.

It is really quite interesting that you say "every user agent must support G711 G711A, G711U..." - I wish my SIP provider supported both of them because the transcoding I have to do between codecs from my SIP device to the provider has caused its own slew of issues on more than one occasion!!

I thought it might be worth mentioning here and pointing out that there are also two types of network element in the SIP protocol - the traditional proxies as laid down by the RFCs but also the back-to-back user agents (B2BUAs) like Asterisk and all its various derivitives.

In my experience, the commercial user with an IP PBX in their office will typically be using a B2BUA of some description while the providers are typically running a complex network of SIP proxies to route any calls they send/receive. Useful information to know. The B2BUAs sit in the middle of the call path, receiving the SIP invitations from the caller and then initiating a new call leg to the callee (which matches the usual definition of a "PBX"). Because the B2BUA is interpreting the SIP session and appears to both endpoints as the place the packets originate from, it's possible for additional features to be introduced to a call which may only be supported at the server-side or by one party. The PBX can just filter out SIP traffic which is not supported on both ends, or it can handle the server-side commands locally. Plus, the actual media traffic which carries the voice can still be the subject of re-invites, so one phone can still send RTP media packets direct to another phone and take the PBX out of the routing path of the voice itself.

That's different to a SIP proxy, which is roughly the same role as routers perform in an IP network. They don't actually do anything with the SIP packets they are given beyond finding the next SIP proxy which gets it a hop closer to the packet's destination, so if an endpoint wants to use a particular feature in the SIP session, both endpoints need to support it.

But, as always, a very nice article and just thought I would point the above out as I thought it was a useful addition to the discussion :-)



You are right. The article is from user agent point of view. But I have not enough experience with pbx to write for them.

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.