Routing VoIP/Real-Time Data Using QoS

David Vincent

Johns Hopkins University

Rockville, Maryland

E-mail: = david.vincent@uspto.gov 

Posted: January, 2002

Abstract
    Three methods used in assuring QoS for VoIP are: using an all ATM = network, using IP over ATM, and using an all IP or packet switched network. Real time data cannot be allowed to experience excessive delays or severe = packet loss.  When routing Voice over Internet Protocol (VoIP), = isochronous, or real-time data, there needs to be a guaranteed Quality of Service = (QoS).  By implementing QoS, the necessary delay and other parameters can be = met.  The three different networks have various methods of guaranteeing = QoS.  Signaling protocols can pre-allocate network resources while transport protocols such as UDP can be used to minimize delay.
 
1.   = Introduction
    The = Internet was designed to handle non-real-time data.  In order to route = real-time data over the Internet, new protocols are being implemented and = existing protocols are being adapted.  When transmitting VoIP data, the = total delay and packet loss will have to be kept to a minimum.  = Therefore protocols which can guarantee the QoS will have to be used.  The = purpose of this paper is to explain three methods of routing real-time data = using QoS.  The paper will discuss what QoS is and why it is needed when transmitting real time data such as VoIP.  Comparisons will be = made between using routers in packet switched networks and switches in cell switched networks.  The paper will also cover various signaling = protocols such as ATM/UNI, RSVP and SIP.  The paper is broken into three = main sections: guaranteeing QoS using ATM, using IP over ATM, and using an = all IP or packet switched network.  There is also an Appendix which = covers various background data such as routing basics and also lists two other methods for guaranteeing QoS (MPLS and DOCSIS).
    At the = present time, it is possible to transmit voice over the Internet.  = Software products such as I-phone=99, Cool Talk=99, and Net Meeting=99 are able = to provide a basic VoIP service.  Computer web sites such as Net2Phone.com, = Dialpad.com, and iBasis.com were providing between 50-110 million VoIP minutes per = month in 2000.  By 2005, IDC forecasts that IP telephony will account = for more than 47% of total U.S. long distance and international voice = traffic[5].  Real time voice data is considered to be delay sensitive.  Since = the present Internet only offers a best-effort QoS, the necessary = delay parameters cannot be guaranteed.  As a result, people who = presently use VoIP over the Internet are, for the most part, not comfortable with the quality of the speech they experience.  People will notice = when voice packets experience delays exceeding 0.15 seconds, and at 0.3 = seconds, the call can no longer be considered to be =93Toll Quality=94 (see = Appendix, section 3).  This means that people who pay, e.g., Dialpad.com, = $9.99 per month for unlimited anytime minutes, may not want to use the = service when the network is congested.  Even those who have broadband = connections to an Internet service provider (ISP), will still not be able to = control the congestion, delay or packet loss, which might be occurring = throughout the Internet.  The first generation voice conferencing systems = were able to deliver good quality of speech when the network load conditions were light.  The second generation voice tools included low bit = rate redundancy packet sending mechanisms for overcoming packet loss, = adaptive play-out mechanisms for reducing network jitter, and inter-media = synchronization mechanisms [1].  As the routers being used in the Internet are = upgraded, these services will become more practical.  Using VoIP will equate to lower equipment cost and management of the network.  By using = VoIP with silence suppression, there can be a bandwidth gain of a factor of four compared to 64 Kbps PCM connections [18].  Therefore, there = is a great deal of interest in perfecting VoIP.
2.  Delay Parameters & QoS
    Real time data is delay sensitive and, in most cases, slightly loss = tolerable.  When referring to VoIP, the QoS parameters can be used to guarantee = that voice does not experience excessive delays.  People are expecting to be able to make full duplex voice communication.  In other = words, users are expecting to be able to both speak simultaneously.  If = the voice data arrives late, then it will cause problems such as talker = overlap.  When people use CB radios or walk-talkies they say =93over=94 so that = the second person will know when it is okay to begin talking.  Anyone who has been watching CNN recently will have seen how a 0.5 second satellite = hop effects voice data.  As mentioned above, people will notice when = voice packets experience delays exceeding 0.15 seconds, and at 0.3 seconds, = the call can no longer be considered to be toll quality.  If the QoS = parameters are guaranteed then the voice communication may be able to approximate what users currently experience using the PSTN/analog = phones.
    In ATM networks, delay (D) is defined as the time difference between = the sending of the information (first block/cell) at the source = (t0) and the receiving of the information (last block/cell) at the receiver (t1).  The delay is different for every block/cell and is statistically variable with a minimum (Dmin)and maximum = delay (Dmax).  The difference between Dmax and = Dmin is sometimes referred to as the jitter.  Basically, jitter is the variance of the delay.  Delay of the network includes = transfer delay and processing delay (caused by switching, routing, queuing, = and/or multiplexing).  Queuing delay is commonly referred to as the most important delay element since it is often used in determining traffic = conditions.  Transfer delay (Dt) is determined by the physical speed of = the medium and the distance to be bridged.  The processing delay = (Dp) is determined by the physical implementation of each of the nodes on = the path but also by the way the information is handled.  Therefore, = there are many factors which contribute to the delay and by guaranteeing a = certain level of QoS, the network is essentially granting the users various = levels of priority.  Meaning, it may not always be possible to perform = all the various operations, such as fragmenting and routing, faster, but it is possible to service higher priority traffic first.

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

3.  Using All ATM Networks

Figure 1.  ATM Protocol Stack [14]
 
 






    In an end-to-end ATM network, QoS is very well defined and the ATM = adaptation layers (AALs) can adapt many different types of traffic into ATM = cells.  The ATM switches honor the traffic contract and the QoS = parameters.  Voice data and T1 lines (see Appendix, section 1) are considered to be constant bit rate (CBR) traffic.  Usually the CBR data is the = highest priority stream or connection coming into a switch and does have to = have the delay parameters met.  When admitting bursty or variable bit = rate (VBR) traffic into an ATM network, congestion may result, due to either an over commitment of network bandwidth or a mismatch in the = statistical multiplexing of multiple traffic streams.  The congestion may be = recognized via a degradation of the subscribers QoS.  Using call admission = control (CAC), cell loss and intolerable delays may be avoided by blocking = calls when the QoS parameters cannot be met.  When the connection is = established, the source  can use ATM signaling such as Q.2931 and the UNI to = create a Setup message comprising the destination end-system address, = desired traffic and QoS parameters, and various information elements (IEs) = defining particular desired higher layer protocol bindings [13].  Rather = than using Q.931 (which is used in narrow band ISDN), ATM (which is = broadband ISDN) uses Q.2931.  The switched virtual circuits (SVCs) are set = up by the ATM Control Plane (see above figure) using a VPI=3D0 and a = VCI=3D5.  When the call is being set up, the network determines if the resources are available, how to connect the SVC, and which VPI/VCI to use.  The ATM Private Network Node Interface (P-NNI) routing protocol routes the connection request based on the QoS.  After the VC is set up, all the traffic will travel along the same path.  Although the = VPI/VCI changes at every node, the switches never have to read the layer=963 = addresses, and the QoS along the path is already guaranteed (assuming severe = congestion or failures do not arise).

    In ATM, the QoS parameters are: peak-to-peak cell delay variation = (CDV), maximum cell transfer delay (maxCTD), and cell loss ratio (CLR).  Although ATM networks do guarantee the QoS, existing network layer = protocols, such as IP, cannot request a specific QoS from the network, partly = because the main functions of network layer protocols were to offer universal = connectivity, and a uniform service interface, to higher layers, independent of the = underlying physical layer protocols.  However, the main reason that people = will not be using ATM end-to-end is because it is not widely = available.  Ethernet dominates the LAN environment and users operating out of a = residence usually connect to their ISP using phone lines.  Even if ATM were to be offered to the residential user, it would most likely be = costly.  Although ATM may never become a dominant protocol in the LAN or = Internet environment, ATM may possibly be used in the wireless arena [15].  For the most part though, ATM will most likely remain mainly in the = backbone networks where fiber optic cables and data rates in increments of 155 = Mbps are common.
4.  Routing IP Packets = Over ATM
    When = routing IP packets over an ATM network, certain issues arrise.  Since IP = is a connectionless protocol and ATM is a connection oriented = protocol,
multicasting and/or = broadcsting become more complicated.  As will be shown below, AAL5 along with various servers (which can duplicate and broadcast received
packets) will be used.  = Although ATM offers QoS, IPv4 does not.  However, IPv4 comprises a Type of Service (TOS) field which may be used to differentiate
different types of data = belonging to various flows or streams, and in IPv6, the TOS field has been changed to a flow label field.  At the present time, there are a
plurality of different methods = and protocols being used in order to transmit IP packets over ATM networks while providing QoS.
 







    Although the figure above = indicates only LLC and SNAP at layer-2, the figure below is provided to show a = more complete picture of the encapsulation.  The SNAP protocol data = unit (PDU) comprises a Organizational ID (OID) and a Protocol ID = (PID).



    IP is a connectionless service which does not offer any QoS.  IP = packets/datagrams, can be routed over a plurality of different paths.  This means = that the packets may arrive at the destination out of order.  In = reality, most if not all packets do follow the same path and do not arrive out = of order.  It is faster to transmit packets when no connections need to be set up but if the packets arrive out of order, it will take time to buffer and reorder them.  ATM is a connection oriented service wherein the fixed length packets, referred to as cells, are all transmitted over the same virtual circuit.  One of the main = differences between IP and ATM is that IP datagrams are routed, while ATM cells are switched.  Routing is performed in software while switching is = performed in hardware [4], even though routing is based on the layer-2/MAC = addresses (see Appendix, section 4) of nodes, routers still have to read the = layer-3/IP addresses.  Any processing of a packet will add to the overall = transfer delay.  Therefore, switching cells over an ATM network is faster = than routing packets through an internet.

    In classical IP over ATM, RFC 1483 specifies using an LLC/SNAP header = which identifies what protocol is being used in the payload.  In the IP over ATM RFCs 1577 and 2225, there is no mention of supporting QoS and delay parameters.  These RFCs only support the best effort = model.  ATM is mainly used for the features of the scalable bandwidth that ATM provides.  Since almost all modern computers run IP, it would make sense to try and run IP from the desktop while trying to take advantage of ATM=92s QoS capabilities.  Although ATM does not extend to the = end user, the edge router(s) in a building can easily run IP over = ATM.
    The Integrated Services Architecture (ISA) has an objective to support and deliver different service levels (see figure below).  This = includes enhancements to IP that allow it to support integrated or multimedia = services.  The Internet Engineering Task Force (IETF) is now trying to integrate = RSVP and ATM.  Protocols such as RSVP allow resource reservation across an IP network, similar to the way ATM allows for QoS within an ATM = network.  The RFC 1932 specifies combining both routing and switching functions = and allowing for cut-through  switching.  The decision to route = or switch can be done by using a signaling protocol like the RSVP.  = Another way to decide would be to have the network inspect/monitor the = traffic.  Upon determining that the traffic is, e.g., CBR data, the network can = then select cut-through switching.  Using RSVP, it is possible to = provide for a similar traffic contract with respect to packet flows that ATM = UNI and network-node interface (NNI) offer to cell flows.

 

[13]

 
 
 
 
 
 

    Since IP is = connectionless, multicasting and broadcasting are very easy to perform.  However, in a connection oriented environment like ATM, multicasting and = broadcasting are not as easy.  In order to perform multicasting, a multicast = address resolution server (MARS) must be used.  The LAN Emulation (LE or = LANE) protocol stack is very similar to the IP over ATM protocol stack, and = their operations are also quite similar.  LANE version 1.0 provides what appears to be a connectionless service over ATM and broadcasting and = multicasting are supported.  The host can have the higher layer protocols = mapped into the AAL5 and ATM layers within the LAN environment.  The = Ethernet/legacy LANs have their MAC addresses mapped to ATM addresses by using a LAN = Emulation Server (LES or LE server).  The end system which is referred to as a LAN Emulation Client (LEC), is identified by an ATM address.  An LE Configuration Server (LECS) is used to initialize a LEC with = information specific to the ELAN that the LEC will be joining.  The LECS will provide a LEC with the ATM address of the LES and enable the LEC to = auto-configure itself [6].  A Broadcast/Unknown Server (BUS) handles data = addressed to the MAC broadcast address (all ones), all multicast traffic, and = unicast frames sent by a LEC before the ATM address of the destination had been resolved.  All LECs maintain a connection to the BUS and are = considered to be leaves on a point to multipoint VC with the BUS as a root.  This enables LECs to send data frames without first setting up a = connection, thus maintaining the presence of a connectionless data transfer service to the higher layers present in the LEC.  In order to run IP over ATM, the IP addresses must be resolved with ATM addresses.  The = RFC 1577 specifies ATMARP which only handles resolving IP to ATM addresses in a single subnetwork referred to as a Logical IP subnet (LIS).  In order to operate IP over ATM in an internet comprising many subnets, other methods must be used.  IP over ATM and LANE v1.0 both call = for using servers in order to send multicast traffic.  In most cases = this provides for a single point of failure.  Neither RFC 1577 nor LANE v1.0 specify routing data through an internet using QoS.  However, using RSVP and/or the ISI model (shown above), QoS can be = supported.
 

4.1  Next Hop Resolution Protocol (NHRP)
    The NHRP = resolves IP with ATM addresses in a multiple LIS environment [6].  It = provides a host (which can be an edge router) with the ATM address of the = destination so that one or more layer-3 hops can be bypassed.  This is done = using a switched virtual circuit (SVC) over an ATM network.  NHRP is = essentially an extension of ATMARP and it does not exchange or advertise reachability information (see Appendix A, section, 6) with peer entities like = RIP and OSPF do.  It also doesn=92t assist in building routes from a = source to a destination [6].  Rather than use ARP servers, NHRP uses NHRP servers which maintain next-hop-resolution cache tables with IP to ATM addresses for all the attached/local nodes.  NHRP can support any layer-3 protocol and allows IP to run over ATM in an internet = environment.
 
4.2  Multiprotocol over ATM (MOPA)
    MOPA is a = service which allows for layer-3 QoS such as the QoS specified in RSVP to take advantage of ATM=92s QoS.  MOPA supports layer-3 internetworking = for hosts attached to ELANS, hosts attached to ATM networks, and hosts = attached to legacy LANs [6].  MPOA is based on LANE v2.0, and it provides = seamless network layer routing using NHRP and multicast address resolution = servers (MARS).  It provides and delivers the function of a router and = takes advantage of the underlying ATM network as much as possible.  True internet functionality is achieved.  MOPA uses virtual = routers which compute routes and forward packets across an ATM network.  = Although LANE v1.0 was really meant for TCP/IP, it did not provide for VoIP and QoS.  LANE v 2.0 handles protocols such as TCP/IP, IPX, DECnet, = NetBIOS, and SNA.  It also allows for QoS end-to-end, and associates QoS = with routing.  MOPA combines bridging and routing functions.  By = using MOPA, multiprotocols can be routed and bridged over ATM networks.  MOPA and LANE v2.0 together allow for layer-3 QoS, such as RSVP to take advantage of ATM=92s QoS.  Therefore, in order to provide QoS for = IP over ATM networks, LANE v2.0 and/MPOA can be used because classical IP over ATM does not allow for QoS, unless RSVP is used.

5.  Routing VoIP = through IP/Packet Switched Networks

    Today IPv4 is being used throughout the Internet.  Although much = work has been done on IPv6, it has yet to be implemented.  In IPv4, the 4-bit TOS field can be used to indicate priority/QoS.  The TOS can indicate minimum delay, maximize throughput, maximize reliability, = minimum monetary cost (see Appendix, section 6.2), or normal service = [17].  In IPv6, the TOS field essentially became the Flow Label field.  = IPv4 also has an Options field where time stamps are located.  The = preferred value for the timestamps is the number of milliseconds past midnight, = UTC, similar to the ICMP timestamp request and reply [16].  Although = the OSPF routing protocol (see Appendix, section 6) can dynamically monitor the TOS field in IP packets, this option is generally not used in the = Internet today.  The IETF has proposed two different models for = guaranteeing QoS in the Internet.  They are the Integrated Services and = Differentiated Services models [19].






5.1  = RSVP

    RSVP is a signaling protocol which reserves and unreserves bandwidth or resources in the routers.  Both IPv4 and IPv6 are supported.  RSVP uses the TOS field in IPv4 and the Flow Label in IPv6.  It = can reserve resources for both unicast and multicast traffic and can do so based on individual requirements of multicast members.  The RSVP = process communicates with IP to exchange RSVP control messages with peer = routers.  After a reservation has been established, each router must classify = each incoming IP packet to determine whether it belongs to a QoS flow or = not.  The reservation of resources is done in only one direction (simplex = operation).  There needs to be separate reservations for each direction of data = flow.  Since the reservations are independent of the routing protocols, not = all routers have to honor the reservations.  If traffic which requires a high level of QoS arrives at a non-RSVP enabled router, the router = handles the traffic using a best-effort delivery method.  This feature = will be more important as the routers throughout the Internet are = upgraded/replaced because packets requiring RSVP do not have to travel through only RSVP enabled routers.  Therefore, it is possible to route IP packets = through routers and still receive a certain level of QoS.
 
5.2  IntServ
    The RFC 1633 outlines the IntServ architecture which defines QoS = services and reservation parameters which are used to obtain the required QoS = for an Internet flow.  RFC 2205 outlines the RSVP signaling protocol = which is used to convey the parameters from one or multiple users towards one or more destinations.  IntServ uses a classifier which is based on a multi-field classification.  It checks five parameters in each = IP packet; the source IP address, the destination IP address, the protocol ID, the source transport port, and the destination transport = port.  IntServ-level QoS can be provided only within peripheral areas of the = Internet, preventing its extension inside core areas and the implementation of = end-to-end QoS.
 
5.3  DiffServ
    DiffServ is a scalable service model that was designed to provide an = end-to-end QoS support for multimedia applications.  This model was proposed by the IETF and was intended to be deployed in the core of the = Internet.  DiffServ operates in the heart of the network while IntServ operates at the edges of the network.  An important part of DiffServ is the = definition of the different per hop behaviors (PHB) like Expedited Forwarding (EF) and Assured Forwarding (AF) [9].  The PHBs are a way of allocating network resources to different traffic aggregates.  A higher = quality, low delay route can be built using EF.  Classification, marking, = policing, shaping, and different PHBs are all part of differentiated = services.  The traffic entering a DiffServ network is classified and conditioned = at the edges, then marked and assigned to different behavior aggregates = (BA) by setting the value of DiffServ code Point (DSCP) [10].  DiffServ discriminates each packet flow by using the TOS field in the IP header, and the TOS is referred to as DSCP [11].  The packets are marked = differently to create the various packet classes.  The routers can classify = the packets by using the DSCP.  The routers process the DSCP and = provide a corresponding PHB to the flow.  An EF PHB requires that the = departure rate of the aggregate=92s packets must equal or exceed the = aggregate=92s maximum arrival rate.  The forwarding parameters are provided to the = aggregate, not to the individual flows.  There are no real limits on the = amount of delay or jitter of individual flows.  When a router or switch = is servicing a number of different sources or connections, the various = packets from those connections will compete to be output first.  There are many different scheduling algorithms which are used to resolve the = contention for output.  Some of the more popular queue scheduling algorithms are priority queuing (PQ), weighted round robin (WRR), and class based queuing (CBQ).  These algorithms can be used to support EF.
 
5.4  Real-time Transport Protocol (RTP) and RTP Control Protocol = (RTCP)
    Another protocol which is used to route data using QoS is RTP.  = This protocol operates on top of UDP (see Appendix, section 5) and provides the port addressing of the transport layer.  RTP supports the = transfer of real time data of a number of users in a session.  RTCP = provides feedback to RTP data sources and all session participants.  = Multicasting is supported and timestamps are used so that accurate timing can be = regenerated at the destination by using a delay buffers.  RTP identifies the = payload format and allows the use of two kinds of RTP relays: translators and mixers.  When a mixer is being used, it can combine = data from a plurality of sources, provide the timing data in the combined = stream, and identify itself as the source of synchronization.  In RTP, a = 32 bit timestamp of the first byte of data in the payload and the values = are generated from a local clock.  A translator is able to transmit = one or more RTP packets for every one packet it receives.  It can also modify the format of the data or use a different lower-level protocol = to transmit from one domain to another.  For example, one user in a = multicast group may not be able to view high quality video on their terminal so = the translator could convert it to a lower quality format which would = require a lower transfer rate.  In RFC 1890, there are many different = payload types for audio and video encoders specified.  Some of the more = popular ones would be: PCMU audio, GSM audio, PCMA audio, JPEG video, and H.261 video.  RFC 1889 specifies the four functions performed by RTCP = including RTCP=92s role in providing QoS.  RTCP provides feedback pertaining = to QoS so that all session members will know how well the others are = performing. Sender reports allow the receivers to estimate the data rates and the = quality of transmissions.  Receiver reports specify any problems in the receivers, including missing packets and excessive jitter.  It is clear that RTP will enable real time data to be routed using = QoS.  As shown below, RTP and RTCP can be used effectively in the H.323 = protocol.
 
Using H.323 Terminal Components





Voice Terminal [8]

 
    H.323 is an established protocol that has been widely used to transfer real-time data because of its manageability, reliability and = interoperability with the PSTN.  The International Telecommunication Union (ITU) = started working on H.323 in 1995.  The H.323 standard specifies three = types of equipment: gatekeepers, gateways, and terminals.  The = gatekeeper is optional but it can provide call control services such as: address = translation, admission control, call authorization, and directory services, to the = terminals.  There is a Q.931 signaling channel between the gateway and the terminal but the call setup messages are processed at the gateway and not = forwarded.  There is a separate Q.931 channel used between the gateway and the = PSTN.  The H.323 protocol was one of the first to be used for VoIP and/or = real-time services so it is not likely that it will simply be replaced.  = However, it may merge with other newer protocols in the future.
 
5.6  Session Initiation Protocol (SIP)
SIP [12]
 
 
 
 
 
 
 

   In about 1997, the = IETF started working on its own VoIP protocol, SIP.  This new signaling protocol is used for establishing real-time calls and conferences over IP networks.  It is an open standard and is scalable.  SIP = was designed to be a general-purpose protocol.  It can be used for = VoIP, conferencing, telephony, multimedia, instant messaging and application level mobility across various networks, including wireless, and across user devices.  SIP can interact with other areas, such as the next generation wireless internets, QoS, payments and security.  The = SIP Forum is currently working on the integration of SIP with such other = areas of work on the Internet.  The Third Generation (3G) Partnership = Project (3GPP) recently decided to use only SIP as the call control protocol = between terminals and the mobile network [18].
    SIP is an = end-to-end, client-server session signaling protocol.  It provides = presence and mobility, and is used in session setup, termination, and = changes.  SIP allows third party entities to create and terminate call using both the Common Gateway Interface (CGI) and Call Processing Language = (CPL).  Users are provided with globally reachable addresses.  A called = party is contacted by using a single, location-independent address.  = Each session may include different types of data although currently most of the SIP extensions are directed towards audio communication.  As a traditional text-based Internet protocol, it resembles the hypertext = transfer protocol (HTTP) and simple mail transfer protocol (SMTP).  SIP = uses Session Description Protocol (SDP) for describing the media.  = Although, SIP cannot reserve resources it can provide information to end systems regarding what the desired QoS is.  SIP is independent of the = packet layer, and extensions to SIP are needed in order to make the protocol = truly functional in terms of interoperability.
    SIP = operates using User Agents,  SIP Proxy Servers, SIP Redirect Servers and = Registrars.  The User Agents, or SIP endpoints, function as clients (UACs) when = initiating requests and as servers (UASs) when responding to requests.  User Agents communicate with other User Agents directly or via an = intermediate server.  The User Agent also stores and manages call states.  The SIP intermediate servers have the capability to operate as proxy or redirect servers.  SIP Proxy Servers re used to, e.g., forward = requests from the User Agent to the next SIP server, User Agent within the = network and also retain information for billing/accounting purposes.  SIP Redirect Servers respond to client requests and inform them of the = requested server=92s address.  Numerous hops can take place until reaching = the final destination.  Servers can contact external location servers to determine user or routing policies, and thus, does not bind the user into only one method of locating users.  In order to maintain = scalability, the SIP servers can either maintain state information or forward = requests in a stateless fashion.  Since SIP enables people mobility by = providing the capability to reach a called party at a single, = location-independent address, it may turn out to be the best protocol for wireless = devices.

 
Summary
    It is = common for large corporations and phone companies to use ATM in the backbone = while most end users are implementing IP.  In VoIP networks, voice data does not get directly encapsulated into IP.  Protocols such = as  RTP and UDP are commonly used to encapsulate the voice data.  = Since it is common to use a LAN as an interface to the Internet, VoIP data = gets encapsulated into a variety of DLL protocols, such as the SNAP and the LLC protocol.  Signaling protocols which assist in guaranteeing = QoS are: RSVP, SIP, and ATM signaling/UNI.

Conclusion
    When = placing VoIP phone calls, people will notice if the voice packets experience = excessive delays.  A guaranteed QoS is needed in order to transmit and = receive toll quality phone calls.  Most ATM networks operate over fiber = optic cables and can in fact guarantee a very high level of QoS.  Using classical IP over ATM will not offer the user any QoS but if RSVP = and/or MPOA and LANE v2.0 are used, the QoS can be assured.  When sending data from an average desktop computer, IP data is encapsulated into the LLC and MAC layers.  If the end user connects to the Internet, = only best-effort service is provided.  Protocols such as RSVP, H.323, = SIP and DiffServ are being developed but very few routers can process these newer protocols.  As times goes by, there will constantly be new = methods and protocols which will support QoS and real-time data.  If end = users are willing to accept low quality voice and video streams, the methods are already in place to achieve this.  Although most people will = not be eager to use VoIP over the Internet today, they may wish to receive small aspect ratio broadcast video clips and even real-time = audio/music.  Currently, an all ATM network would provide the highest quality = service.  However, although ATM will continue to play a dominant role in the = backbone networks, it is not practical to assume that ATM will be brought to the desktop or end users any time soon.  While large corporations, and service providers will continue to use ATM and fiber optic networks, = phone companies are beginning to migrate voice traffic onto packet switched = networks and the Internet.  Almost all desktop computers currently use = TCP/IP and are able to send low quality VoIP data through the Internet.  However, UDP and RTP/UDP allow the data to be transmitted faster.  SIP and H.323 are two protocols which assist IP enabled devices to = transfer real-time voice and video data.  In the future, SIP and H.323 may offer seamless Interworking between the two protocols.  As = protocols such as RSVP, RTP, SIP, and H.323 develop, the routers in the Internet will have to be upgraded.  When enough routers get upgraded, = routing real-time data using QoS will be possible.

 
Acronyms and = Abbreviations

API

application programming interface
ARP
address resolution protocol
AS
autonomous system
ATM
asynchronous transfer mode
CAC
call admission control
CGI
Common Gateway Interface
CLS
controlled load service
CMTS
cable modem termination systems
COPS
Common Open Policy
CPL
Call Processing Language
CSMA
carrier sense multiple access
DiffServ
Differentiated Services
DMTF
Desktop Management Task Force
DTMF
dual tone multifrequency
ETSI
European Telecommunications Standards Institute
IETF
Internet Engineering Task Force
IMTC
International Multimedia Teleconferencing Consortium
IN
intelligent network
IntServ
Integrated Services
IP
Internet Protocol
ISDN
Integrated Services Digital Network
ISP
Internet service provider
ITU-T
International Telecommunication Union/Telecommunication Standardization = Sector
LIS
Logical IP subnet
LL layer
Logical Link layer
MAC
medium access control
MARS
multicast address resolution server
MGCP
Media Gateway Control Protocol
MOPA
 Multiprotocol over ATM
NHRP
Next Hop Resolution Protocol
OSPF
Open Shortest Path First
PHB
per-hop behaviors
PSTN
public switched telephone network
QoS
Quality of Service
RAS
Registration, Admission, and Status
RSVP
Resource Reservation Protocol
RTP
Real-Time Protocol
RTCP
Real-Time Control Protocol
SIP
Session Initiation Protocol
SNMP
Simple Network Management Protocol
TCP
Transmission Control Protocol
3GPP
Third Generation (3G) Partnership Project
TIPHON
Telecommunications and Internet Protocol Harmonization over Networks
UDP
User Datagram Protocol
VoIP
Voice over IP

 
 
References
 
[1]  Kuo, Chin-Hwa et al., =93Delivering Voice over The Internet=94, IEEE, = PP: 628-632, September, 2000.
[2]  Ha, E.J. & Park, J.T. =93Voice Traffic Multiplexing Scheme With = Guaranteed QoS Between VOIP Access Routers Using DiffServ=94, Electronics Letters, = Vol. 37, No. 13; 21, June 2001.
[3]  Hamdi, M. et al.,=93Voice = Service Internetworking for PSTN and IP Networks=94, 3Com technical paper, IEEE = Communication Magazine, Vol. 35, No. 5, May 1999.
[4]  Black, Uyless, =93ATM Volume III Internetworking with ATM=94, = 1998.
[5]  Kubzansky, Mike, =93Global Crossing VoIP Overview=94, 11, July = 2001.
[6]  Sackett, George, =93ATM and Multiprotocol Networking=94, = 1997.
[7]  Keiser, Bernhard, E, et al., =93Digital Telephony and Network = Integration=94, second edition, 1995. 

[8]  Hamdi, M., et al., =93Voice Service Interworking for PSTN and IP = Networks=94, IEEE Communications Magazine, Vol. 35, No. 5, May = 1999.

[9]  Tyagi, Anurag et al., =93VoIP Support on Differentiated Services Using = Expedited Forwarding=94, IEEE, August, 2000.

[10]  Nasser, H., et el., =93Voice over Differentiated Services=94, Internet = draft, draft-nasser-voice-diffserv-eval-99.txt, December, 1998.

[11]  Loren, Paul, =93Quality of Service And New Architectures For future = Telecommunications Networks=94, IEEE, 2000.

[12]  Schulzrinne, =93The Session Initiation Protocol (SIP)=94, Dept. of = Computer Science, May 2001.

[13]  Alles, Anthony, =93ATM Internetworking=94, May = 1995.

[14]  ATM Forum.org Asynchronous Transfer Mode Switching chapter = 20.

[15]  Freeburg et al., =93Asynchronous Transfer Mode Radio Communications = System With Handoff And Methodof = Operation=94, U.S. Patent 5,940,381, August 17, 1999.

[16]  Stevens, Richard, W., =93TCP/IP Illustrated, Volume 1=94 Addison = Wesley, 1994.


[17]  Stallings, William =93High-Speed Networks TCP/IP and ATM Design = Principals=94, Prentice-Hall, Inc., 1998.
[18]  Bos, Lieve, et al., =93Toward an All-IP-Based UMTS System = Architecture=94, IEEE Network, February, 2001.
[19]  Moon, Bongkyo et al., "RSVP Extensions for Real-Time Services in = Wireless Mobile Networks", IEEE Communications Magazine, December, 2001.
 

Appendix

1.  Transmission = Mediums

    The majority of the telephones in use today, are still using the 64 = Kbps (kilo bits per second) DS0 analog transmission lines, even though most phone companies are using digital packet switched networks to route the voice data.Even in the analog form, voice data is only allowed to occupy from 300 Hz to, at most, 3000-4000 Hz.When using an analog phone, as opposed to, e.g., an ISDN phone, it appears that the user is using = circuit switching (CS) technology, but when the data gets to the first = hop/node, the local exchange carrier (LEC) converts the CS voice data into packet switched data.In order to run fewer cables, 24 of the DS0s can be time division multiplexed (TDM) into a = single digital 1.544 Mbps T1 line.Allowing eight bits for each channel, and one bit for overhead yields 24 x 8 + 1 =3D 193 bits per frame.Sampling the data at the Nyquist sampling frequency (at least twice the highest = frequency) yields 8000 x 193=3D1.544 Mbps.Phone companies are now = using fiber optic transmission mediums in the main, or backbone network.A protocol such as Asynchronous Transfer Mode (ATM) can run over fiber = optic cables and has the ability to convert the T1 data into 53 byte fixed = length packets called cells.ATM generally operates in increments of155.52 Mbps.Synchronous Digital Hierarchy (SDH) is used in order to have a constant denominator.Meaning, relating/equating the North American 1.544 Mbps T1 lines and the = Japanese 2 Mbps E1 rates.The SONET = STS-3 and the SDH/STM-1 both run at 155.52 Mbps.The 155.52 Mbps equates to multiplying 270 columns (length of SDH frame) = times nine rows times 64 Kbps (each byte is a 64 kbps channel).The SDH-based physical layer can carry ATM-based or STM-based (synchronous transfer mode) payloads, making it possible to deploy a capacity = fiber-based transmission infrastructure for a variety of circuit-switched (CS) = dedicated applications, and then migrate to the support of ATM.
 
2.  Office Networks
    In an = office environment, there are usually two separate networks.One network for the packet switched computer data and a second network = which carries the voice data.The computers are generally attached to a LAN, e.g., a 100 Mbps or 10 Gbps Fast = Ethernet using inexpensive network interface cards (NICs), and copper unshielded twisted pair (UTP) wires, e.g. category five (CAT5) phone lines.The LAN interfaces an ISP through an edge router using either T1 lines or = fibers.The phone lines are attached to a private branch exchange (PBX) and the PBX interfaces with the phone company using T1 lines.At the present time, these two networks are merging together.In the future, voice, video, and computer data will all occupy the same = network.Many companies are now trying different combinations of protocols to see how to best route real time voice data along with computer data.Once the QoS issues are resolved, people will be able to place high quality video conferencing calls from their desktop computers.

3.  Voice Encoders = (Vocoders)
    Low bit = rate coders such as ones that use G.723.1 and G.729 are applied to the = baseline codec of VoIP [2].Other = encoding schemes that can be used to encode voice data before encapsulating the data into RTP are G.711, G.722, G.728 [3].Speech is categorized as being either toll quality, communication quality or = synthetic quality.Toll quality = generally implies that the signal to noise (SNR) ratio exceeds 30 dB, the = frequency response is 200-3200 Hz, and total harmonic distortion (THD) is less = than 3%.The rate of toll quality = ranges from around 11 kbps to over 64 kbps [7].Communication quality speech has been labeled as being fair or good whereas synthetic quality is thought of as being less than fair, or even fake or = unacceptable.The newer codecs use smaller packets and lower transmission rates.The VoIP systems which use the newer standardized encoding algorithms are = experiencing less delay and jitter.

4.  Protocol Stack = Basics
    The Open = Systems for Interconnection (OSI) model comprises seven layers.Some of the more popular terms will be reviewed here.The layers are numbered from the bottom/physical layer up.Layer number two is known as the Data Link Layer (DLL) or the Logical Link = (LL) layer.At the LL layer, IP = datagrams are encapsulated into frames.The term datagram implies a connectionless circuit will be = used.Layer-2 is the only layer which receives data/protocol data units (PDUs) from = the next higher layer and appends both a header and a trailer to them.The bottom part/protocol the DLL comprises the medium access control = (MAC).This where the MAC/hardware/Ethernet address is appended.The MAC address is derived from the network interface card (NIC), and = comprises ofa 23/24 bit vendor = (manufacturer of the NIC) ID and a 23/24 bit serial number.Although terms such as segment (layer-4 term), packet/datagram = (layer-3), and frame (layer-2) are often used interchangeably, the correct term to use when referring to data at the LL layer is = frame.

5.  User Datagram = Protocol (UDP)
    It would be = easier to provide QoS if the protocols operated faster or if there was less = congestion.UDP is one such transport protocol which routes data from source to = destination faster than, e.g., TCP does.UDP and TCP both operate at the OSI transport level (layer-4, see Appendix, section 4) but unlike TCP, UDP does not use acknowledgements nor = retransmissions.VoIP packets can be encapsulated directly into UDP.If packets are received at the destination in error, they are simply = discarded.Waiting for retransmissions would add excess delay.Like IP, UDP is also connectionless.UDP is a datagram-oriented/connectionless, transport protocol.Unlike a stream-oriented protocol such as TCP, each output operation by a = process produces exactly one UDP datagram, which causes one IP datagram to be = sent.The UDP=92s primary purpose is to send and receive datagrams between = application programs.  UDP provides protocol ports to distinguish among = multiple programs on the same machine.Two main services that UDP provides beyond IP are port numbering and an = optional checksums.The UDP header = includes both 16-bit source and destination port numbers.The destination port number is used to demultiplex incoming data from the = IP layer.In TCP, the checksum is = mandatory.The checksum in UDP covers the UDP header and user data.The IP checksum only covers the IP header not the data.UDP provides no reliability, acknowledgements (ACKs), sequencing, nor flow control.The UDP length field indicates the length of the UDP header and data in bytes.The minimum value for this field is eight bytes.Since there is a length field in IP, the UDP length field is = redundant. Additional information in checksum includes a pseudo-header.The Pseudo-header is not transmitted independently and not included in the length computation.The purpose of the pseudo-header is to provide independent confirmation that the = datagram reached the correct destination in addition to correct port and = protocol.Pseudo-headers consist of 12 octets and include source and destination IP addresses, a field labeled Proto which is for upper layer protocols, UDP length = (excluding 12 octets of Pseudo-header), and one octet of zero padding.Using UDP it is easy to generate IP fragmentation whereas TCP tries to avoid fragmentation.Since UDP transfers data faster than TCP, it is a better transport protocol for = transmitting VoIP and/or real time data.
 

6.  Ethernet as an Interface
    The most popular local area network (LAN) is Ethernet.In the past, Ethernet LANs were providing data rates at 10 Mbps and then = 100 Mbps while ATM was operating at 155 Mbps.Since ATM was mainly using fiber as the transmission medium and Ethernet was running on the phone lines which were already installed inmost every commercial building, Ethernet was not going to be replaced by ATM in the LAN environment.Just to make sure Ethernet did not lose its dominance, they quickly = standardized the 10 Gbps Ethernet.This new Ethernet(803.3z) came out in the fall of1998 = and it only ran over fiber.In = June, 1999, a 10 Gbps Ethernet which could run over phone lines (802.3ab), = was approved.  From that point on, ATM basically had no chance of = breaking into the LAN networking business.However, ATM emulated LANs (ELANs) and virtual LANS (VLANs) are being used.They are complicated to install, and ELANs usually have a single point of = failure.However, these fiber based ELANs offer a very high level of security.The VLANs allow users from different LANs to share resources, e.g., = printers or servers, while not allowing users from the same physical/real LAN, = but different VLAN, to communicate with one another.LAN emulation enables traditional/legacy LANs to run on top of and coexist with ATM switched based networks.The ELANs interwork with legacy LANs through a bridge function [6].  = Ethernet runs at layer-2 and comprises of the LL layer and medium access control (MAC) layer.It is a carrier = sense multiple access (CSMA) protocol.Before transmitting, users first monitor the bus (for voltage levels) to see = if other users are currently transmitting.If there is a collision, the users go into a binary exponential backoff = waiting period before attempting to transmit again.A LAN is connected to an edge router and all frames that are going = outside the LAN or building onto the Internet, pass through the edge router = first.  When using IP over Ethernet the 32/64 bit IP addresses must be mapped = to 48 bit hardware (MAC) addresses.The address resolution protocol(ARP) binds addresses dynamically by making use of the broadcast nature of = Ethernet.If host A wants to know the physical/Ethernet/MAC address of host B, then host A broadcasts an ARP packet with the IP address of host B and asks anyone with that IP address to respond with its physical address.Only host B recognizes the IP address as its own and sends a reply with its physical address.Also, when host A sends the ARP request it includes its own IP-to-physical address = binding for host B to cache (stored locally).
 
7.  Routing Basics
    Currently, the millions of routers that are installed throughout the = Internet and the world, essentially do not provide any real QoS.IP runs at the network layer (layer-3) and it is up to the higher layers, such as the Transmission Control Protocol (TCP) to request = retransmissions when packets are lost or received in error. Routing is performed in = software while switching is performed in hardware [4].Any processing of a packet will add to the overall transfer delay.One of the main differences between IP and ATM is that IP datagrams are = routed, while ATM cells are switched.In networks such as the Internet, many different routing protocols are = currently being used. The process on the router that is running the routing = protocol, communicating with its neighbor routers, is usually called a routing daemon.The term daemon = means the process is running in the background, carrying out operations on = behalf of the whole system.Unix = systems often run the routing daemon named Routed.It is provided with almost every implementation of TCP/IP and it = communicates using only RIP.There is also another daemon called Gated which supports both interior gateway = protocols (IGPs) and exterior gateway protocols (EGPs).Dynamic routing implies that the routing tables are constantly being = updated.A corporation or campus often defines an autonomous system (AS) by all the routers in the individual network being under a single = administrative control.
 
7.1  Interior Gateway Protocols (IGP)
    Each AS can select its own routing protocol to communicate between = their own routers.A protocol which is used for routing within an AS is referred to as an IGP or an = intradomain routing protocol.One of the first IGPs is the HELLO protocol.It was the IGP used on the original NSFNET backbone (1986) but is not used much now.The most widely used IGP is the Routing Information Protocol (RIP).The Intermediate System to Intermediate System (IS-IS) protocol came out in 1986, and the Open Shortest Path First (OSPF) protocol came out in = 1987.The OSPF is now intended to replace RIP and in 1993, an RFC was published = which stated that a router that implements any dynamic routing must support = at least OSPF and RIP.  When routing an IP datagram/packet, some of = the questions that the router will have to ask itself are as follows.Is this packet for me?Meaning, is the layer-2 destination address my address?Is the checksum okay?Is the Time to Live (TTL) okay or should this packet be automatically discarded = because the TTL has expired?Do I have a route for this destination?Is the destination local or remote?One reason that routing is more time consuming than switching is because, = e.g., the TTL has to be processed.  Very often, the terms router and = gateway are used as synonyms, even though technically they are completely = different devices.In order to route a = packet, the layer-2 hardware/MAC address must be known.Bridges forward data using layer-2 addresses, routers read both layer-2 and = layer-3 addresses, and gateways can read all seven layers (of the OSI model) = and perform various types of protocol conversions.In order to route a packet through the Internet, the packet will have to = pass through many different routers before it reaches its final = destination.This is referred to as indirect routing because there are = intermediate nodes.  When a packet is received at a router or gateway, the = router will refer to its routing table to see what is the best way to = send the packet closer to its destination.  Typically a router reads = the destination IP address and compares it to the entries in its routing = table.Then the router will know which port to send the packet out on and which = destination to send it to.Each router has its own MAC address and as a packet gets routed from one router to another, the individual router MAC addresses will be used for the layer-2 = destination address.When a packet is sent out on an indirect route, the layer-3 IP address which specifies the final destination, does not change.However, the layer-2 destination address will change at each and every = router/hop.Up to five flags in the IP packet are used to inform the routers about the routing details, e.g., if the G flag is set, then the packet is on an indirect route.In direct routing, a packet is addressed from one host to another.In this case, the layer-2 destination address will not be changed.Two hosts on the same LAN would direct route packets between = themselves.Currently, routers do not prioritize data.If a router were to grant voice traffic a higher priority than other data, the lower priority data may end up getting dropped or discarded.If there are going to be levels of priority for various types oftraffic or connections, service providers will end up wanting to charge for = them.At the current time, it does not seem feasible for routers all around the world to agree on priority issues and billing schemes.One solution is for a carrier to install its own network because then it = can control the traffic loads and block calls which cannot be guaranteed = the proper QoS parameters.Global Crossing has such a network.They are deploying one of the world=92s most extensive VoIP networks over the core private = Global backbone [5].However, since Global Crossing is in the minority, and true VoIP over the Internet requires = using all possible networks, protocols which allow real time data to traverse internets with minimal delay and corruption must be implemented.
 
7.2  Distance-Vector (Bellman-Ford) Routing
    The term distance-vector comes from the type of information which is = sent from router to router periodically.In D-V routing, routers send out copies of their entire routing table to = adjacent routers.The messages sent between the routers comprise of a list of pairs (D, V) where V identifies a = destination (called the vector), and D is the distance to that = destination/vector.This process amounts to what is known as topology discovery.In D-V routing, each message is reported in the first person, meaning a = router says =93I can reach destination V at distance D=94.Usually D is synonymous with the number of hops it takes to reach the = destination (V).It is very common to refer = to the number of hops as the cost.This is where the term least cost routing (LCR) comes from.However, in LCR algorithms, costs may comprise of a plurality of other factors = besides the number of hops.Costs can literally equate to expense/money since, e.g., routing through certain gateways, e.g., a satellite, can be known to be more expensive than routing = through a terrestrial (land based) router.The application will dictate what =93cost=94 equates to.In the OSPF protocol, costs can be assigned based on throughput, round = trip time, reliability, or whatever.
 
7.3  Link-State (L-S) Protocols
    In a L-S protocol a router does not exchange distances with its = neighbors.Instead, each router actively tests the status of its link to each of its = neighbors.Then it sends this data to its other neighbors which then propagate it = throughout the autonomous system.Each = router takes this L-S information and builds a complete routing table.This is how topology discovery is performed when using a L-S protocol.The L-S protocols will always converge/stabilize faster than D-V protocols after something changes.When = using a L-S algorithm, all routers will have identical routing databases = (DBs), each DB will describe the complete topology, and each router can use = its DB to derive the shortest paths to all destinations, possibly using the Dijkstra least-cost algorithm.Two protocols which use a L-S algorithm are OSPF and IS-IS (see = below).
 
Routing Protocol Comparisons
 
Link-State (L-S) vs. Distance-Vector (D-V) vs. Path Vector = (P-V)
Interior Gateway Protocol (routing within autonomous = system)
Exterior Gateway Protocol (routing between autonomous = system)
Name
Type
Algorithm
Daemon
HELLO
IGP
 
Gated v2/v3
RIP version 1
IGP
D-V
Routed/Gated v2
RIP v2
IGP
D-V
Gated v3
OSPF
IGP
L-S
Gated v3
IS-IS
IGP
L-S
 
EGP
EGP
Reachability
Gated v2/v3
BGP v1
EGP
P-V
Gated v2
BGP v2/v3
EGP
P-V
Gated v3
 
8.  Multiprotocol = Label Switching (MPLS)
    Another = method of supporting QoS is MPLS.  When using MPLS, the end station is = essentially connected to a switch not a router, and a label is added to each IP = packet at the ingress of the network.  An ARP server is connected to the switch and will give the end station(s) the physical addresses.  = The packets comprising a =93label=94 can be identified and switched as = opposed to being routed.  Edge devices can monitor traffic patterns and = determine that certain data is not bursty and is instead CBR (constant bit rate) data and then can switch the CBR data in order the keep the delay = (commonly associated with routers) low/lower.  The labels and the label = switching router (LSR) determine how packets are forwarded through a = network.  The label distribution protocol (LDP) builds a table and delivers the = labels. The CR-LDP protocol allows the definition of a label switched path = (LSP) with QoS constraints.  The QoS classification uses a single valued label, not a multi-field one.  MPLS only calculates the best route one time for a given communication.

Data-Over-Cable Service = Interface Specification (DOCSIS)
    The DOCSIS = is designed to handle latency sensitive services, especially voice, so = that a guaranteed QoS is provided.  First generation VoIP for cable = networks will be partial access bypass systems, where the cable network is used to bypass the twisted pair network and deliver voice directly to a = local Class 5 switch at the head end.  The existing PSTN is used from = the connection point at the Class 5 switch.  In this environment there will be a limited number of hops, so the call quality is very = good.  Eventually the VoIP equipment will perform end to end IP transport, = where the voice packets are carried all the way to their destination without converting the packets to the public phone network.  In this = environment, managing delay throughout the entire network is critical.  In a = typical 2 MHz DOCSIS 1.1 upstream bandwidth, a typical cable network can = simultaneous can handle only 17 toll quality voice calls and 22 cable modem users = who are access the Internet at 350 kbps.  In this example, about 64 % of the upstream is used for the voice data, about 20% is used for other data such as internet browsing, and about 16% is left for = headroom.