Quality of Service Models
There are three basic quality of service models. One of them is best effort, which is basically no quality of service. You’ve done nothing to define certain traffic to have priority over others, you’ve really, you’ve done nothing it’ll get there when it gets there. Integrated services is a reservation. You make a reservation through the network for X amount of bandwidth to ensure that that particular flow will have enough bandwidth and you have the reservation so you're guaranteed the service or the availability. Then there’s the differentiated services model or DiffServ. This is the model that most of us choose to use. It uses something called MQC the Modular Command Line Quality of Service Configuration that we’re going to be doing and with MQC we’re able to classify, mark, and then say what we want to do to that traffic and it enforces what we call policies - Quality of Service policies on how that traffic should be treated and we can accept that there’s a bunch of different levels and we can use this in a very large enterprise network. It gets a little tougher if you use something like IntServ where you have to make reservations through the network in a large enterprise solution, so in other words it doesn’t scale as well as the differentiated services model does.
- Best-effort model
- No QoS applied to packets
- Default model for all traffic
- Integrated Services model (IntServ)
- Offers absolute QiS guarantees by explicitly reserving bandwidth
- Uses RSVP to reserve network resources
- Differentiated service model (DiffServ)
- Allow classification of network traffic
- QoS policy enforces differentiated treatment of traffic classes
- Many levels of quality possible
- Commonly used in enterprise networks
Quality of Service Mechanisms
There are several quality of service tools now that we can use, but first we need to classify the traffic. By classifying the traffic you are identifying the different types of traffic in your network. Now if everybody's special, nobody is special. So what we try to do is take like minded applications that we want to treat the same way and put them into a traffic class. We really shouldn’t go above 11 classes of traffic at this point in time. Now we might definitely dedicate one traffic class to voice, because we’re going to treat that differently than any other traffic class and we can even have a scavenger traffic class which says, "Hey everything else that might be going through the network that doesn’t need any kind of special treatment you get to go out the interface last if you come across a queuing mechanism". So we classify and then we mark the traffic.
We mark the traffic with a differentiated services code point value. We now have said, okay if you are a certain type of traffic this is the number that’s going to be associated with you. Cisco calls it like coloring it, so based on that number this is the class of traffic you are. Now we can go out and say, "Based on your traffic class this is how much bandwidth you get or this is how you're going to be treated as you go through the network" and we want to do this early on, we want to do our classification and marking as close to the source as possible and then that way as we travel through the network a packet travels through the network it doesn’t have to be reclassified or remarked everywhere, it maintains those values and then we set up what treatment we want to have happened accordingly.
There's also congestion management, which is the queuing mechanisms that prioritize the transmission of this packets and they do this based on the classification and marking that we put in place.
Congestion avoidance - little bit different here, we use this to drop packets early to avoid congestion later on in the network. Now thinking about that I would not want to put my voice traffic as a drop probability packet maybe that data packet or that scavenger packet, I might want to, but I would try to avoid having to drop any of my voice packets. So the nice thing about congestion avoidance tools is we can say this particular packet does have a drop probability this particular packet has a much lesser drop probability, so in other words it's not that we can say, don’t ever drop a voice packet, but we can put it way down on the list of the drop probability that it could get dropped out of an interface.
Policing and shaping are what allow us to control and rate limit our traffic and we can do this by even remarking a packet and say "You came in with this value but guess what we’re going to mark you down to a different value and hopefully you’ll get transmitted out that interface in a timely fashion" or we could even drop packets based upon their markings.
And then finally link efficiency mechanism, we saw this with link fragmentation and interleaving this is how we actually made better use or more effective use of the bandwidth we really didn’t do any type queuing here, we just made it more efficient by chopping up those larger data packets and interleaving that voice traffic amongst those packets.
Applying Quality of Service to Input and Output Interfaces
Now depending upon where it is in our network that we’re going to be applying these quality of service tools. Certain tools can only be applied in certain directions. Input interfaces we can use classification, marking, and policing. On output interfaces, we can use congestion management and we can always use marking, congestion avoidance, we can use that shaping and policing and compression and fragmentation and interleaving, so depending upon the directionality, your link speeds and what you need in your network you can choose these tools to enable quality of service in your environment.
Let’s just take a look at a high level at the different queuing algorithm’s and what they can do for you.
- First packet in is first packet out; only one queue
- Priority queuing (PQ)
- Empty queue 1l if queue 1 is empty, then dispatch packets from queue 2, and so on
- Weighted fair queuing (WFQ)
- Flow-based algorithm that simultaneously schedules interactive traffic to the front of a queue
- Class-based weighted fair queueing (CBWFQ)
- Extends WFQ functionality to provide support for user-defined traffic classes
- Low-latency queueing (LLQ)
- Brings strict PQ to CBWFQ; allows delay-sensitive data like voice to be dequeued and transmitted before packets in other queues are dequeued
Basically, first-in first-out is not really a queuing mechanism, I guess it is because you're queuing up first come first serve. But what we typically do is use FIFO within another queuing algorithm, such as priority queuing. If we dedicate a priority queue we're using first-in, first-out. Let’s take voice traffic for example. Let’s say we taken this voice traffic we put into the priority queue its all the same traffic so we have to service it first-in, first-out. We can’t service it really any other way so that’s how FIFO could tie into another type of queuing mechanism.
Weighted fair queuing, this is a flow based algorithm that can actually schedule traffic to the front of the queue and it can share bandwidth among high bandwidth flows in a fair manner and it allows us to setup the ability to give weight to certain flows over others, hence the weighted fair queuing. So, instead of just servicing each queue one packet at a time in kind of a round robin fashion we can say you need to service three packets out of this queue, two packets out of the next queue, one packet out of the bottom queue and back to the top again and what that really does for us is it gives us a way to not starve out other queues, they want to starve out the data traffic. I mean that’s not the goal here to just send voice traffic and forget the rest of it. So very, very basic definition of what weighted fair queuing resembles. This kind of gives us more of a fair distribution of packets out of each one of the queues.
Class-based weighted fair queuing is really an extension of weighted fair queuing and what we do is we go in define a class according to the match criteria. We give it a certain amount of bandwidth and weight in maximum packet limit that can all be assigned to that individual queue and to characterize a class the queue limit must also be defined for that class. That’s again the maximum number of packets allowed to accumulate in that particular queue.
Low latency queuing though that’s the one we want for voice. If we have voice traffic running in our network we probably are going to be choosing low latency queuing. What does that do? That gives us a priority queue for our voice and class-based weighted fair queuing for the rest of the traffic and that really gives us the ability to service all of the traffic in the network without starving anybody out, making sure that applications that are mission critical get serviced over something like scavenger traffic that’s just web surfing maybe that’s in that scavenger queue. So with class-based weighted fair queuing we have a weight for a packet and that weight for that packet belonging to a specific class is derived from bandwidth that’s been assigned to the class when we configure it, so the bandwidth that's assigned to the packets of a class determines the order in which packets are sent out.