Quality of Service (QoS)
Networks have finite bandwidth. QoS ensures critical traffic (Voice/Video) gets through even when the link is congested. It manages the unfairness.
1. Classification & Marking
Before prioritizing traffic, we must identify it. We "mark" packets by setting bits in the header so downstream routers know how to treat them. This is "Trust Boundary" logic.
Layer 2: CoS (Class of Service)
Uses the 3-bit Priority Code Point (PCP) field inside the 802.1Q VLAN Tag. Values 0-7.
- CoS 5: Voice (Expedited Forwarding).
- CoS 3: Critical Data / Signaling (Call Control).
- CoS 0: Best Effort (Default).
Limitation: Only exists on Trunk links. Lost if the frame is routed (Layer 3).
Layer 3: DSCP (Differentiated Services Code Point)
Uses the 6-bit field in the IP Header (ToS Byte). More granular than CoS (64 classes) and persists end-to-end across routers.
| Class | DSCP Value | Per-Hop Behavior (PHB) | Use Case |
|---|---|---|---|
| EF | 46 (101110) | Expedited Forwarding | Voice (Real-time, Low Latency). Strict Priority. |
| AF41 | 34 (100010) | Assured Forwarding | Video Conferencing (Video needs bandwidth, but can handle some drop). |
| CS3 | 24 (011000) | Class Selector | Signaling (SIP/H.323). |
| BE | 0 (000000) | Best Effort | Email, Web, File Transfer. (90% of traffic). |
2. Queuing: Managing the Line
When an interface is congested (Output Buffer Full), packets must wait. The queuing algorithm decides which packet goes next.
The Evolution of Queuing
- FIFO (First-In, First-Out): Default. Simple. Bad for voice (stuck behind a large FTP download).
- PQ (Priority Queuing): 4 Queues (High, Med, Normal, Low). High is ALWAYS serviced first. Risk: "Upper Queue Starvation" if High queue is full, lower queues never get sent.
- WFQ (Weighted Fair Queuing): Dynamically separates flows. High-bandwidth flows are penalized; low-bandwidth flows (like Telnet) get priority.
- CBWFQ (Class-Based WFQ): User-defined classes. "Give Voice 20%, Video 30%, Default 50%". Guarantees bandwidth but doesn't guarantee low latency.
- LLQ (Low Latency Queuing): The Gold Standard. CBWFQ + a Strict Priority Queue (PQ). Voice goes first, always. It is policed to prevent starvation of other queues.
3. Policing vs Shaping
Both limit bandwidth, but they handle excess traffic differently.
Policing (Hard Limit)
Drops excess traffic immediately. Causes TCP retransmissions ("Sawtooth" pattern). Best for ingress (limiting a customer's rate).
Shaping (Soft Limit)
Buffers excess traffic and sends it later. Smoother flow. Adds delay (jitter). Best for egress to a slower link (e.g. 1Gbps interface -> 100Mbps WAN circuit).
4. Congestion Avoidance (WRED)
Tail Drop: When the queue is full, drop all new packets. This causes TCP Global Synchronization (all TCP streams back off at once, then ramp up at once, oscillating link utilization).
WRED (Weighted Random Early Detection): Randomly drops a few packets before the queue is full. This signals TCP senders to slow down gracefully, keeping link utilization high and stable.
Modern WRED can use ECN instead of dropping. It marks the IP header (last 2 bits of ToS) to tell the receiver "I am congested". The receiver then tells the sender to slow down (TCP ECE flag). Zero packet loss!
References
- RFC 2474: Definition of the Differentiated Services Field (DS Field) - The standard for DSCP.
- RFC 2597: Assured Forwarding PHB Group - Defines AF classes (AF11, AF21, etc.).
- RFC 3246: An Expedited Forwarding PHB - Defines EF (Voice) behavior.