| Peter P Waskiewicz Jr | a093bf0 | 2007-06-28 20:45:47 -0700 | [diff] [blame] | 1 |  | 
|  | 2 | HOWTO for multiqueue network device support | 
|  | 3 | =========================================== | 
|  | 4 |  | 
|  | 5 | Section 1: Base driver requirements for implementing multiqueue support | 
| Peter P Waskiewicz Jr | a093bf0 | 2007-06-28 20:45:47 -0700 | [diff] [blame] | 6 |  | 
|  | 7 | Intro: Kernel support for multiqueue devices | 
|  | 8 | --------------------------------------------------------- | 
|  | 9 |  | 
| David S. Miller | b19fa1f | 2008-07-08 23:14:24 -0700 | [diff] [blame] | 10 | Kernel support for multiqueue devices is always present. | 
| Peter P Waskiewicz Jr | a093bf0 | 2007-06-28 20:45:47 -0700 | [diff] [blame] | 11 |  | 
|  | 12 | Section 1: Base driver requirements for implementing multiqueue support | 
|  | 13 | ----------------------------------------------------------------------- | 
|  | 14 |  | 
|  | 15 | Base drivers are required to use the new alloc_etherdev_mq() or | 
|  | 16 | alloc_netdev_mq() functions to allocate the subqueues for the device.  The | 
|  | 17 | underlying kernel API will take care of the allocation and deallocation of | 
|  | 18 | the subqueue memory, as well as netdev configuration of where the queues | 
|  | 19 | exist in memory. | 
|  | 20 |  | 
|  | 21 | The base driver will also need to manage the queues as it does the global | 
|  | 22 | netdev->queue_lock today.  Therefore base drivers should use the | 
|  | 23 | netif_{start|stop|wake}_subqueue() functions to manage each queue while the | 
|  | 24 | device is still operational.  netdev->queue_lock is still used when the device | 
|  | 25 | comes online or when it's completely shut down (unregister_netdev(), etc.). | 
|  | 26 |  | 
| Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 27 |  | 
|  | 28 | Section 2: Qdisc support for multiqueue devices | 
|  | 29 |  | 
|  | 30 | ----------------------------------------------- | 
|  | 31 |  | 
| Alexander Duyck | f07d150 | 2008-09-12 17:57:23 -0700 | [diff] [blame] | 32 | Currently two qdiscs are optimized for multiqueue devices.  The first is the | 
|  | 33 | default pfifo_fast qdisc.  This qdisc supports one qdisc per hardware queue. | 
|  | 34 | A new round-robin qdisc, sch_multiq also supports multiple hardware queues. The | 
| Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 35 | qdisc is responsible for classifying the skb's and then directing the skb's to | 
|  | 36 | bands and queues based on the value in skb->queue_mapping.  Use this field in | 
|  | 37 | the base driver to determine which queue to send the skb to. | 
|  | 38 |  | 
| Alexander Duyck | f07d150 | 2008-09-12 17:57:23 -0700 | [diff] [blame] | 39 | sch_multiq has been added for hardware that wishes to avoid head-of-line | 
|  | 40 | blocking.  It will cycle though the bands and verify that the hardware queue | 
| Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 41 | associated with the band is not stopped prior to dequeuing a packet. | 
|  | 42 |  | 
|  | 43 | On qdisc load, the number of bands is based on the number of queues on the | 
|  | 44 | hardware.  Once the association is made, any skb with skb->queue_mapping set, | 
|  | 45 | will be queued to the band associated with the hardware queue. | 
|  | 46 |  | 
|  | 47 |  | 
|  | 48 | Section 3: Brief howto using MULTIQ for multiqueue devices | 
|  | 49 | --------------------------------------------------------------- | 
|  | 50 |  | 
|  | 51 | The userspace command 'tc,' part of the iproute2 package, is used to configure | 
|  | 52 | qdiscs.  To add the MULTIQ qdisc to your network device, assuming the device | 
|  | 53 | is called eth0, run the following command: | 
|  | 54 |  | 
|  | 55 | # tc qdisc add dev eth0 root handle 1: multiq | 
|  | 56 |  | 
|  | 57 | The qdisc will allocate the number of bands to equal the number of queues that | 
|  | 58 | the device reports, and bring the qdisc online.  Assuming eth0 has 4 Tx | 
|  | 59 | queues, the band mapping would look like: | 
|  | 60 |  | 
|  | 61 | band 0 => queue 0 | 
|  | 62 | band 1 => queue 1 | 
|  | 63 | band 2 => queue 2 | 
|  | 64 | band 3 => queue 3 | 
|  | 65 |  | 
| Alexander Duyck | f07d150 | 2008-09-12 17:57:23 -0700 | [diff] [blame] | 66 | Traffic will begin flowing through each queue based on either the simple_tx_hash | 
|  | 67 | function or based on netdev->select_queue() if you have it defined. | 
| Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 68 |  | 
| Alexander Duyck | ca9b0e2 | 2008-09-12 16:30:20 -0700 | [diff] [blame] | 69 | The behavior of tc filters remains the same.  However a new tc action, | 
|  | 70 | skbedit, has been added.  Assuming you wanted to route all traffic to a | 
| Alexander Duyck | 67333bb | 2008-09-12 17:56:50 -0700 | [diff] [blame] | 71 | specific host, for example 192.168.0.3, through a specific queue you could use | 
| Alexander Duyck | ca9b0e2 | 2008-09-12 16:30:20 -0700 | [diff] [blame] | 72 | this action and establish a filter such as: | 
|  | 73 |  | 
|  | 74 | tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \ | 
|  | 75 | match ip dst 192.168.0.3 \ | 
|  | 76 | action skbedit queue_mapping 3 | 
| Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 77 |  | 
|  | 78 | Author: Alexander Duyck <alexander.h.duyck@intel.com> | 
|  | 79 | Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com> |