前提:
centos 6.5
ovs 2.5.0
dpdk 16.04
编译方法参考1
编译中ovs出现的编译不过使用下面的补丁
https://github.com/tfherbert/ovs-snap/blob/dpdk-stable/openvswitch-2.5.90-dpdk-ethdev-speed.patch
Amazon
- ena(Elastic Network Adapter)
Chelsio
- cxgbe(Terminator 5)
Cesnet
- szedata2(COMBO-80G,COMBO-100G)
Cisco
- enic(UCS Virtual Interface Card)
Emulex
- oce(OneConnect OCe14000 family)
Intel
- e1000(82540,82545,82546)
- e1000e(82571..82574,82583,ICH8..ICH10,PCH..PCH2,I217,I218)
- igb(82575..82576,82580,I210,I211,I350,I354,DH89xx)
- ixgbe(82598..82599,X540,X550)
- i40e(X710,XL710,X722)
- fm10k(FM10420)
Note: The driverse1000ande1000eare also calledem. The driversemandigbare sometimes grouped ine1000family.
Mellanox
Netronome
- nfp(NFP-6xxx)
QLogic
- bnx2x(QLogic 578xx)
Paravirtualization
- virtio-net(QEMU)
- xenvirt(Xen)
- vmxnet3 usermaporvmxnet3 + uio(VMware ESXi)
- memnic
Others
编译流程
https://github.com/openvswitch/ovs/blob/master/INSTALL.DPDK.md
1. Overview
Open vSwitch can use DPDK lib to operate entirely in userspace. This file provides information on installation and use of Open vSwitch using DPDK datapath. This version of Open vSwitch should be built manually withconfigure
andmake
.
The DPDK support of Open vSwitch is considered 'experimental'.
Prerequisites
- required: DPDK 16.04,libnuma
- Hardware:DPDK Supported NICswhen physical ports in use
2. Building and Installation
2.1 Configure & build the Linux kernel
On Linux Distros running kernel version >= 3.0,kernel rebuild is not required and only grub cmdline needs to be updated for enabling IOMMU [VFIO support - 3.2]. For older kernels,check if kernel is built with UIO,HUGETLBFS,PROC_PAGE_MONITOR,HPET,HPET_MMAP support.
Detailed system requirements can be found atDPDK requirementsand also refer to advanced install guideINSTALL.DPDK-ADVANCED.md
2.2 Install DPDK
-
Download DPDKand extract the file,for example in to /usr/src and set DPDK_DIR
cd /usr/src/ wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.04.zip unzip dpdk-16.04.zip export DPDK_DIR=/usr/src/dpdk-16.04 cd $DPDK_DIR
-
Configure and Install DPDK
Build and install the DPDK library.
export DPDK_TARGET=x86_64-native-linuxapp-gcc export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET make install T=$DPDK_TARGET DESTDIR=install
Note: For IVSHMEM,Set
export DPDK_TARGET=x86_64-ivshmem-linuxapp-gcc
2.3 Install OVS
OVS can be installed using different methods. For OVS to use DPDK datapath,it has to be configured with DPDK support and is done by './configure --with-dpdk'. This section focus on generic recipe that suits most cases and for distribution specific instructions,referINSTALL.Fedora.md,INSTALL.RHEL.mdandINSTALL.Debian.md.
The OVS sources can be downloaded in different ways and skip this section if already having the correct sources. Otherwise download the correct version using one of the below suggested methods and follow the documentation of that specific version.
-
OVS stable releases can be downloaded in compressed format fromDownload OVS
cd /usr/src wget http://openvswitch.org/releases/openvswitch-<version>.tar.gz tar -zxvf openvswitch-<version>.tar.gz export OVS_DIR=/usr/src/openvswitch-<version>
-
OVS current development can be clone using 'git' tool
cd /usr/src/ git clone https://github.com/openvswitch/ovs.git export OVS_DIR=/usr/src/ovs
-
Install OVS dependencies
GNU make,GCC 4.x (or) Clang 3.4,libnuma (Mandatory) libssl,libcap-ng,Python 2.7 (Optional) More information can be found atBuild Requirements
-
Configure,Install OVS
cd $OVS_DIR ./boot.sh ./configure --with-dpdk=$DPDK_BUILD make install
Note: Passing DPDK_BUILD can be skipped if DPDK library is installed in standard locations i.e
./configure --with-dpdk
should suffice.Additional information can be found inINSTALL.md.
3. Setup OVS with DPDK datapath
3.1 Setup Hugepages
Allocate and mount 2M Huge pages:
-
For persistent allocation of huge pages,write to hugepages.conf file in /etc/sysctl.d
echo 'vm.nr_hugepages=2048' > /etc/sysctl.d/hugepages.conf
-
For run-time allocation of huge pages
sysctl -w vm.nr_hugepages=N
where N = No. of 2M huge pages allocated -
To verify hugepage configuration
grep HugePages_ /proc/meminfo
-
Mount hugepages
mount -t hugetlbfs none /dev/hugepages
Note: Mount hugepages if not already mounted by default.
3.2 Setup DPDK devices using VFIO
- Supported with kernel version >= 3.6
- VFIO needs support from BIOS and kernel.
-
BIOS changes:
Enable VT-d,can be verified from
dmesg | grep -e DMAR -e IOMMU
output -
GRUB bootline:
Add
iommu=pt intel_iommu=on
,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">cat /proc/cmdlineoutput -
Load modules and bind the NIC to VFIO driver
modprobe vfio-pci sudo /usr/bin/chmod a+x /dev/vfio sudo /usr/bin/chmod 0666 /dev/vfio/* $DPDK_DIR/tools/dpdk_nic_bind.py --bind=vfio-pci eth1 $DPDK_DIR/tools/dpdk_nic_bind.py --status
Note: If running kernels < 3.6 UIO drivers to be used,please checkDPDK in the VM,DPDK devices using UIO section for the steps.
3.3 Setup OVS
-
DB creation (One time step)
mkdir -p /usr/local/etc/openvswitch mkdir -p /usr/local/var/run/openvswitch rm /usr/local/etc/openvswitch/conf.db ovsdb-tool create /usr/local/etc/openvswitch/conf.db \ /usr/local/share/openvswitch/vswitch.ovsschema
-
Start ovsdb-server
No SSL support
ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \ --remote=db:Open_vSwitch,Open_vSwitch,manager_options \ --pidfile --detach
SSL support
Initialize DB (One time step)
ovs-vsctl --no-wait init
-
Start vswitchd
DPDK configuration arguments can be passed to vswitchd via Open_vSwitch 'other_config' column. The important configuration options are listed below. Defaults will be provided for all values not explicitly set. Refer ovs-vswitchd.conf.db(5) for additional information on configuration options.
-
dpdk-init Specifies whether OVS should initialize and support DPDK ports. This is a boolean,and defaults to false.
-
dpdk-lcore-mask Specifies the cpu cores on which dpdk lcore threads should be spawned and expects hex string (eg '0x123').
-
dpdk-socket-mem Comma separated list of memory to pre-allocate from hugepages on specific sockets.
-
dpdk-hugepage-dir Directory where hugetlbfs is mounted
-
vhost-sock-dir Option to set the path to the vhost_user unix socket files.
NOTE: Changing any of these options requires restarting the ovs-vswitchd application.
Open vSwitch can be started as normal. DPDK will be initialized as long as the dpdk-init option has been set to 'true'.
export DB_SOCK=/usr/local/var/run/openvswitch/db.sock ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true ovs-vswitchd unix:$DB_SOCK --pidfile --detach
If allocated more than one GB hugepage (as for IVSHMEM),set amount and use NUMA node 0 memory. For details on using ivshmem with DPDK,refer toOVS Testcases.
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,0" ovs-vswitchd unix:$DB_SOCK --pidfile --detach
To better scale the work loads across cores,Multiple pmd threads can be created and pinned to cpu cores by explicity specifying pmd-cpu-mask. eg: To spawn 2 pmd threads and pin them to cores 1,2
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
-
-
Create bridge & add DPDK devices
create a bridge with datapath_type "netdev" in the configuration database
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
Now you can add DPDK devices. OVS expects DPDK device names to start with "dpdk" and end with a portid. vswitchd should print (in the log file) the number of dpdk devices found.
ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
After the DPDK ports get added to switch,a polling thread continuously polls DPDK devices and consumes 100% of the core as can be checked from 'top' and 'ps' cmds.
top -H ps -eLo pid,psr,comm | grep pmd
Note: creating bonds of DPDK interfaces is slightly different to creating bonds of system interfaces. For DPDK,the interface type must be explicitly set,for example:
ovs-vsctl add-bond br0 dpdkbond dpdk0 dpdk1 -- set Interface dpdk0 type=dpdk -- set Interface dpdk1 type=dpdk
-
PMD thread statistics
# Check current stats ovs-appctl dpif-netdev/pmd-stats-show # Show port/rxq assignment ovs-appctl dpif-netdev/pmd-rxq-show # Clear prevIoUs stats ovs-appctl dpif-netdev/pmd-stats-clear
-
Stop vswitchd & Delete bridge
ovs-appctl -t ovs-vswitchd exit ovs-appctl -t ovsdb-server exit ovs-vsctl del-br br0
4. DPDK in the VM
DPDK 'testpmd' application can be run in the Guest VM for high speed packet forwarding between vhostuser ports. DPDK and testpmd application has to be compiled on the guest VM. Below are the steps for setting up the testpmd application in the VM. More information on the vhostuser ports can be found inVhost Walkthrough.
-
Instantiate the Guest
Qemu version >= 2.2.0 export VM_NAME=Centos-vm export GUEST_MEM=3072M export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2 export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm -m $GUEST_MEM -object memory-backend-file,id=mem,size=$GUEST_MEM,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp sockets=1,cores=2 -drive file=$QCOW2_IMAGE -chardev socket,id=char0,path=$VHOST_SOCK_DIR/dpdkvhostuser0 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off -chardev socket,id=char1,path=$VHOST_SOCK_DIR/dpdkvhostuser1 -netdev type=vhost-user,id=mynet2,chardev=char1,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off --nographic -snapshot
-
Download the DPDK Srcs to VM and build DPDK
cd /root/dpdk/ wget http://dpdk.org/browse/dpdk/snapshot/dpdk-16.04.zip unzip dpdk-16.04.zip export DPDK_DIR=/root/dpdk/dpdk-16.04 export DPDK_TARGET=x86_64-native-linuxapp-gcc export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET cd $DPDK_DIR make install T=$DPDK_TARGET DESTDIR=install
-
Build the test-pmd application
cd app/test-pmd export RTE_SDK=$DPDK_DIR export RTE_TARGET=$DPDK_TARGET make
-
Setup Huge pages and DPDK devices using UIO
sysctl vm.nr_hugepages=1024 mkdir -p /dev/hugepages mount -t hugetlbfs hugetlbfs /dev/hugepages (only if not already mounted) modprobe uio insmod $DPDK_BUILD/kmod/igb_uio.ko $DPDK_DIR/tools/dpdk_nic_bind.py --status $DPDK_DIR/tools/dpdk_nic_bind.py -b igb_uio 00:03.0 00:04.0
vhost ports pci ids can be retrieved using
lspci | grep Ethernet
cmd.
5. OVS Testcases
Below are few testcases and the list of steps to be followed.
5.1 PHY-PHY
The steps (1-5) in 3.3 section will create & initialize DB,start vswitchd and also add DPDK devices to bridge 'br0'.
-
Add Test flows to forward packets betwen DPDK port 0 and port 1
# Clear current flows ovs-ofctl del-flows br0 # Add flows between port 1 (dpdk0) to port 2 (dpdk1) ovs-ofctl add-flow br0 in_port=1,action=output:2 ovs-ofctl add-flow br0 in_port=2,action=output:1
5.2 PHY-VM-PHY [VHOST LOOPBACK]
Add dpdkvhostuser ports to bridge 'br0'. More information on the dpdkvhostuser ports can be found inVhost Walkthrough.
ovs-vsctl add-port br0 dpdkvhostuser0 -- set Interface dpdkvhostuser0 type=dpdkvhostuser ovs-vsctl add-port br0 dpdkvhostuser1 -- set Interface dpdkvhostuser1 type=dpdkvhostuser
Add Test flows to forward packets betwen DPDK devices and VM ports
# Clear current flows ovs-ofctl del-flows br0 # Add flows ovs-ofctl add-flow br0 in_port=1,action=output:3 ovs-ofctl add-flow br0 in_port=3,action=output:1 ovs-ofctl add-flow br0 in_port=4,action=output:4 # Dump flows ovs-ofctl dump-flows br0
Instantiate Guest VM using Qemu cmdline
Guest Configuration
| configuration | values | comments |----------------------|--------|----------------- | qemu version | 2.2.0 | | qemu thread affinity | core 5 | taskset 0x20 | memory | 4GB | - | cores | 2 | - | Qcow2 image | CentOS7| - | mrg_rxbuf | off | -
Instantiate Guest
export VM_NAME=vhost-vm export GUEST_MEM=3072M export QCOW2_IMAGE=/root/CentOS7_x86_64.qcow2 export VHOST_SOCK_DIR=/usr/local/var/run/openvswitch taskset 0x20 qemu-system-x86_64 -name $VM_NAME -cpu host -enable-kvm -m $GUEST_MEM -object memory-backend-file,mrg_rxbuf=off --nographic -snapshot
Guest VM using libvirt
The below is a simple xml configuration of 'demovm' guest that can be instantiated using 'virsh'. The guest uses a pair of vhostuser port and boots with 4GB RAM and 2 cores. More information can be found in<domain type='kvm'> <name>demovm</name> <uuid>4a9b3f53-fa2a-47f3-a757-dd87720d9d1d</uuid> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <memoryBacking> <hugepages> <page size='2' unit='M' nodeset='0'/> </hugepages> </memoryBacking> <vcpu placement='static'>2</vcpu> <cputune> <shares>4096</shares> <vcpupin vcpu='0' cpuset='4'/> <vcpupin vcpu='1' cpuset='5'/> <emulatorpin cpuset='4,5'/> </cputune> <os> <type arch='x86_64' machine='pc'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> </features> <cpu mode='host-model'> <model fallback='allow'/> <topology sockets='2' cores='1' threads='1'/> <numa> <cell id='0' cpus='0-1' memory='4194304' unit='KiB' memAccess='shared'/> </numa> </cpu> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/root/CentOS7_x86_64.qcow2'/> <target dev='vda' bus='virtio'/> </disk> <disk type='dir' device='disk'> <driver name='qemu' type='fat'/> <source dir='/usr/src/dpdk-16.04'/> <target dev='vdb' bus='virtio'/> <readonly/> </disk> <interface type='vhostuser'> <mac address='00:00:00:00:00:01'/> <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser0' mode='client'/> <model type='virtio'/> <driver queues='2'> <host mrg_rxbuf='off'/> </driver> </interface> <interface type='vhostuser'> <mac address='00:00:00:00:00:02'/> <source type='unix' path='/usr/local/var/run/openvswitch/dpdkvhostuser1' mode='client'/> <model type='virtio'/> <driver queues='2'> <host mrg_rxbuf='off'/> </driver> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> </devices> </domain>
DPDK Packet forwarding in Guest VM
To accomplish this,DPDK and testpmd application have to be first compiled on the VM and the steps are listed inDPDK in the VM.
-
Run test-pmd application
cd $DPDK_DIR/app/test-pmd; ./testpmd -c 0x3 -n 4 --socket-mem 1024 -- --burst=64 -i --txqflags=0xf00 --disable-hw-vlan set fwd mac_retry start
-
Bind vNIC back to kernel once the test is completed.
$DPDK_DIR/tools/dpdk_nic_bind.py --bind=virtio-pci 0000:00:03.0 $DPDK_DIR/tools/dpdk_nic_bind.py --bind=virtio-pci 0000:00:04.0
Note: Appropriate PCI IDs to be passed in above example. The PCI IDs can be retrieved using '$DPDK_DIR/tools/dpdk_nic_bind.py --status' cmd.
5.3 PHY-VM-PHY [IVSHMEM]
The steps for setup of IVSHMEM are covered in section 5.2(PVP - IVSHMEM) ofOVS Testcasesin ADVANCED install guide.
6. Limitations
- Supports MTU size 1500,MTU setting for DPDK netdevs will be in future OVS release.
- Currently DPDK ports does not use HW offload functionality.
-
Network Interface Firmware requirements: Each release of DPDK is validated against a specific firmware version for a supported Network Interface. New firmware versions introduce bug fixes,performance improvements and new functionality that DPDK leverages. The validated firmware versions are available as part of the release notes for DPDK. It is recommended that users update Network Interface firmware to match what has been validated for the DPDK release.
For DPDK 16.04,the list of validated firmware versions can be found at: