linux內(nèi)核中VLAN收發(fā)處理——lvyilong316 VLAN報(bào)文格式基于802.1Q的VLAN幀格式如下: Type:長(zhǎng)度為2字節(jié),取值為0x8100,表示此幀的類型為802.1Q Tag幀。 PRI:長(zhǎng)度為3比特,可取0~7之間的值,表示幀的優(yōu)先級(jí),值越大優(yōu)先級(jí)越高。該優(yōu)先級(jí)主要為QoS差分服務(wù)提供參考依據(jù)(COS)。 VLAN Identifier (VID) : 長(zhǎng)度12bits,可配置的VLAN ID取值范圍為1~4094。通常vlan 0和vlan 4095預(yù)留,vlan1為缺省vlan,一般用于網(wǎng)管 注意:這里的兩個(gè)Type,前面802.1Q Tag中的Type,指明這個(gè)是VLAN報(bào)文,其值為0x8100;而對(duì)于后面Length/Type中的Type指定的是以太網(wǎng)內(nèi)層協(xié)議的類型,如IP或ARP等。 相關(guān)數(shù)據(jù)結(jié)構(gòu)1.1 struct vlan_ethhdr包含vlan頭部的二層頭部結(jié)構(gòu)體 點(diǎn)擊(此處)折疊或打開(kāi)
1.2 struct vlan_hdrvlan頭部關(guān)聯(lián)的結(jié)構(gòu)體 點(diǎn)擊(此處)折疊或打開(kāi)
不支持VLAN的網(wǎng)卡對(duì)于不支持VLAN的網(wǎng)卡,也就不能識(shí)別報(bào)文中Type為0x8100這個(gè)類型有什么特殊之處,網(wǎng)卡驅(qū)動(dòng)會(huì)將其當(dāng)作普通mac幀收上來(lái)。注意此時(shí),如果是正常的mac幀(非VLAN),skb->protocol會(huì)被設(shè)置成mac幀的第13、14字節(jié),也就是(Length/Type)中的Type,而對(duì)于VLAN的mac幀來(lái)說(shuō)同樣會(huì)被設(shè)置為mac幀的第13、14字節(jié),但此時(shí)是802.1Q Tag中的Type(至于為什么,看下VLAN的格式就明白了)。 所以對(duì)于不支持VLAN的網(wǎng)卡收到VLAN mac幀后,skb->protocol是等于0x8100的。有了這個(gè)背景再看下面的處理邏輯。 首先,無(wú)論什么數(shù)據(jù)包通過(guò)網(wǎng)卡驅(qū)動(dòng)后都會(huì)進(jìn)入netif_receive_skb函數(shù)。 下面看netif_receive_skb函數(shù),其中已經(jīng)出去和VLAN接收的無(wú)關(guān)邏輯。 int netif_receive_skb(struct sk_buff *skb) { struct packet_type *ptype, *pt_prev; //這里是重點(diǎn),但是只有網(wǎng)卡支持VLAN時(shí)才會(huì)設(shè)置skb->vlan_tci if (skb->vlan_tci && vlan_hwaccel_do_receive(skb)) return NET_RX_SUCCESS; //…… //遍歷ptye_all鏈表, 上面的paket_type.type 為 ETH_P_ALL, list_for_each_entry_rcu(ptype, &ptype_all, list) { if (ptype->dev == null_or_orig || ptype->dev == skb->dev || ptype->dev == orig_dev) { if (pt_prev)//注意,此時(shí)orig_dev為物理dev,如eth0 // 此函數(shù)最終調(diào)用paket_type.func() ret = deliver_skb(skb, pt_prev, orig_dev); pt_prev = ptype; } } //bridge邏輯(可以看到bridge邏輯再VLAN處理之前) skb = handle_bridge(skb, &pt_prev, &ret, orig_dev); //這里和VLAN沒(méi)有關(guān)系,而是mac-vlan的相關(guān)功能,編譯內(nèi)核時(shí)選上MAC_VLAN模塊,下面才會(huì)執(zhí)行 skb = handle_macvlan(skb, &pt_prev, &ret, orig_dev); //這里的type被置為VLAN協(xié)議,即0x8100 type = skb->protocol; //處理ptype_base[ntohs(type)&15]上的所有的 packet_type->func() list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) { if (ptype->type == type && (ptype->dev == null_or_orig || ptype->dev == skb->dev || ptype->dev == orig_dev)) { if (pt_prev) //此函數(shù)最終調(diào)用paket_type.func(),由于type為802.1Q的協(xié)議,所以會(huì)調(diào)用其對(duì)應(yīng)的協(xié)議處理函數(shù)。 ret = deliver_skb(skb, pt_prev, orig_dev); pt_prev = ptype; } } //…… } 在加載8021q時(shí)會(huì)注冊(cè)相應(yīng)packet_type,同時(shí)初始化相關(guān)處理函數(shù)func。 static struct packet_type vlan_packet_type __read_mostly = { .type = cpu_to_be16(ETH_P_8021Q), .func = vlan_skb_recv, /* VLAN receive method */ }; 所以接下來(lái)會(huì)調(diào)用vlan_skb_recv函數(shù)。 net/8021q/vlan_dev.c l vlan_skb_recv int vlan_skb_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type *ptype, struct net_device *orig_dev) { struct vlan_hdr *vhdr; struct net_device_stats *stats; u16 vlan_id; u16 vlan_tci; /* skb_share_check()會(huì)調(diào)用3個(gè)函數(shù):skb_sharde(), skb_clone(), kfree_skb(),都很重要。skb_shared()檢查skb->users數(shù)目是否為1,不為1則表示有多個(gè)協(xié)議棧模塊要處理它,此時(shí)就需要使用skb_clone()來(lái)復(fù)制一份skb;kfree_skb()并不一定釋放skb,只有當(dāng)skb->users為1時(shí),才會(huì)釋放;否則只是遞減skb->users。*/ skb = skb_share_check(skb, GFP_ATOMIC); if (skb == NULL) goto err_free; // VLAN_HLEN的值為4 if (unlikely(!pskb_may_pull(skb, VLAN_HLEN))) goto err_free; //從skb中獲取到vlan_id vhdr = (struct vlan_hdr *)skb->data; vlan_tci = ntohs(vhdr->h_vlan_TCI); vlan_id = vlan_tci & VLAN_VID_MASK; rcu_read_lock(); //這一步是核心,此時(shí)skb->dev為真正的設(shè)備,經(jīng)過(guò)vlan處理后,報(bào)文應(yīng)該被上層協(xié)議看作是由vlan虛擬設(shè)備接收的,因此這里設(shè)置skb->dev為虛擬的vlan設(shè)備。 skb->dev = __find_vlan_dev(dev, vlan_id);//如何找到相應(yīng)虛擬vlan設(shè)備后面分析 //更新設(shè)備統(tǒng)計(jì)計(jì)數(shù) stats = &skb->dev->stats; stats->rx_packets++; stats->rx_bytes += skb->len; //更新校驗(yàn)和,此時(shí)data指向了真正的數(shù)據(jù)字段,如ip或arp頭 skb_pull_rcsum(skb, VLAN_HLEN); skb->priority = vlan_get_ingress_priority(skb->dev, vlan_tci); vlan_set_encap_proto(skb, vhdr); //重新設(shè)置skb->protocol skb = vlan_check_reorder_header(skb); //去掉報(bào)文中的VLAN tag netif_rx(skb); //再次送回協(xié)議棧 rcu_read_unlock(); return NET_RX_SUCCESS; err_unlock: rcu_read_unlock(); err_free: kfree_skb(skb); return NET_RX_DROP; } l vlan_set_encap_proto static inline void vlan_set_encap_proto(struct sk_buff *skb, struct vlan_hdr *vhdr) { __be16 proto; unsigned char *rawp; //根據(jù)VLAN的報(bào)文格式可知vhdr->h_vlan_encapsulated_proto就是真正以太網(wǎng)幀的類型,如IP,ARP proto = vhdr->h_vlan_encapsulated_proto; if (ntohs(proto) >= 1536) { skb->protocol = proto; return; } rawp = skb->data; if (*(unsigned short *)rawp == 0xFFFF) skb->protocol = htons(ETH_P_802_3); else skb->protocol = htons(ETH_P_802_2); } l vlan_check_reorder_header static inline struct sk_buff *vlan_check_reorder_header(struct sk_buff *skb) { if (vlan_dev_info(skb->dev)->flags & VLAN_FLAG_REORDER_HDR) { if (skb_cow(skb, skb_headroom(skb)) < 0) skb = NULL; if (skb) { //這個(gè)是重點(diǎn),ETH_HLEN=14,VLAN_ETH_HLEN=18 memmove(skb->data - ETH_HLEN, skb->data - VLAN_ETH_HLEN, 12); skb->mac_header += VLAN_HLEN;// VLAN_HLEN=4 } } return skb; } 執(zhí)行memmove(skb->data - ETH_HLEN, skb->data - VLAN_ETH_HLEN, 12)前,報(bào)文內(nèi)容如下: 執(zhí)行后變?yōu)橄聢D。 可見(jiàn)通過(guò)拷貝覆蓋,將報(bào)文中的VLAN tag去掉了。然后執(zhí)行skb->mac_header += VLAN_HLEN;
繼續(xù)轉(zhuǎn)發(fā)過(guò)程的分析,我們發(fā)下vlan_skb_recv最后調(diào)用了netif_rx(),進(jìn)而又會(huì)進(jìn)入到netif_receive_skb。有了bridge邏輯分析的基礎(chǔ),我們就不會(huì)奇怪為什么數(shù)據(jù)包轉(zhuǎn)一圈又回來(lái)了。因?yàn)?/span>skb->dev已經(jīng)變了,有物理設(shè)備(如eth0)變?yōu)榱颂摂M設(shè)備(如eth0.100),另外報(bào)文中的VLAN tag已經(jīng)被抹去。所以同一個(gè)skb再次進(jìn)入netif_receive_skb,和之前走的邏輯也是不同的。 注:netif_receive_skb()這個(gè)函數(shù)在報(bào)文接收中會(huì)多次進(jìn)入的,網(wǎng)卡驅(qū)動(dòng)收到報(bào)文進(jìn)入netif_receive_skb(),bridge處理完后再進(jìn)入netif_receive_skb(),vlan處理完成再進(jìn)入netif_receive_skb()。而bridge處理完后會(huì)設(shè)置標(biāo)志,表明bridge已經(jīng)處理過(guò)該報(bào)文,在再次進(jìn)入netif_receive_skb時(shí)就不會(huì)再被bridge模塊處理。 下面總結(jié)一下不支持VLAN特性時(shí)的接收邏輯如下圖: 說(shuō)完了接收邏輯,再看下vlan的發(fā)送邏輯。我們知道數(shù)據(jù)包轉(zhuǎn)發(fā)到vlan設(shè)備后,會(huì)調(diào)用vlan設(shè)備的.ndo_start_xmit函數(shù),那么這個(gè)函數(shù)指針被初始化什么函數(shù)呢?這個(gè)函數(shù)是在vlan_dev_init中初始化的。 l vlan_dev_init /net/8021q/vlan_dev.c static int vlan_dev_init(struct net_device *dev) { //…… /*根據(jù)real device是否支持NETIF_F_HW_VLAN_TX,讓vlan device的netdev_ops指針指向不同的接口函數(shù)。*/ if (real_dev->features & NETIF_F_HW_VLAN_TX) { dev->header_ops = real_dev->header_ops; dev->hard_header_len = real_dev->hard_header_len; dev->netdev_ops = &vlan_netdev_accel_ops; } else { dev->header_ops = &vlan_header_ops; dev->hard_header_len = real_dev->hard_header_len + VLAN_HLEN; dev->netdev_ops = &vlan_netdev_ops; } //…… } static const struct net_device_ops vlan_netdev_ops = { //…… .ndo_start_xmit = vlan_dev_hard_start_xmit, //…… } 所以真實(shí)設(shè)備不支持vlan時(shí),發(fā)送或調(diào)用 vlan_dev_hard_start_xmit函數(shù)。 l vlan_dev_hard_start_xmit static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev) { int i = skb_get_queue_mapping(skb); struct netdev_queue *txq = netdev_get_tx_queue(dev, i); struct vlan_ethhdr *veth = (struct vlan_ethhdr *)(skb->data); unsigned int len; int ret; //如果mac的協(xié)議類型不是vlan協(xié)議,說(shuō)明還沒(méi)有打上VLAN tag,則在此處添加上4字節(jié)的VLAN tag if (veth->h_vlan_proto != htons(ETH_P_8021Q) || vlan_dev_info(dev)->flags & VLAN_FLAG_REORDER_HDR) { unsigned int orig_headroom = skb_headroom(skb); u16 vlan_tci; vlan_dev_info(dev)->cnt_encap_on_xmit++; vlan_tci = vlan_dev_info(dev)->vlan_id; //獲取到vlan設(shè)備的vlan id vlan_tci |= vlan_dev_get_egress_qos_mask(dev, skb); skb = __vlan_put_tag(skb, vlan_tci);//在報(bào)文中添加VLAN tag if (!skb) { txq->tx_dropped++; return NETDEV_TX_OK; } if (orig_headroom < VLAN_HLEN) vlan_dev_info(dev)->cnt_inc_headroom_on_tx++; } //這里是重點(diǎn),skb->dev設(shè)置為真實(shí)設(shè)備的dev skb->dev = vlan_dev_info(dev)->real_dev; len = skb->len; ret = dev_queue_xmit(skb);//再次調(diào)用dev_queue_xmit if (likely(ret == NET_XMIT_SUCCESS)) { txq->tx_packets++; txq->tx_bytes += len; } else txq->tx_dropped++; return NETDEV_TX_OK; } 我們知道dev_queue_xmit最終會(huì)調(diào)用skb->dev的.ndo_start_xmit,之前skb->dev指向的是vlan虛擬設(shè)備,調(diào)用虛擬設(shè)備的.ndo_start_xmit,即vlan_dev_hard_start_xmit,而之后skb->dev被設(shè)置成真實(shí)物理設(shè)備,所以再次進(jìn)入dev_queue_xmit就會(huì)調(diào)用正常物理設(shè)備的.ndo_start_xmit將數(shù)據(jù)包發(fā)送出。 VLAN虛擬設(shè)備的組織方式在vlan_skb_recv中有這一行代碼: skb->dev = __find_vlan_dev(dev, vlan_id); 再說(shuō)這個(gè)函數(shù)是怎么找到對(duì)應(yīng)的vlan設(shè)備之前,先說(shuō)下vlan設(shè)備的組織方式。 數(shù)據(jù)結(jié)構(gòu)vlan_group_hash是vlan虛擬網(wǎng)卡存儲(chǔ)與關(guān)聯(lián)的核心結(jié)構(gòu): static struct hlist_head vlan_group_hash[VLAN_GRP_HASH_SIZE];//[net\8021q\vlan.c] 當(dāng)通過(guò)vconfig創(chuàng)建了eth1.1, eth1.2, eth1.100三個(gè)虛擬網(wǎng)卡后,vlan_group_hash的整體結(jié)構(gòu)如圖所示,先有個(gè)整體印象: vlan_group_hash是大小為32的hash表,所用的hash函數(shù)是: static inline unsigned int vlan_grp_hashfn(unsigned int idx) { return ((idx >> VLAN_GRP_HASH_SHIFT) ^ idx) & VLAN_GRP_HASH_MASK; } 而傳入?yún)?shù)idx就是dev->ifindex,比如eth1的就是1。因此可以這樣理解,vlan_group_hash表插入的是真實(shí)網(wǎng)卡設(shè)備信息(eth1)。對(duì)于一般主機(jī)來(lái)說(shuō),網(wǎng)卡不會(huì)太多,32個(gè)表項(xiàng)的hash表是完全足夠的。 首先查找網(wǎng)卡是否已存在,這里的real_dev一般是真實(shí)的網(wǎng)卡如eth1等。以real_dev->ifindex值作hash,取出vlan_group_hash的表項(xiàng),由于可能存在多個(gè)網(wǎng)卡的hash值相同,因此還要匹配表項(xiàng)的real_dev是否與real_dev相同。 grp = __vlan_find_group(real_dev); 如果不存在相應(yīng)的表項(xiàng),則分配表項(xiàng)struct vlan_group,并加入vlan_group_hash: ngrp = grp = vlan_group_alloc(real_dev); 結(jié)構(gòu)定義如下,它可以代表在vlan下真實(shí)網(wǎng)卡的信息。real_dev指向真實(shí)網(wǎng)卡如eth1;nr_vlans表示網(wǎng)卡下創(chuàng)建的vlan數(shù);vlan_devices_arrays用于存儲(chǔ)創(chuàng)建的vlan虛擬網(wǎng)卡: struct vlan_group { struct net_device *real_dev; unsigned int nr_vlans; int killall; struct hlist_node hlist; /* linked list */ struct net_device **vlan_devices_arrays[VLAN_GROUP_ARRAY_SPLIT_PARTS]; struct rcu_head rcu; }; 創(chuàng)建完表項(xiàng)vlan_group,緊接初始化vlan_devices_arrays二維數(shù)組中相應(yīng)元素 err = vlan_group_prealloc_vid(grp, vlan_id); 最后,設(shè)置vlan_devices_arrays相應(yīng)元素指向創(chuàng)建的vlan虛擬網(wǎng)卡(如eth1.1)的struct net_device。這里值得注意的是vlan_devices_arrays是二維數(shù)組(實(shí)際是一維數(shù)組,但每個(gè)元素是二級(jí)指針),內(nèi)核支持的最大vlan數(shù)是4096,為了查找效率,應(yīng)用了二級(jí)目錄的概念。vlan_devices_arrays指向大小512的數(shù)組,數(shù)組中每個(gè)再指向大小8的數(shù)組,像eth1.100則位于第12組的第5個(gè)(vlan_devices_arrays[11][4])。 vlan_group_set_device(grp, vlan_id, dev); 以一個(gè)例子來(lái)說(shuō)明,當(dāng)主機(jī)收到報(bào)文,交由vlan協(xié)議模塊處理后(vlan_rcv),此時(shí)需要更換skb->dev所指向的設(shè)備,以使上層協(xié)議認(rèn)為報(bào)文是來(lái)自于虛擬網(wǎng)卡(比如eth1.1),而不知道網(wǎng)卡eth1的存在。更換設(shè)備就需要知道skb->dev更換的目標(biāo)。這由兩個(gè)因素決定:skb->dev和vlan_id。skb->dev即報(bào)文來(lái)自主機(jī)的哪個(gè)網(wǎng)卡,如來(lái)自eth1,則skb->dev->name=”eth1”;vlan_id即vlan號(hào),這在報(bào)文中的vlan報(bào)文中可以提取出。有了這兩個(gè)信息,從vlan_group_hash出發(fā),首先根據(jù)skb->dev->ifindex查找vlan_group_hash的相應(yīng)項(xiàng)(eth1),取出vlan_group;然后,根據(jù)vlan_id,在vlan_devices_array中查找到虛擬網(wǎng)卡設(shè)備(eth1.1)。 一般支持的最大vlan數(shù)是4096,為了查詢效率,vlan_devices_array并不是一個(gè)4096的數(shù)組,而是二維數(shù)組,將每8個(gè)vlan分為一組,共512組,像eth1.100則位于第12組的第5個(gè)。 有了上面的背景,在看__find_vlan_dev就容易多了: l __find_vlan_dev struct net_device *__find_vlan_dev(struct net_device *real_dev, u16 vlan_id) { struct vlan_group *grp = __vlan_find_group(real_dev);//根據(jù)真實(shí)設(shè)備的ifindex找到對(duì)應(yīng)的vlan_group if (grp) return vlan_group_get_device(grp, vlan_id);//再vlan_group中根據(jù)vlan_id找到對(duì)應(yīng)的vlan設(shè)備 return NULL; } 支持VLAN的網(wǎng)卡對(duì)于支持vlan(802.1q)的網(wǎng)卡設(shè)備,其實(shí)就相當(dāng)于將vlan_skb_recv函數(shù)所做的工作下放到了網(wǎng)卡驅(qū)動(dòng)。當(dāng)網(wǎng)卡收到報(bào)文,提取其mac幀的13、14字節(jié)的協(xié)議號(hào),發(fā)現(xiàn)是vlan協(xié)議,就會(huì)進(jìn)行: 1. 從skb->date中提取VLAN id,賦值給skb->vlan_tci; 2. 除去skb->date中4字節(jié)的VLAN tag; 3. 將根據(jù)vlan_tci(即vlan id)skb->dev設(shè)置成相應(yīng)虛擬vlan設(shè)備。 所以這種情況下當(dāng)數(shù)據(jù)包第一次由驅(qū)動(dòng)進(jìn)入netif_receive_skb時(shí),skb的dev已經(jīng)被設(shè)置為了虛擬vlan設(shè)備。下面看netif_receive_skb的處理邏輯。 int netif_receive_skb(struct sk_buff *skb) { //…… if (skb->vlan_tci && vlan_hwaccel_do_receive(skb)) return NET_RX_SUCCESS; //….. } 由于skb->vlan_tci被設(shè)置為了vlan id,不為0,所以進(jìn)入vlan_hwaccel_do_receive邏輯。 l vlan_hwaccel_do_receive int vlan_hwaccel_do_receive(struct sk_buff *skb) { struct net_device *dev = skb->dev; struct net_device_stats *stats; //將skb->dev設(shè)置為vlan設(shè)備對(duì)應(yīng)的真實(shí)設(shè)備 skb->dev = vlan_dev_info(dev)->real_dev; netif_nit_deliver(skb); //將skb->dev設(shè)置會(huì)對(duì)應(yīng)的vlan虛擬設(shè)備 skb->dev = dev; skb->priority = vlan_get_ingress_priority(dev, skb->vlan_tci); skb->vlan_tci = 0;//這里保證即使后面再次進(jìn)入netif_receive_skb處理邏輯,也不會(huì)進(jìn)入到vlan處理邏輯。 stats = &dev->stats; //更新vlan設(shè)備的統(tǒng)計(jì)計(jì)數(shù) stats->rx_packets++; stats->rx_bytes += skb->len; switch (skb->pkt_type) { case PACKET_BROADCAST: break; case PACKET_MULTICAST: stats->multicast++; break; case PACKET_OTHERHOST: if (!compare_ether_addr(eth_hdr(skb)->h_dest, dev->dev_addr)) skb->pkt_type = PACKET_HOST; break; }; return 0; //注意返回值為0,netif_receive_skb的邏輯會(huì)繼續(xù)執(zhí)行 } l netif_nit_deliver void netif_nit_deliver(struct sk_buff *skb) { struct packet_type *ptype; if (list_empty(&ptype_all)) return; skb_reset_network_header(skb); skb_reset_transport_header(skb); skb->mac_len = skb->network_header - skb->mac_header; rcu_read_lock(); list_for_each_entry_rcu(ptype, &ptype_all, list) { if (!ptype->dev || ptype->dev == skb->dev) deliver_skb(skb, ptype, skb->dev); } rcu_read_unlock(); } 可以看到netif_nit_deliver會(huì)遍歷ptype_all鏈表,將skb發(fā)送給每個(gè)ptype_all協(xié)議,這里注意此時(shí)skb->dev被替換為真實(shí)的dev,所以無(wú)論網(wǎng)卡是否支持vlan,如果你在eth0設(shè)備上創(chuàng)建了vlan設(shè)備eth0.100,那么tcpdump再eth0上都可以抓到vlan的數(shù)據(jù)包,并不是只能再eth0.100抓包。 接下來(lái)的接收邏輯就和普通數(shù)據(jù)包一樣進(jìn)入netif_receive_skb。 說(shuō)完接收再看下支持VLAN設(shè)備的發(fā)送邏輯。有前面知道,當(dāng)物理設(shè)備支持NETIF_F_HW_VLAN_TX時(shí): dev->netdev_ops = &vlan_netdev_accel_ops;// acceleration加速 static const struct net_device_ops vlan_netdev_accel_ops = { //…… .ndo_start_xmit = vlan_dev_hwaccel_hard_start_xmit, //…… } 所以會(huì)調(diào)用vlan_dev_hwaccel_hard_start_xmit函數(shù)。 l vlan_dev_hwaccel_hard_start_xmit static netdev_tx_t vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb, struct net_device *dev) { int i = skb_get_queue_mapping(skb); struct netdev_queue *txq = netdev_get_tx_queue(dev, i); u16 vlan_tci; unsigned int len; int ret; vlan_tci = vlan_dev_info(dev)->vlan_id; vlan_tci |= vlan_dev_get_egress_qos_mask(dev, skb); skb = __vlan_hwaccel_put_tag(skb, vlan_tci);//設(shè)置vlan id //這里是重點(diǎn),將skb->dev由虛擬的vlan設(shè)備設(shè)置為對(duì)應(yīng)的真實(shí)設(shè)備 skb->dev = vlan_dev_info(dev)->real_dev; len = skb->len; ret = dev_queue_xmit(skb); if (likely(ret == NET_XMIT_SUCCESS)) { txq->tx_packets++; txq->tx_bytes += len; } else txq->tx_dropped++; return NETDEV_TX_OK; } 對(duì)比vlan_dev_hwaccel_hard_start_xmit和不支持vlan特性的發(fā)送函數(shù)vlan_dev_hard_start_xmit,好像邏輯沒(méi)什么不同啊,都是添加vlan id,修改skb->dev啊。那為什么要兩套函數(shù)呢?其實(shí)是不一樣的,我們看下這里是如何設(shè)置vlan id的。 l __vlan_hwaccel_put_tag static inline struct sk_buff *__vlan_hwaccel_put_tag(struct sk_buff *skb, u16 vlan_tci) { skb->vlan_tci = vlan_tci; return skb; } 這里可以看到,設(shè)置vlan id僅僅是設(shè)置類skb->vlan_tci,而并沒(méi)有修改skb->date,從而插入4字節(jié)的VLAN tag。這個(gè)動(dòng)作是交給網(wǎng)卡驅(qū)動(dòng)做的。這就是和不支持VLAN特性設(shè)備的最大區(qū)別,不設(shè)置skb->date的VLAN tag就不需要進(jìn)行字節(jié)拷貝。從而減少了cpu處理時(shí)間。所以支持VLAN特性的設(shè)備在從驅(qū)動(dòng)接收到vlan mac幀時(shí)VLAN tag已經(jīng)被去除,而發(fā)送時(shí)也不會(huì)添加VLAN tag,而交由驅(qū)動(dòng)去添加。 補(bǔ)充其實(shí)對(duì)應(yīng)vlan的接收處理,linux2.x和3.x實(shí)現(xiàn)還是有區(qū)別的,這里只是講的2.x。 在linx 2.6的內(nèi)核里,是通過(guò)將dev_add_pack將該接收函數(shù)注冊(cè)到三層協(xié)議相關(guān)的接收函數(shù)的鏈表里的。即把vlan的接收函數(shù)與ip 、ipv6等協(xié)議的接收函數(shù)注冊(cè)到同一個(gè)鏈表里的。 但是考慮到vlan畢竟是屬于二層協(xié)議的范疇,因此在linux3.x中,對(duì)剝除vlan tag的操作進(jìn)行了調(diào)整,即在netif_receive_skb中,即調(diào)用vlan_untag操作,剝除數(shù)據(jù)包的vlan tag,接著調(diào)用vlan_do_receive修改skb->dev的值,接著重新返回到vlan_untag的起始調(diào)用處,即實(shí)現(xiàn)了從real_dev->vlan_dev的轉(zhuǎn)換。這樣既將vlan的剝除與三層協(xié)議相關(guān)的接收函數(shù)區(qū)別開(kāi)來(lái),又省去了netif_rx的調(diào)用。 |
|
來(lái)自: mrjbydd > 《linux kernel》