线上服务器172.18.30.67 有4个网口

其中eno1和eno2是两个光纤万兆口,enp26s0f0和enp26s0f1是两个以太千兆网口。

eno2网卡在没做bonding的时候,通过NetworkManager的dhcp获得了地址。然后eno1和eno2做了bonding,但是eno2实际还在单独起作用,bonding后地址没去掉,导致有两个网关。路由表如下:

[root@renhe-18-30-67 ~]# ip r
default via 172.18.29.254 dev eno2 
default via 172.18.31.254 dev br0.199 proto static metric 426 
172.18.28.0/23 dev eno2 proto kernel scope link src 172.18.28.67 
172.18.30.0/23 dev br0.199 proto kernel scope link src 172.18.30.67 metric 426 
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1

eno2是从dhcp获得了172.18.28.67的地址

ip a
eno2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        inet 172.18.28.67  netmask 255.255.254.0  broadcast 172.18.29.255
        ether b4:05:5d:08:e0:d8  txqueuelen 1000  (Ethernet)
        RX packets 81768780  bytes 5398659503 (5.0 GiB)
        RX errors 0  dropped 7  overruns 0  frame 0
        TX packets 10044620  bytes 467566304 (445.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

这种情况下,处理方法如下:

bonding的模式是 物理网卡 –> bond –> bond.xxx –>br.xxx

先查bonding模式:

cat /sys/class/net/bond0/bonding/mode
active-backup 1

再看bonding网卡状态:

cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: eno1 (primary_reselect always)
Currently Active Slave: eno1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eno1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: b4:05:5d:08:e0:d8
Slave queue ID: 0

Slave Interface: eno2
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: b4:05:5d:08:e0:d9
Slave queue ID: 0

再次确认active

cat /sys/class/net/bond0/bonding/active_slave
eno1

再确认第二个default网关有效

ping -I br0.199 172.18.31.254

从以上可以确定主网卡是eno1,shutdown了eno2不会影响任何东西。

接下来的步骤:

先查出来dhclient的进程号,是2252

    1   2252   2252   2252 ?            -1 Ss       0   8:50 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--eno2.lease -pf /var/run/dhclient-eno2.pid -H renhe-18-30-67 eno2

然后处理:

systemctl stop NetworkManager
systemctl disable NetworkManager
# 杀掉dhclient
kill -9 2252

这样eno2的ip地址在过一段时间后会消失掉。

如果不消失:

ip link set eno2 down

然后等等

ip link set eno2 up

就可以了。

由于是线上服务器,无法停机,所以操作才搞得这么小心谨慎。