Dual NIC clients
-
OK - did some thinking. It times out because there is no default gateway set on the secondary link. Setting that it will connect. The problem is, how do I know which network it chooses? I’m getting inconsistent transfer speeds now, average of 5GB/min versus 225MB/min - apparently depending on which NIC it connects through.
I should mention that inter-VLAN routing is enabled on the layer 3 switch of the primary network. Removing the secondary network from the static route list or pulling the physical link kills it again - this time at trying to send an inventory before deploying.
If I pull the power on the secondary network they will all deploy at high speeds.
With the secondary network on (and inter-VLAN routing), some will deploy normally, others slowly - apparently arbitrarily, as the same machines will act differently from task to task.
It would seem the kernel arbitrarily sets which NIC is eth0 from boot to boot? That would perhaps explain why it would appear to use different NICs.If I pull the plug on the secondary network while deploying it stops deploying until plugged back in. So it’s using the “wrong” NIC…
Anyone ever see something like this?
-
@tag Is it arbitrarily? If you try the same PC over and over does it report different speeds?
I think it has to do with the order your system reports the NICs in. If it reports the slower NIC first it will use that. Perhaps that’s something you can alter on your end?
-
@Quazz
Yes, sometimes an image will deploy at 5GB/min - a few minutes later when trying again with the same client it will only deploy at a fraction of that speed…Otherwise I agree; I too believe it has to do with the order of the NICs.
I tried swapping the cables at first to see if it just chose one specific hardware ID first, but that was not the case.
Since then I’ve done quite a few test deployments on the same eigth machines. Mostly they’ll deploy slowly, but every now and again one will run on the faster NIC, which can be verified by pulling the cables and seeing which one makes it pause.
I’m not saying it’s arbitrary, but I don’t see a pattern so it seems arbitrary to me.
-
@tag Are you using quick image?
Might be possible to tell it to use the faster NIC by registering them, assigning the faster NIC as primary NIC and deploying in that manner, but I really can’t be certain on that, I was kind of hoping someone else would chime on in on this, heh.
-
@Quazz
No, they are registered and deployed through tasks. Actually it’s not a question of one NIC being faster than the other - they’re more or less identical mobo dual NICs - as it is the link speed. They’re 100Mbps switch ports on a 100Mbps trunk to the layer 3 switch. It was never intended for large data transfers - just remote access and so on.The primary MAC in the host registration is the faster link, so that has no effect, I’m afraid. These are 1Gbps switch ports for the clients and two ports in ether channel for the server.
-
@tag said in Dual NIC clients:
If i understand you correctly, your suggestion of trunking would enable the client to connect to the TFTP server on either link?
No, the developmental version of fog is called “fog trunk”.
-
@Quazz do we know of any linux kernel arguments that specify only using a particular nic, or disabling a particular nic?
Maybe even a custom kernel could be an answer? Or custom init?
-
@Wayne-Workman The problem is, what arguments can we use to differentiate? It sounds like they’re basically identical NICs connecting to different network outlets, probably getting inconsistent names as well. (based on his results)
https://www.kernel.org/doc/Documentation/kernel-parameters.txt
Scroll down to grcan.enable0
Looks like those options are useful?
-
@Quazz very nice. I read it’s description, but the one below it caught my eye:
grcan.select= [HW] Select which physical interface to use. Format: 0 | 1 Default: 0
So perhaps try this for the host’s kernel arguments (web gui -> host management -> desired host -> kernel arguments)
grcan.select=1
See what happens?
-
@Quazz and @Wayne-Workman
Thanks for the suggestions.
I tried playing around with grcan.enable0=[0|1] and grcan.enable1=[0|1] as well as grcan.select=[0|1] but none had any effect. The kernel continues to choose the slower link in most cases. Seemed promising, though…
What I do notice for the first time, though, is that whatever NIC is not chosen is disabled. I hadn’t noticed as I can’t see the backs of the boxes very well. Here it is also obvious that the active NIC changes on occasion, as the LEDs die on the disabled NIC.
-
@tag I think you’re going to have to build a custom init. You can change the fog.upload and fog.download scripts. The idea would be to use shell script to determine which interface is on the right network, and then disable the other interface (or enable it). It should be pretty simple.
How experienced are you with shell scripting?
Also, here’s a link on how to unpack and re-pack the inits:
https://wiki.fogproject.org/wiki/index.php?title=Build_FOG_file_system_with_BuildRootI’m willing to help do this - but I wouldn’t have time until tonight to mess with it.
-
Thanks for the reply.
Seems kind of inflexible, though… The same init is used for all, right? We even have some clients with three NICs at other locations… If it has to take various hw scenarios into account, it might take some fancy scripting.
I know some basic scripting but nothing really fancy.
The only way to determine the correct interface would be to filter on IP, as I see it.
So maybe a list of interfaces and then for each ethX in the list:
#!/bin/csh set nwid = X.X.X set list = (eth0 eth1) foreach eth ($list) set ip = `ifconfig $eth | grep inet | awk '{print $2}' | sed 's/addr://' | cut -c-10 ` if ($ip == $nwid) then ifup $eth else ifdown $eth endif end
That would in my case get the network ID of the correct network and other disimilar outputs from the other interfaces in the list which could then be compared to a set value of the correct network ID. Based on that comparison you could then turn on or off the interfaces.
I’m sure someone else could do something a lot niftier.
I haven’t tested any of this and it might screw up if the number of interfaces actually present is different from the number in the list.
-
@tag You’ve got the right idea - but that specific code is inflexible.
Tonight I’ll put something together that will take the IPs, and the subnet mask, and calculate the subnet ID and use that for comparison.
-
@Wayne-Workman
Thanks again.Yes, that code will only work on the specific network defined in $nwid and if the kernel names the interfaces ethX and probably only if the number of interfaces match that particular piece of hw…
Mighty nice of you to help me out here… Appreciate it.
Thanks.
-
@tag Well the way I write it, it’ll work with however many NICs a system has. We will need to use the new feature that @Tom-Elliott so kindly implemented maybe a month ago, the host’s
Host Init
field. Basically we will build an init for each of your subnets, then use groups to assign the right inits to the right computers - so that those computers in those subnets use the correct interface.Sounds like a lot - but I really don’t think it is. I think this is going to be very easy.
-
I’ve been working on this.
This is my first go-round with a custom init so I’m asking that @Sebastian-Roth and @Tom-Elliott and @george1421 to take a look, too.
I’ve not tested, as I don’t readily have available a machine with multiple interfaces, but I think I’ve got a universal init that you can pass a custom kernel argument to, which will ensure the correct interface is up, and others are down. So far, I’ve coded for three possible interfaces. I already had many of these functions already written in another project I’ve been working on.
in the init, I’ve edited the file
/usr/share/fog/lib/funcs.sh
to include these functions:cidr2mask() { #Expects CIDR notation (a single integer between 0 and 32) local i="" local mask="" local full_octets=$(($1/8)) local partial_octet=$(($1%8)) for ((i=0;i<4;i+=1)); do if [[ $i -lt $full_octets ]]; then mask+=255 elif [[ $i -eq $full_octets ]]; then mask+=$((256 - 2**(8-$partial_octet))) else mask+=0 fi test $i -lt 3 && mask+=. done echo $mask } getCidr() { #Expects an interface name to be passed. local cidr cidr=$(ip -f inet -o addr | grep $1 | awk -F'[ /]+' '/global/ {print $5}' | head -n2 | tail -n1) echo $cidr } mask2network() { #Expects IP address passed 1st, and Subnet Mask passed 2nd. OIFS=$IFS IFS='.' read -r i1 i2 i3 i4 <<< "$1" read -r m1 m2 m3 m4 <<< "$2" IFS=$OIFS printf "%d.%d.%d.%d\n" "$((i1 & m1))" "$((i2 & m2))" "$((i3 & m3))" "$((i4 & m4))" } GetInterfaceInfo() { DIR="/" ip link show > $DIR/interfaces.txt interface1name="$(sed -n '3p' $DIR/interfaces.txt)" interface2name="$(sed -n '5p' $DIR/interfaces.txt)" interface3name="$(sed -n '7p' $DIR/interfaces.txt)" rm -f $DIR/interfaces.txt echo $interface1name | cut -d \: -f2 | cut -c2- > $DIR/interface1name.txt echo $interface2name | cut -d \: -f2 | cut -c2- > $DIR/interface2name.txt echo $interface3name | cut -d \: -f2 | cut -c2- > $DIR/interface3name.txt interface1name="$(cat $DIR/interface1name.txt)" interface2name="$(cat $DIR/interface2name.txt)" interface2name="$(cat $DIR/interface2name.txt)" rm -f $DIR/interface1name.txt rm -f $DIR/interface2name.txt rm -f $DIR/interface3name.txt #Bring up interfaces. echo “iface $interface1name inet dhcp” >>/etc/network/interfaces echo “iface $interface2name inet dhcp” >>/etc/network/interfaces echo “iface $interface3name inet dhcp” >>/etc/network/interfaces ip link set $interface1name up ip link set $interface2name up ip link set $interface3name up sleep 4 interface1ip="$(/sbin/ip addr show | grep $interface1name | grep -o "inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*" | grep -o "[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*")" interface2ip="$(/sbin/ip addr show | grep $interface2name | grep -o "inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*" | grep -o "[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*")" interface3ip="$(/sbin/ip addr show | grep $interface3name | grep -o "inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*" | grep -o "[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*")" if [[ -z $interface1ip ]]; then interface1ip=127.0.0.1 fi if [[ -z $interface2ip ]]; then interface2ip=127.0.0.1 fi if [[ -z $interface3ip ]]; then interface3ip=127.0.0.1 fi interface1network=$(mask2network $interface1ip $(cidr2mask $(getCidr $interface1name))) interface2network=$(mask2network $interface2ip $(cidr2mask $(getCidr $interface2name))) interface3network=$(mask2network $interface3ip $(cidr2mask $(getCidr $interface3name))) } setCorrectInterface() { for arg in $(cat /proc/cmdline) do echo $arg | grep -q USE_NETWORK if [ $? == 0 ] then val=$(echo $arg | cut -d= -f2) desiredNetwork=$val fi done GetInterfaceInfo if [[ $interface1network==$desiredNetwork ]] then ip link set $interface1name up ip link set $interface2name down ip link set $interface3name down elif [[ $interface2network==$desiredNetwork ]] then ip link set $interface1name down ip link set $interface2name up ip link set $interface3name down else [[ $interface3network==$desiredNetwork ]] then ip link set $interface1name down ip link set $interface2name down ip link set $interface3name up fi }
And I’ve added a call to the main function in the main fog script file
/bin/fog
between the usb part and the task calling, around line 10 like this:#!/bin/bash . /usr/share/fog/lib/funcs.sh ### If USB Boot device we need a way to get the kernel args properly if [[ $boottype == usb && ! -z $web ]]; then mac=$(getMACAddresses) wget -q -O /tmp/hinfo.txt "http://${web}service/hostinfo.php?mac=$mac" [[ -f /tmp/hinfo.txt ]] && . /tmp/hinfo.txt fi setCorrectInterface if [[ -n $mode && $mode != +(*debug*) ]]; then case $mode in wipe) fog.wipe ;; checkdisk) fog.testdisk ;; photorec) fog.photorec ;; badblocks) fog.surfacetest ;; clamav) fog.av ;; autoreg) fog.auto.reg ;; manreg) fog.man.reg ;; inventory) fog.inventory ;; capone) fog.capone ;; winpassreset) fog.chntpw ;; quickimage) fog.quickimage ;; sysinfo) fog.sysinfo ;; "donate.full") fog.donatefull ;; *) handleError "Fatal Error: Unknown mode :: $mode ($0)\n Args Passed: $*" ;; esac else case $type in down) fog.download ;; up) fog.upload ;; *) [[ -z $type ]] && type="Null" handleError "Fatal Error: Unknown request type :: $type" ;; esac fi
With modifications to the init like this (and using fog trunk), You’d simply specify this custom init, and pass the kernel argument for the network you want to use. For example:
-
That would work nicely, seeing you can use different inits. I didn’t know that as it is not possible in 1.2.0.
The caveat is that I would have to redo the server as trunk requires a newer Ubuntu according to @Quazz.
-
@tag correct. No getting around that.