Massive packet loss/NIC issues with new Dell 7070 Ultra in FOG


  • @george1421

    Thanks for the feedback.
    We switched to Dell from another brand, so we don’t have any adapters laying around, but i have a Dell DA200 USB C dongle with ethernet at home, good idea to try that, i’ll bring it tomorrow just to test! 🙂

    I quickly tried some version of Ubuntu live ( i think 18.04) on it and it worked that time last week. But i downloaded Debian Buster now and tried as well just to make sure as much as possible, it’s the same problem there now! Only on one of the PC’s, Debian Buster on “the first” PC of the same model continues to work fine with the same cables & BIOS here as well… Thanks for pushing me to actually double check that again!

    I got to DOA submit this PC to Dell soon, but unsure how they’ll see this as it’s working in Windows, but i’ll hope they have enough goodwill with us being a new Dell customer with a big order going in.
    Before doing that, i’m reinstalling windows 10 manually now from USB / Win 10 media creator just to make sure that works still or now.

    Btw, we began our environment change focused on the 3060, then the 3070 came and we were preparing purchasing of that model, but we slowly trickled into optimizing our physical environment with the 7070 Ultra instead and landed on this model to purchase now. It’s quite different from 3060/3070 since it’s built on laptop parts to begin with. It’s still quite expandable and configurable, but no full size PCI-e etc 😉


    UPDATE

    After talking with a Dell tech who helped me brainstorm a few things and we doublechecked the revision of the NIC, which is the same on both/all machines (Rev 11 of the I219-LM Intel ethernet NIC), we didn’t get any wiser, so basically still at the same spot.

    What i’m concluding is that it seems to work on newest Kernels even on this troubling PC, so i’m leaning to have to build my own kernel and it should sort itself out (what i’d probably had done by now, if it weren’t for all the other PC’s already working…)


  • Moderator

    @no0NE I’d still recommend trying a different cable/switch port just to rule it out completely, even if the first device seems to work properly on it.

    I also found the following on this topic that’s worth checking out: https://wiki.hetzner.de/index.php/Low_performance_with_Intel_i218/i219_NIC/en

  • Senior Developer

    @no0NE So that leads us to the suggestion George made earlier. Boot some kind of Linux Live OS and see if it causes the same problems on this hardware (but not on the other one).


  • @Sebastian-Roth said in Massive packet loss/NIC issues with new Dell 7070 Ultra in FOG:

    @no0NE Have you had the first device (working fine) on the same switch port and with the same cable that you now see the problem with the second “faulty” device?

    Absolutely sure the firmware is the exact same version on both?

    Yes & Yes, sadly. Other than that, i’ve tried several other straight & crossover cables to be fully sure. I’m running out of ideas to troubleshoot, hence this post… 🙂 Good suggestions if i hadn’t already checked, thanks! 🙂

  • Moderator

    @no0NE said in Massive packet loss/NIC issues with new Dell 7070 Ultra in FOG:

    The Dell 7070 Ultra is a desktop

    Ok its similar to the 3060s we use then, just the 7000 series. (off topic, but dell has kind of made a mess with series numbers now its hard to tell what is what).

    So do you have a Dell usb c dongle from a 7400 or 3530 you could test with? It needs to be a Dell usb C dongle (ethernet adapter) so its supported by the firmware for pxe booting. I guess you could use a WD15 or WD19 dock. The idea is to try to use another network adapter than the built in one.

    I can tell you the Intel I219 network adapter has been supported since the 4.12.x linux kernel.

    Something else to try is to boot a linux live OS and see if you still have networking issues. FOS Linux runs on a current linux kernel so if its supported by linux then FOS Linux should be able to communicate with it.

  • Senior Developer

    @no0NE Have you had the first device (working fine) on the same switch port and with the same cable that you now see the problem with the second “faulty” device?

    Absolutely sure the firmware is the exact same version on both?


  • @george1421
    Thanks for the reply!

    As you say, the issue is not on the server side, but rather on the client.

    The Dell 7070 Ultra is a desktop built with laptop components (google it, it’s quite cool 😉 ). It’s powered through USB-C from our monitors (65w), together with DP & USB hub functionality, but the ethernet is connected separetly directly to the ethernet jack of the PC (Intel I219-LM network card).

    I ping and scan the traffic going to the PC from my admin PC, the FOG server, the other PC of the same model, can start the ping in windows and all’s fine, but as soon as it boots into FOG the packet losses starts to occur.

    To isolate as much problem sources as possible, i’ve also connected ethernet directly between FOG server and this PC and the packet losses is still the same - massive!
    I’ve also tried with dumb & smart/L3 switches and results are the same…

    Thanks for trying to help brainstorm how to find the error source since i might’ve missed something, but from these points i think i’ve eliminated your possible fault points.

  • Moderator

    @no0NE My personal idea is that its a target system firmware issue.

    Since you are talking about a 7070, is that a portable or a desktop?

    I really don’t think its the FOG server at this point, but instead something on the endpoint side.

    If this device is a portable are you using a usb c to ethernet dongle?

    Where are you seeing the packet loss?

    If you have these devices connected to an enterprise managed switch, are you seeing errors on the port counters?

    If this is a managed enterprise switch see if the green ethernet (802.3az) is enabled on the switch port. If it is disable it to see if it resolves the issue.


  • Just to add; most of the times trying to start the imaging process on “PC #2” it will fail after trying to check in for a while with an error stating “no route to host” or similar connection related errors (due to all the packet losses…) with an automated reboot after 1 min. sometimes though it slowly gets to the PartClone stage and then is stuck there at ~0.5-1mB/s with continued massive packet loss.

    I’d like to try it out on further PC’s of this model, it feels like a NIC/Hardware issue if it’s only this PC, but the first PC works repeatedly every time (20+ times). And it works without issues in Windows/UEFI…

    The packet loss against this PC is the same when pinged from any PC on the network, incl. from the FOG server itself.

    Latest available UEFI/BIOS for this model (v 1.0.2)

358
Online

7.7k
Users

14.7k
Topics

138.5k
Posts