Our company recently purchased a few Dell 7070 Ultra to start preparing our environment for a change to this PC/Setup in our production environment.
So far i’ve successfully captured and deployed a test image from 1 of my 2 test PC’s of this model, my problem is that the 2:nd PC of the exact same model & batch suddenly after loading the bzimage/kernel file (throughout the rest of the imaging process) has massive packet losses/response times, ranging from ~500-5000+ms, with a lot of dropped packages all together. Resulting in imaging taking a weekend instead of ~5 min. Imaging works correctly also with our legacy hardware running undionly.kkpxe/BIOS
Once in Windows/UEFI/anywhere else than FOG the PC has standard response times and everything works perfectly, showing that the NIC seems to work fine…
The problem starts before any imaging/capturing begins - as soon as the kernel is loaded, pointing probably to a driver issue, what confuses me is why the first PC works like a charm every time in that case…
I’ve manually upgraded the kernel to Kernel.TomElliott.18.104.22.168 (from included .48 kernel) - no difference.
At this stage i’d like to try further PC’s from this model, but it will take a couple of months before that’s possible.
So my question is, do you have anything else to point me in a direction to troubleshoot further, or is there a newer kernel/drivers that might simply work better? It is a brand new model and even a completely new series from Dell after all…
I’ve tried changing ports & cables between the PC’s that work and doesn’t work, it’s always this specific PC that doesn’t work with any combination of cables etc… I’ve had one imaging that suddenly 99% of the process seemed to work and i managed to deploy the image to the PC that time, but that’s once in about 50+ tries, randomly during imaging it might start working with ~1ms for 5-10 sec and then it stops working again, sometimes (maybe 10% of the time) if i pull the ethernet cable out for a couple of seconds and put it back in it works for the first 5-10 seconds as well… Really feels like a driver issue.
Do you think it’s a new/other kernel version that should solve this, or a newer FOG version altogether or something else?
FOG 1.5.7 stable, ARM (FOG test environment on Raspberry Pi 4, 4gB, Raspbian Buster, latest updates as of last week)
Imaging/capturing; Dell 7070 Ultra, i5 8365U, 8gB RAM, UEFI, with default ipxe.efi & .48 x64 & .64 x64 FOG/Tom kernels
NIC; Intel I219-LM
UPDATE; new BIOS released 21st October, 1.1.2. Updated to this version, same issue remains on this PC. Fresh win 10 install from microsoft - network works perfect, Debian Buster, FOG 1.5.7 etc same issue, works perfect still on 1st PC.
Today (22:nd October) i got information from our branch office that they’ve managed to run FOG with our help on the 2 Dell 7070 ultras there and they work fine as well! Very strange that this single PC would behave like this and only in Linux env.
Any help greatly appreciated, thanks!
Robin, IT Specialist. (With client PC environment responsibility among other things)