Surface 3 Fails to Image
-
@Sebastian-Roth Yes, sorry i didn’t make that clear, I tried 4.2.3 yesterday. I too am confused as i didn’t have the network problem when i was on SVN 5473 only when I upgraded to 5590.
I tried 4.2.3 again today (still on SVN 5666) (manually deleted bzimage and bzimage32) and used the Kernel update page to download 4.2.3 and i get the same error i posted yesterday.
I booted back into debug mode and verified that the ID of the network adapter is still 045e:07ab.
If i have time today I’ll see if i can do a fresh FOG install using the current stable release and then download kernel 4.2.3 and see if i get the same thing.@Imperilled Are you using the Microsoft 10/100 network adapter (1552) or the Microsoft gigabit adapter (1663)?
-
@wwarsin said:
@Tom-Elliott Hi Tom, I just tried that and recieved the following error (I’m on SVN 5666 now)
FOGFTP: Failed to rename file. Remote Path: //var/www/html/fog/service/ipxe/bzImage/backup/bzImage_20151207_214012, Local Path: /var/www/html/fog/service/ipxe/bzImage, Error: ftp_rename(): Rename failed.
You’ll find the FTP credentials and host used for the Kernel updater here:
FOG Configuration -> FOG Settings -> TFTP Server ->
- FOG_TFTP_HOST #This is the IP that FTP uses for putting the kernel in place.
- FOG_TFTP_FTP_USERNAME #This is the username FTP uses.
- FOG_TFTP_FTP_PASSWORD #This is the password FTP uses.
- FOG_TFTP_PXE_KERNEL_DIR #This is where FTP tries to put the kernels.
Also, @Tom-Elliott I’ve recently discovered a bug. I don’t know how long it’s existed but I was able to figure it out today. On both our Administration FOG setup, and my building’s FOG setup (which is separate), the FOG_TFTP_PXE_KERNEL_DIR is missing a slash in the path:
/var/www/html/fog/
service/ipxe/ The one colored red. I manually corrected it on both and it works, but we didn’t manually remove that slash. Just pointing that out to maybe help others with the issue. -
@wwarsin @Imperilled I just found out that all the RTL8152/8153 based connectors are blacklisted in the cdc_ether driver (see here: http://lxr.free-electrons.com/source/drivers/net/usb/cdc_ether.c#L686). Your microsoft NIC isn’t just because they don’t know/care about those yet I guess.
What does this mean? To my understanding the cdc_ether driver is not happy with those realtek USB NICs and tries to stay out of the way. But because your NICs IDs are not blacklisted the driver still jumps in - and might possibly work.
But there is a “real” RTL8152/8153 driver in linux as well (http://lxr.free-electrons.com/source/drivers/net/usb/r8152.c) but this one is not finding your USB NICs as it is not aware of those microsoft IDs either…
I am compiling a patched kernel (4.3.0) right now. Would you please give this a try. Just move your current bzImage out of the way (backup) and put this test kernel into place.
Update: Just after having finished compiling the kernel I stumbled upon a patch which someone else came up with. Guess what. It’s pretty much exactly what I just did: http://svn.exactcode.de/t2/trunk/package/base/linux/surface-dock-eth.patch
So here is the bzImage: https://drive.google.com/folderview?id=0B-bOeHjoUmyMV095YVpsR3U5VFk&usp=sharing
-
Kernels have been updated with the non-CDC-ETHER but with the patch in this thread. Please update and let me know if things are (or are not) functional now?
-
I installed the lastest SVN (5686) and attempted to boot with the bzimage it installed as well as replaced the bzimage file with the link you provided and I still receive the no network error…
ifconfig in debug mode also still only shows lo
Error ident-mapping new memmap (0x13ac72000)! Starting logging:* OK Populating /dev using udev: udevd[2950]: error creating epoll fd: Function not implemented done Initializing random number generator... done. Starting eth0 interface ip: SIOCSIFFLAGS: No such device cat: /sys/class/net/eth0/carrier: No such file or directory cat: /sys/class/net/eth0/carrier: Invalid argument cat: /sys/class/net/eth0/carrier: Invalid argument cat: /sys/class/net/eth0/carrier: Invalid argument cat: /sys/class/net/eth0/carrier: Invalid argument cat: /sys/class/net/eth0/carrier: Invalid argument cat: /sys/class/net/eth0/carrier: Invalid argument cat: /sys/class/net/eth0/carrier: Invalid argument cat: /sys/class/net/eth0/carrier: Invalid argument cat: /sys/class/net/eth0/carrier: Invalid argument ssh-keygen: generating ew boot keys: RSA DSA ECDSA ED25519 Starting sshd: OK
I may not be able to do any testing until mid/late next week (or even the week after next) after today.
-
@wwarsin Can you please try
ifconfig -a
and as wellls -al /sys/class/net/eth0/device/driver/module
(to see which driver is actually used)…And as well could you please check dmesg
dmesg | grep eth
-
I will try your kernel tomorrow. I need to find a solution because my image are still in raw… 124Gb X 15 surface it’s a day to deploy with only one adapter
-
@Imperilled Kernel won’t make a difference in your particular case! I would really like to see your issue fixed as well. Could you please open a new topic on this. Makes it a lot easier for everyone to follow if we don’t discuss two different topics in one thread! Please let us know what error you see when trying to upload an Multiple Partition - Single Disk image and we should be able to help you on this.
-
I upgraded to SVN 5762 and the surface booted directly into debug mode (instead of having to manually enter the IP address of the fog server) but network still isn’t working. I ran lsusb again because i’ve acquired the Microsoft Model 1663 (USB to ethernet Gigabit adapter).
045e:07c6If you prefer i test with the model 1552 (10/100 adapter) let me know but we’ll probably use the 1663 once we get this working.
-
@wwarsin Yes, we changed the iPXE script which is probably why you don’t see it asking for the IP address anymore. Upgrading to the latest version means that the latest kernel has been downloaded as well. I am not sure if the patch is still part of the kernel. @Tom-Elliott??
Re-reading all the posts I saw that you had a full dmesg output posted with your very first question already (thank god you did!):
cdc_ether 1-2.4:2.0 eth0: register 'cdc_ether' at usb-0000:00:14.0-2.4, CDC Ethernet Device, 60:45:bd:f9:62:b6 ... cdc_ether 1-2.4:2.0 eth0: kevent 12 may have been dropped
So to me this means that an older kernel version was magicaly able to run your USB NIC device with the cdc_ether driver. I am not sure why this is not working anymore even if we add this driver back to the kernel. But I am not confident with the cdc_ether driver anyway and would hope that we don’t need it at all (RTL8152 chips being blacklisted is just on thing I don’t like about it).
So we are back to the question: Are we able to make this USB NIC work with the r8152 driver and possibly how??
In your last post I see
cat: /sys/class/net/eth0/carrier: No such file or directory cat: /sys/class/net/eth0/carrier: Invalid argument
Looks like eth0 is not available on the first try but pops up at some point. My guess is that Tom removed the patch after you last try as it didn’t seam to work. So the current version you installed when upgrading to the latest version might have installed a kernel without patch. But from what I see in your posts I have a feeling that we are pretty close to get this work with the patch.
Feel free to try this kernel again: https://drive.google.com/folderview?id=0B-bOeHjoUmyMV095YVpsR3U5VFk&usp=sharing
Boot into debug mode and wait for a few seconds. Then see what you get fromls -al /sys/class/net/eth0/device/driver/module
anddmesg | grep 8152
-
@Sebastian-Roth What patch are you referring to?
I removed my edits yes, cause you are absolutely correct that they didn’t matter anyway.
So my custom patches are gone. I have not built a 4.3.3 kernel yet though I am aware it was released.
I have not added CDC_ETHER either as I really don’t think it would matter either.
I am still doing the mmc patch (which is part of why the slow to update to latest all the time – among working on the init scripts) but I know that part has no relevance to the issues in this thread.
-
@Tom-Elliott Talking about this patch: http://svn.exactcode.de/t2/trunk/package/base/linux/surface-dock-eth.patch as I feel like this might be going down the right lane with this issue. Should work with any kernel version I reckon. Keep cdc_ether disabled, please.
-
@Sebastian-Roth I downloaded the bzimage you linked to and replaced the one /var/www/fog/service/ipxe and booted to
Here are the results of the two commands:
[root@fogclient /]# ls -al /sys/class/net/eth0/device/driver/module ls: cannot access /sys/class/net/eth0/device/driver/module: No such file or directory [root@fogclient /]# cd /sys/class/net [root@fogclient /]# ls lo@ [root@fogclient /]# dmesg | grep 8152 [ 1.070704] usbcore: registered new interface driver r0152 [root@fogclient /]#
-
@wwarsin Thanks for trying and reporting. Does not look very good to me. But as I can see in one of your earlier posts it has kind of worked (at least not “No such file or directory”) in the past. What messages did you see this time when booting up into debug mode??
-
@wwarsin I am wondering if the device is actually recognized. Could you please boot the device into debug mode. When you see the shell unplug the USB NIC. Wait for a few seconds and plug it back in. Then run
dmesg | tail -n 20
. Would be great if you could take a picture of what you see on the screen. Hopefully we might see something similar to this: https://bugzilla.redhat.com/show_bug.cgi?id=1236679 (we don’t have the same issue in FOG, just posting this to show what the output might look like).I stumbled upon a newer driver (kernel source code) on realtek.com.tw. Version 2.05.0 (2015/8/13) instead of the 1.08.2 (2014) included in kernel 4.3.3. Will try to build a kernel with that realtek code.
Edit (read this first): I just remembered that I only added ID 045e:07ab (Microsoft Model 1552) to the kernel I compiled last because this was the device you reported using at first. So this means that using my kernel (the one I compiled 10 days ago) would not work with Microsoft Model 1663 (045e:07c6). I compiled two new kernel images that you can find here: https://drive.google.com/folderview?id=0B-bOeHjoUmyMV095YVpsR3U5VFk&usp=sharing
Both added with the Microsoft device IDs for 1552 AND 1663! bzImage.vanilla is a plain kernel and bzImage.realtek I compiled with the earlier mentioned driver code from the realtek website. Try vanilla first as I hope that we don’t need the newest realtek driver. Again, boot into debug mode and try the commands. As well try re-plugging the device to see if it is recognized (seedmesg | tail -n 20
). -
@Sebastian-Roth Hi Sebastian, I will try to get this to you next week or the week after due to the Holidays…
Have a great Christmas and New Years!
-
@Sebastian-Roth I just tested with the vanilla bzimage and in debug mode the network works! However when i tried to capture an image the network fails and reports
I upgraded to SVN 5798 and am still using the Microsoft Model 1663 network adapter.
Edit:
I seem to be getting mixed results now… I created the capture task and it first booted to the no network screen. Then it booted to the white FOG splashscreen and another attempt looked like it loaded correctly to start capturing the image however the screen went by so fast that i couldn’t catch any errors (the surface just restarted) -
@wwarsin said:
I upgraded to SVN 5798 and am still using the Microsoft Model 1663 network adapter.
Whenever you upgrade you are using the current official FOG trunk kernel which does not have the patch included (to add the mentioned IDs).
Probably good if you can take a video of the screen so we might have a chance to see if there is an error and when exactly things go wrong. As I don’t have a surface device I need your assistance to figure this out. But it looks like we are making progress. At least we seam to have networking up again - maybe only part of the time, not sure why?!
-
@Sebastian-Roth I assume bzimage is the kernel? I downloaded this file after I upgraded to the newest SVN - I’ll hold off on upgraded fog until this is resolved.
I’m out the rest of this week (food poisoning ) but will take a video of the surface next week.
-
@wwarsin Sorry for my late reply. Have been without internet for some days. Hope you are better again!
Yes, bzImage is the kernel. Re-download and put in correct path (/var/www …) if you have upgraded. But there shouldn’t be a need to upgrade right now if you don’t see other issues. Probably best if you don’t to hopefully have more stable test results.
Looking forward to the video.