• Recent
    • Unsolved
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login

    Clients imaging despite recieving "Read ERROR: No such file or directory" and "ata1.00: failed command" errors

    Scheduled Pinned Locked Moved
    FOG Problems
    3
    16
    1.5k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      danieln
      last edited by danieln

      OS: Debian
      FOG Version: 1.5.9-RC2

      I set up one additional node to my FOG server and sometimes (only with certain images), it PXE boots normally and will throw the following error right before Partclone starts:

      alt text

      However, once it continues in one minute, it begins to image.

      Then, usually during imaging it will display the following message:

      alt text

      But FOG will still image the device correctly once Partclone is completed. It’s bizarre to me that it would say that it could not write the image due to “no such file or directory” and then image it. I also double checked on the node to make sure the image file was in /images and it’s definitely there.

      Perhaps one of my drives on the node is failing? It’s brand new though.

      I’m also curious as to what the ata1:00: failed command could possibly mean. Especially in the context of it imaging a client successfully.

      Any ideas on what may be going on?

      Thanks in advance, this community has been a lifesaver and is much appriciated!

      1 Reply Last reply Reply Quote 0
      • S
        Sebastian Roth Moderator
        last edited by Sebastian Roth

        @danieln Please run ls -al /images/DellE5450-80-Non-Office/ on your FOG server console and post output here.

        It’s strange you get those many ATA error messages and it would still finish. I would never expect that! Do you have another device of the exact same model? Does it show the same error messages when deploying to that?

        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

        D 1 Reply Last reply Reply Quote 0
        • D
          danieln @Sebastian Roth
          last edited by

          @sebastian-roth Weird, right?

          Here is the output of ls -al /images/DellE5450-80-Non-Office/ on the Master Node:

          alt text

          and here is that same output on the Node that was throwing those errors:

          alt text

          1 Reply Last reply Reply Quote 0
          • S
            Sebastian Roth Moderator
            last edited by

            @danieln Have you tried imaging that exact same machine from both servers and you only get the “No such file or directory” error on the later one?

            And asking again, do you have another notebook - exact same model - that you can deploy to, just to see if you get the same ATA errors?!

            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

            D 1 Reply Last reply Reply Quote 0
            • D
              danieln @Sebastian Roth
              last edited by danieln

              @sebastian-roth said in Clients imaging despite recieving "Read ERROR: No such file or directory" and "ata1.00: failed command" errors:

              @danieln Have you tried imaging that exact same machine from both servers and you only get the “No such file or directory” error on the later one?

              And asking again, do you have another notebook - exact same model - that you can deploy to, just to see if you get the same ATA errors?!

              I only appear to get the “No such file or directory” error on the node I set up yesterday. However I am getting ATA errors on the other nodes now too with multiple Dell E5480s. Here’s a screenshot of what i’m seeing of one I am currently imaging from a different node:

              alt text

              And again, it finishes correctly. This is a picture of the same screen moments later:

              alt text

              Do you think it’s maybe isolated to the image? I’d assume the ATA errors have something to do with the hard drive but I’m not sure what.

              1 Reply Last reply Reply Quote 0
              • S
                Sebastian Roth Moderator
                last edited by Sebastian Roth

                @danieln said in Clients imaging despite recieving "Read ERROR: No such file or directory" and "ata1.00: failed command" errors:

                I only appear to get the “No such file or directory” error on the node I set up yesterday.

                I should have explained this a bit more in depth earlier. FOS (the Linux OS doing all the work) reads from the file (e.g. d1.p1.img) piping it through a decompression fifo. So if partclone says “No such file” it’s very likely the decompression fifo died for some reason (file corrupted, RAM issue, …) and partclone is not able to read from it anymore.

                Please run file /images/DellE5450-80-Non-Office/d1p1.img and md5sum /images/DellE5450-80-Non-Office/d1p1.img on both your nodes and compare the output. Which compression do you use, Gzip or Zstd?

                Do you think it’s maybe isolated to the image? I’d assume the ATA errors have something to do with the hard drive but I’m not sure what.

                The ATA errors stem from the same FOS (FOG Linux OS) and I would read that as kind of an issue with the Linux kernel with those particular notebooks. It is possible the deploy is fine despite the messages but I am not sure. When you search the web for those ATA messages people say that very often the SATA cable or even power supply (in PCs) can cause such messages. Often Windows is less picky with this kind of things and so I can imagine for Linux to complain (still trying hard) but Windows not so.

                Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                D 1 Reply Last reply Reply Quote 0
                • D
                  danieln @Sebastian Roth
                  last edited by

                  @sebastian-roth Thanks very much for the response and for the very helpful info!

                  I should have explained this a bit more in depth earlier. FOS (the Linux OS doing all the work) reads from the file (e.g. d1.p1.img) piping it through a decompression fifo. So if partclone says “No such file” it’s very likely the decompression fifo died for some reason (file corrupted, RAM issue, …) and partclone is not able to read from it anymore.

                  Please run file /images/DellE5450-80-Non-Office/d1p1.img and md5sum /images/DellE5450-80-Non-Office/d1p1.img on both your nodes and compare the output. Which compression do you use, Gzip or Zstd?

                  That makes sense. It’s just weird to me that it would die on this node when it’s brand new, but i suppose it’s possible. Perhaps I’ll just try a recapture. Or, if I go into the node and manually delete the DellE5450-80 directory, will the Master know to repropogate it? If not, I could try a recapture and see if that works.

                  The output of file /images/DellE5450-80-Non-Office/d1p1.img on both the Master node and the node I was having issues with was the following :

                  /images/DellE5450-80-Non-Office/d1p1.img: Zstandard compressed data (v0.8+), Dictionary ID: None
                  

                  The output of md5sum /images/DellE5450-80-Non-Office/d1p1.img on the Master node was:

                  e929a14a17c60b2b9a7dfdf18f526232  /images/DellE5450-80-Non-Office/d1p1.img
                  

                  The output of md5sum /images/DellE5450-80-Non-Office/d1p1.img on the problematic node was:

                  1d4bf4ac2bcef83013fe4589149b0e30  /images/DellE5450-80-Non-Office/d1p1.img
                  

                  I am using Zstd for compression. Do you recommend Gzip? What are the pros/cons of both?

                  Do you think it’s maybe isolated to the image? I’d assume the ATA errors have something to do with the hard drive but I’m not sure what.

                  The ATA errors stem from the same FOS (FOG Linux OS) and I would read that as kind of an issue with the Linux kernel with those particular notebooks. It is possible the deploy is fine despite the messages but I am not sure. When you search the web for those ATA messages people say that very often the SATA cable or even power supply (in PCs) can cause such messages. Often Windows is less picky with this kind of things and so I can imagine for Linux to complain (still trying hard) but Windows not so.

                  I will say that I replaced the hard drive on one of the client laptops that was having that issue and it was resolved, but I attempted a hard drive replacement on a separate client and it was still throwing the ATA errors, so maybe it was something else. But you’re thinking its more along the lines of hardware issues with the laptop and not with FOS or the Node itself? I feel like it only throws those ATA errors when connecting to that one node, but I could be wrong. Maybe that’s the next thing i’ll test.

                  1 Reply Last reply Reply Quote 0
                  • S
                    Sebastian Roth Moderator
                    last edited by Sebastian Roth

                    @danieln said in Clients imaging despite recieving "Read ERROR: No such file or directory" and "ata1.00: failed command" errors:

                    The output of md5sum /images/DellE5450-80-Non-Office/d1p1.img on the Master node was:
                    e929a14a17c60b2b9a7dfdf18f526232 /images/DellE5450-80-Non-Office/d1p1.img

                    The output of md5sum /images/DellE5450-80-Non-Office/d1p1.img on the problematic node was:
                    1d4bf4ac2bcef83013fe4589149b0e30 /images/DellE5450-80-Non-Office/d1p1.img

                    That’s very interesting. I did not expect the checksums to be different but good that I asked. To me that means that the file was not replicated from the master to the storage properly. So please delete /images/DellE5450-80-Non-Office/d1p1.img on the storage node and wait till it’s being replicated from the master. Then check md5sums again. The FOG replication services checks filesize and checksums (this check only happens for smaller files because it puts too much load on the server if checksums for large files are calculated on every run) but seems like this is a seldom case where filesize matches but checksum doesn’t.

                    I am using Zstd for compression. Do you recommend Gzip? What are the pros/cons of both?

                    Both are fine. I tend to use Zstd more and more.

                    you’re thinking its more along the lines of hardware issues with the laptop and not with FOS or the Node itself? I feel like it only throws those ATA errors when connecting to that one node, but I could be wrong. Maybe that’s the next thing i’ll test.

                    Yes I would say it’s very unlikely to be caused by FOS or the node unless you have different FOG/kernel versions installed.

                    Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                    Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                    D 2 Replies Last reply Reply Quote 0
                    • D
                      danieln @Sebastian Roth
                      last edited by

                      @sebastian-roth

                      @danieln said in Clients imaging despite recieving "Read ERROR: No such file or directory" and "ata1.00: failed command" errors:

                      The output of md5sum /images/DellE5450-80-Non-Office/d1p1.img on the Master node was:
                      e929a14a17c60b2b9a7dfdf18f526232 /images/DellE5450-80-Non-Office/d1p1.img

                      The output of md5sum /images/DellE5450-80-Non-Office/d1p1.img on the problematic node was:
                      1d4bf4ac2bcef83013fe4589149b0e30 /images/DellE5450-80-Non-Office/d1p1.img

                      That’s very interesting. I did not expect the checksums to be different but good that I asked. To me that means that the file was not replicated from the master to the storage properly. So please delete /images/DellE5450-80-Non-Office/d1p1.img on the storage node and wait till it’s being replicated from the master. Then check md5sums again. The FOG replication services checks filesize and checksums (this check only happens for smaller files because it puts too much load on the server if checksums for large files are calculated on every run) but seems like this is a seldom case where filesize matches but checksum doesn’t.

                      I cannot exagerrate how useful this information is for me to know for the future. So thanks a million! I will try that and report back. However, just for my own clarification, those two checksum outputs should be the same if it replicated correctly?

                      I am using Zstd for compression. Do you recommend Gzip? What are the pros/cons of both?

                      Both are fine. I tend to use Zstd more and more.

                      Good to know.

                      1 Reply Last reply Reply Quote 0
                      • S
                        Sebastian Roth Moderator
                        last edited by

                        @danieln said in Clients imaging despite recieving "Read ERROR: No such file or directory" and "ata1.00: failed command" errors:

                        However, just for my own clarification, those two checksum outputs should be the same if it replicated correctly?

                        Exactly!

                        Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                        Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                        1 Reply Last reply Reply Quote 0
                        • D
                          danieln @Sebastian Roth
                          last edited by

                          @sebastian-roth said

                          you’re thinking its more along the lines of hardware issues with the laptop and not with FOS or the Node itself? I feel like it only throws those ATA errors when connecting to that one node, but I could be wrong. Maybe that’s the next thing i’ll test.

                          Yes I would say it’s very unlikely to be caused by FOS or the node unless you have different FOG/kernel versions installed.

                          I’d like to circle back around to this to get some more clarifcation. My master node has version 1.5.9-RC2 and this particular node (as well as all the other nodes) have version 1.5.9. I’m unsure as to what the differences between RC2 and the 1.5.9 version are, but do you think this is something worth investigating?

                          1 Reply Last reply Reply Quote 0
                          • S
                            Sebastian Roth Moderator
                            last edited by

                            @danieln said in Clients imaging despite recieving "Read ERROR: No such file or directory" and "ata1.00: failed command" errors:

                            I’d like to circle back around to this to get some more clarifcation. My master node has version 1.5.9-RC2 and this particular node (as well as all the other nodes) have version 1.5.9. I’m unsure as to what the differences between RC2 and the 1.5.9 version are, but do you think this is something worth investigating?

                            Sure, taking a quick look doesn’t hurt. The 1.5.9-RC2 was a release candidate not long before 1.5.9 was released. Possibly that uses a different kernel - depends on when it was installed.

                            You should be able to get the kernel version by running the following command on all your nodes: file /var/www/html/fog/service/ipxe/bzImage* (the * will include the 64 bit and 32 bit kernel - the output should show kernel version 4.19.x or possibly 5.6.x.)

                            Web GUI issue? Please check apache error (debian/ubuntu: /var/log/apache2/error.log, centos/fedora/rhel: /var/log/httpd/error_log) and php-fpm log (/var/log/php*-fpm.log)

                            Please support FOG if you like it: https://wiki.fogproject.org/wiki/index.php/Support_FOG

                            D 1 Reply Last reply Reply Quote 0
                            • D
                              danieln @Sebastian Roth
                              last edited by

                              @sebastian-roth said

                              Sure, taking a quick look doesn’t hurt. The 1.5.9-RC2 was a release candidate not long before 1.5.9 was released. Possibly that uses a different kernel - depends on when it was installed.

                              You should be able to get the kernel version by running the following command on all your nodes: file /var/www/html/fog/service/ipxe/bzImage* (the * will include the 64 bit and 32 bit kernel - the output should show kernel version 4.19.x or possibly 5.6.x.)

                              Well I’ll be damned. The kernel on the master and all of the working nodes is version 4.19.123 and the problematic node’s kernel is version 4.19.145. You think that may be what’s causing the issue? But why would some of the other images be working fine then?

                              At any rate, is there an easy way to update this kernel without having to do a full reinstall?

                              george1421G 1 Reply Last reply Reply Quote 0
                              • george1421G
                                george1421 Moderator @danieln
                                last edited by

                                @danieln said in Clients imaging despite recieving "Read ERROR: No such file or directory" and "ata1.00: failed command" errors:

                                is there an easy way to update this kernel

                                FOG (FOS) kernels can be downloaded from here https://fogproject.org/kernels/ download both the x64 and x32 bit kernels. Save the x64 as bzImage and the x32 ad bzImage32 (case is important). Then you can just move the files to /var/www/html/fog/service/ipxe directory on the FOG server. It probably wouldn’t hurt to rename the existing ones before you move the new kernels in. You can confirm the version of the bzImage files with file /var/www/html/fog/service/ipxe/bzImage It should print out the version of the kernel.

                                Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                D 1 Reply Last reply Reply Quote 1
                                • D
                                  danieln @george1421
                                  last edited by

                                  @george1421 said

                                  FOG (FOS) kernels can be downloaded from here https://fogproject.org/kernels/ download both the x64 and x32 bit kernels. Save the x64 as bzImage and the x32 ad bzImage32 (case is important). Then you can just move the files to /var/www/html/fog/service/ipxe directory on the FOG server. It probably wouldn’t hurt to rename the existing ones before you move the new kernels in. You can confirm the version of the bzImage files with file /var/www/html/fog/service/ipxe/bzImage It should print out the version of the kernel.

                                  Thank you for this info! I downloaded those files, renamed them, and moved them to /var/www/html/fog/service/ipxe . I ended up keeping the old files and renaming them bzImageOLD and bzImage32OLD respectively. The new output of file /var/www/html/fog/service/ipxe/bzImage is this:

                                  /var/www/html/fog/service/ipxe/bzImage:     Linux kernel x86 boox executable bzImage, version 4.19.123 (jenkins-agent@Tollana) #1 SMP Sun May 17 01:04:09 CDT 2020, R0-rootFS, swap_dev 0x8, Normal VGA
                                  
                                  /var/www/html/fog/service/ipxe/bzImage32:     Linux kernel x86 boox executable bzImage, version 4.19.123 (jenkins-agent@Tollana) #1 SMP Sat May 16 23:59:01 CDT 2020, R0-rootFS, swap_dev 0x7, Normal VGA
                                  
                                  /var/www/html/fog/service/ipxe/bxImage32OLD: Linux kernel x86 boot executable bzImage, version 4.19.145 (sebastian@Tollana) #1 SMP Sun Sep 13 05:43:10 CDT 2020, R0-rootFS, swap_dev 0x7, Normal VGA
                                  
                                  /var/www/html/fog/service/ipxe/bxImageOLD: Linux kernel x86 boot executable bzImage, version 4.19.145 (sebastian@Tollana) #1 SMP Sun Sep 13 05:35:01 CDT 2020, R0-rootFS, swap_dev 0x8, Normal VGA
                                  
                                  

                                  Version 4.19.123 is what is on the master as well as all the other nodes. I trust that it will look at these since they’re named properly even though the other files are in the same directory but they’re renamed. I will run a recapture/deploy test and report back with findings.

                                  Thank you both again!

                                  george1421G 1 Reply Last reply Reply Quote 0
                                  • george1421G
                                    george1421 Moderator @danieln
                                    last edited by

                                    @danieln Correct as long as bzImage and bzImage are what you want then the proper kernels will boot.

                                    Please help us build the FOG community with everyone involved. It's not just about coding - way more we need people to test things, update documentation and most importantly work on uniting the community of people enjoying and working on FOG!

                                    1 Reply Last reply Reply Quote 0
                                    • 1 / 1
                                    • First post
                                      Last post

                                    169

                                    Online

                                    12.0k

                                    Users

                                    17.3k

                                    Topics

                                    155.2k

                                    Posts
                                    Copyright © 2012-2024 FOG Project