cmcgonag

cmcgonag

George,

Thanks for the ideas. I am going to try to build my own customized LIVECD version and see if I can get it to run. Will report back (will most likely take me a while!).

Thanks for the insight.

Colin

cmcgonag

@george1421

Sorry I was away on vacation. Just now getting back to this.

All nodes currently are on intel 64bit processors. We use AMD GPUs mostly, some NVIDA to test, for compute. 105 Nodes currently deployed (more or less 100kw)
We are “trying” to design them to run various compute level code. Mining, blockchain test code, ai, whatever, if it runs on Ubuntu it should run [I totally get that there is a mirad of ways this can fail]. I have tested a ton of different packages out there with more or less success with my current configuration.
I essentially want for someone to to give me an image, I then push it to the cluster and it runs. So far it has worked great with FOG (minus my ssd failures).
I use ubuntu because it is what I am the most familiar with and it seems to have the best driver level support all around. And the use case is mostly ubuntu.
I have been running them in GUI mode. If something fails I have a KVM I can link over and see what happened. That isnt to say I couldnt run character mode, but I dont know why I would.
My goal is to scale to 1000 nodes then lease out the capacity, think a kind of bare metal AWS, but way more ghetto.

That being said, a LUN may be the way to go, but I dont see why I need that. I should be able to do all this in ram, but then again, I am just the engineer trying to figure out how to cool all this stuff down (hence the engineering part). I also dont want my network getting bogged down on iSCSI traffic. I am only running 1GBe in a pretty limited spanning tree config. I want to be like a refinery for processing data, maybe monday I do some compute, tuesday ai, Wednesday blockchain, all for way cheaper than anybody else (at least that is my goal lol).

cmcgonag

@george1421 said in How to Add Boot to MEMDISK Option - Syntax Question:

character based computing nodes

They will only run ubuntu as a node.

cmcgonag

@george1421

I started using fog to “to create a master image and deploy it to the hard drive of many computers.” But now I want to transition to “a diskless (in reference to the target nodes) netboot system that loads and executes everything out of RAM (actually the hard drive on the target computer is not needed at all).”

So whatever I need to do to make that transition, I want to do, even if it means I shouldnt be using fog. I just know fog the best, hence why I am trying to use it.

Ultimate goal is to be diskless.

Thanks.

cmcgonag

@george1421 said in How to Add Boot to MEMDISK Option - Syntax Question:

Thanks george! That is super helpful. I am trying to stay away from iSCSI (dont know much about it either); if I am not mistaken, each node would need its own iSCSI instance to run? I was worried that multiple clients connected to the same iSCI target would have issues.

So in your netbooting example:

kernel tftp://${fog-ip}/os/ubuntu/Desk17.10/vmlinuz.efi
initrd tftp://${fog-ip}/os/ubuntu/Desk17.10/initrd.lz

“This right here is telling… vmlinuz.efi is… what you might think in windows is the operating system or kernel. initrd.lz think of it as a virtual hard drive. To get linux to boot you need a kernel (OS) and a hard drive (initrd). From there… that’s call the boot loader, that tiny OS has enough brains to reach back out to the NFS server (FOG) to get the rest of the operating system and to load it into memory. That is netbooting.”

I guess what I was confused on is how do the calls differ, like if I wanted to PXE boot my node now it completely wipes the drive and loads the OS over top, vs trying to boot to memory. How does it know to store the image on sda (currently) vs how to store it in ram (in your example)? My assumption was the base call (as you listed it) would AUTOMATICALLY overwrite sda (or whatever drive was specified in the host primary disk parameter), and move on. But maybe that is not the case?

cmcgonag

Also I apologize if this question is stupid. I am a mechanical engineer by training and have had to teach myself all this stuff.

cmcgonag

Thanks George. I have read that tutorial. What I was seeing though was this (Ubuntu 17 section):

“In the fog WebGUI go to FOG Configuration->iPXE New Menu Entry
Set the following fields
Menu Item: os.Ubuntu.Desktop.17.10
Description: Ubuntu Desktop 17.10
Parameters:
kernel tftp://${fog-ip}/os/ubuntu/Desk17.10/vmlinuz.efi
initrd tftp://${fog-ip}/os/ubuntu/Desk17.10/initrd.lz
imgargs vmlinuz.efi root=/dev/nfs boot=casper netboot=nfs nfsroot=${fog-ip}:/images/os/ubuntu/Desk17.10/ locale=en_US.UTF-8 keyboard-configuration/layoutcode=us quiet splash ip=dhcp rw
boot || goto MENU
Menu Show with: All Hosts”

None of that code specifies a memdisk or a ramdisk. I assumed I needed to chain “memdisk iso raw” to that tutorial. As it stands, I assume it would load to sda.

cmcgonag

Agreed it is diskless boot and have poured over the iPXE forums. I have seen various results. Most guys are running a very rudimentary PXE solution that seems dated. I also dont want an iSCI target.

Mostly we use ubuntu. That is where I would like to start.

I appreciate any help you can give.

cmcgonag

All,

We have a large compute cluster that uses FOG to load various linux distros onto the cluster to do tasks. We have recently started failing SSDs at a pretty high rate. I realized that are distros are small (4gb) and we could just load these into a ramdisk/memdisk and operate that way, as we dont need storage (compute cluster only).

Is the appropriate line to add to the iPXE menu the below:

kernel memdisk
initrd “path to file”
boot
goto start

or is it:

initrd “path to file”
chain memdisk iso raw ||

or do I need to add something more complex like listed here?

https://forums.fogproject.org/topic/2845/ipxe-advanced-menu-or-memdisk-problem/8

My thought is I would like to be able to have the functionality that we only use the SSDs for larger distros, but use the memdisk for smaller distros. My hope is I could change a couple of lines to tell iPXE where to store the image when it loads. My goal is to get away from SSDs entirely and operate 100% out of ram.

Also, is there the option to add memdisk to the task or image, so that when I push to the cluster it will essentially boot to a memdisk vs sda etc / rather than trying to trick the menu?

Thanks for your help.

Colin

cmcgonag

@cmcgonag

Got it to run by upping my memory limit to 1280M. Testing now. Currently only getting 200Mb/s transfer speed on eno1.

cmcgonag

@cmcgonag

Latest posts made by cmcgonag