What can we do when we don't trust UUID?
@Wayne-Workman as I said,
Obviously the fields I picked were completely arbitrary and some testing may need to go into it to pick the best identifiers.
and as Tom said, this is just brainstorming how to best identify hosts; nothing is set in stone.
Also, for the reasons you stated, hdd and ram have a lower weight in the fuzzy search; but they are still important metrics.
As for the motherboard being changed out, that’s essentially saying you have a brand new computer, not much we can do about that. But even then, the fuzzy search would be able to suggest a couple hosts that are more likely than the rest. That would allow FOG to cooperate with fog admins; if a host has been changed greatly, FOG can prompt the user to select which host it is, and update its metrics. The point of a fuzzy search is to handle most changes in a host gracefully. There will always be extremes.
This is just a discussion. Nothing here is set in stone, the idea is to come up with a better solution than relying on a single point.
Wayne Workman last edited by
Trying to keep this short but - I don’t think HDD serial numbers should be used because disks fail and get replaced. Also, a technician may trade HDDs in two boxes to see if a problem stays with the original box or moves to the new box.
I don’t think RAM information is a good one either, since RAM can be added. CPU would be better but still not great.
If the motherboard is changed out, that is effectively a new host - because it’ll come with a new MAC and new motherboard serial number.
@george1421 @Sebastian-Roth this is a very good conversation to be having. FOG 2.0 was looking at system uuids to identify computers, but for the reasons you stated, that wouldn’t work very well. Thinking out-loud here, maybe these points are worth considering:
- There is no 1 single we can rely on as you showed in your original post, and some derived value based on client information may be the best route to go.
- A static key derived from values may not be the best idea. Instead it should be a weighted component comprised of several fields. For example, one could think about it like so:
- UUID: .4
- Primary MAC: .2
- Motherboard asset #: .2
- Hard drive asset #: .1
- Ram/CPU information :.1
For each field that matches, a score gets increment by that amount; the host with the highest score, and above some threshold, gets selected/matched (essentially a fuzzy search). This provides some tolerance against machine’s hardware being upgraded, or portable network adapters being used. Obviously the fields I picked were completely arbitrary and some testing may need to go into it to pick the best identifiers.
@sebastian-roth I wonder if we should see if the testers can provide more examples of what I posted. We only have Dells at my work so what I found almost all dells are exactly the same in what they produce. It may be of some value to get the inventory of the MSI board that started this quest too. The more data we have to start with, the better the decision will be even before any code is crafted.
As for the ipxe that is going to be a difficult one.
These four look interesting:
manufacturer (string) Manufacturer product (string) Product name serial (string) Serial number asset (string) Asset tag
You can’t/shouldn’t use the hard drive serial number since that is the one device that may change more frequently than the other values. Hard drive crashes, is replaced and now we have a new identity.
Looking through the specs you posted I get the impression that we should be able to reliably identify unique systems checking those five:
- MAC address
- System UUID
- System Serial Number
- Motherboard Serial Number
- Hard Disk Serial Number
Though this is where the problems start. We need iPXE to report those figures right on boot-up so we can identify each and every client properly. So we cannot wait for linux to come up and read DMI information I am afraid. We might need to add some code to iPXE to be able to make this work.
Check out this list: http://ipxe.org/cfg (serial, asset, … sounds good - and we should be able to add more in case we need)
As described in the github issue I still think we should not add another field to the DB for that.
I’m not suggesting one way or the other here, but just opening the discussion.
Adding a new field would be possible especially if the
fluidvalue can be calculated based on values already in the inventory database. This could be a simple (or complex) sql script delivered with a FOG update just like we do today. If the value can be calculated then it can be populated.
You do raise an interesting point, what are the impacts to the FOG Client if we move away from mac address being the key identifier? The mac address will need to be involved somewhere still with booting since iPXE isn’t as smart as the FOG programmer’s PHP skills.
@george1421 Thanks for opening this thread. I’ve been onto that and opened an issue on github to work on that already but it’s still a great idea to discuss this important issue here in the forums as not all of you are on github I suspect.
Good to start with collecting system information as you did. So we see what’s available before we actually get into deciding about what to use.
As described in the github issue I still think we should not add another field to the DB for that. For that all FOG users would have to go through some kind of process to fill this new field with proper information for each and every client they have. Though we could do all this with the fog-client and other means of updating the DB field when clients run tasks I don’t think we need to go there.
What I propose is adding a logic to the PHP code that kind of concatenates (though not in the sense of string concat…) all the information we already have in the DB to find that one unique host currently talking…