FOG resizes all partitions even System Reserved!
-
@Quazz A way around it:
I believe the problem is dealing with multiple different characters within the same pattern of matching.
Maybe something like:
.*[Rr][Ee|È|è][Ss][Ee|È|è][Rr][Vv][Ee|È|è][Dd|].*
-
I like that idea, I will test it in a minute, but I did edit it a little bit (correcting accents and changing the last character possibilities given the label format that gets checked + added support for Dutch)
[Rr][Ee|É|é][Ss][Ee][Rr][Vv][Ee|É|é][Dd|_|Ee]
Well edit with results
-
Running some testing:
I created a labels.txt file
I then pulled the information from labels.txt into a variable.
Then I run inline testing: Regex pattern used:
*[Rr][Ee|É|é][Ss][Ee|É|é][Rr][Vv][Ee|É|é|][Dd|_|]*
Then I run inline testing: Regex pattern used:
.*[Rr][Ee|É|é][Ss][Ee|É|é][Rr][Vv][Ee|É|é|][Dd|_|].*
Hopefully this can help shed light. The
.*
matters more than we might expect. -
@Quazz Are you sure it’s both that fail now or could it just be one of them? I could imagine this being a very dirty character encoding issue. Could be wrong though.
-
@Sebastian-Roth Talking with @Quazz in chat, he says “you’ll never guess what works, storing the regex into a variable”
-
@Tom-Elliott
[Rr][Ee|É|é][Ss][Ee][Rr][Vv][Ee|É|é][Dd|]
This one is confirmed to work in my test case.
Feel free to update github as I can’t create a PR atm
-
@Quazz Great work. Could you please make sure to test this in the FOS client as well as the bash version might differ to the one you have on other systems. Possibly you have done the tests in the client already, then forget my comment.
-
@Sebastian-Roth Yes, I have tested in FOS, every other variation we tried worked on regular systems already
edit: I may have messed up the test case, I’ll try again in a bit though
I dun goofed, I’m pretty embarassed. I messed up my sed which made the variable empty and of course it matches then…
-
@Quazz So what does this mean?
Is the test cases working properly now?
-
@Tom-Elliott Unfortunately not, it turns out it was working because the variable was empty.
Back to square one.
Would
[Rr]*[Ss][Ee][Rr][Vv]*
be dangerous to use in production? I don’t expect many false positives, but arguably not resizing certain partitions that should is less harmful than the reverse.edit: Even stuff like grep fails, the accents are blocking everything so far
-
@Quazz said in FOG resizes all partitions even System Reserved!:
Even stuff like grep fails, the accents are blocking everything so far
Possibly a character encoding thing. Sorry but have no great idea on how to work around this.
-
@Sebastian-Roth I suppose we could try enforcing all of the systems to load UTF-8 locale?
-
@Tom-Elliott
BR2_ENABLE_LOCALE_WHITELIST="en_US"
to
BR2_ENABLE_LOCALE_WHITELIST="en_US.UTF-8"
?edit: is it possible the difference is down the glibc vs uClibc???
-
Haven’t forgotten about this, just been setting up a build environment so I can try out some of the locale options and see if anything helps.
-
I did not really understand what was going on here… Have you made some progress?
-
@maxcarpone Basically, unicode support (required for letters such as é) is broken/missing in current builds, causing the check to fail.
I think we’re on the right track, but each build takes time to compile, so for each idea it takes hours before I can test.
Currently looking into busybox locale support.
-
Still no improvement, I did notice /usr/share/locale being empty, not sure if that’s expected on buildroot or not… (according to the buildroot code, locale support only works if the target file system has locales in /usr/share/locale)
I have a week off next week, so this seems like it will be on hold for a while.
-
@Quazz Thanks for looking into this!
Usually you don’t have to recomplie from scratch all the time. Try changing some code or setting and run
make
again. Depending on what you changed this will only take a couple of minutes. -
@Sebastian-Roth That’s usually true, but for some reason my changes weren’t being reflected, I was probably doing something wrong, but couldn’t figure it out.
So far, the only thing I’ve been able to find is that everything should work, sigh.
-
@Quazz @Tom-Elliott I have done some testing on my own now and this is definitely unicode hell! Short story is I think we should not rely on those label checks anymore as they can go wrong so easily with non ASCII characters. Here you go with a bit of unicode fun in the client shell:
[Mon Dec 03 root@fogclient ~]# e=$(echo -ne '\xC3\xA9') [Mon Dec 03 root@fogclient ~]# E=$(echo -ne '\xC3\x89') [Mon Dec 03 root@fogclient ~]# label=$(echo -ne 'R\xC3\xA9serv\xC3\xA9_au_syst') [Mon Dec 03 root@fogclient ~]# echo $e é [Mon Dec 03 root@fogclient ~]# echo $E É [Mon Dec 03 root@fogclient ~]# echo $label Réservé_au_syst
Ok that’s for starters just to get the right characters set in variables as I can’t seem to enter those using my keyboard in a ssh session on a client (neither can I in the VM terminal). So I suppose bash and the underlaying libs are able to display unicode characters but it’s not fully supported anyhow.
Important: This is using the UTF-8 codes foré
but there are other encoding standards like ISO-8859-1 through to ISO-8859-15 and many more that may encode the very same character with different codes. Or let me say it the other way round. If we read that label the returned string might be using different unicodes than we had used in the scripts although the characters look identical to our eyes it would still not match. So here comes the fun part:[Mon Dec 03 root@fogclient ~]# if [[ $label =~ [Rr][Ee$E$e] ]]; then echo "JA"; fi JA
So using the variables in the bash regex actually does work. But…
[Mon Dec 03 root@fogclient ~]# if [[ $label =~ [Rr][Ee$E$e][Ss] ]]; then echo "JA"; fi
What?!? I simply added
[Ss]
which should match, shouldn’t it? Ok let’s try to skip the special character for now.[Mon Dec 03 root@fogclient ~]# if [[ $label =~ [Rr].[Ss] ]]; then echo "JA"; fi [Mon Dec 03 root@fogclient ~]# if [[ $label =~ [Rr]..[Ss] ]]; then echo "JA"; fi JA [Mon Dec 03 root@fogclient ~]# if [[ $label =~ [Rr][Ee$E$e].[Ss] ]]; then echo "JA"; fi JA
Crazy stuff. So this special character ends up being two characters when doing bash regex. I still have no idea what that extra character might be and how to find it other than using
.
as any character. I guess it stems from our buildroot bash only partially supporting UTF-8 unicode. Anyhow, this is how my new regex looks like:[Mon Dec 03 root@fogclient ~]# if [[ $label =~ [Rr][Ee$E$e].[Ss][Ee][Rr][Vv][Ee$E$e].[Dd]? ]]; then echo "JA"; fi JA
And exactly the same if we use
grep
instead:[Mon Dec 03 root@fogclient ~]# echo $label | grep "[Rr][Ee$E$e][Ss][Ee][Rr][Vv][Ee$E$e]" [Mon Dec 03 root@fogclient ~]# echo $label | grep "[Rr][Ee$E$e].[Ss][Ee][Rr][Vv][Ee$E$e].[Dd]*" Réservé_au_syst
Same ugly hack I reckon. And please keep in mind that this could fail if some Windows installations were made using ISO-8859-1 code pages. So to sum it all up. Let’s move forward and not waste any more time to find the perfect regex matching all the labels out there.
We have started to gather information on that stuff and I think we should tackle it now and see if it works any better: https://github.com/FOGProject/fos/issues/18