Message boards : Questions and problems : 64 bit system suddenly thinks it is 32 bit.
Message board moderation
Author | Message |
---|---|
Send message Joined: 8 Nov 05 Posts: 24 ![]() |
I have 3 64 bit Linux systems all running Fedora 22 with the latest patches. Several weeks ago they started thinking they are 32 bit. They started using the 32 bit executables for the projects I crunch. I found lines like this in the sched_request files for each project: <platform_name>x86_64-pc-linux-gnu</platform_name> <alt_platform> <name>i686-pc-linux-gnu</name> </alt_platform> This past weekend one of those system had a failed disk drive and I had to replace it and rebuild the system from scratch. Still Fedora 22 with all the latest patches. All of a sudden it no longer believes it is 32 bit. The alt_platform stanza no longer appears in any of its sched_request files. I guessing that some package or patch that got installed when this started is confusing whatever it is in the boinc client that decides if it is 64 bit capable or not. Perhaps some 64 bit library was not updated. When I reinstalled the system this past weekend, I grabbed the KDE LiveCD and used that to reinstall. Other than the boinc-client, I installed nothing extra. I've googled around and found nothing appropriate. Next step is to see if I can grab the source and take a look although I'm not a developer, only a UNIX/Linux admin so I may or may not have any luck. Oh, and boinc clinet and manager is installed from the packages supplied in the Fedora repository. It is version 7.2.42. Any hints or suggestions would be appreciated. If more info is needed, let me know. |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
Is this related to your question 64 bit systems now using 32 bit application for Linux at FiND? Do you have the same observations for other projects? I'll attempt an explanation there, but both you and Ben may find it a little surprising. |
Send message Joined: 8 Nov 05 Posts: 24 ![]() |
Yes, it is. The other projects I have also have the same lines in their sched_request files on the 2 systems that have been running for who knows how long. They no longer appear in the one I rebuilt this weekend. However, before that, they did. Since it is apparently happening on other projects, I thought it might be better to ask here. I realize it's not a critical problem, but it just bothers me and I want to figure out why. |
Send message Joined: 8 Nov 05 Posts: 24 ![]() |
One other thing I just noticed. For the other project I crunch, it has the alt_platform lines in its sched_request file saying it is 32 bit but uses the 64 bit executable. That project supplies both 32 and 64 bit Linux executables. |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
My suggested explanation is available for anyone to read at FiNDAH message 2584. It applies to most projects, but the evidence is particularly stark at FiND. |
Send message Joined: 20 Nov 12 Posts: 801 ![]() |
I think your explanation covers pretty well why BOINC server may send either 32-bit or 64-bit apps when both are runnable. I'll cover this part: This past weekend one of those system had a failed disk drive and I had to replace it and rebuild the system from scratch. Still Fedora 22 with all the latest patches. All of a sudden it no longer believes it is 32 bit. The alt_platform stanza no longer appears in any of its sched_request files. 64-bit Linux version of BOINC client decides whether it can run 32-bit apps by checking files in system library directory and if it finds any 32-bit libraries it concludes that this system must support 32-bit apps. (Or so, can't remember the details.) You probably installed some 32-bit program in August which then pulled in 32-bit libraries. |
Send message Joined: 8 Nov 05 Posts: 24 ![]() |
There have always been 32 bit and 64 bit libraries on these systems. It's nothing new. I went back through the logs and see where some 32 bit libraries were upgraded (meaning newer versions replaced older versions) but I didn't see where anything was installed. I'll be trying one more thing - a complete removal and installation of boinc. If that doesn't fix it, and I'm not confident it will, I'll live with it unless I get a new idea. Thanks for the input. Charlie |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
The decision of which application to use (if both are available choices for the system) is made by the project server, on the basis of the application_version history for the HostID of the computer. If you remove and replace BOINC, but leave the rest of the computer unchanged, a BOINC server will try to recycle any available previous HostID - using your account, computer name, IP address, and probably other things I've forgotten to make the association between hardware and HostID. It does this to avoid unnecessary bloating of the Host table in the database. So, rather than removing and replacing the BOINC client, what you probably want to do is to force a new HostID, and thus start clean with a new host_app_version table. (make sure you don't immediately fill it with zero-runtime-estimate tasks...) The best way of doing that is by keeping the old BOINC installation, and forcing an apparent 'cheat' by tweaking the <rpc_seqno> for the project in client_state.xml downwards, so it appears that two separate computers are trying to contact the project scheduler using the same HostID. Someone may need to remind me whether it's necessary to set <allow_multiple_clients> at the same time for this to work with recent server code - or you could experiment. Flush out all running tasks with NNT before you try this, and update the project 'dry', and inspect for a successful HostID change, before allowing new work. |
Send message Joined: 8 Nov 05 Posts: 24 ![]() |
When I rebuilt the system this past weekend and reinstalled boinc, it took over the old hostid. That's what made be think a complete removal and reinstall might work. But, before i do that I'll give what you suggest a shot and see what happens. |
Send message Joined: 8 Nov 05 Posts: 24 ![]() |
No luck. I first changed the rpc_seq number to 0. Started boinc. It reregistered the host with a new id number and used the old host entry but still used the 32 bit binary. Then I stopped everything after the few tasks had completed. I removed everything boinc and reinstalled. I attached to the project. It registered the host as new but still used the old host id and still used the 32 bit binary. Someone suggested perhaps a 32 bit library somewhere is fooling boinc into thinking it can only run 32 bit. I've had both 32 and 64 bit libraries on this system for ever so I'm not sure what it causing the confusion. I guess I'll live with it for now. Thanks everyone for the suggestions. I really appreciate it. Charlie |
Send message Joined: 20 Nov 12 Posts: 801 ![]() |
Ok, checked what the client does. It scans the files in /lib, /lib32, /lib/32, /usr/lib, /usr/lib32 and /usr/lib/32. For all files in those directories and following symlinks it runs either /usr/bin/file or /bin/file, whichever you have. If the output from the file command contains both "ELF" and "32-bit" then 32-bit apps are supported. There is <no_alt_platform> but I don't know why it would be set in your re-installed host. |
Send message Joined: 8 Nov 05 Posts: 24 ![]() |
On a Linux system the 64 bit libraries are in /lib64 and /usr/lib64. 32 bit libraries are in /usr/lib and /lib. There is no directory named lib32 anywhere that I could find. I'll give that no_alt_platform a shot and see what happens. Charlie |
Send message Joined: 8 Nov 05 Posts: 24 ![]() |
Added no_alt_paltform to my cc_config.xml file. Didn't seem to make a difference. |
Send message Joined: 20 Nov 12 Posts: 801 ![]() |
That's weird. Do you have the file in right place? Mistyped the tag name? |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
Didn't work for me either, in Windows64. And I'm using BOINC v7.6.9, with cc_config.xml pre-populated with default tags from the GUI - just set the value in the place provided. Didn't work with a simple 'Read config files', didn't work after a full client restart. Edit - I did get a huge number of error messages on startup, including 16/09/2015 21:32:44 | FiND@Home | [error] App version has unsupported platform windows_intelx86; changing to windows_x86_64 and 16/09/2015 21:32:44 | LHC@home 1.0 | [error] App version has unsupported platform windows_intelx86; changing to windows_x86_64 16/09/2015 21:32:44 | LHC@home 1.0 | [error] State file error: duplicate app version: sixtrack windows_x86_64 44401 sse3 Edit 2: So now I have <app_version> <app_name>vina</app_name> <version_num>102</version_num> <platform>windows_x86_64</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>618682904.035871</flops> <api_version>7.5.0</api_version> <file_ref> <file_name>vina_1.2_windows_intelx86.exe</file_name> <main_program/> </file_ref> </app_version> - totally muddled, but it's going on fetching new work. Unfortunately, FiND doesn't distinguish the platform work is issued for by plan class, or otherwise display platform or alt_platform in the web lists of tasks issued. Edit 3: but that <flops> value looks like the APR=0.62 from the 64-bit app_ver in Application details for host 105172. Next step is probably a 'flush all tasks' (which will take far less time than is being estimated with that grotty APR), reset project, and see what comes down the pipe next time. |
Send message Joined: 20 Nov 12 Posts: 801 ![]() |
But... it did work, you wouldn't have had those errors otherwise. There doesn't seem to be any log messages though. |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
But... it did work, you wouldn't have had those errors otherwise. There doesn't seem to be any log messages though. My initial quick test of working was "Did it download the 64-bit version of the app at the next work request?", and the answer was "no". It was only as I worked through the logs and other indicators that I saw that it had converted the existing tasks and application_version to run the current workload (including the 32-bit binary) as if it was the 64-bit platform. |
Send message Joined: 20 Nov 12 Posts: 801 ![]() |
ok |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
OK, it begins to make more sense. After that batch had completed and reported, I reset the project, and fetched a new batch. This time I got the 64-bit app, as <app_version> <app_name>vina</app_name> <version_num>102</version_num> <platform>windows_x86_64</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>661977170.668584</flops> <api_version>7.5.0</api_version> <file_ref> <file_name>vina_1.2_windows_x86_64.exe</file_name> <main_program/> </file_ref> </app_version> Note that the APR has already risen slightly. The sched_request also showed <platform_name>windows_x86_64</platform_name>, and no alternates. But I'm also getting messages like 16/09/2015 22:20:29 | GPUGRID | Message from server: This project doesn't support computers of type windows_x86_64 so I'd better start putting this computer back together without the <no_alt_platform>, and resetting the projects which were 'adjusted'. |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
'Read config files' isn't enough to re-activate <alt_platform>, it needs a full client restart. But after that, it asked with <alt_platform>, and got the 32-bit application back as before. So the tag works, but use with care - it operates globally, across all projects, and not all of them necessarily supply work flagged as 64-bit. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.