Message boards : Questions and problems : BOINC caused small catastrophe
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Aug 05 Posts: 68 |
I just had BOINC case a small catastrophe. I'll describe what happened and hope that the behavior can be changed. The main machine on which I run BOINC has two problems that have required workarounds. The machine has an AMD Phenom II CPU and runs Windows 10 64 bit. 1. The machine is the main machine for backups. And, due to some hardware glitch, it is not possible to run GPU computation while doing a backup. If yo do, the SCSI controller will crash, requiring a power off reboot and destroying the data on whatever tape is in use. The workaround, of course, is to set BOINC to suspend GPU use when the backup program is running. Problem solved. 2. I'm in Phoenix, where summer temperatures of 115F or so are not uncommon. The air conditioning cannot keep up. So in the summer I want to limit hours that GPU computations run to exclude the hottest part of the day. The workaround here is to suspend GPU in BOINC and activate a Windows scheduled task that at the right time of day runs the command "C:\Program Files\BOINC\boinccmd.exe --set_gpu_mode always 43200" to restart GPU computation for a time. This has also been working fine. This evening, my backup hung and as I discovered, ruined the tape set being updated. A quick look at BOINC showed that it was running GPU tasks, despite the running backup program being set as a GPU exclusive program. I can only speculate what happened, but here goes. Clearly the set_gpu_mode command overrides the menu setting to suspend GPU computations exactly as documented. It appears that his command ALSO overrides the specification of a gpu exclusive application. Otherwise, why was BOINC running GPU tasks while the backup program was running? If I'm right, I think this should be fixed. If nothing else, the set_gpu_mode is not documented to override "suspend GPU when.." settings and I think it should not. Thanks for listening, ++PLS |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
It would be helpful to post segments of the event logs which cover significant time events during that process - the switches between GPU use suspended and permitted, and the time when the backup program started to run. That might help to narrow down which event went wrong, and either caused GPU processing to restart, or didn't stop it when it should have. Since you've probably rebooted the machine since then, you would have to recover the logs from the file 'stdoutdae.txt' in your BOINC data directory. I have to say the the combination of * SCSI tape backup server * Windows 10 * BOINC GPU computation sounds ambitious to me, and is possibly pushing the boundaries a bit hard. |
![]() Send message Joined: 29 Aug 05 Posts: 15634 ![]() |
The workaround here is to suspend GPU in BOINC and activate a Windows scheduled task that at the right time of day runs the command "Always" in this case is akin to "Run Always", not "Run based on preferences". So this setting ignores any other preferences you have set, including the exclusive_gpu_app option you have set in cc_config.xml Set it to "auto" to follow preferences. |
Send message Joined: 29 Aug 05 Posts: 68 |
Right now I'm rerunning the backup set that has its data corrupted. When that is done, I can reproduce the situation. I've copied the log file you mention, so we'll see. Ambitious? Perhaps, but this machine is idle when not running backups. And the problem with the SCSI controller and GPU is not unique to BOINC. Windows 10 uses the GPU enough to cause problems if the machine is being actively used during a backup. |
Send message Joined: 29 Aug 05 Posts: 68 |
> Set it to "auto"... Thanks, I'll try that. May I suggest updating the documentation to make this difference clear? Thanks, ++PLS |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.