Message boards : BOINC client : Spurious Suspends
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Aug 09 Posts: 19 ![]() |
Hi guys, I have an application that is deployed across a large private DG (1600 nodes), running an old 5.10.45 version of the client. The main worker thread of my application must always run, as its launching other processes (similarly to the wrapper) and needs to control them accordingly. Hence I've disabled the worker thread suspend function, and instead poll for suspend flags via a call to boinc_get_status() from my worker thread, and handle suspend requests accordingly. The problem is that on most nodes, including my own test node (running the same core client version), I'm seeing random setting of the suspend flags as signalled by the core client. These occur, even without explicit gui based suspend requests. I heard somewhere that Boinc is using suspend to "nice" applications, and was wondering if this is the case on all client versions? Cheers Chris |
![]() Send message Joined: 26 Aug 05 Posts: 164 |
BOINC can suspend an app for a number of reasons. Benchmarks and CPU throttling (what you are refering to as 'nice' I think) are the two most common causes. However, I've seen your problem crop up in a few applications due to memory corruption. The last time was due to a static array declared on the heap, the app wrote past the end of the array. Oddly enough the value being written in the array was a timestamp and so it was 'randomly' switching the suspend/resume state of the app based on the time of day. Is there any chance you can run your app under a debugger and break on memory access? ----- Rom BOINC Development Team, U.C. Berkeley My Blog |
Send message Joined: 17 Aug 09 Posts: 19 ![]() |
Well, I never explicitly allocate heap memory using malloc, but that's not to say the compiler won't allocate it. The app is developed using VC++ 2008 Express so sure I could get it to dump as soon as a suspend flag is found. However, the CPU throttling is more likely to be the problem, as other processes could well be running on the node (Novel based updates, virus scans etc). This could well be causing the seemingly random suspend (and also resume messages). This would be strange however, as its the child processes launched by my app (via win32 CreateProcess() calls) that are consuming cpu. Will the Core Client recognise these as Boinc related, and call suspend/resume to throttle back the CPU? Also, why is suspend/resume used, and not Windows or Unix thread/process priority nicing instead? |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.