Message boards : Questions and problems : Mageia 4 & 5 Linux - suspend and local pref issues
Message board moderation
Author | Message |
---|---|
Send message Joined: 23 Oct 15 Posts: 4 ![]() |
Hi, I have Boinc installed on various Windows machines running WCG without issue. I have deployed to a single Mac using the same configuration also without issue. Now I'm attempting to deploy to a few Linux boxes as a test before deploying wider. I'm running into the issue that I can not get the Boinc jobs on the Linux machines to resume. They constantly detect the computer as active (even with no one logged in). I know the activity detection on Linux as been an issue now and then, so I thought I would work around it by allowing work even when the computer is busy but to suspend if non-boinc CPU work exceeds a low % like 10%. However, if I set the local config to run always but to suspend with the CPU is over X% it never suspends. In a last ditch attempt I tried to use exclusive apps to suspend computing however I can not get that to work either. I added bash and top to the exclusive app list and boinc still does not suspend. Reading various forms, I'm still at a loss. Okay here is the setup and configuration: 2 computers running Mageia5 and 1 running Mageia4. - 1 is a dual core with 4G and a GT310 - 2 are quad core with 16GB and a GT730. - all three installs are x64 - all three are configured with network file systems and user accounts (they all have the same configuration) Boinc clients installed (the latest stable available): - boinc-client-7.2.42-1.2.mga4 - boinc-client-7.2.42-5.mga5 Boinc was installed as a service. Boinc manager installed only on 2 of the machines for troubleshooting. BoincTasks (v1.6.7) used to monitor all the machines. Boinc screen saver or gui manager is not used. Note: Windows machines are running 7.2.47 and Mac 7.2.42 Projects: only World Community Grid Method of install: # install client urpmi boinc-client # create the config folder # copy the default config files into /var/lib/boinc # gui_rpc_auth.cfg # remote_hosts.cfg # account_www.worldcommunitygrid.org.xml # set perms on these files to 640 # chown the files to boinc # check if enable on boot systemctl is-enabled boinc-client.service # if not enabled the enable it systemctl enable boinc-client.service # open port 31416 tcp and udp for incoming # start the client systemctl start boinc-client.service Configurations: 1. This is the basic config where everything is controlled from the setting on the WCG website. This is the config that is seen by all three operating systems. These files are taken from one of the Mageia5 boxes. cc_config.xml <cc_config> <log_flags> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> <task>1</task> <android_debug>0</android_debug> <app_msg_receive>0</app_msg_receive> <app_msg_send>0</app_msg_send> <async_file_debug>0</async_file_debug> <benchmark_debug>0</benchmark_debug> <checkpoint_debug>0</checkpoint_debug> <coproc_debug>0</coproc_debug> <cpu_sched>0</cpu_sched> <cpu_sched_debug>0</cpu_sched_debug> <cpu_sched_status>0</cpu_sched_status> <dcf_debug>0</dcf_debug> <disk_usage_debug>0</disk_usage_debug> <file_xfer_debug>0</file_xfer_debug> <gui_rpc_debug>0</gui_rpc_debug> <heartbeat_debug>0</heartbeat_debug> <http_debug>0</http_debug> <http_xfer_debug>0</http_xfer_debug> <mem_usage_debug>0</mem_usage_debug> <network_status_debug>0</network_status_debug> <notice_debug>0</notice_debug> <poll_debug>0</poll_debug> <priority_debug>0</priority_debug> <proxy_debug>0</proxy_debug> <rr_simulation>0</rr_simulation> <rrsim_detail>0</rrsim_detail> <sched_op_debug>0</sched_op_debug> <scrsave_debug>0</scrsave_debug> <slot_debug>0</slot_debug> <state_debug>0</state_debug> <statefile_debug>0</statefile_debug> <suspend_debug>0</suspend_debug> <task_debug>0</task_debug> <time_debug>0</time_debug> <trickle_debug>0</trickle_debug> <unparsed_xml>0</unparsed_xml> <work_fetch_debug>0</work_fetch_debug> </log_flags> <options> <abort_jobs_on_exit>0</abort_jobs_on_exit> <allow_multiple_clients>0</allow_multiple_clients> <allow_remote_gui_rpc>0</allow_remote_gui_rpc> <client_version_check_url>http://boinc.berkeley.edu/download.php?xml=1</client_version_check_url> <client_new_version_text></client_new_version_text> <client_download_url>http://boinc.berkeley.edu/download.php</client_download_url> <disallow_attach>0</disallow_attach> <dont_check_file_sizes>0</dont_check_file_sizes> <dont_contact_ref_site>0</dont_contact_ref_site> <exit_after_finish>0</exit_after_finish> <exit_before_start>0</exit_before_start> <exit_when_idle>0</exit_when_idle> <fetch_minimal_work>0</fetch_minimal_work> <fetch_on_update>0</fetch_on_update> <force_auth>default</force_auth> <http_1_0>0</http_1_0> <http_transfer_timeout>300</http_transfer_timeout> <http_transfer_timeout_bps>10</http_transfer_timeout_bps> <max_event_log_lines>2000</max_event_log_lines> <max_file_xfers>8</max_file_xfers> <max_file_xfers_per_project>2</max_file_xfers_per_project> <max_stderr_file_size>0</max_stderr_file_size> <max_stdout_file_size>0</max_stdout_file_size> <max_tasks_reported>0</max_tasks_reported> <ncpus>-1</ncpus> <network_test_url>http://www.google.com/</network_test_url> <no_alt_platform>0</no_alt_platform> <no_gpus>0</no_gpus> <no_info_fetch>0</no_info_fetch> <no_priority_change>0</no_priority_change> <os_random_only>0</os_random_only> <proxy_info> <socks_server_name></socks_server_name> <socks_server_port>80</socks_server_port> <http_server_name></http_server_name> <http_server_port>80</http_server_port> <socks5_user_name></socks5_user_name> <socks5_user_passwd></socks5_user_passwd> <http_user_name></http_user_name> <http_user_passwd></http_user_passwd> <no_proxy></no_proxy> </proxy_info> <rec_half_life_days>10.000000</rec_half_life_days> <report_results_immediately>0</report_results_immediately> <run_apps_manually>0</run_apps_manually> <save_stats_days>30</save_stats_days> <skip_cpu_benchmarks>0</skip_cpu_benchmarks> <simple_gui_only>0</simple_gui_only> <start_delay>0.000000</start_delay> <stderr_head>0</stderr_head> <suppress_net_info>0</suppress_net_info> <unsigned_apps_ok>0</unsigned_apps_ok> <use_all_gpus>0</use_all_gpus> <use_certs>0</use_certs> <use_certs_only>0</use_certs_only> <vbox_window>0</vbox_window> </options> </cc_config> global_prefs.xml <global_preferences> <source_project>http://www.worldcommunitygrid.org/</source_project> <source_scheduler>https://scheduler.worldcommunitygrid.org/boinc/wcg_cgi/fcgi</source_scheduler> <mod_time>1445617860</mod_time> <cpu_scheduling_period_minutes>120</cpu_scheduling_period_minutes> <disk_interval>60.0</disk_interval> <disk_max_used_gb>10.0</disk_max_used_gb> <disk_max_used_pct>50.0</disk_max_used_pct> <disk_min_free_gb>2.0</disk_min_free_gb> <end_hour>0</end_hour> <idle_time_to_run>3.0</idle_time_to_run> <max_bytes_sec_down>0.0</max_bytes_sec_down> <max_bytes_sec_up>0.0</max_bytes_sec_up> <max_cpus>16</max_cpus> <max_ncpus_pct>50.0</max_ncpus_pct> <net_end_hour>0</net_end_hour> <net_start_hour>0</net_start_hour> <start_hour>0</start_hour> <cpu_usage_limit>60.0</cpu_usage_limit> <ram_max_used_busy_pct>25.0</ram_max_used_busy_pct> <ram_max_used_idle_pct>75.0</ram_max_used_idle_pct> <vm_max_used_pct>50.0</vm_max_used_pct> <work_buf_min_days>0.2</work_buf_min_days> <work_buf_additional_days>0.3</work_buf_additional_days> <suspend_if_no_recent_input>0.0</suspend_if_no_recent_input> <daily_xfer_period_days>0</daily_xfer_period_days> <daily_xfer_limit_mb>0.0</daily_xfer_limit_mb> <suspend_cpu_usage>30.0</suspend_cpu_usage> <venue name="linux"> <cpu_scheduling_period_minutes>120</cpu_scheduling_period_minutes> <disk_interval>60.0</disk_interval> <disk_max_used_gb>10.0</disk_max_used_gb> <disk_max_used_pct>50.0</disk_max_used_pct> <disk_min_free_gb>2.0</disk_min_free_gb> <end_hour>0</end_hour> <idle_time_to_run>3.0</idle_time_to_run> <max_bytes_sec_down>0.0</max_bytes_sec_down> <max_bytes_sec_up>0.0</max_bytes_sec_up> <daily_xfer_period_days>0</daily_xfer_period_days> <daily_xfer_limit_mb>0.0</daily_xfer_limit_mb> <max_cpus>16</max_cpus> <max_ncpus_pct>50.0</max_ncpus_pct> <suspend_cpu_usage>30.0</suspend_cpu_usage> <net_end_hour>0</net_end_hour> <net_start_hour>0</net_start_hour> <start_hour>0</start_hour> <cpu_usage_limit>60.0</cpu_usage_limit> <ram_max_used_busy_pct>25.0</ram_max_used_busy_pct> <ram_max_used_idle_pct>75.0</ram_max_used_idle_pct> <vm_max_used_pct>50.0</vm_max_used_pct> <work_buf_min_days>0.2</work_buf_min_days> <work_buf_additional_days>0.3</work_buf_additional_days> <suspend_if_no_recent_input>0.0</suspend_if_no_recent_input> </venue> </global_preferences> global_prefs_override.xml does not exist. Event log from this machine after stopping and starting the boinc client using systemctl [stop|start] boinc-client.service command.
After grabbing the files I logged off the computer and waited several minutes (previously I have left the machine over night, restarted it, etc.) Using the boinc-manager on server01 I verified that all three activity settings are on Run based on preferences. Before playing with the configurations and trying different things I left the 3 machines overnight (after restarting) to see if they would kick in. BoincTasks showed that all three were in download mode but suspended due to the computer being in use. On the mageia4 machine I was logged in (as it is my desktop) but was not active. The other two machines had no one logged in and where on the graphical login screen (set to blank after a period of time). When I started trying other configurations I experimented with my desktop, setting exclusive apps and local prefs set to run when busy but suspend when too much CPU load. Here are the current relevant files: global_prefs_override.xml [code] <global_preferences> <run_on_batteries>1</run_on_batteries> <run_if_user_active>1</run_if_user_active> <run_gpu_if_user_active>1</run_gpu_if_user_active> <suspend_cpu_usage>5.000000</suspend_cpu_usage> <start_hour>0.000000</start_hour> <end_hour>0.000000</end_hour> <net_start_hour>0.000000</net_start_hour> <net_end_hour>0.000000</net_end_hour> <leave_apps_in_memory>0</leave_apps_in_memory> <confirm_before_connecting>0</confirm_before_connecting> <hangup_if_dialed>0</hangup_if_dialed> <dont_verify_images>0</dont_verify_images> <work_buf_min_days>0.200000</work_buf_min_days> <work_buf_additional_days>0.300000</work_buf_additional_days> <max_ncpus_pct>50.000000</max_ncpus_pct> <cpu_scheduling_period_minutes>120.000000</cpu_scheduling_period_minutes> <disk_interval>60.000000</disk_interval> <disk_max_used_gb>10.000000</disk_max_used_gb> <disk_max_used_pct>50.000000</disk_max_used_pct> <disk_min_free_gb>2.000000</disk_min_free_gb> <vm_max_used_pct>50.000000</vm_max_used_pct> <ram_max_used_busy_pct>25.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>75.000000</ram_max_used_idle_pct> <max_bytes_sec_up>0.000000</max_bytes_sec_up> <max_bytes_sec_down>0.000000</max_bytes_sec_down> <cpu_usage_limit>60.000000</cpu_usage_limit> <daily_xfer_limit_mb>0.000000</daily_xfer_limit_mb> <daily_xfer_period_days>0</daily_xfer_period_days> </global_preferences> [/code] cc_config.xml [code] <cc_config> <log_flags> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> <task>1</task> <android_debug>0</android_debug> <app_msg_receive>0</app_msg_receive> <app_msg_send>0</app_msg_send> <async_file_debug>0</async_file_debug> <benchmark_debug>0</benchmark_debug> <checkpoint_debug>0</checkpoint_debug> <coproc_debug>0</coproc_debug> <cpu_sched>0</cpu_sched> <cpu_sched_debug>0</cpu_sched_debug> <cpu_sched_status>0</cpu_sched_status> <dcf_debug>0</dcf_debug> <disk_usage_debug>0</disk_usage_debug> <file_xfer_debug>0</file_xfer_debug> <gui_rpc_debug>0</gui_rpc_debug> <heartbeat_debug>0</heartbeat_debug> <http_debug>0</http_debug> <http_xfer_debug>0</http_xfer_debug> <mem_usage_debug>0</mem_usage_debug> <network_status_debug>0</network_status_debug> <notice_debug>0</notice_debug> <poll_debug>0</poll_debug> <priority_debug>0</priority_debug> <proxy_debug>0</proxy_debug> <rr_simulation>0</rr_simulation> <rrsim_detail>0</rrsim_detail> <sched_op_debug>0</sched_op_debug> <scrsave_debug>0</scrsave_debug> <slot_debug>0</slot_debug> <state_debug>0</state_debug> <statefile_debug>0</statefile_debug> <suspend_debug>0</suspend_debug> <task_debug>0</task_debug> <time_debug>0</time_debug> <trickle_debug>0</trickle_debug> <unparsed_xml>0</unparsed_xml> <work_fetch_debug>0</work_fetch_debug> </log_flags> <options> <abort_jobs_on_exit>0</abort_jobs_on_exit> <allow_multiple_clients>0</allow_multiple_clients> <allow_remote_gui_rpc>0</allow_remote_gui_rpc> <client_version_check_url>http://boinc.berkeley.edu/download.php?xml=1</client_version_check_url> <client_new_version_text></client_new_version_text> <client_download_url>http://boinc.berkeley.edu/download.php</client_download_url> <disallow_attach>0</disallow_attach> <dont_check_file_sizes>0</dont_check_file_sizes> <dont_contact_ref_site>0</dont_contact_ref_site> <exclusive_app>bash</exclusive_app> <exclusive_app>top</exclusive_app> <exit_after_finish>0</exit_after_finish> <exit_before_start>0</exit_before_start> <exit_when_idle>0</exit_when_idle> <fetch_minimal_work>0</fetch_minimal_work> <fetch_on_update>0</fetch_on_update> <force_auth>default</force_auth> <http_1_0>0</http_1_0> <http_transfer_timeout>300</http_transfer_timeout> <http_transfer_timeout_bps>10</http_transfer_timeout_bps> <max_event_log_lines>2000</max_event_log_lines> <max_file_xfers>8</max_file_xfers> <max_file_xfers_per_project>2</max_file_xfers_per_project> <max_stderr_file_size>0</max_stderr_file_size> <max_stdout_file_size>0</max_stdout_file_size> <max_tasks_reported>0</max_tasks_reported> <ncpus>-1</ncpus> <network_test_url>http://www.google.com/</network_test_url> <no_alt_platform>0</no_alt_platform> <no_gpus>0</no_gpus> <no_info_fetch>0</no_info_fetch> <no_priority_change>0</no_priority_change> <os_random_only>0</os_random_only> <proxy_info> <socks_server_name></socks_server_name> <socks_server_port>80</socks_server_port> <http_server_name></http_server_name> <http_server_port>80</http_server_port> <socks5_user_name></socks5_user_name> <socks5_user_passwd></socks5_user_passwd> <http_user_name></http_user_name> <http_user_passwd></http_user_passwd> <no_proxy></no_proxy> </proxy_info> <rec_half_life_days>10.000000</rec_half_life_days> <report_results_immediately>0</report_results_immediately> <run_apps_manually>0</run_apps_manually> <save_stats_days>30</save_stats_days> <skip_cpu_benchmarks>0</skip_cpu_benchmarks> <simple_gui_only>0</simple_gui_only> <start_delay>0</start_delay> <stderr_head>0</stderr_head> <suppress_net_info>0</suppress_net_info> <unsigned_apps_ok>0</unsigned_apps_ok> <use_all_gpus>0</use_all_gpus> <use_certs>0</use_certs> <use_certs_only>0</use_certs_only> <vbox_window>0</vbox_window> </options> </cc_config> [/code] Event log after stopping and starting
After grabbing the files I logged off the computer and waited several minutes (previously I have left the machine over night, restarted it, etc.) Using the boinc-manager on server01 I verified that all three activity settings are on Run based on preferences. Before playing with the configurations and trying different things I left the 3 machines overnight (after restarting) to see if they would kick in. BoincTasks showed that all three were in download mode but suspended due to the computer being in use. On the mageia4 machine I was logged in (as it is my desktop) but was not active. The other two machines had no one logged in and where on the graphical login screen (set to blank after a period of time). When I started trying other configurations I experimented with my desktop, setting exclusive apps and local prefs set to run when busy but suspend when too much CPU load. Here are the current relevant files: global_prefs_override.xml [code] <global_preferences> <run_on_batteries>1</run_on_batteries> <run_if_user_active>1</run_if_user_active> <run_gpu_if_user_active>1</run_gpu_if_user_active> <suspend_cpu_usage>5.000000</suspend_cpu_usage> <start_hour>0.000000</start_hour> <end_hour>0.000000</end_hour> <net_start_hour>0.000000</net_start_hour> <net_end_hour>0.000000</net_end_hour> <leave_apps_in_memory>0</leave_apps_in_memory> <confirm_before_connecting>0</confirm_before_connecting> <hangup_if_dialed>0</hangup_if_dialed> <dont_verify_images>0</dont_verify_images> <work_buf_min_days>0.200000</work_buf_min_days> <work_buf_additional_days>0.300000</work_buf_additional_days> <max_ncpus_pct>50.000000</max_ncpus_pct> <cpu_scheduling_period_minutes>120.000000</cpu_scheduling_period_minutes> <disk_interval>60.000000</disk_interval> <disk_max_used_gb>10.000000</disk_max_used_gb> <disk_max_used_pct>50.000000</disk_max_used_pct> <disk_min_free_gb>2.000000</disk_min_free_gb> <vm_max_used_pct>50.000000</vm_max_used_pct> <ram_max_used_busy_pct>25.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>75.000000</ram_max_used_idle_pct> <max_bytes_sec_up>0.000000</max_bytes_sec_up> <max_bytes_sec_down>0.000000</max_bytes_sec_down> <cpu_usage_limit>60.000000</cpu_usage_limit> <daily_xfer_limit_mb>0.000000</daily_xfer_limit_mb> <daily_xfer_period_days>0</daily_xfer_period_days> </global_preferences> [/code] cc_config.xml [code] <cc_config> <log_flags> <file_xfer>1</file_xfer> <sched_ops>1</sched_ops> <task>1</task> <android_debug>0</android_debug> <app_msg_receive>0</app_msg_receive> <app_msg_send>0</app_msg_send> <async_file_debug>0</async_file_debug> <benchmark_debug>0</benchmark_debug> <checkpoint_debug>0</checkpoint_debug> <coproc_debug>0</coproc_debug> <cpu_sched>0</cpu_sched> <cpu_sched_debug>0</cpu_sched_debug> <cpu_sched_status>0</cpu_sched_status> <dcf_debug>0</dcf_debug> <disk_usage_debug>0</disk_usage_debug> <file_xfer_debug>0</file_xfer_debug> <gui_rpc_debug>0</gui_rpc_debug> <heartbeat_debug>0</heartbeat_debug> <http_debug>0</http_debug> <http_xfer_debug>0</http_xfer_debug> <mem_usage_debug>0</mem_usage_debug> <network_status_debug>0</network_status_debug> <notice_debug>0</notice_debug> <poll_debug>0</poll_debug> <priority_debug>0</priority_debug> <proxy_debug>0</proxy_debug> <rr_simulation>0</rr_simulation> <rrsim_detail>0</rrsim_detail> <sched_op_debug>0</sched_op_debug> <scrsave_debug>0</scrsave_debug> <slot_debug>0</slot_debug> <state_debug>0</state_debug> <statefile_debug>0</statefile_debug> <suspend_debug>0</suspend_debug> <task_debug>0</task_debug> <time_debug>0</time_debug> <trickle_debug>0</trickle_debug> <unparsed_xml>0</unparsed_xml> <work_fetch_debug>0</work_fetch_debug> </log_flags> <options> <abort_jobs_on_exit>0</abort_jobs_on_exit> <allow_multiple_clients>0</allow_multiple_clients> <allow_remote_gui_rpc>0</allow_remote_gui_rpc> <client_version_check_url>http://boinc.berkeley.edu/download.php?xml=1</client_version_check_url> <client_new_version_text></client_new_version_text> <client_download_url>http://boinc.berkeley.edu/download.php</client_download_url> <disallow_attach>0</disallow_attach> <dont_check_file_sizes>0</dont_check_file_sizes> <dont_contact_ref_site>0</dont_contact_ref_site> <exclusive_app>bash</exclusive_app> <exclusive_app>top</exclusive_app> <exit_after_finish>0</exit_after_finish> <exit_before_start>0</exit_before_start> <exit_when_idle>0</exit_when_idle> <fetch_minimal_work>0</fetch_minimal_work> <fetch_on_update>0</fetch_on_update> <force_auth>default</force_auth> <http_1_0>0</http_1_0> <http_transfer_timeout>300</http_transfer_timeout> <http_transfer_timeout_bps>10</http_transfer_timeout_bps> <max_event_log_lines>2000</max_event_log_lines> <max_file_xfers>8</max_file_xfers> <max_file_xfers_per_project>2</max_file_xfers_per_project> <max_stderr_file_size>0</max_stderr_file_size> <max_stdout_file_size>0</max_stdout_file_size> <max_tasks_reported>0</max_tasks_reported> <ncpus>-1</ncpus> <network_test_url>http://www.google.com/</network_test_url> <no_alt_platform>0</no_alt_platform> <no_gpus>0</no_gpus> <no_info_fetch>0</no_info_fetch> <no_priority_change>0</no_priority_change> <os_random_only>0</os_random_only> <proxy_info> <socks_server_name></socks_server_name> <socks_server_port>80</socks_server_port> <http_server_name></http_server_name> <http_server_port>80</http_server_port> <socks5_user_name></socks5_user_name> <socks5_user_passwd></socks5_user_passwd> <http_user_name></http_user_name> <http_user_passwd></http_user_passwd> <no_proxy></no_proxy> </proxy_info> <rec_half_life_days>10.000000</rec_half_life_days> <report_results_immediately>0</report_results_immediately> <run_apps_manually>0</run_apps_manually> <save_stats_days>30</save_stats_days> <skip_cpu_benchmarks>0</skip_cpu_benchmarks> <simple_gui_only>0</simple_gui_only> <start_delay>0</start_delay> <stderr_head>0</stderr_head> <suppress_net_info>0</suppress_net_info> <unsigned_apps_ok>0</unsigned_apps_ok> <use_all_gpus>0</use_all_gpus> <use_certs>0</use_certs> <use_certs_only>0</use_certs_only> <vbox_window>0</vbox_window> </options> </cc_config> [/code] Event log after stopping and starting
Of course bash is running (several instances) and top is running and yet the machine will not suspend. Running a browser load test in firefox to bring the CPU up to 40+% does not cause suspend either. Even tried running top as the local boinc user to see it this was some strange perms issue but it makes no difference. I'm likely missing something but I can't seem to find it so any hints would be greatly appreciated. Also I was not sure which flags would be of use so I did not turn any on for the postings here. Lastly, I posted to the Mageia forms in case this is a bug related to Mageia and to request a later version get built for Mageia... this was earlier today. Cheers. |
Send message Joined: 20 Nov 12 Posts: 801 ![]() |
Some thoughts. <venue name="linux"> These don't match. Using the boinc-manager on server01 I verified that all three activity settings are on Run based on preferences. I've had Manager lie to me a few times. Try toggling the activity settings. setting exclusive apps and local prefs set to run when busy but suspend when too much CPU load Works here, although I use 7.7 client. Try setting <mem_usage_debug> and <time_debug> log flags. |
Send message Joined: 23 Oct 15 Posts: 4 ![]() |
Hi Juha, thanks for the info and suggestions...
So I was wondering about the "venue name". When attempting to figure out how to get things working, on the WCG website I defined two device profiles: Default and linux. Although I did have all the devices including this one set to use the "Default" profile, I would still see the "linux" listed in the global_prefs.xml file. I then deleted the "linux" device profile on the website and eventually the clients removed this entry from their configuration. The WCG Host location is still "none"; I suspect this means it is using the "Default" device profile but I don't know for sure. The test clients no longer mention the "linux" device profile and there is no change in the clients behavior.
I had read about this issue and did and have again tried toggling the activity settings. When set to suspend the clients suspend, when set to run always the clients run, and when set to preferences, the clients suspend reporting that the machines are busy.
I take it that the 7.7 client is a development version as I do not see it listed on the boinc site. Do you know where I might find a more recent build for linux x64? Also I tried the two flags without much success (as in I'm not sure what the information is telling me). Here is the event log: Note: the unrecognised tags seem to be due to using a boinc manager on Windows to access a boinc client running on Linux. Removing the tags from cc_config.xml manually does not appear to improve the situation any. The boinc manager on Windows likes to re-add the two flags whenever you adjust the computer preferences. 15/10/27 15:17:47 | | Unrecognized tag in cc_config.xml: <dont_suspend_nci> 15/10/27 15:17:47 | | Unrecognized tag in cc_config.xml: <dont_use_vbox> 15/10/27 15:17:47 | | Starting BOINC client version 7.2.42 for x86_64-pc-linux-gnu 15/10/27 15:17:47 | | log flags: file_xfer, sched_ops, task, mem_usage_debug, time_debug 15/10/27 15:17:47 | | Libraries: libcurl/7.40.0 OpenSSL/1.0.2d zlib/1.2.8 libidn/1.32 libssh2/1.4.3 15/10/27 15:17:47 | | Data directory: /var/lib/boinc 15/10/27 15:17:47 | | No usable GPUs found 15/10/27 15:17:47 | | Host name: host01 15/10/27 15:17:47 | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz [Family 6 Model 42 Stepping 7] 15/10/27 15:17:47 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid xsaveopt 15/10/27 15:17:47 | | OS: Linux: 4.1.8-desktop-1.mga5 15/10/27 15:17:47 | | Memory: 15.63 GB physical, 15.99 GB virtual 15/10/27 15:17:47 | | Disk: 163.72 GB total, 102.27 GB free 15/10/27 15:17:47 | | Local time is UTC -6 hours 15/10/27 15:17:47 | | VirtualBox version: 5.0.8_OSEr103449 15/10/27 15:17:47 | | Config: GUI RPCs allowed from: 15/10/27 15:17:47 | | server01.domain.com 15/10/27 15:17:47 | | server01 15/10/27 15:17:47 | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID [blanked]; resource share 100 15/10/27 15:17:47 | World Community Grid | General prefs: from World Community Grid (last modified 23-Oct-2015 17:13:34) 15/10/27 15:17:47 | World Community Grid | Host location: none 15/10/27 15:17:47 | World Community Grid | General prefs: using your defaults 15/10/27 15:17:47 | | Preferences: 15/10/27 15:17:47 | | max memory usage when active: 4001.26MB 15/10/27 15:17:47 | | max memory usage when idle: 12003.79MB 15/10/27 15:17:47 | | max disk usage: 10.00GB 15/10/27 15:17:47 | | max CPUs used: 4 15/10/27 15:17:47 | | don't compute while active 15/10/27 15:17:47 | | don't use GPU while active 15/10/27 15:17:47 | | suspend work if non-BOINC CPU load exceeds 30% 15/10/27 15:17:47 | | (to change preferences, visit a project web site or select Preferences in the Manager) 15/10/27 15:17:47 | | Not using a proxy 15/10/27 15:17:47 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:47 | | Suspending computation - computer is in use 15/10/27 15:17:47 | | Suspending network activity - computer is in use 15/10/27 15:17:47 | | [time] dt 13.724262 w2 0.999984 on 0.999984; active 0.930795; gpu_active 0.930795; conn 0.000000, cpu_and_net_avail 0.930795 15/10/27 15:17:48 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:49 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:50 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:51 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:52 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:53 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:54 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:55 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:56 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:57 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:57 | | [time] dt 10.125921 w2 0.999988 on 0.999984; active 0.930784; gpu_active 0.930784; conn 0.000000, cpu_and_net_avail 0.930784 15/10/27 15:17:58 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:17:59 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:18:00 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:18:01 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:18:02 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:18:03 | | [error] [mem_usage] procinfo_setup() returned 1 And the pattern continues this way. I turned off the mem_usage flag: 15/10/27 15:23:48 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:23:49 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:23:49 | | [time] dt 10.050012 w2 0.999988 on 0.999984; active 0.930405; gpu_active 0.930405; conn 0.000000, cpu_and_net_avail 0.930405 15/10/27 15:23:50 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:23:51 | | [error] [mem_usage] procinfo_setup() returned 1 15/10/27 15:23:51 | | Re-reading cc_config.xml 15/10/27 15:23:51 | | Unrecognized tag in cc_config.xml: <dont_suspend_nci> 15/10/27 15:23:51 | | Unrecognized tag in cc_config.xml: <dont_use_vbox> 15/10/27 15:23:51 | | Not using a proxy 15/10/27 15:23:51 | | Config: GUI RPCs allowed from: 15/10/27 15:23:51 | | server01.domain.com 15/10/27 15:23:51 | | server01 15/10/27 15:23:51 | | log flags: file_xfer, sched_ops, task, time_debug 15/10/27 15:23:59 | | [time] dt 10.047127 w2 0.999988 on 0.999984; active 0.930395; gpu_active 0.930395; conn 0.000000, cpu_and_net_avail 0.930395 15/10/27 15:24:09 | | [time] dt 10.043709 w2 0.999988 on 0.999984; active 0.930384; gpu_active 0.930384; conn 0.000000, cpu_and_net_avail 0.930384 15/10/27 15:24:19 | | [time] dt 10.041398 w2 0.999988 on 0.999984; active 0.930373; gpu_active 0.930373; conn 0.000000, cpu_and_net_avail 0.930373 15/10/27 15:24:29 | | [time] dt 10.036447 w2 0.999988 on 0.999984; active 0.930362; gpu_active 0.930362; conn 0.000000, cpu_and_net_avail 0.930362 15/10/27 15:24:39 | | [time] dt 10.031088 w2 0.999988 on 0.999984; active 0.930351; gpu_active 0.930351; conn 0.000000, cpu_and_net_avail 0.930351 15/10/27 15:24:49 | | [time] dt 10.025552 w2 0.999988 on 0.999984; active 0.930341; gpu_active 0.930341; conn 0.000000, cpu_and_net_avail 0.930341 15/10/27 15:24:59 | | [time] dt 10.033298 w2 0.999988 on 0.999984; active 0.930330; gpu_active 0.930330; conn 0.000000, cpu_and_net_avail 0.930330 Then turned on a few more flags to see if anything would show up: 15/10/27 15:25:00 | | Re-reading cc_config.xml 15/10/27 15:25:00 | | Unrecognized tag in cc_config.xml: <dont_suspend_nci> 15/10/27 15:25:00 | | Unrecognized tag in cc_config.xml: <dont_use_vbox> 15/10/27 15:25:00 | | Not using a proxy 15/10/27 15:25:00 | | Config: GUI RPCs allowed from: 15/10/27 15:25:00 | | server01.domain.com 15/10/27 15:25:00 | | server01 15/10/27 15:25:00 | | log flags: file_xfer, sched_ops, task, cpu_sched, cpu_sched_debug, cpu_sched_status 15/10/27 15:25:00 | | log flags: sched_op_debug, state_debug, time_debug 15/10/27 15:25:00 | | [cpu_sched_debug] Request CPU reschedule: Core client configuration 15/10/27 15:25:00 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:01 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:02 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:03 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:04 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:05 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:06 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:07 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:08 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:09 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:09 | | [time] dt 10.051890 w2 0.999988 on 0.999984; active 0.930319; gpu_active 0.930319; conn 0.000000, cpu_and_net_avail 0.930319 15/10/27 15:25:10 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:11 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:12 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:13 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:14 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:15 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:16 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:17 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:18 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:19 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:19 | | [time] dt 10.059023 w2 0.999988 on 0.999984; active 0.930308; gpu_active 0.930308; conn 0.000000, cpu_and_net_avail 0.930308 15/10/27 15:25:20 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:21 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:22 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:23 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:24 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:25 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:26 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:27 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:28 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:29 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:29 | | [time] dt 10.054603 w2 0.999988 on 0.999984; active 0.930297; gpu_active 0.930297; conn 0.000000, cpu_and_net_avail 0.930297 15/10/27 15:25:30 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:31 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:32 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:33 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 15/10/27 15:25:34 | | [suspend] net_susp 1 file_xfer_susp 1 reason 2 The finally set them back to default. Once again on this client after running just the default preferences as defined on the WCG website I set local preferences to run even when busy and added "top" and "bash" as exclusive apps. Then ssh'ed into the client and ran both programs with no change; activity did not suspend. Still open to any suggestion on how to further troubleshoot this or a work around. This is stopping me from deploying to about 125 machines unfortunately. Cheers. |
Send message Joined: 20 Nov 12 Posts: 801 ![]() |
Do you know where I might find a more recent build for linux x64? Debian testing/unstable and latest Ubuntu are probably at 7.6.x. No idea if there is any hope of them running on Mageia. Fedora and RHEL/EPEL might be more compatible but I don't know what versions they have. You could compile the client yourself. It's not all that hard. The boinc manager on Windows likes to re-add the two flags whenever you adjust the computer preferences. Manager doesn't edit the config files. It always rewrites the entire file and adds every flag and option it knows of. Those tags were added after 7.2.42. Nothing to worry about. 15/10/27 15:17:48 | | [error] [mem_usage] procinfo_setup() returned 1 That helps, sort of. Do you have any programs running that have ')' in their name? Anything older than 7.4.0 can't handle such names which then breaks both suspend by CPU usage and exclusive apps. This is stopping me from deploying to about 125 machines unfortunately. You have permission to do that, right? |
Send message Joined: 23 Oct 15 Posts: 4 ![]() |
Yep. I'm one of the sysadmins for a few computer labs (dualboot win/lin pcs, and macs). I have deployed to all the windows installs, linux is what I'm working on and Macs are next, then some of our RDS servers. Eventually I wnat to get a project server up and running as we have a few researches interested running their own projects across all of our machines. |
Send message Joined: 23 Oct 15 Posts: 4 ![]() |
Do you know where I might find a more recent build for linux x64? Looks like I may try this and see what happens.
As matter of fact ps auxw | grep ')' shows three instances of "(sd-pam)". I'm guessing this bug is what is causing my issues. Thank you! I'll try hunting down a newer rpm or compiling it from source. I'll post back the results. Cheers. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.