Changes to the multicore app configuration


Advanced search

Message boards : News : Changes to the multicore app configuration

Author Message
David Cameron
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 13 May 14
Posts: 252
Credit: 2,028,556
RAC: 1
    
Message 5277 - Posted: 13 Sep 2016, 8:09:56 UTC
Last modified: 13 Sep 2016, 8:54:21 UTC

After analysis of results from the multicore app (see this thread) we have made the following changes

- Minimum VirtualBox version required is 5.0.0
- Maximum cores reduced from 12 to 8 (this can be overridden in app_config.xml but we have seen severe performance issues above 8 cores)
- Enforced that hardware acceleration is enabled on the host. This should stop the -148 ERR_EXEC errors because WU will not be sent to these hosts.

Please continue to provide your feedback on changes we make.

David Cameron
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 13 May 14
Posts: 252
Credit: 2,028,556
RAC: 1
    
Message 5278 - Posted: 13 Sep 2016, 8:54:09 UTC - in response to Message 5277.

... and just after I wrote this we ran out of work for multicore :( We'll try to have some new work by the end of the day.

computezrmle
Send message
Joined: 29 Oct 14
Posts: 54
Credit: 1,137,404
RAC: 0
    
Message 5280 - Posted: 13 Sep 2016, 9:01:36 UTC - in response to Message 5277.
Last modified: 13 Sep 2016, 9:02:34 UTC

After analysis of results from the multicore app (see this thread) we have made the following changes

- Minimum VirtualBox version required is 5.0.0
- Maximum cores reduced from 12 to 8 (this can be overridden in app_config.xml but we have seen severe performance issues above 8 cores)
- Enforced that hardware acceleration is enabled on the host. This should stop the -148 ERR_EXEC errors because WU will not be sent to these hosts.

Please continue to provide your feedback on changes we make.[/list]


You forgot to mention:
- # of unsent WUs is set to 0 (according to the server status page)

;-)

OOps: too slow

47an
Send message
Joined: 14 Jan 15
Posts: 24
Credit: 10,732,116
RAC: 0
    
Message 5357 - Posted: 21 Sep 2016, 19:06:52 UTC

A short wish from me to atlas about amount of event for each workunits.

Last 2 days there have been very short units to us, fair enough this might be good those who choose low amount of cores or slow cores. Was reading that about 50 events per unit for new task and i do think this would be very inefficient if if this is set for last days units.

If look at some of last days shorter runs with total run time of 1500-2000 sec (mt_mcore set to 8 core) with default settings, the could end with cpu time of 5000-7000. I check and if i remember it right singlecore units could start after 5-6 min and last days it seems to be 9-11 minutes until it start up using the cores. The units had a total runtime at 30-40 minutes so 1/3-1/4 of the time was in creation of vm and bot up and setting before start.
This would be fine for a task at 3-4 hour run but for a half hour not so effective as it could be.

I do like new mt_core and would save a lot of ram and disk but to keep this we all would win to make higher event to these workunits.

How about make 50 event for short 100 as default and long at 150, then for those who like even longer do 200-400 events.

I was used to 10 000 sec units and thought that was fine at that time. There are some of these today to but mor and more short units get to my host.

David Cameron
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 13 May 14
Posts: 252
Credit: 2,028,556
RAC: 1
    
Message 5358 - Posted: 21 Sep 2016, 19:14:50 UTC - in response to Message 5357.

Unfortunately it was not our choice to have these short WU, the tasks are defined upstream with a fixed number of events and we cannot change it. The ideal number for multi-core is probably 200 events but we do not have any of those tasks at the moment.

More technical details: the simulation uses an input file of event data and simulates a certain number of events from that file. We are limited currently to use one input file per WU and the current WU have input files with only 50 events. This means the maximum number of events that each WU can simulate is 50.

We are working on a solution to allowing multiple input files per WU but it may be a few weeks before this is ready. In the meantime we are putting pressure upstream to get tasks with input files containing more events.

Tom*
Send message
Joined: 28 Jun 14
Posts: 118
Credit: 8,761,428
RAC: 0
    
Message 5363 - Posted: 21 Sep 2016, 20:09:47 UTC
Last modified: 21 Sep 2016, 20:11:58 UTC

I looked at the elased time versus the cpu time and concluded that for my systems
three cpu's is best usage for the current work units. I can squeeze in an extra
ATLAS task on my 12 cpu and free up a core for an extra vLHC task by doing this without losing much in the way if credits.

With 4 cpu's per task one cpu was hardly utilized.

It will likely change as we get 200 event tasks again.

Tom*
Send message
Joined: 28 Jun 14
Posts: 118
Credit: 8,761,428
RAC: 0
    
Message 5369 - Posted: 21 Sep 2016, 21:29:12 UTC

Forget the last pertaining to cpu usage as after setting three cpu's ATLAS only uses two of those. Will still need to see if an extra task does better creditwise than extra cpu's per task.

Laughingass
Send message
Joined: 8 Jun 15
Posts: 3
Credit: 46,682
RAC: 0
    
Message 5422 - Posted: 1 Oct 2016, 0:37:41 UTC - in response to Message 5369.

My Windows laptop with i5-6300HQ Quad-core and 16GB RAM was sluggish when more than one Atlas task was running.

Setting <max_concurrent>1</max_concurrent> didn't limit tasks to one.

I looked in the config doc at
https://boinc.berkeley.edu/wiki/Client_configuration
Found the setting project_max_concurrent

I added this setting to the Atlas app_config.xml next to the existing entries.
C:\ProgramData\BOINC\projects\atlasathome.cern.ch\app_config.xml:
<max_concurrent>1</max_concurrent>
<project_max_concurrent>1</project_max_concurrent>

Now Atlas is limited to one task and my laptop is running normally again.

Jacob Klein
Send message
Joined: 21 Jun 14
Posts: 48
Credit: 27,798
RAC: 0
    
Message 5423 - Posted: 1 Oct 2016, 13:31:06 UTC - in response to Message 5422.
Last modified: 1 Oct 2016, 13:31:28 UTC

Per documentation here:
https://boinc.berkeley.edu/wiki/Client_configuration

<max_concurrent> is an app-level element, that must be within an <app> element. Are you sure you set that up correctly?
<project_max_concurrent> is a project-level element, that does not have to be within an <app> element, which probably explains why it is working for you.

If you still have your <max_concurrent> app_config.xml attempt that didn't work, I'd be interested in inspecting it, to make sure BOINC doesn't have a bug.

Regards,
Jacob Klein

maeax
Send message
Joined: 25 Jun 14
Posts: 50
Credit: 1,700,662
RAC: 0
    
Message 5613 - Posted: 12 Nov 2016, 12:32:03 UTC
Last modified: 12 Nov 2016, 12:33:51 UTC

When more logical CPU's where defined than physical
the following Info is under Boinc-VM - Settings of the ATLAS MultiCore-VM.

System Processor page
More virtual CPUs are assigned to the virtual machine than the number of physical CPUs on the host system (4).
This is likely to degrade the performance of your virtual machine.
Please consider reducing the number of virtual CPUs.


Have PC's with 4/8 and 2/4 and 4/4 (Physical/Virtual) CPUs.

Virtualbox 5.1.8.

Profile tullio
Send message
Joined: 27 Jun 14
Posts: 256
Credit: 289,886
RAC: 0
    
Message 5615 - Posted: 12 Nov 2016, 13:38:46 UTC
Last modified: 12 Nov 2016, 13:41:25 UTC

My AMD A10-6700 has 2 cores and 4 logical processors, according the the Windows 10 Task Manager. I believe this to be right, since it runs only 2 BOINC tasks at a time, one if it is multicore, like Atlas, GPU tasks, both SETI@home and Einstein@home run alongside.
Tullio

Message boards : News : Changes to the multicore app configuration