Advanced search

Message boards : Number Crunching : Memory usage

Author Message
Alexander
Send message
Joined: 11 Aug 14
Posts: 41
Combined Credit: 23,861,254
DNA@Home: 428,269
SubsetSum@Home: 1,125,177
Wildlife@Home: 22,307,809
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

      
Message 4930 - Posted: 29 Dec 2014, 19:53:33 UTC

I've noticed on 2 systems, both 2 cores 4 threads, that one Gibbs sampler has paused with the message 'Waiting for memory', leaving one thread unused.
Windows task-manager shows that these processes take more than 900MB of ram. Sorry, this makes it unpracticable on 4GB ram systems, since windows also wants some memory and 4 tasks of this kind are too much. All other apps find 4GB ram enough and I don't want to upgrade to 8GB just for the Gibbs sampler.

Is there a way to reduce the ram usage a little bit? Otherwise I could try to make an app_info that limits the Gibbs-Sampler to a maximum of 2 silmultanous tasks. But this might require some maintenance when the task name changes.

Alexander

Profile Charles Dennett
Avatar
Send message
Joined: 10 Aug 14
Posts: 39
Combined Credit: 354,259
DNA@Home: 219,783
SubsetSum@Home: 134,472
Wildlife@Home: 4
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

    
Message 4931 - Posted: 29 Dec 2014, 20:58:54 UTC

I'm running into the same problem. There is one type of task that I've found has this rather large memory footprint. Task name is of the type gibbs_snail_hg19_1000fa_3motifs. On my 4 core/4 GB/Linux system I've had to create an app_config.xml file down in /var/lib/boinc/projects/volunteer.cs.und.edu_csg to limit the system to 2 tasks max for CSG. The file looks like this:

<app_config> <app> <name>gibbs</name> <max_concurrent>2</max_concurrent> </app> </app_config>


I also have an old HP pavillion notebook with two cores and only 1 GB of memory. It can handle one of these at a time but not two. I usually have to manually abort these tasks if I see them. Otherwise the thing locks up. I have a start delay in the tasks so I can reboot, start boinc and get in there if needed to abort a task before it actually starts. The delay is a line in cc_config.xml that I added:

<cc_config> <options> <start_delay>180</start_delay> </options> </cc_config>


The 4 core system is also crunching for POEM@Home. It's only 64 bit Linux so my other crunchers are only doing CSG for now. The other project I am currently crunching - FiND@Home - is currently down.
____________

Alexander
Send message
Joined: 11 Aug 14
Posts: 41
Combined Credit: 23,861,254
DNA@Home: 428,269
SubsetSum@Home: 1,125,177
Wildlife@Home: 22,307,809
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

      
Message 4933 - Posted: 30 Dec 2014, 8:45:48 UTC - in response to Message 4931.

Hi Charles,
thx for your reply.
On one system (with on chip gpu) I had to reduce the max_concurrent even to one. The usable ram is limited there to 3.4GB.

@ Travis:
Now it's clear that this project is not well fitted for the Androids.
But:
These days I installed a low budget system, MB with onboard Intel J1900 Celeron processor. Is a 4 core 2GHz system. Runs with SSD and 100% cpu load and one Einstein Intel gpu wu @ 27watts out of the wall. If someone is interested in the performance: Check my systems, it's Rocky1. It will work 7/24.

yo2013
Send message
Joined: 27 Sep 14
Posts: 19
Combined Credit: 963,744
DNA@Home: 118,965
SubsetSum@Home: 122,043
Wildlife@Home: 722,736
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

      
Message 4934 - Posted: 30 Dec 2014, 20:14:46 UTC - in response to Message 4933.

I experienced this sometimes in one of my Linux boxes. It gets fixed if I suspend the WU and resume it again.

Robert H
Send message
Joined: 27 Oct 14
Posts: 38
Combined Credit: 251,816
DNA@Home: 31,083
SubsetSum@Home: 220,734
Wildlife@Home: 0
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

    
Message 4935 - Posted: 2 Jan 2015, 23:18:02 UTC
Last modified: 2 Jan 2015, 23:18:52 UTC

What is the rek. minimum amount of RAM? I have 4GB, but maybe i should buy 4GB more?

Jozef J
Send message
Joined: 19 Apr 14
Posts: 15
Combined Credit: 21,881,667
DNA@Home: 1,006,826
SubsetSum@Home: 115,384
Wildlife@Home: 20,759,457
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 4,150
Images Observed: 37,019

          
Message 4936 - Posted: 4 Jan 2015, 12:25:11 UTC

Has one computer with 12 core Xeon E5 of 16 gigabytes ram ddr3 and scarcely sufficient for the calculation. LoooL, so about the minimum for this project is 32 gigabytes of RAM for me..

Alexander
Send message
Joined: 11 Aug 14
Posts: 41
Combined Credit: 23,861,254
DNA@Home: 428,269
SubsetSum@Home: 1,125,177
Wildlife@Home: 22,307,809
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

      
Message 4938 - Posted: 4 Jan 2015, 20:44:20 UTC

As a very course calculation I would say:
1 GB for the operating system plus 1 GB per workunit.

yo2013
Send message
Joined: 27 Sep 14
Posts: 19
Combined Credit: 963,744
DNA@Home: 118,965
SubsetSum@Home: 122,043
Wildlife@Home: 722,736
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

      
Message 4940 - Posted: 6 Jan 2015, 9:43:39 UTC
Last modified: 6 Jan 2015, 9:48:32 UTC

I'm experiencing this problem more often and in all my machines. From time to time (almost daily) half of my cores stop computing tasks for boinc and I have to manually fix the problem.

As a workaround, I will set the client to switch between tasks every 10 minutes instead of 1 hour.

Jozef J
Send message
Joined: 19 Apr 14
Posts: 15
Combined Credit: 21,881,667
DNA@Home: 1,006,826
SubsetSum@Home: 115,384
Wildlife@Home: 20,759,457
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 4,150
Images Observed: 37,019

          
Message 4941 - Posted: 7 Jan 2015, 2:15:53 UTC - in response to Message 4938.

Alexander you must by computer specialist. respect for you meen.

gtippitt
Send message
Joined: 26 Mar 14
Posts: 4
Combined Credit: 10,688,594
DNA@Home: 950,688
SubsetSum@Home: 0
Wildlife@Home: 9,737,906
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

    
Message 4952 - Posted: 12 Jan 2015, 21:05:57 UTC - in response to Message 4938.

Having 1GB per job is a good rule of thumb for this project. I've got 7 motherboards with 24 cores each (quad AMD 8431 Hex cores) running DNA jobs. The RAM in these systems varies from 16GB and up. Only ones with at least 32GB are able to run 20 DNA jobs without memory problems. Setting a limit for concurrent DNA jobs to allow the system to run other jobs on other cores is the best solution. POEM@Home is a good choice, as they are working on similar research and use less RAM.

You should also check your BOINC config to turn off the option to leave jobs in memory when interrupted. This applies not only to when a job is suspended by a user's non-BOINC work, but also if a BOINC job is waiting for another with higher priority to finish.

My Ubuntu systems are diskless and use ISCSI drives from my disk server. For the 24 core systems, I limit the number of cores for BOINC to 23 to leave 1 core free to take care of system overhead and LAN IO. With this setting, I get overall CPU utilization of at least 95%. If BOINC jobs are running on all CPU cores, the CPU utilization tops at about 85% due to jobs waiting on resources. These systems are also running jobs on 1 or more GPUs for POEM and GpuGrid, so keeping resources free to prevent bottlenecks to the GPUs is important.

On multicore systems I would recommend not running DNA jobs on all cores by limiting the number of concurrent DNS jobs. I would leave 1 or more cores to run smaller jobs like POEM and taking care of system overhead. This way the DNA jobs get everything they need to run as fast as possible. For example on dual and quad core systems, I have 1 core limit for projects like CSG and WCG, and let smaller RAM projects like SETI and POEM use whatever is left.

Every project is a bit different. Some run best using the fast, but limited instruction set of the GPU, while other projects need a CPU, which is slower but has a more robust instruction set. Some projects are doing only a few operations against data that can be chopped in small chunks and run with few resources. Other projects with more complex logic will require more resources, even with small chunks of data.

For a simple (and silly) example:
I have a bunch of small animals that need to be sorted into different holding areas. These animals are either wild sewer rats or tame pet chinchillas. No matter how many animals I need to sort, or how many people are sorting them, the instructions are fairly short. Is the animal smooth haired or fuzzy? Simple logic that requires few instructions to be written.

Another project has lots of pictures of cats, guinea pigs, rabbits, and dogs to be sorted and counted. The instructions for this project would be much more complex. How to tell the difference between, a long haired guinea pig, a persian kitten, and a Yorkshire terrier? How you tell the beagle from a rabbit? Both have long ears. You can't hear the Beagle's bay that she's smelled a rabbit. Both are standing still and alert smelling the air. Some of the rabbits are while, while others are brown. Some of the Beagles are white, while others are tri-color. They both are in the same tall grass, so fur length and size are difficult to determine from the two photos. A long list of instructions will be needed for the sorters. It is not as simple as "use less paper, so the sorters don't need to read as many instructions." Using a smaller font size won't help.

It isn't that some projects are "better", they are all trying to determine different things, even when their goals and subjects of research are similar.

Greg

yo2013
Send message
Joined: 27 Sep 14
Posts: 19
Combined Credit: 963,744
DNA@Home: 118,965
SubsetSum@Home: 122,043
Wildlife@Home: 722,736
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

      
Message 4957 - Posted: 17 Jan 2015, 11:09:55 UTC - in response to Message 4952.


On multicore systems I would recommend not running DNA jobs on all cores by limiting the number of concurrent DNS jobs. I would leave 1 or more cores to run smaller jobs like POEM and taking care of system overhead. This way the DNA jobs get everything they need to run as fast as possible. For example on dual and quad core systems, I have 1 core limit for projects like CSG and WCG, and let smaller RAM projects like SETI and POEM use whatever is left.


How do you do that? AFAIK, number of cores used can be set only for BOINC as a whole.

Trotador
Send message
Joined: 14 Oct 14
Posts: 4
Combined Credit: 14,003,824
DNA@Home: 1,465,524
SubsetSum@Home: 448,628
Wildlife@Home: 12,089,671
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

      
Message 4958 - Posted: 17 Jan 2015, 11:39:00 UTC - in response to Message 4957.
Last modified: 17 Jan 2015, 11:44:13 UTC


On multicore systems I would recommend not running DNA jobs on all cores by limiting the number of concurrent DNS jobs. I would leave 1 or more cores to run smaller jobs like POEM and taking care of system overhead. This way the DNA jobs get everything they need to run as fast as possible. For example on dual and quad core systems, I have 1 core limit for projects like CSG and WCG, and let smaller RAM projects like SETI and POEM use whatever is left.


How do you do that? AFAIK, number of cores used can be set only for BOINC as a whole.


You have to create an app_config.xml file (with a text editor) in the project folder and fill in it with:

<app_config>
<app>
<name>gibbs</name>
<max_concurrent>9</max_concurrent>
</app>
</app_config>

Subsitute "9" with the value you want for your system, depending on the total RAM it has available and taking into account if it is a dedicated cruncher or your daily system.

To make BOINC Manager aware, just use the "Read config files" in the "Advanced Menu" or restart it.

Edit: oops! already explained in the second post of this thread...

yo2013
Send message
Joined: 27 Sep 14
Posts: 19
Combined Credit: 963,744
DNA@Home: 118,965
SubsetSum@Home: 122,043
Wildlife@Home: 722,736
Wildlife@Home Watched: 0s
Wildlife@Home Events: 0
Climate Tweets: 0
Images Observed: 0

      
Message 4961 - Posted: 18 Jan 2015, 13:36:40 UTC - in response to Message 4958.

Oh, thanks! Didn't see the second post...


Post to thread

Message boards : Number Crunching : Memory usage