Parallel Garbage Collector
One of the core components of Maple's engine is the garbage collector. The garbage collector is responsible for finding and reclaiming memory that is no longer needed by the evaluation engine. In Maple 17, the garbage collector is capable of taking advantage of multiple processors to perform its job more quickly. As the garbage collector is used continuously as Maple runs, this speed up helps all users, not just those running parallel algorithms.
The garbage collector in Maple 17 is capable of running multiple threads to take advantage of multiple cores. This can speed up the collector significantly. Although the collector is only responsible for a fraction of Maple's total running time, this can still lead to running times reduced by up to 10%. These speed ups require no changes to user code.
There are 2 kernelopts that can control the parallel collector.
gcmaxthreads controls the maximum number of threads that the parallel garbage collector will use.
gcthreadmemorysize controls how many threads Maple will use for a collection. The number of bytes allocated is divided by the value assigned to gcthreadmemorysize to determine the number of threads.
The following example shows the speed up in the garbage collector from using multiple threads. Note that this may take a few minutes to run depending on your particular machine specifications.
restart
if kerneloptswordsize = 32 then upperBound ≔ 2⋅10^7;else upperBound ≔ 10^8; end if:
data ≔ seqi, i = 1 .. upperBound:
p:=proc local i; for i to 100 do gc end do end proc:
kerneloptsgcmaxthreads=1: t11≔timerealp⁡:
kerneloptsgcmaxthreads=2:t12≔timerealp⁡:
kerneloptsgcmaxthreads=4:t14≔timerealp⁡:
Statistics:-ColumnGraph⁡t1[1],t1[2],t1[4],datasetlabels=One Thread,Two Threads,Four Threads,labels=,Seconds,gridlines
The garbage collection algorithm only comprises a small amount of the total Maple running time, so the speed up will not be this significant for real Maple computations. The following example shows a more realistic situation.
d:=seq⁡randpoly⁡x,y,z,dense,degree=12,i=1..10000:
f:=proc local i; mapi→int⁡i,x,dend proc:
kernelopts⁡gcmaxthreads=1:
t21≔timerealf⁡:
kernelopts⁡gcmaxthreads=2:
t22≔timerealf⁡:
kernelopts⁡gcmaxthreads=4:
t24≔timerealf⁡:
Statistics:-ColumnGraph⁡t21,t22,t24,datasetlabels=One Thread,Two Threads,Four Threads,labels=,Seconds,gridlines
See Also
kernelopts
Download Help Document