Here are quoted a few speed up ratios between GFARGO, and the standard, double precision FARGO code on one CPU core. These speed up ratios are obtained in the limit of a large mesh. In the case of the comparison problem of de Val-Borro et al. (2006), the small size of the mesh (384*128) yields somewhat smaller ratios. The CPU code, for these benchmarks, is always compiled with
gcc, with the
|8600M GT||Intel(R) Core(TM) 2 Duo@2.4 GHz||9.5x||MacBook Pro (2007)|
|Quadro FX5800||Opteron AMD firstname.lastname@example.org GHz||90x||Clusters at CEA|
|Tesla C1070||Intel(R) Xeon(TM) email@example.comGHz||30x||Cluster at CEA|
|GeForce GTX 285||Intel(R) Core(TM)2 Duo E6750@2.66GHz||61x||Linux workstation|
The standard test of a Neptune mass planet embedded in an inviscid disk (see
de Val-Borro et al. (2006)) runs in 4 minutes and a half on a MacBook Pro (2007) equipped with the chip NVIDIA GeForce 8600M GT. It runs in 2 minutes and 50 seconds on a more modern MacBook Pro, equipped with the chip 330M GT, and in 92 seconds on a GTX285. The double precision version for CUDA 5.0 runs in 2 minutes and 39 seconds on a 650M chip (such as those present on 2012-2013 MacBook Pro’s), and in 52 seconds on a Tesla C2050.