Home > Legacy archive > Specific versions > GFARGO > Benchmarking
Here are quoted a few speed up ratios between GFARGO, and the standard, double precision FARGO code on one CPU core. These speed up ratios are obtained in the limit of a large mesh. In the case of the comparison problem of de Val-Borro et al. (2006), the small size of the mesh (384*128) yields somewhat smaller ratios. The CPU code, for these benchmarks, is always compiled with gcc
, with the -O3
and -ffast-math
options.
Graphics card | CPU | ratio | Platform | |||
8600M GT | Intel(R) Core(TM) 2 Duo@2.4 GHz | 9.5x | MacBook Pro (2007) | |||
Quadro FX5800 | Opteron AMD 2380@2.5 GHz | 90x | Clusters at CEA | |||
Tesla C1070 | Intel(R) Xeon(TM) 5570@2.93GHz | 30x | Cluster at CEA | |||
GeForce GTX 285 | Intel(R) Core(TM)2 Duo E6750@2.66GHz | 61x | Linux workstation |
The standard test of a Neptune mass planet embedded in an inviscid disk (see
de Val-Borro et al. (2006)) runs in 4 minutes and a half on a MacBook Pro (2007) equipped with the chip NVIDIA GeForce 8600M GT. It runs in 2 minutes and 50 seconds on a more modern MacBook Pro, equipped with the chip 330M GT, and in 92 seconds on a GTX285. The double precision version for CUDA 5.0 runs in 2 minutes and 39 seconds on a 650M chip (such as those present on 2012-2013 MacBook Pro’s), and in 52 seconds on a Tesla C2050.