Currently, a number of people are using GRAPE systems for SPH simulations (see, e.g., [Ste96]). In these simulations, GRAPE is used to calculate gravity and to construct the list of neighbor particles. The host handles the SPH interaction between neighbors. The calculation of SPH interaction consumes fairly large fraction of the total CPU time.
If SPH interaction can also be handled by some specialized
hardware like GRAPE, we can achieve further speedup. The speedup
we can achieve is not very large, typically around a factor of 10
or less. This is because the calculation cost of SPH interaction
is still and not as large as that of gravity. On the other
hand, this fact implies that we do not need a very fast hardware.
In GRAPE-6 project, we will try a relatively new approach,
so-called ``reconfigurable computing'' [BA96], to
accelerate SPH and similar applications. An alternative is to
develop a hardware specialized to SPH
[YOT96]. However, whether the high initial cost of
a custom LSI can be justified by a relatively modest speedup is not
clear. The basic idea of ``reconfigurable computing'' is to use a
programmable LSI chip (field-programmable gate arrays or FPGA) to
implement (part of ) applications. Currently, FPGAs with nominal
gate count of 100,000 are available. This number is about a
factor of 100 smaller than that for a full custom LSI, but may be
sufficient to implement single pipeline for SPH calculation.
The readers interested in FPGA and reconfigurable computing are referred to [BA96]. The bottom line is that we may be able to use them as pipeline processors more flexible than hardwired GRAPE pipelines and at the same time to achieve price performance better than that of programmable general-purpose computers. Of course, it is also true that reconfigurable computing is neither as flexible as general-purpose computer, nor as efficient as GRAPE. Thus, it cannot directly compete with either of them. However, for the part of computation which is relatively time consuming, but much less so compared to gravitational force calculation, reconfigurable computing would offer an ideal solution.
Thus, GRAPE-6 might become a heterogeneous computer with three, not two, components (figure 4). We may be able to use the reconfigurable part for various applications, such as the calculation of van der Waals force in molecular dynamics and evaluation and shifting of spherical harmonics in the fast multipole method.
Figure 4: Extended GRAPE architecture