Earth Simulator Project, a Major Initiative of High Performance Super Computer Development in Japan
Keiji Tani, Earth Simulator Research and Development Center


The Science and Technology Agency of Japan has proposed a project to promote studies for global change prediction by an integrated three-in-one research and development approach: earth observation, basic research, and computer simulation. It goes without saying that basic process and observation studies for global change are very important. Most of these basic processes, however, are tightly coupled and form a typical complex system. A large-scale simulation in which the coupling between these basic processes are taken into consideration is the only way for a complete understanding of this kind of complicated phenomena As part of the project, we are developing an ultra-fast computer named the "Earth Simulator". The Earth Simulator has two important targets, one is the applications to the atmospheric and oceanographic science and the other is the applications to the solid earth science. For the first applications, high resolution global, regional and local models will be developed and for the second a global dynamic model to describe the entire solid earth as a system and a simulation model of earthquake generation process etc., will be developed.
Taking as an example of a global AGCM (atmospheric general circulation model), here we consider the requirements for computational resources for the Earth Simulator. Present typical global AGCM uses about a 100km mesh in both longitudinal and latitudinal directions. The mesh size will be reduced to 10km in the high resolution global AGCM on the Earth Simulator. The number of layers will also be enhanced up to several to 10 times that of the present model. According to the resolution level, the time integration mesh must be reduced. Taking all these conditions into account, both the CPU and main memory of the Earth Simulator must be at least 1000 times lager than those of present computers. The effective performance of present typical computers is about 4-6 GFLOPS. Therefore, we set the sustained performance of the Earth Simulator for a high resolution global AGCM to be more than 5 TFLOPS.
Reviewing the trends of commercial parallel computers, we can consider two types of parallel architectures for the Earth Simulator; one is a distributed parallel system with cache-based microprocessors and the other is a system with vector processors. According to the performance evaluation for a well-known AGCM (CCM2), it is shown that the efficiency is less than 7% on cache-based parallel systems, where the efficiency is the ratio of the sustained performance to the theoretical peak. On the other hand, an efficiency about 30% was obtained on parallel systems with vector processors. For this reason, we decided to employ a distributed parallel system with vector processors.
Another key issue for a parallel system is the interconnection network. As mentioned above, many different types of applications will run on the Earth Simulator. Judging from the flexibility of parallelism for many different types of applications, we employ a single-stage crossbar network in order to make the system completely flat.
An outline of the hardware system of the Earth Simulator can be summarized as follows:
o Architecture : MIMD-type distributed memory parallel system consisting of computing nodes with shared memory vector type multi-processors.
o Performance : Assuming the efficiency 12.5%, the peak performance 40 TFLOPS (the effective performance for an AGCM is more than 5 TFLOPS).
o Total number of processor nodes 640
o Number of PE's for each node 8
o Total number of PE's 5120
o Peak performance of each PE 8 GFLOPS
o Peak performance of each node 64 GFLOPS
o Main memory : 10 TB (total).
o Shared memory / node : 16 GB
o Interconnection network : Single-Stage Crossbar Network.
The conceptual and basic designs for the Earth Simulator have been completed during the last fiscal year. The R&D for parts and packaging is on the way and will be completed at the end of March and followed by the detailed design. The manufacture of the system will start in the beginning of 2000 and the system will be completed in the spring of 2002.