UNDERWATER WORLD ARTICLES

SUPERCOMPUTERS, NETWORKS, AND STORAGE

DAN Project Dive Saftey (PDS) is collecting dive profiles and biomedical information for reliably estimating decompression illness (DCI) within statistical models. Some million dives will be entered into a recreational data base. High performance computing environments with supercomputers are necessary to efficiently process, transfer, fit, and store data in timely fashion. Maximum likelihood treatment of DCI incidence with multivariate analysis borders the Grand Challenge problem category for supercomputers today, overtaxing less powerful platforms. Within present and available very high performance computing environments, the computational demands of PDS can be met in efficient and timely manner.

Computing technology has made incredible progress in the past 50 years. In 1945, there were no stored program computers. Today, a few thousand dollars will purchase a desktop personal computer with more performance, more memory, and more disk storage than a million dollar computer in 1965. This rapid rate of improvement has come from advances in technology used to build the computer and from innovation in computer design. Performance increase is sketched in Figure 1, in terms of a nominal 1965 minicomputer. Performance growth rates for supercomputers, minicomputers, and mainframes are near 20\% per year, while performance growth rate for microcomputers is closer to 35\% per year. Supercomputers are the most expensive, ranging from one to tens of millions of dollars, and microprocessors are the least expensive, ranging from a few to tens of thousands of dollars. Supercomputers and mainframes are usually employed in high end, general purpose, compute intensive applications. Minicomputers and microprocessors address the same functionality, but often in more diverse roles and applications. The latter class of computers is usually more portable, because they are generally smaller in size. They are on your desktop.

The label supercomputer usually refers to the fastest, biggest, and most powerful computer in existence at any time. In the 1940s, supercomputers were employed in the design of nuclear weapons (as still today), In the 1950s, supercomputers were first used in weather forecasting, while in the 1960s, computational fluid dynamics problems in the aerospace industry were solved on supercomputers. In the 1970s, 1980s, and 1990s seismological data processing, oil reservoir simulation, structural analysis of buildings and vehicles, quantum field theory, circuit layout, econometric modeling, materials and drug design, brain tomography and imaging, molecular dynamics, global climate and ocean circulation modeling, and semiconductor fabrication joined the supercomputing revolution. Very few areas in science and engineering have not been impacted by supercomputers. Diving is still on the fringes of supercomputing, but applications are growing, particularly in the areas of dive profile analysis, statistics, data management, and biomodeling. Smaller and less powerful computers are now employed for monitoring, controlling, directing, and analyzing dives, divers, equipment, and environments. Wrist computers perform rudimentary decompression calculations and stage ascents with mostly Haldane models.

Operational supercomputers today process data and perform calculations at rates of 10^9 floating point operations per second (gigaflops), that is, 10^{9} adds, subtracts, multiplies, or divides per second. At the edge today, and in the marketplace, are shared memory processors (SMPs) providing users with 10^{12} floating point operations per second (teraflops), impressively opening yet another age in computational science. These machines are massively parallel processors (MPPs), involving thousands of computing nodes processing trillions of data points. To support these raw computing speeds, networks transmitting data at gigabits/sec, and fast storage exchanging terabytes of information over simulation times are also requisite. Ultrafast, high resolution, graphics servers, able to process voluminous amounts of information, offer an expeditious means to assess output data. Differences in raw processing speeds between various components in a high performance computing environment can degrade overall throughput, conditions termed latencies, or simply, manifest time delays in processing data. Latencies are parasitic to sustained computing performance. Latencies develop at the nodes connecting various computer, storage, network, terminal, and graphics devices, simply because of impedance mismatch in data handling capabilities.

Obviously, computers work on processing information, doing calculations, and fetching and storing data in steps. A set of operations, performed in sequential fashion by one processor, is termed serial. A set of operations performed in any fashion, by any number of processors, is roughly termed parallel. Serial computing architectures, once the standard, are now being replaced by parallel computing architectures, with anywhere from tens to thousands of central processing units (CPUs). Processors themselves can be scalar, or vector, that is, operating on a single entity, or group of entities (numbers).

The architectural feature associated with supercomputers in the 1970s was vector processing. Vector processing allowed large groups of numbers, or vectors, to be processed in parallel, resulting in performance speedups by factors of ten or more (compared to generational improvements on the order of 2 or 3). In the early 1980s, parallel supercomputing was introduced, allowing multiple processors to work concurrently on a single problem. By the end of the century, significantly greater computing parallelism (combining tens of thousands of processing units perhaps), and architectures that integrate modalities, such as numeric and symbolic processing, may be possible. As in the past, software developments on future state of the art supercomputers will probably trail hardware advances, perhaps with increasing distance due to increasingly more complex superparallel systems.

Networks are the backbone of modern computer systems. Supercomputers without high speed communications links and network interfaces are degraded in application processing speed, limited by the slowest component in the computing platform. Gigaflop computers need gigabit/sec network transmission speeds to expedite the flow of information. Data, voice, image, and full motion video can be digitally encoded, and sent across a variety of physical media, including wire, fiber optics, microwaves, and satellites. The assumption is that all information transmitted will be digital. The greater the number of systems, people, and processes that need to transmit information to one another, the greater the speeds and bandwidths required. Like water in a pipe, to get more information through a network, one can increase the rate of flow (speed), and/or increase the amount that can flow through cross sectional area (bandwidth). Applications under development today presage the needs to transfer data very quickly tomorrow. To perform as a utility, that is, usefully communicate anything, anytime, anywhere, a network must possess four attributes

1. connectivity -- ability to move information regardless of the diversity of the media;

2. interoperability -- ability of diverse intelligent devices to communicate with one another;

3. manageability -- ability to be monitored, and to change with applications and devices;

4. distributed applications and connective services -- ability to provide easy access to tools, data, and resources across different computing platforms, or organizations.

Commercial telecommunications links (modem connections to the Internet) are extremely slow, in the vicinity of 10 kilobits/sec to 56 kilobits/sec. Even dedicated communications lines are low speed, that is, T1 and T3 links (1.4 megabits/sec and 43 megabits/sec respectively), and cannot feed supercomputers with information fast enough to support economical processing. The 4 terabytes from a seismic map of an oil field in the Gulf (8 square miles) would take about 3 - 4 days to transmit from one site to another for processing. The 1 million dive profiles projected in DAN Project Dive Safety stacks up to hundreds of gigabytes, depending on resolution.

Advances in massively parallel, large memory computers, and high speed networks have created computing platforms, depicted in Figure 2, which allow researchers to execute supercodes that generate enormous data files. The supercomputing environment depicted in Figure 2 can be found in large Universities, National and Regional Laboratories, dedicated Commercial Computing Centers, and various Governmental Agencies. The one in Figure 2 depicts the superplatform at the Los Alamos National Laboratory. These facilities are available to the commercial user, and computing costs range from \100-\$300 per hour on vector supercomputers (YMP, T90, J90) to \$1 - \$4 per node per hour on massively parallel supercomputers (CM5, T3D, SP2 Cluster).

Supercodes generate enormous amounts of data, and a typical large application will generate from tens of gigabytes up to several terabytes of data. Such requirements are one to two orders of magnitude greater than the comfortable capacities of present generation storage devices. New high performance data systems (HPDS) are online to meet the very large data storage and handling. Systems consist of fast, large capacity storage devices that are directly connected to a high speed network, and managed by software distributed across workstations. Disk devices are used to meet high speed and fast access requirements, while tape devices are employed to meet high speed and high capacity requirements. Storage devices usually have a dedicated workstation for storage and device management, and to oversee data transfer. Put simply, computer systems use a hierarchy to manage information storage

1. primary storage -- fast, solid state memory contained in the processor;

2. direct access storage -- magnetic or optical disks, connected to the processor, providing fast access;

3. sequential access storage -- magnetic tape cassettes or microfilm, providing large capacity.

Transfer rates in fast HPDS systems are presently near 800 megabits/sec. Moving down the hierarchy, access time goes up, storage capacity increases, and costs decrease. Today, of all computing components, the cost of storage is decreasing the most rapidly. A few hundred dollars will buy gigabyte hard drives for your PC. Renting storage commercially is also cheap (\$20 gigabyte/month).

GRAND CHALLENGE PROBLEMS
Grand Challenge problems are computational problems requiring the fastest computers, networks, and storage devices in existence, and problems whose solutions will have tremendous impact on the economic well being of the United States. Vortices in combustion engines, porous flow in oil bearing substrates, fluid turbulence, three dimensional seismic imaging, ductile and brittle fracture of materials under stress, materials by computer design, global convection of tectonic plates, geomagnetic field generation, ocean and atmospheric circulation, high impact deformation and flow, air and groundwater pollution, global climate modeling, elastic-plastic flow, brain tomography, HIV correlations, bubble generation and cavitating flow, and many others are just such problems. Statisical modeling coupled to maximum likelihood for millions of trials, as employed to estimate DCI incidence in DAN Project Dive Safety, borders and pushes the Grand Challenge computational problem category, particularly as the number of model fit parameters increases beyond 5.

The scale of computational effort for nominal Grand Challenge problems can be gleaned from Table 1, listing floating point operations, computer memory, and data storage requirements. As a reference point, the 6 million volumes in the Library Of Congress represent 24 terabytes of information. The simulations listed in Table 1 run for many hours on the CM5, the Thinking Machines Corporation (TMC) massively parallel supercomputer. The CM5 is a 1024 node (Sparc processors) MPP supercomputer, with 32 gigabytes of fast memory, access to 450 gigabytes of disk storage, and a peak operational speed of 128 gigaflops. On the next (teraflops) generation supercomputers, simulation times are expected to drop to many minutes.

Problem	Description	Operations (number)	Memory (terabytes)	Storage (terabytes)
probabilistic decompression	DCI Maximum liklihood	10^14	.030	.450
pourous media	3D immisicible flow	10^18	1	4
ductile material	3D molecular dynamics	10^18	.30	3
ductile material	3D material hydro	10^18	1	20
plasma physics	numerical tokamak	10^18	1	100
global ocean	century circulation	10^17	4	20
brain topology	3D rendering	10^15	.015	.001
quantum dynamics	lattice QCD	10^18	.008	.008

Scientific advance rests on the interplay between theory and experiment. Computation closes the loop between theory and experiment in quantitative measure. Theory provides the framework for understanding. Experiment and data provide the means to verify and delineate that understanding. Although many disciplines rely on observational data (astronomy, geology, and paleontology, for instance), the hallmark of scientific endeavor is experiment. Clearly, the power of experimental science is its ability to design the environment in which data is gathered. And it is in the design process that modern computers play an important role.

While many believe that good experimentation depends on the skill and imagination of the designer, this is not entirely true. Insight and experience are certainly desirable to determine and optimize measurable response and procedures, but once this has been determined, it is the mathematics that dictates experimental structure, as detailed by Fisher some 70 years ago in noting that the real world is

1. noisy -- repeating an experiment under identical conditions yields different results

2. multivariate -- many factors potentially affect phenomena under investigation;

3. interactive -- the effect of one factor may depend on the level of involvement of other factors.

Computers permit extension and analysis of experimental design methodology to problems for which only crude prescriptions have been hitherto available. Computer software is now widely and economically available to automate the basic and most useful procedures. This allows the user without extensive statistical background to routinely employ methods to otimize design.

Certainly, performing numerical experiments on computers, that is, leveraging model predictions to gain insight into phenomena under study, can often provide results that give the best possible estimate of overall experimental response and behavior. The approach here is to use the smallest possible subsets of inputs to run the simulation model, thereby narrowing the focus. In designing experiments, Monte Carlo simulations are used in high energy and accelerator physics, semiconductor fabrication, material damage, neutron and photon shielding, and biomedical dose. Large deterministic modules, in excess of 100,000 lines of code, on the other hand, have been applied to the design of laser fusion target experiments. Similarly, atomistic simulations with millions and, in the future, billions of test atoms provide the opportunity for both fundamental and technological advances in material science. Nonequilibrium molecular dynamics calculations address basic scientific issues, such as interaction potentials and plastic flow. The interaction potentials developed in the last decade for metals, alloys, and ceramics can be used to model prototypical hardness experiments, such as crystal indentation. The underlying mechanisms for plastic flow are microscopic crystal defect motions, and molecular dynamics calculations yield quantitative estimates for hardness experiments. Linkages between experiment and supercomputer modeling are growing in scope and number.

Monte Carlo Bubble Simulations
Monte Carlo calculations explicitly employ random variates, coupled to statistical sampling, to simulate physical processes and perform numerical integrations. In computational science, Monte Carlo methods play a special role because of their combination of immediacy, power, and breadth of application. The computational speed and memory capacity of supercomputers have expedited solutions of difficult physical and mathematical problems with Monte Carlo statistical trials. Although Monte Carlo is typically used to simulate a random process, it is frequently applied to problems without immediate probabilistic interpretation, thus serving as a useful computation tool in all areas of scientific endeavor. Applied to bubble formation and tissue-blood interactions, Monte Carlo methods are truly powerful supercomputing techniques.

The Monte Carlo method is different than other techniques in numerical analysis, because of the use of random sampling to obtain solutions to mathematical and physical problems. A stochastic model, which may or may not be immediately obvious, is constructed. By sampling from appropriate probability distributions, numerical solution estimates are obtained. Monte Carlo calculations simulate the physical processes at each point in an event sequence. All that is required for the simulation of the cumulative history is a probabilistic description of what happens at each point in the history. This generally includes a description of the geometrical boundaries of regions, a description of material composition within each region, and the relative probability (functional) for an event. With high speed computers, millions of events can be generated rapidly to provide simulation of the processes defined by the probability function. Statistically, the accuracy of the simulation increases with number of events generated.

The generation of cavitation nuclei in tissue can be effected with Monte Carlo techniques, using the Gibbs potential (bubble formation energy) across liquid-vapor interfaces as a probability function for bubble radius as the random variable. Surrounded by dissolved gas at higher tension for any ambient pressure, bubbles generated can be tracked through growth and collapse cycles in time, allowed to move with surrounding material, coalesced with each other, and removed at external boundaries. Cavitation simulations are applied to multiphase flow in nuclear reactor vessels, cavitation around ship propellors, bubbles in gels, cloud and ice condensation processes in the atmosphere, cosmic ray tracking in chambers, and boiling processes in general.

Two Phase Porous Flow
Numerical simulations of oil-water fluid flows are a challenging problem, due in part to the complexities of interfacial dynamics and also because of the complexities of geometry. Rather than testing in the field, many oil companies have turned their efforts to the numerical study of pore spaces in oil bearing rock, with high resolution, three dimensional, X-ray scans. Traditional numerical methods have been applied to problems with simple boundaries, but none of the methods apply successfully to the arbitrary geometries found in porous media. Recent emergent numerical techniques on supercomputers, such as derivatives of cellular automata, have demonstrated such capability. Using such cellular methods, it is now possible to study the interactions between oil-water systems and porous rock media.

HIV Analysis
Research directed at either finding a cure or vaccine for AIDS is hampered by the extreme variability of the viral genome. Because of this variability, it is difficult to identify targets for drug and vaccine design. Exploiting the speed of modern supercomputers, methods have been developed to test for potentially distant regions in viral proteins that interact. Identifications of interacting sites can be used by experimentalists in finding a vaccine or drug preventing infection or death. Linked positions imply biological correlation of functionality, and are important sites within the virus. A map of interaction zones can be used by experimentalists trying to track and define function regions of the virus. Such maps can be generated rapidly, and in three dimensions, on modern computing platforms with graphics capabilities.

Groundwater Remediation
Groundwater contamination occurs commonly throughout the world. According to recent estimates, cleanup costs in the US alone are estimated at \$1 trillion. Hence, any information or analysis that provides even minor cost savings for a single site, can have significant impact overall if the the information is transferable to disparate sites. Computational experiments performed on modern supercomputers are useful for understanding the complex chemical migration and transformation processes that occur when hazardous substances are released into heterogeneous groundwater systems in a variety quiescent states. Simulations of this sort provide an alternative basis to study detailed behavior under natural and engineered conditions.

Combustion Vortex Interactions
Experiments have shown that inducing rotational motions (vortices) in the gases of internal combustion engines enhances both turbulence and combustion efficiency. Combustion efficiency is improved because the rotational kinetic energy breaks down into fluid turbulence when the piston approaches the cylinder head. Although a qualitative understanding of the dynamics of vortices has already been obtained, supercomputing power provides the precision and speed to determine when and where vortices develop in combustion engines, questions hitherto obscure to the engine designers.

Molecular Dynamics
Material phenomena, such as fracture, dislocation, plasticity, ablation, stress response, and spall are important to the development and manufacture of novel materials. Molecular dynamics simulations on supercomputers, providing resolution on the micron scale, employ millions of interacting molecules to represent states of matter. In such calculations, each molecule moves in the collective force field of all other molecules, and molecular motions of all particles are tracked. This is atomistic physics at the most basic level of interaction.

Supercomputers open up new realms for investigation and enable greater problem domains to be considered. Researchers can develop solutions that treat entire problems from first principles, building from the interactions at the atomic level, all the way up to the macroscopic. As the tool of researcher imagination, new insights and approaches to problem solving are unconstrained.

PROBABILISTIC DECOMPRESSION MODELING AND MAXIMUM LIKELIHOOD
Maximum likelihood is a statistical technique used to fit model equations to a sample with relative probabilities for occurrence and non-occurence given. We can never measure any physical variable exactly, that is, without error. Progressively more elaborate experiments or theoretical representation only reduce the error in the determination. In extracting parameter estimates from data sets, it is also necessary to minimize the error (data scatter) in the extraction process. Maximum likelihood is one such technique applied to probabilistic decompression modeling.

DCI is a hit, or (hopefully) no-hit situation, and statistics are binary, as in coin tossing. As a random variable, DCI incidence is a complicated function of many physical variables, such as inert gas buildup, VGE counts, pressure reduction on decompression, volume of separated gas, number of bubble seeds, gas solubility in tissue and blood, ascent rate, nucleation rate, distribution of growing bubble sizes, and combinations thereof. Any, and all of these, can be assigned as risk functions in probabilistic decompression modeling, and associated constants deduced in the maximum likelihood fit process.

Project Dive Safety is a DAN program to collect and analyze data on real dives in real time for profiles, behavioral, and health aspects associated with recreational diving. The study focuses on actual dives and profiles recorded by depth/time computers, and verifies the general condition of the diver up to 48 hours after exiting the water, regarding health problems Upwards of a million dive profiles are anticipated for this study, mainly because DCI incidence is low probability and many trials are necessary for meaningful modeling, statistics, correlations, and estimates. Multivariate model equations are fitted to the dive profiles and observed DCI incidence rate using maximum likelihood, a technique which minimizes the variance in fitting equations to a recreational diving sample. The recreational data file sizes to hundreds of gigabytes, and requires gigaflop supercomputing resources for processing. A 10 parameter risk function fit to 1 million dive profiles would take about an hour on the 256 node CRI T3D, an MPP with 16 gigabytes of memory, 65 gigabytes of fast disk, and a peak speed near 38 gigaflops. Run times scale as the number of events times the number of risk function parameters squared.

MULTILEVEL DIVE PROFILE ANALYSIS
Schemes for multilevel diving are employed in the commercial, scientific, and sport sectors. In addition to validation, questions arise as to method consistency with the formulation of the US Navy Tables on critical tension principles. One approach employs back to back repetitive sequencing, assigning groups at the start of each multilevel dive segment based on the total bottom time (actual plus residual nitrogen) of the previous segment. At times, the method allows critical tensions, other than the controlling (repetitive) 120 minute compartment tension, to be exceeded upon surfacing. In the context of the US Navy Tables, such circumstance is to be avoided. But, by tightening the exposure window and accounting for ascent and descent rates, such a multilevel technique can be made consistent with the permissible tension formulation of the US Navy Tables.

To adequately evaluate multilevel diving within any set of Tables, it is necessary to account for ascent and descent rates. While ascent and descent rates have small effect on ingassing and outgassing in slow tissue compartments, ascent and descent rates considerably impact fast tissue compartments. Model impact is measured in nitrogen buildup and elimination in hypothetical compartments, whose halftimes denote time to double, or half, existing levels of nitrogen. Buildup and elimination of nitrogen is computed with Haldane tissue equations (exponential rate expressions), and critical tensions, are assigned to each compartment to control diving activity and exposure time. In multilevel diving, computed tissue tensions in any and all compartments must be maintained below their critical values. This is a more stringent constraint than just flooring the 120 minute compartment tension, the approach used in the US Navy Tables for repetitive diving.

In the context of the US Navy Tables, from which many Tables with reduced nonstop time limits derive, six compartments with 5 ,10, 20, 40, 80, and 120 minute halftimes limit diving through maximum tensions (M-values) of 104, 88, 72, 58, 52, and 51 fsw, respectively. The 5 and 10 minute compartments are fast, the 80 and 120 minute compartments are slow, and the others are often between, depending on exposure profile. Dive exposure times, depths, ascent, and descent rates, affecting slow and fast compartments in a complicated manner, are virtually infinite in number, thus suggesting the need for both a supercomputer and meaningful representation of the results. A CRAY YMP supercomputer addressed the first concern, while the US Navy Tables provided a simple vehicle for representation of results.

Calculations were performed in roughly 1 minute time intervals, and 10 fsw depth increments for all possible multilevel dives up to, and including, the standard US Navy nonstop time limits, and down to a maximum depth of 130 fsw. Ascent and descent rates of 60 fsw/min were employed. Tissue tensions in all six compartments were computed and compared against their M-values. Dives for which the M-values were not violated were stored until the end of the multilevel calculations, for further processing. Dives violating any M-value, at any point in the simulation, were terminated, and the next dive sequence was initiated. The extremes in times for permissible multilevel dives form the envelope of calculations at each depth. The envelope terms out to be very close to the NAUI nonstop limits for the US Navy Tables, that is, the Tables shown in Figure 3. Within a minute, on the conservative side, the envelope tracks the reduced nonstop limits. Approximately 16 million multilevel dives were analyzed on a CRAY YMP in about 8 minutes CPU time, including construction of the envelope, with 10 fsw and 1 minute resolution. The CRAY YMP has raw speed near 320 megaflops per CPU.

Adjunct to Figure 3, one can summarize with regard to YMP calculations

1. the deeper the initial depth, the shorter the total multilevel dive time;

2. maximum permissible multilevel dive times (total) vary between 100 and 60 minutes, depending on initial depths;

3. miminum permissible multilevel increments vary from 30 fsw to 10 fsw as the depth decreases from 130 fsw to 40 fsw;

4. multilevel US Navy Table dives falling within the envelope never exceed critical values, below or at the surface, in all compartments;

5. the multilevel envelope is the set of reduced nonstop limits.

In terms of the modified Tables (Figure 3), multilevel dives that stay to the left of the nonstop time limits never violate critical tensions, and are (hypothetically) sanctioned. Dive computers, of course, perform the same exercise underwater, comparing instantaneous values of computed tissue tensions in all compartments, throughout the duration of the dive, against stored M-values to estimate time remaining and time at a stop.