Navy DSRC Introduction and Policy Guide
Table of Contents
- 1. Introduction
- 1.1. Purpose
- 1.2. Overview of Supported CTAs
- 1.3. Requesting Assistance
- 1.4. Obtaining an Account
- 1.5. Visitor Information
- 2. Hardware, Network, and Software
- 2.1. High Performance Computing
- 2.1.1. IBM iDataPlex (Haise)
- 2.1.2. IBM iDataPlex (Kilrain)
- 2.2. Mass Storage Archive Server (Newton)
- 2.3. Network Connectivity
- 2.4. Software Environment
- 3. Data Storage
- 3.1. Permanent File Storage
- 3.2. Temporary File Storage
- 3.3. Archival File Storage
- 4. Processing Environment
- 4.1. Determining the Correct HPC System
- 4.1.1. Software Availability
- 4.1.2. Hardware Requirements
- 4.1.3. Queue Limits
- 4.2. Processing Environment Overview and Philosophy
- 4.3. Job Scheduling/Queuing Environment and Policies
- 4.3.1. IBM iDataPlex Queue Usage Policies
- 4.4. Interactive CPU-time Limits
- 5. Navy DSRC Specific Documentation
1. Introductionto top
1.1. Purpose
This document provides an overview of the Navy DSRC. This guide is intended to offer assistance to users and their S/AAAs in determining which systems will best meet specific computational needs.
To contact us with questions, comments, or suggestions about this guide, please visit the Contact Us page for complete contact information.
1.2. Overview of Supported CTAs
The Navy Department of Defense (DoD) Supercomputing Resource Center (Navy DSRC) is organizationally located with the Naval Meteorology and Oceanography Command (NAVMETOCCOM) and is collocated with the headquarters (Commander, Naval Meteorology and Oceanography Command - CNMOC) at the John C. Stennis Space Center, MS. NAVMETOCCOM/CNMOC provides oceanographic support to the Department of Defense through a wide range of oceanographic modeling, prediction and data collection techniques.
The Navy DSRC, formerly the NAVO MSRC, was the second of the four major shared DoD High Performance Computing (HPC) centers to be formed under the auspices of the DoD HPC Modernization Program. Now one of five such centers, the Navy DSRC provides specialized support in the following critical defense computational technology areas (CTAs):
| CTA | Description |
|---|---|
| CWO | Climate/Weather/Ocean Modeling and Simulation |
| CFD | Computational Fluid Dynamics |
| CSM | Computational Structural Mechanics |
| CCM | Computational Chemistry, Biology, and Materials Science |
| CEA | Computational Electromagnetics and Acoustics |
| ENS | Electronics, Networking, and Systems/C4I |
| SIP | Signal/Image Processing |
| FMS | Forces Modeling and Simulation |
| EQM | Environmental Quality Modeling and Simulation |
| IMT | Integrated Modeling and Test Environments |
| SAS | Space and Astrophysical Science |
DoD Supercomputing Resource Centers provide DoD scientists and engineers with most of the program's computational resources. Each center supports a full range of centralized systems and services, including vector machines, scalable parallel systems, clustered workstations, DoD scientific visualization resources, and training.
1.3. Requesting Assistance
The Consolidated Customer Assistance Center (CCAC) is available to help users with unclassified problems, issues, or questions. Analysts are on duty 8:00 a.m. - 11:00 p.m. Eastern, Monday - Friday (excluding Federal holidays).
- Web: https://help.ccac.hpc.mil/
- E-mail: help@ccac.hpc.mil
- Phone: 1-877-CCAC-039 (1-877-222-2039) or (937) 255-0679
- Fax: (937) 656-9538
You can contact the Navy DSRC for after-hours support and for support services not provided by CCAC. You can contact us in any of the following ways:
- E-mail: dsrchelp@navo.hpc.mil
- Phone: 1-800-993-7677 or (228) 688-7677
- Fax: (228) 688-4356
- U.S. Mail:
Navy DoD Supercomputing Resource Center
1002 Balch Boulevard
Stennis Space Center, MS 39522-5001
For more detailed contact information, please see the Contact Us page.
1.4. Obtaining an Account
The process of getting an account on the HPC systems at any of the DSRCs begins with getting an account on the HPCMP Portal to the Information Environment, commonly called a "pIE User Account". If you do not yet have a pIE User Account, please visit the Consolidated Customer Assistance Center (CCAC) Accounts page and follow the instructions there. Once you have an active pIE User Account, visit the Navy DSRC Accounts page for instructions on how to request accounts on the Navy DSRC HPC systems. If you need assistance with any part of this process, please contact CCAC at accounts@ccac.htp.mil.
1.5. Visitor Information
If you are planning to visit the Navy DSRC, it is important that you review the instructions on the Planning a Visit page. This page contains important information including pre-trip and on-arrival instructions that you will need to know to ensure that your visit to our center goes smoothly.
2. Hardware, Network, and Softwareto top
All HPC systems currently in operation at the Navy DSRC are seamlessly integrated with the Mass Storage Archive Server and the Defense Research and Engineering Network (DREN) via many high-speed networking technologies.
2.1. High Performance Computing
2.1.1. IBM iDataPlex (Haise)
Haise is an IBM iDataPlex. The login and compute nodes are populated with 2.6-GHz Intel Xeon Sandy Bridge E5-2670 16-core processors. Haise uses the FDR-10 InfiniBand interconnect in a Fat Tree configuration as its high-speed network for MPI messages and IO traffic. Haise uses IBM's General Parallel File System (GPFS) to manage its parallel file system that targets IBM's IS4600 (Infinite Storage) RAID arrays. Haise has 1,176 compute nodes that share memory only on the node; memory is not shared across the nodes. Each login node has two 8-core processors (16 cores) with its own Red Hat Enterprise Linux operating system, sharing 64 GBytes of memory, with no user-accessible swap space. Each compute node has two 8-core processors (16 cores) with its own Red Hat Enterprise Linux operating system, sharing 32 GBytes of memory, with no user-accessible swap space. Haise is rated at 391 peak TFLOPS and has 2.8 PBytes (formatted) of disk storage.
Haise is intended to be used as a batch-scheduled HPC system. Its login nodes are not to be used for large computational (e.g., memory, IO, long executions) work. All executions that require large amounts of system resources must be sent to the compute nodes by batch job submission.
| Login Nodes | Compute Nodes | |
|---|---|---|
| Total Nodes | 8 | 1176 |
| Operating System | RedHat Enterprise Linux | RedHat Enterprise Linux |
| Cores/Node | 16 | 16 |
| Core Type | Intel Xeon Sandy Bridge E5-2670 | Intel Xeon Sandy Bridge E5-2670 |
| Core Speed | 2.6 GHz | 2.6 GHz |
| Memory/Node | 64 GBytes | 32 GBytes |
| Accessible Memory/Node | 8 GBytes | 27 GBytes |
| Memory Model | Shared on node. | Shared on node. Distributed across cluster. |
| Interconnect Type | 10 GigEthernet | FDR-10 InfiniBand |
| Path | Capacity | Type |
|---|---|---|
| /scr | 2.4 PBytes | GPFS |
| /u/home | 16 TBytes | GPFS |
| /p/cwfs | 800 TBytes | PanFS |
For detailed information on using Haise, see the Haise User Guide.
2.1.2. IBM iDataPlex (Kilrain)
Kilrain is an IBM iDataPlex. The login and compute nodes are populated with 2.6-GHz Intel Xeon Sandy Bridge E5-2670 16-core processors. Kilrain uses the FDR-10 InfiniBand interconnect in a Fat Tree configuration as its high-speed network for MPI messages and IO traffic. Kilrain uses IBM's General Parallel File System (GPFS) to manage its parallel file system that targets IBM's IS4600 (Infinite Storage) RAID arrays. Kilrain has 1,176 compute nodes that share memory only on the node; memory is not shared across the nodes. Each login node has two 8-core processors (16 cores) with its own Red Hat Enterprise Linux operating system, sharing 64 GBytes of memory, with no user-accessible swap space. Each compute node has two 8-core processors (16 cores) with its own Red Hat Enterprise Linux operating system, sharing 32 GBytes of memory, with no user-accessible swap space. Kilrain is rated at 391 peak TFLOPS and has 2.8 PBytes (formatted) of disk storage.
Kilrain is intended to be used as a batch-scheduled HPC system. Its login nodes are not to be used for large computational (e.g., memory, IO, long executions) work. All executions that require large amounts of system resources must be sent to the compute nodes by batch job submission.
| Login Nodes | Compute Nodes | |
|---|---|---|
| Total Nodes | 8 | 1176 |
| Operating System | RedHat Enterprise Linux | RedHat Enterprise Linux |
| Cores/Node | 16 | 16 |
| Core Type | Intel Xeon Sandy Bridge E5-2670 | Intel Xeon Sandy Bridge E5-2670 |
| Core Speed | 2.6 GHz | 2.6 GHz |
| Memory/Node | 64 GBytes | 32 GBytes |
| Accessible Memory/Node | 8 GBytes | 27 GBytes |
| Memory Model | Shared on node. | Shared on node. Distributed across cluster. |
| Interconnect Type | 10 GigEthernet | FDR-10 InfiniBand |
| Path | Capacity | Type |
|---|---|---|
| /scr | 2.4 PBytes | GPFS |
| /u/home | 16 TBytes | GPFS |
| /p/cwfs | 800 TBytes | PanFS |
For detailed information on using Kilrain, see the Kilrain User Guide.
2.2. Mass Storage Archive Server (Newton)
There is one Oracle T4-4 system, Newton, which makes up the Resilient Mass Storage Server (RMSS). The system is configured with two 8-core 3.0-GHz processors, 256 GBytes of main memory, and over 70 TBytes of hard disk storage. For information on using the archive system, see the Archive User Guide.
2.3. Network Connectivity
Our site is a primary node of the Defense Research and Engineering Network, or DREN. DREN is a robust, high-speed network providing connectivity to user sites and centers nationwide. We connect to the DREN Wide Area Network (WAN) via an OC-48 circuit capable of data transfers up to 2.48 Gbits/sec and a secondary OC-12 circuit capable of data transfers up to 622 Mbits/sec to provide fault tolerance and additional bandwidth.
Our Local Area Network (LAN), a 10-Gigabit Ethernet connection, provides primary connectivity to the Navy DSRC infrastructure, HPCs, and mass storage assets. The users of the Navy DSRC are able to use this high-performance connectivity for interactive and data transfer functions.
2.4. Software Environment
All Navy DSRC systems run derivatives of the UNIX System V operating system with vendor-specific enhancements. A large variety of compiler environments, math libraries, programming tools and third-party analysis applications are available on the DSRC systems.
| System | Software Listing |
|---|---|
| IBM iDataPlex (Haise) | http://www.navo.hpc.mil/software/index.html?sys=Haise |
| IBM iDataPlex (Kilrain) | http://www.navo.hpc.mil/software/index.html?sys=Kilrain |
3. Data Storageto top
The Navy DSRC data storage consists of local home directories on each system, temporary disk storage on each system and long-term storage on the Resilient Mass Storage Server (RMSS). Files stored on the RMSS are subject to migration to off-line status that is controlled by Sun's Storage and Archive Manager/Quick File System (SAM/QFS) software.
3.1. Permanent File Storage
Users are allocated a home directory (referenced locally with the $HOME environment variable) on each Navy DSRC system with 1 GByte of non-migrated storage. $HOME is not backed up by the Center; therefore users are responsible for maintaining backup copies of any files in this directory.
3.2. Temporary File Storage
Each Navy DSRC system is configured with a large quantity of high-speed disk storage configured as the /scr file system. /scr is the globally accessible, high-speed working storage primarily for interactive and batch processing. Batch jobs use large amounts of temporary space. There are no limits on the size of individual files. Users are responsible for managing their own files in the /scr areas. The /scr file system is not backed up by the Center. Users are responsible for maintaining backup copies of any files in the temporary file system. Users can access their temporary storage by using the $WORKDIR environment variable. The table below lists the /scr allocations for each system.
| System | /scr |
|---|---|
| IBM iDataPlex (Haise) | 20 TBytes |
| IBM iDataPlex (Kilrain) | 20 TBytes |
3.3. Archival File Storage
All of our HPC systems have access to an online archival mass storage system that provides long-term storage for users' files on a petascale archival storage system that resides on a robotic tape library system. A 70-TByte disk cache frontends the tape file system and temporarily holds files while they are being transferred to or from tape.
The environment variables $ARCHIVE_HOST and $ARCHIVE_HOME are automatically set for you. $ARCHIVE_HOST can be used to reference the archive server, and $ARCHIVE_HOME can be used to reference your archive directory on the server. These can be used when transferring files to/from archive. For information on using the archive system, see the Archive User Guide.
4. Processing Environmentto top
4.1. Determining the Correct HPC System
Determining the correct HPC System for your needs can be a complex task. The following are just a few of the factors that might influence your choice:
4.1.1. Software Availability
If your work depends upon a specific Commercial Off-The-Shelf (COTS) application, you can verify it's availability on any system in the HPCMP by checking the Consolidated Software List. Software information for Navy DSRC systems is also available on our local software page. If you can't find the application that you need, contact CCAC for assistance.
4.1.2. Hardware Requirements
To ensure that your jobs will have access to sufficient cores and memory to run as needed, you can review the hardware specifications on our Hardware page. Additional details are available in each of the HPC User Guides, available from the Documentation page.
4.1.3. Queue Limits
If your jobs require exceptionally long run times or if you need an exceptionally large number of cores, you should verify that queue limits on the system you choose allow both the number of cores and run time that you need. To check this, see our Queue Summary page.
4.2. Processing Environment Overview and Philosophy
Navy DSRC provides both an interactive and a batch submission environment. Batch queue environments are available on all of the systems. The batch environment is the primary environment for most user work. All of the HPC systems at the Navy DSRC use the PBS batch queue system.
The batch queue environments allow users to submit, monitor and terminate their own batch jobs. This capability is intended for jobs requiring large amounts of memory and/or CPU time that generally run for many hours. Through the batch queue environments, the user submits a job either from the command line or through a shell script. Resource requirements (e.g., CPU time and number of processors) or runtime parameters (e.g., output file redirection) can be issued on the command line or embedded in the shell script for the batch job to be executed.
4.3. Job Scheduling/Queuing Environment and Policies
4.3.1. IBM iDataPlex Queue Usage Policies
| Priority | Queue Name |
Job Class |
Max Wall Clock Time |
Max Cores Per Job |
Comments |
|---|---|---|---|---|---|
| Highest | urgent | Urgent | 24 Hours | 4096 | Designated urgent projects by DoD HPCMP |
![]() |
high | High | 168 Hours | 6144 | Designated high-priority projects by service/agency |
| challenge | Challenge | 168 Hours | 6144 | Challenge projects only | |
| special | N/A | 24 Hours | 4096 | Access available by request | |
| debug | Debug | 30 Minutes | 1024 | User diagnostic jobs | |
| standard | Standard | 168 Hours | 4096 | Non-challenge user jobs | |
| transfer | N/A | 12 Hours | 1 | Data transfer jobs | |
| Lowest | background | Background | 4 Hours | 512 | User jobs that will not be charged against the project allocation |
4.4. Interactive CPU-time Limits
The Navy DSRC has implemented a 15 minute (900 second) interactive processing limit on login nodes for processes running outside of the batch scheduler. This also applies to systems that do not have a batch scheduler installed. If you were to run an application on a login node, the application would be allowed to accrue 900 seconds-worth of CPU time, not real time, before being terminated. This policy has been put in place in order to protect interactive access for all users.
| System | CPU Time |
|---|---|
| IBM iDataPlex (Haise) | 15 Minutes |
| IBM iDataPlex (Kilrain) | 15 Minutes |
| Oracle T4-4 (Newton) | 15 Minutes |
5. Navy DSRC Specific Documentationto top
On-line documentation and information can be found through the Navy DSRC Web site, the message of the day (MOTD) that is displayed when logging on any system, and manual pages via the man command.
