Technical Requirements
Q1. Please clarify the the overlapping
responsibilities of system administration of the CMRS.
A1. The selected Offeror shall have primary
responsibility for administering the system, and full
responsibility for meeting the prescribed system metrics. ORNL
will provide additional administrative support, primarily with
higher level administrative functions including scheduling
policy, integration with other infrastructure services
including accounts management and reporting responsibilities.
Reference Section 11 for additional information.
Q2. From Section 5.3, what Scale Factor, Fsw,
from potential software improvements should an Offeror assume?
A2. Assume that Fsw = 2.0
Q3. For on-site hardware responses, shall we
assume that vendor field services engineers hold active DOE Q
clearances?
A3. No. Staff that access the CMRS shall be US
citizens, and may require a background check, but there is not
specific requirement for an active security clearance.
Q4. What guidance should Offerors use to
calculate the I/O scale factor, Fio?
A4. The Offeror shall assume that Fio
= 3.0 for the FS, and Fio= 1.5 for the LTFS.
Reference Section 5.3 for additional information.
Q5. Please define the FSB and LTFSB.
A5. The FS and LTFS Benchmarks are described
in the Benchmark Instructions, Section 1.7.
Q6. Section 5.4 describes a preference for IB
and SRP as the disk attachment protocol. If external server
hardware is used to access the storage, is IB and SRP the
preferred model for attaching external storage to the servers?
Is 8Gb FC acceptable?
A6. IB + SRP is the preferred model for
attaching external storage to servers. Other solutions,
including FC, may be acceptable and will be evaluated.
Q7. How is run time variability measured?
A7. The CMRS shall deliver job run-time
variability of not more than +/- 5% for its benchmark codes
under the following condition: at least 90% of CMRS resources
are in use, running multiply copies of the benchmark codes in
a random manner, using fixed input datasets.
Q8. As described in Section 8.3, Local Area
Network Connectivity, will ORNL provide the cables and
switches?
A8. The intent of the description in Section
8.3 is to define a demarc between the Offeror equipment and
the ORNL infrastructure. ORNL will provide a modest number of
connections from the CMRS to the infrastructure to support
services delivery (LDAP, NAT, and similar). ORNL will also
provide the network infrastructure from the LTFS DTNs out to
external /wide area networks. The Offeror is responsible for
any switch, cables, or similar equipment that interconnects
the elements of its solution (compute, FS, LTFS).
Q9. What level of support is required outside
of business hours?
A9. Reference Section 11 for specific details
for warranty, maintenance, and support services requirements.
Q10. Does the application support requirement
need a dedicated on-site person?
A10. An on-site application analyst is
preferred. The Offeror is responsible for describing their
strategy for meeting the requirement for application support.
Q11. Can training requirements be provided via
online training?
A11. Online training materials may supplement
more traditional training methods. Reference Section 11.8 for
additional information.
Q12. Is the LTFS Effectiveness Level for the
disk subsystem only?
A12. No. The EL is calculated for the entire
file system.
Q13. Is QDR required for the connection from
the disk to the OSS's?
A13. No. QDR or 10Gigabit Ethernet is required
for the connection from the LTFS to the RDTNs.
Q14. Section 5.2 describes a file size
distribution, and asks Offerors to provide performance targets
for data transfers between the LTFS and FS using a parallel
copy mechanism. Will transfer rates need to be benchmarked at
each of the data sizes, and averaged?
A14. No. The Offeror should generate a
synthetic data stream that corresponds to the distribution,
and use that synthetic data stream to generate performance
targets between the LTFS and FS. As the Offeror is creating
the synthetic data stream with the prescribed mix of
files/sizes, the details of the contents of the data stream
must also be provided.
Q15. As described in Section 5.4, the storage
system shall support hot swapping of all components. Will
testing of this capability be used as a term of acceptance?
A15. All requirements are subject to
acceptance testing.
Q16. In Section 5.4, there is a requirement
that the storage system shall support and be configured with
tiers that are protected by RAID 6 or an equivalent data
protection and recovery mechanism. Please define a
"tier".
A16. For the purposes of this acquisition, a
tier is a RAID set or collection of disks configured in a
single redundancy group.
Q17. Is there a MTTDL requirement for the
filesystems?
A17. As the file systems are protected by a
double parity scheme, the Offeror shall calculate the MTTDL of
their solution as
MTTDL = MTBF3/(N*(N-1)*(N-2)*MTTR2)
where MTBF is the Mean Time Between Failures
and MTTR is the Mean Time to Recover.
Q18. What is the anticipated I/O load during
drive rebuilds?
A18. There is no expectation that I/O load
will decrease during rebuilds.
Q19. There is a statement in Section 5.4
relative to the maximum rebuild time for a drive, limiting it
to less than 24 hours. Is this true for emerging technologies
where the rebuild time for large-capacity disk drives may not
yet be known?
A19. This statement is marked as Significant,
but not Critical. Offerors may take limited exception to
Significant requirements where the reasons and benefits for
those exceptions are clearly stated.
Q20. For 480VAC power to the compute racks,
will ORNL allow only 4 wire, 3-phase delta wiring or will you
also allow 5 wire, 3-phase wye wiring with 277 VAC single
phase power wiring of individual components?
A20. ORNL will allow either 480V delta
connected (without neutral) or 480Y/277V wye connected (with
neutral) power supplies. Offerors must ensure that harmonic
currents for such a solution meet the specifications described
in IEEE 519A.
Benchmarks
A lower-case (b) is appended to the numbering
for the Benchmark Questions and Answers to distinguish them
from Questions and Answers for the Technical Specification.
Q1b. May Offerors see the Benchmark Results
spreadsheet template?
A1b. That spreadsheet, Benchmark_Results.xls,
is now posted.
Q2b. Are there updates available to the CM-HR-tput
throughput benchmark script?
A2b. Yes. That update, CM-HR-tput.csh, is now
posted. Minor modifications to suit your benchmarking
environment will be required. As the download file has a .csh
extension, you may need to right-click and save the file to
prevent your OS from attempting to run the script.
Q3b. Can Offerors provide as reference a
standard output file for a CM2-HR run that completes in less
than 3.5 hours?
A3b. Yes. An example is now posted. A CM2-HR
throughput example generating output CM-HR-tput.693405
completed on a Cray XT5 using 2150 cores in 03:21. The
download has an artificial .txt extension.
Q4b. Are the current benchmarks expected to
scale to the full partition size (10's of thousands of cores
or more) of the proposed subsystems? Are the current
benchmarks intended to provide both capacity and capability
results?
A4b. No. However, there will be significant
work on the applications over the life of the project to
substantially increase the effective core counts at which
these and other NOAA applications can run. This is reflected
in the requirement that a subsystem or partition shall be
capable of running applications at the full size of that
subsystem or partition. It is understood that the current
benchmarks do not have a capability component.
Q5b. Can ORNL provide additional reference
output for CM-CHEM and CM2-HR? Specifically, fms.out log files
from CM-CHEM and CM2-HR for the short verification run and a
full scaling run and diag_integral.out from both CM-CHEM and
CM2-HR for the short verification run and a full scaling run.
A5b. Yes, these reference files are now posted
in the NOAA_benchmark_output.tar.gz file.
Q6b. What version of ESMF should I be using
for the GFS benchmark? The top level README says
"emsf-2.2.2rp2" but the README file under gfs
directory seems to indicate a different version, "ESMF
version v2.2.2 release date 03/16/06".
A6b. ESMF v2.2.2, release date 03/16/06 is the
official release. Please use this version if possible.
Q7b. Section 4.1.2 'Model Reproducibility'
refers to a 'CM2-Chem verification directory'. Does this refer
to the tar files produced by the CM-CHEM run scripts at the
end? Or ardiff script?
A7b. Neither. The ardiff script may be
disregarded. The corrected instructions state "The
reproducibility of the atmospheric and ocean components of the
model may be verified through a series of checksums and global
integrals written to stdout at the end of the run."
Q8b. Can ORNL provide the benchmark job work
stream run time on existing system or element ( tE
)?
A8b. The value for tE is referenced
in the Technical Specification (Attachment A. Section 5.3,
Figure 4).
Q9b. From the Benchmark Instructions, "GFS
is a global spectral weather model developed and used at NOAA
NCEP. To build the GFS executable you will need to download
and build ESMF version 2.2.2 release date 03/16/06 from
http://www.esmf.ucar.edu/download/releases.shtml
Are we allowed to use a more recent version of
the library? For instance,there is Version 4.0 (dated
10/30/09)
A9b. The ESMF API has changed from v2.2.2 to
v.4.0. It may be difficult to ensure that the GFS code will
work correctly using v.4.0. The modifications needed are
allowed, but this is not suggested.
Q10b. It appears that there is an error in the
CM-CHEM-verification job script with respect to the specified
output directories. Can you confirm?
A10b. There is an incorrect path in the CM-CHEM-verification
job script that refers to CM-CHEM-repro that Offerors should
change to CM-CHEM-verification.