next up previous contents
Home: Berkeley Seismological Laboratory
Next: Outreach and Educational Activities Up: Operations Previous: Seismic Data Analysis   Contents


Data Archive and Distribution: Northern California Earthquake Data Center

Subsections

Introduction

The Northern California Earthquake Data Center, a joint project of the Berkeley Seismological Laboratory and the U.S. Geological Survey at Menlo Park, serves as an "on-line" archive for various types of digital data relating to earthquakes in central and northern California. The NCEDC is located at the Berkeley Seismological Laboratory, and has been accessible to users via the Internet since mid-1992.

The primary goal of the NCEDC is to provide a stable and permanent archival and distribution center of digital geophysical data for northern and central California such as seismic waveforms, electromagnetic data, GPS data, and earthquake parametric data. The principal networks contributing seismic data to the data center are the Berkeley Digital Seismic Network (BDSN) operated by the Seismological Laboratory, the Northern California Seismic Network (NCSN) operated by the USGS, and the Bay Area Regional Deformation (BARD) GPS network. The collection of NCSN digital waveforms date from 1984 to the present, the BDSN digital waveforms date from 1987 to the present, and the BARD GPS data date from 1993 to the present.

The NCEDC continues to use the World Wide Web as a principal interface for users to request, search, and receive data from the NCEDC. The NCEDC has implemented a number of useful and original mechanisms of data search and retrieval using the World Wide Web, which are available to anyone on the Internet. All of the documentation about the NCEDC, including the research users' guide, is available via the Web. Users can perform catalog searches and retrieve hypocentral information and phase readings from the various earthquake catalogs at the NCEDC via easy-to-use forms on the Web. In addition, users can peruse the index of available broadband data at the NCEDC, and can request and retrieve broadband data in standard SEED format via the Web. Access to all datasets is available via research accounts at the NCEDC. The NCEDC's home page address is http://quake.geo.berkeley.edu/

Datasets

The bulk of the data at the NCEDC consist of waveform and GPS data from northern California. The total size of the datasets archived at the NCEDC is shown in Table 10.1.


Table 10.1: Volume of Data Archived at the NCEDC by Data Type
Data Type MBytes
BDSN broadband, electric field, and magnetic field waveforms 713,977
NCSN event seismograms 276,233
GPS data (RINEX and raw data) 235,870
Calpine/Unocal Geysers region seismograms 38,658
Parkfield HRSN seismograms 77,319
Nevada broadband seismograms 47,827
USGS Low Frequency geophysical data 907
Misc data 37,178
Total size of archived data 1,427,969


BDSN/NHFN Seismic Data

The archival of current BDSN (Chapter 2) and NHFN (Chapter 3 seismic data is an ongoing task. BDSN and NHFN data are telemetered from 30 seismic data loggers in real-time to the BSL, where they are written to disk files. Each day, an extraction process creates a daily archive by retrieving all continuous and event-triggered data for the previous day. The daily archive is run through quality control procedures to correct any timing errors, triggered data is reselected based on the REDI, NCSN, and UCB earthquake catalogs, and the resulting daily collection of data is archived at the NCEDC.

All of the data acquired from the BDSN and NHFN Quanterra data loggers are archived at the NCEDC. Most of the 16-bit BDSN digital broadband data from 1987-1991 have been converted to MiniSEED and are now online. However, there are still data from the 16-bit BDSN data acquisition systems from MHC, SAO, and PKD1 from late 1991 through mid-1992 that need to be converted to MiniSEED. Likewise, data acquired by portable 24-bit RefTek recorders before the installation of Quanterra data loggers at NHFN sites has not yet been converted to MiniSEED and archived at the NCEDC.

Figure 10.1: Chart showing the availability of BDSN and NHFN data at the NCEDC for the 1 Hz and 20 Hz channels from 01/01/1996 - 06/30/2001. The "uptime" of the BDSN and NHFN is better than 95% at nearly all stations. Exceptions are BDM (damaged by a lightning strike in May 2000), FARB (power problems when the USFWS generator failed in 1998), MOD (the newest BDSN station which suffered some delays during installation), WENL (flooded during the winter of 1997), and YBIB (damaged during a lightning strike in 1997, and without AC power for the past year). In general, a difference between the 1 and 20 Hz data is indicative of one or more significant telemetry problems. Following a major telemetry outage, BSL staff will recover 1 Hz continuous data but only event data for the 20 Hz channels.
\begin{figure*}\begin{center}
\epsfig{file=bdsn.avail.eps,width=17cm}\end{center}\end{figure*}

NCSN/SHFN Seismic Data

NCSN and SHFN waveform data are now being sent to the NCEDC via the Internet. The NCSN event waveform files are automatically transferred from the Menlo Park to the NCEDC as part of the routine analysis procedure by the USGS, and are automatically verified and archived by the NCEDC. Parametric information, such as event catalogs and phase readings from both the BDSN and NCSN are automatically updated on the NCEDC on an hourly basis.

A few corrupt NCSN event files were discovered at the NCEDC several years ago, and were eventually traced down to suspected flaws in the 12-inch WORM media and/or firmware problems on the Sony WDA-600 series jukeboxes used by the NCEDC. When we transcribed the data from the 12-inch WORM media to the current 5.25 inch magneto-optical media, we verified that all files were transcribed accurately. This year, using software developed at the NCEDC to detect possibly corrupt NCSN files, we identified 4704 possibly corrupted NCSN waveform event files. We re-read the original NCSN tapes for all of these events, discovered that only 71 of the files were actually corrupt, and replaced the corrupted event waveform files.

The NCEDC maintains a list of teleseismic events recorded by the NCSN, which is updated automatically whenever a new NCSN event file is received at the NCEDC, since these events do not appear in the NCSN catalog.

Electro-Magnetic Data

The NCEDC continues to archive and process electric and magnetic field data acquired from data loggers at two sites (SAO and PKD). At PKD and SAO, 3 components of magnetic field and 2 or 4 components of electric field are digitized and telemetered in real-time along with seismic data to the Seismological Laboratory, where they are processed and archived at the NCEDC in a similar fashion to the seismic data (Chapter 5). The system generates continuous data channels at 40 Hz, 1 Hz, and .1 Hz for each component of data. All of these data are archived and remain available online at the NCEDC. Using programs developed by Dr. Martin Fullerkrug at the Stanford University STAR Laboratory (now at the Institute for Meteorology and Geophysics at the Univerity of Frankfurt), the NCEDC is computing and archiving magnetic activity and Schumann resonance analysis using the 40 Hz data from this dataset. The magnetic activity and Schumann resonance data can be accessed from the Web.

In addition to the electro-magnetic data from PKD and SAO, the NCEDC archives data from a low-frequency, long-baseline electric field project operated by Dr. Steve Park of UC Riverside at site PKD2. This experiment (which is separate from the equipment at PKD1 described in Chapter 5), uses an 8-channel Quanterra data logger to record the data, which are transmitted to the BSL using the same circuit as the BDSN seismic data. These data is acquired and archived in an identical manner to the other electric field data at the NCEDC.

Parkfield High Resolution Seismic Network Data

Event seismograms from the Parkfield High Resolution Seismic Network (HRSN) from 1987 through June 1998 are available in their raw SEGY format via NCEDC research accounts. A number of events have faulty timing due to the lack or failure of a precision timesource for the network. Due to funding limitations, there is currently no ongoing work to correct the timing problems in the older events or to create MiniSEED volumes for these events. However, a preliminary catalog for a significant number of these events has been constructed, and the catalog is available via the web at the NCEDC.

The original HRSN acquisition system died in late 1998, and an interim system of portable RefTek recorders were installed at some of the sites. Over the past year, 3 new borehole sites were installed, the system has been upgraded to operate with Quanterra Q733 data loggers and digital telemetry to new central processing site in Parkfield. Event files from the HRSN network are automatically transferred to the NCSN, and are made available to the research community via anonymous ftp until they are reviewed and permanently archived.

As an interim measure, continuous data from the HRSN has been temporarily archived on the NCEDC in order to help researchers retrieve events that were not detected during the network upgrade.

GPS Data

The NCEDC continues to expand its archive of GPS data through the BARD (Bay Area Regional Deformation) network of continuously monitored GPS receivers in northern California (Chapter 6). The NCEDC GPS archive now includes 40 continuous sites in northern California. There are 24 core BARD sites owned and operated by UC Berkeley, LLNL, USGS, UC Davis, Trimble Navigation, and Stanford. Data from the other northern California sites are collected from sites operated by JPL, the U.S. Coast Guard, and Scripps Institute of Oceanography.

Most of the Berkeley BARD sites are collocated with seismic stations, and data from these sites are acquired in real-time using shared frame relay telemetry link. The remaining Berkeley BARD stations use dedicated frame relay and/or spread spectrum radio to provide data in real-time to UC Berkeley, and are automatically processed and archived at the NCEDC on a daily basis. Data from the USGS sites are downloaded by the USGS and transferred to the NCEDC on a daily basis, and is automatically archived by the NCEDC. The other sites are automatically acquired from their respective operators on an hourly or daily basis, and are archived by the NCEDC.

This year the NCEDC continued to archive non-continuous survey GPS data. The initial dataset to be archived is the survey GPS data collected by the USGS Menlo Park for northern California and other locations. The NCEDC will be the principal archive for this dataset. Significant quality control efforts were implemented by the NCEDC to ensure that the raw data, scanned site log sheets, and RINEX data are archived for each survey. All of the USGS Menlo Park GPS data has been transferred to the NCEDC and the majority of the data has been archived and is available for distribution.

Calpine/Unocal Seismic Data

The Calpine Geysers seismic network, consisting of up to 49 different borehole monitoring sites, was initially deployed and operated by the Unocal Geothermal Division on behalf of the geothermal energy producers in the California Geysers geothermal field. In 1999, the Unocal geothermal fields at the Geysers were acquired by the Calpine Corporation. Calpine has continued Unocal's collaboration with the NCEDC and has made data available for public distribution. The Calpine/Unocal Geysers dataset is a collection of digital microearthquake seismograms recorded by this network. 12 years of digital microearthquake seismograms (1987 through 1998) have been released for archiving and distribution through the NCEDC.

Calpine has also released an unedited earthquake catalog for events in 1995-1998 with approximate times and very rough hypocenters for some of the event represented in the time series, primarily as an index aid for the waveforms. Calpine makes no claims that this catalog is complete in any manner, and assumes no responsibility for the accuracy or usefulness of the catalog data.

Through an updated agreement with the NCEDC last year, Calpine released triggered event waveform data and a preliminary hypocenter catalog for an additional two years of data from 1999-2000. The total dataset represents over 248,000 events that were recorded by the Calpine/Unocal Geysers network, and is available via research accounts at the NCEDC.

USGS Low Frequency Data

Over the last 25 years, the USGS at Menlo Park, in collaboration with other principal investigators, has collected an extensive low-frequency geophysical data set that contains over 1300 channels of tilt, tensor strain, dilatational strain, creep, magnetic field, water level, and auxiliary channels such as temperature, pore pressure, rain and snow accumulation, and wind speed. We are actively working with the USGS to assemble the requisite information for the hardware representation of the stations and the instrument responses for this diverse dataset.

During the past year, we installed the necessary programs to archive the raw data in standard MiniSEED and to populate the database with the necessary parameters to create SEED instrument responses. We have currently archived timeseries data from 887 data channels from 167 sites, and have instrument response information for 542 channels at 139 sites. The waveform archive is updated on a daily basis with data from 486 currently operating data channels.

There is also considerable interest in having the NCEDC archive and distribute a "processed" version of the principal data channels from this data set. We will augment the raw data archive as additional instrument response information is assembled for the channels, and will work with the USGS to clearly define the attributes of the "processed" data channels.


Table 10.2: Table of standard Quanterra State-of-Health channels archived at the NCEDC.
State-of-Health Channels
Channel Rate (Hz) Description
LEP 1.00 Battery Voltage
UCD 0.01 Clock drift
UCQ 0.01 Clock quality
UFC 0.01 Clock VCO frequency control
UEP 0.01 Battery voltage
UF1 0.01 Free telemetry packets (comlink 1)
UF2 0.01 Free telemetry packets (comlink 2)
UF3 0.01 Free telemetry packets (comlink 3)
UME 0.01 Broadband sensor mass position (E)
UMN 0.01 Broadband sensor mass position (N)
UMZ 0.01 Broadband sensor mass position (Z)
UMU 0.01 Broadband sensor triaxial mass position (U)
UMV 0.01 Broadband sensor triaxial mass position (V)
UMW 0.01 Broadband sensor triaxial mass position (W)


CNSS Catalog

The NCEDC, in conjunction with the Council of the National Seismic System (CNSS), is producing and distributing a world-wide composite catalog of earthquakes based on the catalogs of the national and various U.S. regional networks. Each network updates their earthquake catalog on a daily basis at the NCEDC, and the NCEDC constructs a composite world-wide earthquake catalog by combining the data, removing duplicate entries that may occur from multiple networks recording an event, and giving priority to the data from each network's authoritative region. The catalog, which includes data from 14 regional and national networks, is searchable using a Web interface at the NCEDC. The catalog is also freely available to anyone via ftp over the Internet.

Notes on Quanterra data loggers

State-of-Health Channels

The BDSN and NHFN Quanterra data loggers create a variety of low frequency data channels that provide state-of-health information of the acquisition and telemetry systems. These channels report information such as battery voltage, internal data logger temperature, broadband sensor mass positions, clock quality, clock frequency, clock drift, and telemetry buffer utilization. We use these channels to monitor the systems and diagnose problems, but until recently, have not archived or distributed these data channels to other users.

On June 1, 2001, we started to archive the continuous state-of-health channels at the NCEDC as part of the continuous BDSN/NHFN data archive in order to provide both the BSL and external users with as much information as possible on variables that may affect network data quality. We developed the required SEED instrument responses for all of the state-of-health channels so that we could distribute this data in the standard SEED format along with the seismic and other geophysical data channels.

Due to differences in the specific data logger configurations, not all state-of-health channels are available at all sites. Table 10.2 lists the basic state-of-health channels archived at the NCEDC. In situations where we have multiple data loggers at a site, the state-of-health channels from the second data logger have channel names beginning with A instead of U.

Timing Corrections

In late 1998, Quanterra provided the first release of MultiSHEAR, an enhanced version of its data acquisition software. During November and December 1999, all of the BSL Quanterra data loggers were updated to MultiSHEAR with the corresponding OS/9 modifications. The main feature of MultiSHEAR that affects the BDSN and NHFN data archived at the NCEDC is the correction of a systematic timing error of the decimated channels from SHEAR and UltraSHEAR software in the Quanterra data logger.

The timing description in the following section is provided courtesy of the USGS Albuquerque Laboratory, which provides data collection and quality control procedures for the portion of the GSN that uses Quanterra data loggers.

The Quanterra digitizers initially sample at very high rates. In firmware the data are introduced to a filter cascade of a various number of stages where they are low-pass FIR filtered and decimated multiple times. Depending on the specific system the data are further FIR filtered and decimated by configurable software. Each applied FIR filter introduces to the data a nominal delay of half the FIR filter width which then requires subsequent corrections to the data time tags.

For these Quanterra systems the calculation of the time tag applied to the data is more complicated than the first order correction associated with the half widths of the FIR filters. There is a very small correction term associated with data buffering and a more substantive subjective correction to account for the delay in reading the 'first break' during signal onset. This second term attempts to bridge the gap between impulsive and steady state signals. The size of this term has been a function of each filter's transition band and is generally 1.5-2.0 output samples. The cumulative effect of these corrections has mistimed most of the seismic channels. The mistiming is obvious when a sine wave is input into the Quanterra system and time series from the various channels are overplotted.

The following tables provide the timing corrections required for the various Quanterra data logger configurations used by the BSL.

Table 10.3: Time tag corrections for Quanterra Q680, Q980, and Q935 systems running SHEAR and UltraSHEAR software. The time corrections in the table should be added to the original timeseries timetags.
BDSN Quanterra
Q680, Q980, Q935 data loggers
Channel Sample Rate Correction
  (Hz) (seconds)
H?? 80 0.000
B?? 40 0.0025
B?? 20 -0.005
L?? 1.0 -0.688
V?? 0.1 -0.189



Table 10.4: Time tag corrections for Quanterra Q4120 seismic systems running UltraSHEAR software. The time corrections in the table should be added to the original timeseries timetags.
BDSN Quanterra
Q4120 seismic data loggers
Channel Sample Rate Correction
  (Hz) (seconds)
H?? 100 -0.00587
B?? 20 +0.03913
L?? 1.0 +1.06413
V?? 0.1 ???



Table 10.5: Time tag corrections for Quanterra Q4120 electro-magnetic systems running UltraSHEAR software. The time corrections in the table should be added to the original timeseries timetags.
BDSN Quanterra
Q4120 electro-magnetic data loggers
Channel Sample Rate Correction
  (Hz) (seconds)
B?? 40 +0.02277
L?? 1.0 +1.08527



Table 10.6: Time tag corrections for Quanterra NHFN Q4120 systems running UltraSHEAR software. The time corrections in the table should be added to the original timeseries timetags.
NHFN Quanterra
Q4120 data loggers
Channel Sample Rate Correction
  (Hz) (seconds)
CL?,DP? 500 -0.00633
HL?,EP? 100 -0.02161
B?? 20 +0.02339
L?? 1.0 +1.04839


As of this time, the BSL has not applied time corrections to either the waveform data or derived parameters such as phase picks. These timing issues do not apply to any of the stations with Q730 data loggers.


Table 10.7: Date of the MultiSHEAR upgrade at BDSN/NHFN sites and data logger type. Data recorded after this date do not require the time corrections in Tables 10.3 - 10.6. Data recorded before this date do.
Quanterra Data loggers
Date of MultiSHEAR Upgrade
Station Date Data logger
ARC 1999.346,18:19 Q980
BDM 1999.337,19:26 Q4120 (BDSN)
BKS 1999.338,07:11 Q980
BRIB 1999.309,07:47 Q4120 (NHFN)
BRIB 1999.337,02:00 Q980
BRK 1999.356,02:32 Q680
CMB 1999.346,02:35 Q980
CMSB 1999.307,06:53 Q4120 (NHFN)
CRQB 1999.309,02:00 Q4120 (NHFN)
CVS 1999.349,18:17 Q4120 (BDSN)
FARB 1999.351,01:39 Q4120 (BDSN)
HOPS 1999.349,18:19 Q980
JRSC 1999.351,02:17 Q680
KCC 1999.351,22:28 Q980
MHC 1999.351,03:07 Q980
MIN Closed before upgrade Q980
MOD 1999.289,00:00 Q980
ORV 1999.339,04:56 Q980
PKD 1999.354,21:52 Q980
PKD 1999.354,21:51 Q4120 (EM)
PKD1 1999.355,03:04 Q980
PKD1 1999.355,21:55 Q4120 (EM)
POTR 1999.350,00:58 Q4120 (BDSN)
RFSB 1999.309,02:48 Q4120 (NHFN)
RSRB 1999.309,03:24 Q4120 (NHFN)
SAO 1999.341,22:30 Q980
SAO 1999.341,22:58 Q4120 (EM)
SMCB 1999.309,03:52 Q4120 (NHFN)
STAN Closed before upgrade Q680
WDC 1999.284,00:00 Q980
WENL 1999.350,00:53 Q4120 (BDSN)
YBH 1999.346,18:47 Q980
YBIB 1999.308,23:07 Q4120 (NHFN)


Causal and Acausal Filters

The BSL employs a mixture of causal and acausal FIR filters on the Quanterra data loggers of the BDSN, NHFN, and HRSN. All the Q680/Q935/Q980 filters are firmware controlled and employ acausal filters. In contrast, the filter tree in the Q4120 data loggers is software controlled and can be configured for either causal or acausal performance.

In configuring the data loggers, the BSL has tried to maintain a balance between consistency and research goals. In general, the BDSN Q4120s have been configured for acausal filters for consistency with the Q680/Q935/Q980 systems. In contrast, the NHFN Q4120s have been configured for casual filters at 500 and 100 sps (CL?, HL?, DP?, and EP? channels) while the lower data rate channels (BL?, LL?, BP?, and LP?) have been configured for acausal filters - again for consistency with the BDSN.

Hardware and Software system

The NCEDC is housed with the computing facilities at the Berkeley Seismological Laboratory in McCone Hall. The BSL and NCEDC computers share a high-speed 100 MBit switched network.

In 1998 partial funding for a new mass storage system was made available from the USGS in mid-year, but the purchase of the new mass storage system had been deferred in an attempt to acquire higher density drives, which ultimately were not available. The new mass system was purchased in late spring 1998, and is comprised of a Sun Ultra 450 computer, a 1.3 Tbyte DISC 517 slot jukebox with two 2.6 GByte Magneto Optical (MO) drives, an 11-slot AIT tape jukebox which holds 25 GBytes per tape, and the SAM-FS hierarchical filesystem management software. Only two MO drives and minimal media were purchased at that time.

In 1999, the mass storage system was upgraded from its initial 1.3 TByte capacity to 2.5 TByte capacity by the replacement of its 2.6 GByte MO drives by four 5.2 GByte MO drives and 5.2 GB MO media. The mass storage system can be upgraded to a total of 1000 slots (5.2 TByte) capacity with the addition of a second media picker, drives, and media cells.

The new hardware and software system can be configured to automatically create multiple copies of each data file. The NCEDC is using this feature to create an online copy of each data file on MO media, and another copy on AIT tape which will be stored offline.

During the past year, the NCEDC updated memory in its two Sun servers.

Joint Northern California
Earthquake Catalog

Currently both the USGS and BSL construct and maintain earthquake catalogs for northern and central California. The "official" UC Berkeley earthquake catalog begins in 1910, and the USGS "official" catalog begins in 1966. Both of these catalogs are archived and available through the NCEDC, but the existence of 2 catalogs has caused confusion among both researchers and the public. The BSL and the USGS have spent considerable effort over the past year to define procedures for merging the data from the two catalogs into a single northern and central California earthquake catalog in order to present a unified view of northern California seismicity. The differences in time period, variations in data availability, and mismatches in regions of coverage all complicate the task.

From 1910 through 1967, the BSL catalog is the primary source of northern California earthquake information. Only limited phase data are available for this time period, although location and magnitude information is provided. The NCSN began to come online in 1966, and observations from this network are available beginning in July of that year.

Starting with data from 1996, the BSL and USGS are working to generate a "joint" catalog by merging phase data and relocating the earthquakes. One of the initial complications in this project is matching up events between the two catalogs. Due to the sparse nature of the BSL instrumentation over the years, the BSL catalog is only complete at the magnitude 3 level while the USGS catalog is generally complete at the magnitude 2 level. However, the BSL catalog includes regional events from southern California, Nevada, Utah, Oregon, and Washington. Thus neither catalog is a subset of the other. Other complications include foreshock/aftershock sequences, where one organization might read a foreshock and the mainshock and the other might read the mainshock and an immediate aftershock. Since limited phase data are available for the BDSN until 1976, most events during this period will combine the USGS location with the BSL magnitude. Where BSL phase data are available, an event that appears in both catalogs will have its phase and amplitude data merged, the event will be relocated with the combined phase data, and magnitudes will be recomputed using the available amplitude readings and new location.

The process of consolidating the BSL data to be merged into the joint catalog has uncovered many details in the original catalog which were ambiguous or poorly documented or inconsistent over time, such as the use of channel names for phase and amplitude readings. This year, new problems over uncertainties in NCSN timing for older events have arisen.

The USGS and BSL performed an initial joint catalog from the USGS and BSL catalogs in the spring of 1999. The BSL spent considerable time analyzing the resulting catalog, and has identified problems with specific earthquake associations and other related problems. Most of the remaining problems are associated with events outside the network, especially in the Cape Mendocino/Gorda plate region. We are working to finalize this effort as well as to establish proceedures to creating an on-going joint catalog.

User Interface Development

During the past year, the NCEDC has developed a generalized database query system to support the development of portable database query applications among data centers with different internal database schemas. The initial goal was to modify the IRIS SeismiQuery web interface program to make installation easier at the NCEDC and other data centers, as well as to introduce a new query language that would be schema independent.

In order to support SeismiQuery and other future database query applications, we defined a set of Generic Data Views (GDV) for the database that encompassed the basic objects that we expect most data centers to support. We introduced a new language we call MSQL (Meta SeismiQuery Language), which is based on generic SQL, and uses the GDV's for its core schema. MSQL queries are converted to Data Center specific SQL queries by the parsing program MSQL2SQL. This parser stores the MSQL parsing tree in a data structure and API's were implemented to browse and modify elements in the parsing tree. These API's are the only datacenter or database specific source codes. We finally modified the SeismiQuery web interface to uniformly generate MSQL requests and to process these requests in a consistent fashion.

We have installed SeismiQuery at the NCEDC, where it provides a common interface for querying attributes and available data for both the BDSN and the USGS Low Frequency networks.

We have provided both IRIS and the SCEC Data Center with our modified version of SeismiQuery. We envision using this approach to support other database query programs in the future.

Database Development

Most of the parametric data archived at the NCEDC, such as earthquake catalogs, phase and amplitude readings, waveform inventory, and instrument responses have been stored in flat text files. Flat file are easily stored and viewed, but are not efficiently searched. Over the last year, the NCEDC, in collaboration with the USGS/SCEC Data Center, and TriNet, has continued development of database schemas to store the parametric data from the joint earthquake catalog, station history, complete instrument response for all data channels, and waveform inventory.

The parametric schema supports tables and associations for the joint earthquake catalog. It allows for multiple hypocenters per event, multiple magnitudes per hypocenter, and association of phases and amplitudes with multiple versions of hypocenters and magnitudes respectively. The instrument response schema represents full multi-stage instrument responses (including filter coefficients) for the broadband data loggers. The hardware tracking schema will represent the interconnection of instruments, amplifiers, filters, and data loggers over time. This schema will be used to store the joint northern California earthquake catalog and the CNSS composite catalog.

The entire description for the BDSN and USGS Low Frequency Geophysical networks and data archive has been entered into the hardware tracking, SEED instrument response, and waveform tables. Programs have been developed to perform queries of waveform inventory and instrument responses, and the NCEDC can now generate full SEED volumes from the BDSN network based on information from the database and the waveforms on the mass storage system. The second stage of development will include the NCSN waveform inventory and later the NCSN instrument response data as they are made available. We distributed all of our programs and procedures to populate the hardware tracking and instrument response tables to the SCECDC in order to help them populate their database.

Additional details on the joint catalog effort and database schema development may be found at http://quake.geo.berkeley.edu/db

Data Distribution

In a collaborative project with the IRIS DMC and other worldwide datacenters, the NCEDC has helped develop and implement NETDC, a protocol which will provide a seamless user interface to multiple datacenters for geophysical network and station inventory, instrument responses, and data retrieval requests. The NETDC builds upon the foundation and concepts of the IRIS BREQ_FAST data request system. The NETDC system was put into production in January 2000, and is currently operational at three datacenters worldwide - the NCEDC, IRIS DMC, and Geoscope. The NETDC system receives user requests via email, automatically routes the appropriate portion of the requests to the appropriate datacenter, optionally aggregates the responses from the various datacenters, and delivers the data (or ftp pointers to the data) to the users via email.

The NCEDC hosts a web page that allows users to easily query the NCEDC waveform inventory, generate and submit NETDC requests to the NCEDC. The NCEDC currently supports both the BREQ_FAST and NETDC request formats. As part of our collaboration with SCECDC, the NCEDC provided its BREQ_FAST interface code to SCECDC, have worked closely with them to implement BREQ_FAST requests at the SCECDC.

The various earthquake catalogs, phase, and earthquake mechanism can be searched using NCEDC web interfaces that allow users to select the catalog, attributes such as geographical region, time and magnitude. The GPS data is available to all users via anonymous ftp. Research accounts are available to any qualified researcher who needs access to the other datasets that currently are not available via the Web.

The NCEDC is participating in the UNAVCO-sponsored GPS Seamless Archive Centers (GSAC) initiative, which is developing common protocols and interfaces for the exchange and distribution of continuous and survey-mode GPS data. During this year, the NCEDC developed and implemented procedures to generate the appropriate GSAC inventory records for previously archived GPS data at the NCEDC, and now automatically generates the GSAC inventory records on a routine basis as part of its archiving procedures. The GSAC inventory records are available via anonymous ftp for other GSAC data wholesalers and retailers.

Acknowledgements

The NCEDC is a joint project of the BSL and the USGS Menlo Park and is partially funded by the USGS.

Under Barbara Romanowicz's general supervision, and with Doug Neuhauser as NCEDC manager, Stephane Zuzlewski, Rick McKenzie, Mark Murray, André Bassett, and Lind Gee of the BSL and David Oppenheimer, and Hal Macbeth of the USGS Menlo Park contribute to the operation of the NCEDC. Will Prescott of the USGS Menlo Park has been coordinating the contribution of the low-frequency data set. Doug Neuhauser, Stephane Zuzlewski, and Lind Gee contributed to the preparation of this chapter.

Rick Lester retired from the USGS in the spring of 2000 and the BSL would like to acknowledge the many contributions he made toward archiving NCSN data at the NCEDC over the years.


next up previous contents
Next: Outreach and Educational Activities Up: Operations Previous: Seismic Data Analysis   Contents



Berkeley Seismological Laboratory, 202 McCone Hall, UC Berkeley, Berkeley, CA 94720-4760
Questions or comments? Send e-mail: www@seismo.berkeley.edu
© 2001, The Regents of the University of California.