Subsections


Northern California Earthquake Data Center

Introduction

The Northern California Earthquake Data Center, a joint project of the Berkeley Seismological Laboratory (BSL) and the U.S. Geological Survey at Menlo Park, serves as an online archive for various types of digital data relating to earthquakes in central and northern California. The NCEDC is located at the Berkeley Seismological Laboratory, and has been accessible to users via the Internet since mid-1992.

The primary goal of the NCEDC is to provide a stable and permanent archival and distribution center of digital geophysical data for networks in northern and central California. These data include seismic waveforms, electromagnetic data, GPS data, strain, creep, and earthquake parameters. The principal networks contributing seismic data to the data center are the Berkeley Digital Seismic Network (BDSN) operated by the Seismological Laboratory, the Northern California Seismic Network (NCSN) operated by the USGS, and the Bay Area Regional Deformation (BARD) GPS network. The collection of NCSN digital waveforms dates from 1984 to the present, the BDSN digital waveforms date from 1987 to the present, and the BARD GPS data date from 1993 to the present. The BDSN includes stations that form the specialized Northern Hayward Fault Network (NHFN) and the MiniPBO (MPBO) borehole seismic and strain stations in the SF Bay Region.

NCEDC Overview

The NCEDC is located within the computing facilities at the Berkeley Seismological Laboratory in McCone Hall. The BSL facility provides the NCEDC with air conditioning, 100 bit switched network, and reliable power from a UPS with generator backup.

The currently installed NCEDC facilities consist of a Sun 280R host computer with a 15-slot AIT tape library which holds 25 GBytes per tape, two SCSI/ATA RAID systems with a total capacity of 4.6 TBytes, and the SAM-FS hierarchical storage management (HSM) software. A dual processor Sun Ultra 60 provides Web services and research account access to the NCEDC, and a Sun Ultra 450 computer is used for quality control procedures.

In order to increase capacity and reliablity, the NCEDC has ordered a Sun L100 tape library with a capacity of 100 LTO-2 tapes, two LTO-2 tape drives, an additional 7 TBytes of RAID, and a fiber-channel switch to support a local SAN. With this additional hardware, we plan to implement a full online copy of the NCEDC data at an alternate location.

The hardware and software system can be configured to automatically create multiple copies of each data file. The NCEDC uses this feature to create an online copy of each data file on online RAID, and another copy on AIT tape which is stored offline. All waveform and GPS data are currently stored on magnetic disk, with backup copies on tape media.

The NCEDC acquired a single processor unlimited user Oracle database license from other funding sources to provide public access to the NCEDC earthquake catalog and waveform inventory, and has ordered a single processor computer on which to run this database.

2003-2004 Activities

By its nature, data archiving is an ongoing activity. In 2003-2004, the NCEDC continued to expand its data holdings and enhance access to the data. Projects and activities of particular note include:

These activities and projects are described in detail below.

Data Collections

The bulk of the data at the NCEDC consists of waveform and GPS data from northern California. Figure 10.1 shows the relative proportion of each data set at the NCEDC. The total size of the datasets archived at the NCEDC is shown in Table 10.1. Figure 10.2 shows the geographic distribution of data archived by the NCEDC.

Figure 10.1: Chart showing the relative proportion of each data set at the NCEDC.
\begin{figure*}\begin{center}
\epsfig{file=ncedc_chart_type.eps, width=15cm}\end{center}\end{figure*}


Table 10.1: Volume of Data Archived at the NCEDC by network
Data Type GBytes
BDSN/NHFN/MPBO (broadband, electric and magnetic field, strain) waveforms 1,414
NCSN seismograms 870
Parkfield HRSN seismograms 705
BARD GPS (RINEX and raw data) 402
UNR Nevada seismograms 135
SCSN seismograms 39
Calpine/Unocal Geysers region seismograms 38
USGS Low frequency geophysical waveforms 1
Misc data 14
Total size of archived data 3,618


Figure 10.2: Map showing the location of stations whose data are archived at the NCEDC. Circles are seismic sites; squares are GPS sites, and diamonds are the locations of USGS Low-frequency experiments.
\begin{figure*}\begin{center}
\epsfig{file=ncedc_all.eps, width=16cm}\end{center}\end{figure*}

BDSN/NHFN/MPBO Seismic Data

The archival of current BDSN (Chapter 3), NHFN (Chapter 4), and Mini-PBO (Chapter 7) (all stations using the network code BK) seismic data is an ongoing task. These data are telemetered from more than 30 seismic data loggers in real-time to the BSL, where they are written to disk files. Each day, an extraction process creates a daily archive by retrieving all continuous and event-triggered data for the previous day. The daily archive is run through quality control procedures to correct any timing errors, triggered data is reselected based on the REDI, NCSN, and BSL earthquake catalogs, and the resulting daily collection of data is archived at the NCEDC.

All of the data acquired from the BDSN/NHFN/MPBO Quanterra data loggers are archived at the NCEDC. The NCEDC has made an effort to archive older digital data, and the 16-bit BDSN digital broadband data from 1987-1991 have been converted to MiniSEED and are now online. In late June 2002, the NCEDC initiated a project to convert the remaining 16-bit BDSN data (MHC, SAO, and PKD1) from late 1991 through mid-1992 to MiniSEED. An undergraduate student was hired to read the old tapes and to work on the conversion. All remaining 20 Hz 16 bit BDSN data has been converted to MiniSEED, and we are working on the decimation procedures to create the 1 Hz data channels. Data acquired by portable 24-bit RefTek recorders before the installation of Quanterra data loggers at NHFN sites has not yet been converted to MiniSEED and archived.

Figure 10.3: Chart showing the availability of BDSN/NHFN/MPBO (BK networkc) data at the NCEDC for the 1 Hz and 20 Hz channels from 01/01/1996 - 06/30/2004. The data availability from these networks is better than 95% at nearly all stations. Notable exceptions are MNRC (operated for the first year with only dialup telemetry before the installation of continuous telemetry), YBIB (lost AC power before decomissioning the site), and W02B (experienced significant radio problems). In general, a difference between the 1 and 20 Hz data is indicative of one or more significant telemetry problems. Following a major telemetry outage, BSL staff will recover 1 Hz continuous data but only event data for the 20 Hz channels.
\begin{figure*}\begin{center}
\epsfig{file=bdsn.avail.eps,width=17cm}\end{center}\end{figure*}

NCSN/SHFN Seismic Data

NCSN and SHFN waveform data are sent to the NCEDC via the Internet. The NCSN event waveform files are automatically transferred from the USGS Menlo Park to the NCEDC as part of the routine analysis procedure by the USGS, and are automatically verified and archived by the NCEDC.

The NCEDC maintains a list of teleseismic events recorded by the NCSN, which is updated automatically whenever a new NCSN event file is received at the NCEDC, since these events do not appear in the NCSN catalog.

The NCSN operates a total of 11 continuously telemetered digital broadband stations in northwest California and southwest Oregon in support of the USGS/NOAA Consolitated Reporting of EarthquakeS and Tsunamis (CREST) system, two digital broadband stations in the Mammoth region, and one digital broadband station in the Parkfield region. The NCEDC established procedures to create an archive of continuous data from these stations, in addition to the event waveform files. These data initially included channels at 50 and 100 Hz, but now are all 100 Hz sampling and are archived continuously. At the USGS's request, the 3 component 500 Hz data from the Mammoth Deep Hole are now continuously archived.

Parkfield High Resolution Seismic Network Data

Event seismograms from the Parkfield High Resolution Seismic Network (HRSN) from 1987 through June 1998 are available in their raw SEGY format via NCEDC research accounts. A number of events have faulty timing due to the lack or failure of a precision timesource for the network. Due to funding limitations, there is currently no ongoing work to correct the timing problems in the older events or to create MiniSEED volumes for these events. However, a preliminary catalog for a significant number of these events has been constructed, and the catalog is available via the web at the NCEDC.

As described in Chapter 5, the original HRSN acquisition system died in late 1998, and an interim system of portable RefTek recorders were installed at some of the sites. Data from this interim system are not currently available online.

In 2000 and 2001, 3 new borehole sites were installed, and the network was upgraded to operate with Quanterra Q730 data loggers and digital telemetry. The upgraded acquisition system detects events using the HRSN stations and extracts waveforms from both the HRSN and the PASO stations. The event waveform files are automatically transferred to the NCEDC, where they are made available to the research community via anonymous ftp until they are reviewed and permanently archived. In 2000-2003 the PASO array, a temporary IRIS PASSCAL broadband network with real-time telemetry, was installed in the Parkfield area and its recording system was housed at the HRSN recording site in Parkfield. During this time, the HRSN collected event data from both the HRSN and PASO array and provided this integrated data set to researchers in near-real-time.

The HRSN 20 Hz (BP) and state-of-health channels are being archived continuously at the NCEDC. As an interim measure, the NCEDC also archived the continuous 250 Hz (DP) data channels through late 2002 in order to help researchers retrieve events that were not detected during the network upgrade.

The increased seismic activity related to the magnitude 6.5 earthquake in nearby San Simeon on December 22, 2003 drastically increased the number of triggers by the HRSN network. From December 2003 through August 2004, the HRSN had over 70,000 triggers. The 56Kb frame relay connection from Parkfield to UC Berkeley, which was installed to transmit continuous 20 Hz data, selected 250 Hz channels, and event triggered 250 Hz waveforms from the network, was saturated from the increased activity. The HRSN stopped telemetering the event-triggered waveforms, and the NCEDC started to archive continuous 20 and 250 Hz data from the entire network from tapes created at the HRSN operations center in Parkfield in order to preserve this unique dataset. The NCEDC plans to continue archiving both the continuous 250 Hz and 20 Hz data streams for the forseeable future.

SAFOD

In July 2002, scientists from Duke University successfully installed a three component 32 level downhole-seismic array in the pilot hole at the EarthScope SAFOD site in collaboration with Steve Hickman (USGS), Mark Zoback (Stanford University) and the Oyo Geospace Engineering Resources International (GERI) Corporation. High frequency event recordings from this array has been provide by Duke University for archiving at the NCEDC. We are currently converting the original SEG-2 format data files to MiniSEED, and developing the SEED instrument responses for this data set.

UNR Broadband Data

The University of Reno in Nevada (UNR) operates several broadband stations in western Nevada and eastern California that are important for northern California earthquake processing and analysis. Starting in August 2000, the NCEDC has been receiving and archiving continuous broadband data from four UNR stations. The data are transmitted in real-time from UNR to UC Berkeley, where they are made available for real-time earthquake processing and for archiving. Initially, some of the stations were sampled at 20 Hz, but all stations are now sampled and archived continuously at 100 Hz.

The NCEDC installed Simple Wave Server (SWS) software at UNR, which provides an interface to UNR's recent collection of waveforms. The SWS is used by the NCEDC to retrieve waveforms from UNR that were missing at the NCEDC due to real-time telemetry outages between UNR and UC Berkeley.

Electro-Magnetic Data

The NCEDC continues to archive and process electric and magnetic field data acquired from data loggers at two sites (SAO and PKD). At PKD and SAO, 3 components of magnetic field and 2 or 4 components of electric field are digitized and telemetered in real-time along with seismic data to the Berkeley Seismological Laboratory, where they are processed and archived at the NCEDC in a similar fashion to the seismic data. The system generates continuous data channels at 40 Hz, 1 Hz, and .1 Hz for each component of data. All of these data are archived and remain available online at the NCEDC. Using programs developed by Dr. Martin Fullerkrug at the Stanford University STAR Laboratory (now at the Institute for Meteorology and Geophysics at the Univerity of Frankfurt), the NCEDC is computing and archiving magnetic activity and Schumann resonance analysis using the 40 Hz data from this dataset. The magnetic activity and Schumann resonance data can be accessed from the Web.

In addition to the electro-magnetic data from PKD and SAO, the NCEDC archives data from a low-frequency, long-baseline electric field project operated by Dr. Steve Park of UC Riverside at site PKD2. This experiment (which is separate from the original equipment at PKD1 described in Chapter 3), uses an 8-channel Quanterra data logger to record the data, which are transmitted to the BSL using the same circuit as the BDSN seismic data. These data are acquired and archived in an identical manner to the other electric field data at the NCEDC.

GPS Data

The NCEDC continues to expand its archive of GPS data through the BARD (Bay Area Regional Deformation) network of continuously monitored GPS receivers in northern California (Chapter 6). The NCEDC GPS archive now includes 67 continuous sites in northern California. There are approximately 50 core BARD sites owned and operated by UC Berkeley, USGS (Menlo Park and Cascade Volcano Observatory), LLNL, UC Davis, UC Santa Cruz, Trimble Navigation, and Stanford. Data are also archived from sites operated by other agencies including East Bay Municipal Utilities District, the City of Modesto, the National Geodetic Survey, and the Jet Propulsion Laboratory.

The NCEDC continues to archive non-continuous survey GPS data. The initial dataset archived is the survey GPS data collected by the USGS Menlo Park for northern California and other locations. The NCEDC is the principal archive for this dataset. Significant quality control efforts were implemented by the NCEDC to ensure that the raw data, scanned site log sheets, and RINEX data are archived for each survey. All of the USGS MP GPS data has been transferred to the NCEDC and virtually all of the data from 1992 to the present has been archived and is available for distribution.

Geysers Seismic Data

The Calpine Corporation currently operates a micro-seismic monitoring network in the Geysers region of northern California. Prior to 1999 this network was operated by Unocal. Through various agreements, both Unocal and Calpine have released triggered event waveform data from 1989 through 2000 along with preliminary event catalogs for the same time period for archiving and distribution through the NCEDC. This dataset represents over 296,000 events that were recorded by Calpine/Unocal Geysers network, and are available via research accounts at the NCEDC.

The Lawrence Berkeley Laboratory (LBL), with funding from the California Energy Commission, operates a 22 station network in the Geysers region with an emphesis on monitoring seismicity related to well water injection. The earthquake locations and waveforms from this network are sent to the NCEDC, and the locations are forwarded to the NCSN so that they can be merged into the NCSN earthquake catalog. The LBL Geysers waveforms will be available at the NCEDC once the events have been merged into the NCSN catalog.

USGS Low Frequency Data

Over the last 26 years, the USGS at Menlo Park, in collaboration with other principal investigators, has collected an extensive low-frequency geophysical data set that contains over 1300 channels of tilt, tensor strain, dilatational strain, creep, magnetic field, water level, and auxilliary channels such as temperature, pore pressure, rain and snow accumulation, and wind speed. In collaboration with the USGS, we assembled the requisite information for the hardware representation of the stations and the instrument responses for many channels of this diverse dataset, and developed the required programs to populate and update the hardware database and generate the instrument responses. We developed the programs and procedures to automate the process of importing the raw waveform data and convert it to MiniSEED format.

We have currently archived timeseries data from 887 data channels from 167 sites, and have instrument response information for 542 channels at 139 sites. The waveform archive is updated on a daily basis with data from 350 currently operating data channels. We will augment the raw data archive as additional instrument response information is assembled by the USGS for the channels, and will work with the USGS to clearly define the attributes of the "processed" data channels.

SCSN/Statewide Seismic Data

In 2004, the NCEDC started to archive broadband and strong motion data from 15 SCSN (network CI) stations that are telemetered to the Northern California Management Center (NCMC) of the California Integrated Seismic Network (CISN). These data are used in the prototype real-time state-wide earthquake processing system and also provide increased coverage for northern California events. Since the data are telemetered directly from the stations in real-time to both the SCSN and to the NCMC, the NCEDC archives the NCMC's copy of the data to ensure that at least one copy of the data will be preserved.

Northern California Seismicity Project

The objective of the Northern California Seismicity Project (NCSP), which commenced in fiscal year 2001, is to transcribe the pre-1984 data for M$_{L} \geq$ 2.8 earthquakes which have occurred in Northern and Central California (NCC) outside of the San Francisco Bay region (SFBR), from the original reading/analysis sheets of the Berkeley Seismological Archives, into a computer readable format. This work complements the ongoing Historical Earthquake Relocation Project (HERP) of the Berkeley Seismological Laboratory, which concentrates solely on the San Francisco Bay Region.

The long-term goal of this project is to characterize the spatial and temporal evolution of northern California seismicity during the initial part of the earthquake cycle as the region emerges from the stress shadow of the great 1906 San Francisco earthquake. The problem is that the existing BSL seismicity catalog for the SFBR, which spans most of the past century (1910-present), is inherently inhomogeneous because the location and magnitude determination methodologies have changed, as seismic instrumentation and computational capabilities have improved over time. As a result, NCC seismicity since 1906 is poorly understood.

Creation of a NCC seismicity catalog that is homogeneous, that spans as many years as possible, and that includes formal estimates of the parameters and their uncertainty is a fundamental prerequisite for probabilistic studies of the NCC seismicity. The existence of the invaluable BSL seismological archive containing the original seismograms as well as the original reading/analysis sheets, coupled with the recently acquired BSL capability to scan and digitize historical seismograms at high resolution, allows the application of modern analytical algorithms towards the problem of determining the source parameters of historical SFBR earthquakes.

The funding level for this project has not allowed us to transcribe all of the pre-1984 reading/analysis sheets from the Berkeley Seismological Archive. However, limiting our work to earthquakes of M$_{L} \geq$ 3.0 provides a significant contribution to the uniformity of the NCC seismicity catalog. Although some funding was provided this year, we were unable to hire staff to work on this project and will complete it in 2004.

Earthquake Catalogs

Northern California

Currently both the USGS and BSL construct and maintain earthquake catalogs for northern and central California. The "official" UC Berkeley earthquake catalog begins in 1910, and the USGS "official" catalog begins in 1966. Both of these catalogs are archived and available through the NCEDC, but the existence of 2 catalogs has caused confusion among both researchers and the public. The BSL and the USGS have spent considerable effort over the past years to define procedures for merging the data from the two catalogs into a single northern and central California earthquake catalog in order to present a unified view of northern California seismicity. The differences in time period, variations in data availability, and mismatches in regions of coverage all complicate the task.

Worldwide

The NCEDC, in conjunction with the Council of the National Seismic System (CNSS), produced and distributed a world-wide composite catalog of earthquakes based on the catalogs of the national and various U.S. regional networks for several years. Each network updates their earthquake catalog on a daily basis at the NCEDC, and the NCEDC constructs a composite world-wide earthquake catalog by combining the data, removing duplicate entries that may occur from multiple networks recording an event, and giving priority to the data from each network's authoritative region. The catalog, which includes data from 14 regional and national networks, is searchable using a Web interface at the NCEDC. The catalog is also freely available to anyone via ftp over the Internet.

With the demise of the CNSS and the development of the Advanced National Seismic System (ANSS), the NCEDC was asked to update its Web pages to present the composite catalog as a product of the ANSS. This conversion was completed in the fall of 2002.

Data Quality Control

The NCEDC developed a GUI-based state-driven system CalQC to facilitate the quality control processing that is applied to the continuously archived data sets at the NCEDC.

The quality control procedures for these datasets include the following tasks:

CalQC uses previously developed programs to perform each function, but it provides a graphical point-and-click interface to automate these procedures, and to provide the analyst with a record of when each process was started, whether it executed correctly, and whether the analyst has indicated that a step has been completed. CalQC is used to process all data from the BDSN network, and all continuous data from the NCSN, UNR, SCSN, and HRSN networks that are archived by the NCEDC.

User Interface Development

SeismiQuery

During 2000 and 2001, the NCEDC developed a generalized database query system to support the development of portable database query applications among data centers with different internal database schemas. The initial goal was to modify the IRIS SeismiQuery web interface program to make installation easier at the NCEDC and other data centers, as well as to introduce a new query language that would be schema independent.

In order to support SeismiQuery and other future database query applications, we defined a set of Generic Data Views (GDV) for the database that encompass the basic objects we expect most data centers to support. We introduced a new language we call MSQL (Meta SeismiQuery Language), which is based on generic SQL, and uses the GDV's for its core schema. MSQL queries are converted to Data Center specific SQL queries by the parsing program MSQL2SQL. This parser stores the MSQL parsing tree in a data structure, and API's were implemented to browse and modify elements in the parsing tree. These API's are the only datacenter or database specific source codes. We finally modified the SeismiQuery web interface to uniformly generate MSQL requests and to process these requests in a consistent fashion.

We have installed SeismiQuery at the NCEDC, where it provides a common interface for querying attributes and available data for SEED format data, and have provided both IRIS and the SCEC Data Center with our modified version of SeismiQuery. We envision using this approach to support other database query programs in the future.

NetDC

In a collaborative project with the IRIS DMC and other worldwide datacenters, the NCEDC helped develop and implement NetDC, a protocol which will provide a seamless user interface to multiple datacenters for geophysical network and station inventory, instrument responses, and data retrieval requests. The NetDC builds upon the foundation and concepts of the IRIS BREQ_FAST data request system. The NetDC system was put into production in January 2000, and is currently operational at three datacenters worldwide - the NCEDC, IRIS DMC, and Geoscope. The NetDC system receives user requests via email, automatically routes the appropriate portion of the requests to the appropriate datacenter, optionally aggregates the responses from the various datacenters, and delivers the data (or ftp pointers to the data) to the users via email.

STP

In 2002, the NCEDC wrote a collaborative proposal with the SCEDC to the Southern California Earthquake Center, with the goal of unifying data access between the two data centers. As part of this project, the NCEDC and SCEDC are working to support a common set of 3 tools for accessing waveform and parametric data: SeismiQuery, NetDC, and STP.

The Seismogram Transfer Program or STP is a simple client-server program, developed at the SCEDC. Access to STP is either through a simple direct interface that is available for Sun or Linux platforms, or through a GUI Web interface. With the direct interface, the data are placed directly on a user's computer in several possible formats, with the byte-swap conversion performed automatically. With the Web interface, the selected and converted data are retrieved with a single ftp command. The STP interface also allows rapid access to parametric data such as hypocenters and phases.

The NCEDC has continued work on STP, working with the SCEDC on extensions and needed additions. We added support for the full SEED channel name (Station, Network, Channel, and Location), and are now able to return event-associated waveforms from the NCSN waveform archive.

EVT_FAST

In order to provide Web access to the NCSN waveform before the SEED conversion and instrument response for the NCSN has been completed, the NCEDC implemented EVT_FAST, an interim email-based waveform request system similar to the BREQ_FAST email request system. Users can email EVT_FAST requests to the NCEDC and request NCSN waveform data based on the NCSN event id. The NCSN waveform data is converted to either SAC ASCII, SAC binary, or AH format, and placed in the anonymous ftp directory so that users can retrieve the data. The EVT_FAST waveforms are currently named with the USGS's native NCSN channel names. We have just begun the work to provide EVT_FAST waveform data in SEED format with SEED channel names.

FISSURES

The FISSURES project developed from an initiative by IRIS to improve earth scientists' efficiency by developing a unified environment that can provide interactive or programatic access to waveform data and the corresponding metadata for instrument response, as well as station and channel inventory information. FISSURES was developed using CORBA (Common Object Request Broker Architecture) as the architecture to implement a system-independent method for the exchange of this binary data. The IRIS DMC developed a series of services, referred to as the Data Handling Interface (DHI), using the FISSURES architecture to provide waveform and metadata from the IRIS DMC.

The NCEDC has implemented the FISSURES Data Handling Interface (DHI) services at the NCEDC, which involves interfacing the DHI servers with the NCEDC database schema. We started with the source code for the IRIS DMC's DHI servers, which reduced significantly the implementation's time. We now have the waveform and event FISSURES services running in demonstration mode at the NCEDC. These services interact with the NCEDC database and data storage system, and can deliver NCEDC event and channel metadata as well as waveforms using the FISSURES interfaces. We have installed the FISSURES DHI servers, and worked with the IRIS DMC in 2003-2004 to register with the FISSURES naming services which are run at both the IRIS DMC and the NCEDC.

GSAC

Since 1997, the NCEDC has collaborated with UNAVCO and other members of the GPS community on the development of the GPS Seamless Archive Centers (GSAC) project. This project allows a user to access the most current version of GPS data and metadata from distributed archive locations. The NCEDC is participating at several levels in the GSAC project: as a primary provider of data collected from core BARD stations and USGS MP surveys, and as a wholesale collection point for other data collected in northern California. We helped to define database schema and file formats for the GSAC project, and have produced complete and incremental monumentation and data holdings files describing the data sets that are produced by the BARD project or archived at the NCEDC so that other members of the GSAC community can provide up-to-date information about our holdings. Currently, the NCEDC is the primary provider for over 120,000 data files from over 1400 continuous and survey-mode monuments. The data holdings records for these data have been incorporated into the GSAC retailer system, which became publicly available in late 2002.

Database Development

Most of the parametric data archived at the NCEDC, such as earthquake catalogs, phase and amplitude readings, waveform inventory, and instrument responses, have been stored in flat text files. Flat files are easily stored and viewed, but are not efficiently searched. Over the last year, in collaboration with the Southern California Earthquake Data Center (SCEDC) and the California Integrated Seismic Network (CISN), the NCEDC has continued development of database schemas to store the parametric data from the joint earthquake catalog, station history, complete instrument response for all data channels, and waveform inventory.

The parametric schema supports tables and associations for the joint earthquake catalog. It allows for multiple hypocenters per event, multiple magnitudes per hypocenter, and association of phases and amplitudes with multiple versions of hypocenters and magnitudes respectively. The instrument response schema represents full multi-stage instrument responses (including filter coefficients) for the broadband data loggers. The hardware tracking schema will represent the interconnection of instruments, amplifiers, filters, and data loggers over time. This schema will be used to store the joint northern California earthquake catalog and the ANSS composite catalog.

The entire description of the BDSN/NHFN/MPBO, HRSN, and USGS Low Frequency Geophysical networks and data archive has been entered into the hardware tracking, SEED instrument response, and waveform tables. Using programs developed to perform queries of waveform inventory and instrument responses, the NCEDC can now generate full SEED volumes for these networks based on information from the database and the waveforms on the mass storage system.

During 2002-2003, the NCEDC and NCSN jointly developed a system consisting of an extensive spreadsheet containing per-channel information that describes the hardware of each NCSN data channel and provides each channel with a SEED-compliant channel name. This spreadsheet, combined with a limited number of files that describe the central-site analog digitizer, FIR decimation filters, and general characteristics of digital acquisition systems, allow the NCSN to assemble its station history in a format that the NCEDC can use to populate the hardware tracking and instrument response database tables for the NCSN.

During 2003-2004, the NCEDC and NCSN finalized the CUSP-to-SEED channel mapping for the NCSN waveforms, and entered all of the hardware tracking and response information into the NCEDC database for the sites operated by the NCSN, and can now generate complete SEED responses for all of those data channels. There is, however, additional work that needs to be done in conjunction with contributing networks such as CA DWR, UNR, and SCSN to provide responses for shared stations.

The second part of this project is the conversion of the NCSN waveforms from their native CUSP format into MiniSEED, the standard NCEDC waveform format. Multiple problems needed to be addressed, such as ambiguous or erroneously labeled CUSP data channel, sensors that were recorded on multiple data channels, and ensuring that each distinct data channel is mapped to a distinct SEED channel name. The NCEDC developed programs to use the time-dependent NCSN instrument response spreadsheet and NSCN-supplied channel name transformation rules to determine the SEED channel naming, and to provide feedback to the NCSN on channel naming problems. In 2004, the NCEDC converted all of the NCSN waveform data from the period 1984 through 2003 from CUSP format into MiniSEED format. We entered the waveform descriptors into the NCEDC database, and provided association information between the NCSN event ids and the corresponding waveform data. We are currently developing procedures to convert new NCSN waveforms into MiniSEED format and archive them as they are received by the NCEDC, and to convert the remaining 2004 CUSP waveforms.

The NCEDC has developed XML import and export procedures to provide better maintenance of the hardware tracking information and resulting instrument responses for stations in our database. When changes are made to either existing hardware or to station configurations, we export the current view in XML format, use a GUI-based XML editor to easily update the information, and import the changes back into the database. When adding new stations or hardware, we can easily use information from existing hardware or stations as templates for the new information. This allows us to treat the database as the authoritative source of information, and to use off-the-shelf tools such as the XML editor and XML differencing programs as part of our database maintenance procedures.

We distributed all of our programs and procedures for populating the hardware tracking and instrument response tables to the SCEDC in order to help them populate their database.

During 2002-2003, the BSL had been processing events detected by the HRSN (BP) network. The waveform data and event parameters (picks and hypocenters) are stored in separate HRSN database tables, and will be merged with events from the NCSN when the NCSN catalog is migrated to the database. However, human event processing stopped after the San Simeon earthquake due to the rapid increase in seismicity related to that event.

Additional details on the joint catalog effort and database schema development may be found at http://quake.geo.berkeley.edu/db

Data Distribution

The NCEDC continues to use the World Wide Web as a principal interface for users to request, search, and receive data from the NCEDC. The NCEDC has implemented a number of useful and original mechanisms of data search and retrieval using the World Wide Web, which are available to anyone on the Internet. All of the documentation about the NCEDC, including the research users' guide, is available via the Web. Users can perform catalog searches and retrieve hypocentral information and phase readings from the various earthquake catalogs at the NCEDC via easy-to-use forms on the Web. In addition, users can peruse the index of available broadband data at the NCEDC, and can request and retrieve broadband data in standard SEED format via the Web. Access to all datasets is available via research accounts at the NCEDC. The NCEDC's Web address is http://quake.geo.berkeley.edu/

The NCEDC hosts a web page that allows users to easily query the NCEDC waveform inventory, and generate and submit NetDC requests to the NCEDC. The NCEDC currently supports both the BREQ_FAST and NetDC request formats. As part of our collaboration with SCEDC, the NCEDC provided its BREQ_FAST interface code to SCEDC, and has worked with them to implement BREQ_FAST requests at the SCEDC.

The various earthquake catalogs (including phase and earthquake mechanism) can be searched using NCEDC web interfaces that allow users to select the catalog, attributes such as geographical region, time and magnitude. The GPS data is available to all users via anonymous ftp. Research accounts are available to any qualified researcher who needs access to the other datasets that currently are not available via the Web.

The GPS data archived at the NCEDC is available over the Internet through the GSAC retailer system, which became publicly available in late 2002, as well as by anonymous FTP.

Web Pages

The NCEDC developed its Web pages in the early days of the Web. Unfortunately, time constraints have kept the pages somewhat static and limited in their use. In June of 2002, the NCEDC began a project to update and expand their Web offerings. This project was completed in October 2002, and provides the NCEDC with a uniform look-and-feel for all web pages.

Acknowledgements

The NCEDC is a joint project of the BSL and the USGS Menlo Park and is partially funded by the USGS.

Doug Neuhauser is the manager of the NCEDC. Stephane Zuzlewski, Rick McKenzie, Mark Murray, André Basset, and Lind Gee of the BSL and David Oppenheimer, Hal Macbeth, and Fred Klein of the USGS Menlo Park contribute to the operation of the NCEDC. Doug Neuhauser, Lind Gee, and Stephane Zuzlewski contributed to the preparation of this chapter.

Berkeley Seismological Laboratory
215 McCone Hall, UC Berkeley, Berkeley, CA 94720-4760
Questions or comments? Send e-mail: www@seismo.berkeley.edu
© 2004, The Regents of the University of California