Database Development

Most of the parametric data archived at the NCEDC, such as earthquake catalogs, phase and amplitude readings, waveform inventory, and instrument responses, have been stored in flat text files. Flat files are easily stored and viewed, but are not efficiently searched. Over the last year, the NCEDC, in collaboration with the Southern California Earthquake Data Center (SCEDC) and the California Integrated Seismic Network (CISN), has continued development of database schemas to store the parametric data from the joint earthquake catalog, station history, complete instrument response for all data channels, and waveform inventory.

The parametric schema supports tables and associations for the joint earthquake catalog. It allows for multiple hypocenters per event, multiple magnitudes per hypocenter, and association of phases and amplitudes with multiple versions of hypocenters and magnitudes respectively. The instrument response schema represents full multi-stage instrument responses (including filter coefficients) for the broadband data loggers. The hardware tracking schema will represent the interconnection of instruments, amplifiers, filters, and data loggers over time. This schema will be used to store the joint northern California earthquake catalog and the ANSS composite catalog.

The entire description for the BDSN/NHFN/MPBO, HRSN, and USGS Low Frequency Geophysical networks and data archive has been entered into the hardware tracking, SEED instrument response, and waveform tables. Using programs developed to perform queries of waveform inventory and instrument responses, the NCEDC can now generate full SEED volumes for these networks based on information from the database and the waveforms on the mass storage system.

During 2002-2003, the NCEDC and NCSN jointly developed a system consisting of an extensive spreadsheet containing per-channel information that describes the hardware of each NCSN data channel and provides each channel with a SEED-compliant channel name. This spreadsheet, combined with a limited number of of files that describe the central-site analog digitizer, FIR decimation filters, and general characteristics of digital acquisition systems, allow the NCSN to assemble its station history in a format that the NCEDC can use to populate the hardware tracking and instrument response database tables for the NCSN.

During 2003-2004, the NCEDC and NCSN finalized the CUSP-to-SEED channel mapping for the NCSN waveform, and entered all of the hardware tracking and response information into the NCEDC database for the sites operated by the NCSN, and can now generate complete SEED responses for all of those data channels. There is, however, additional work that needs to be done in conjunction with contributing networks such as CA DWR, UNR, and SCSN to provide responses for shared stations.

The second part of this project is the conversion of the NCSN waveforms from their native CUSP format into MiniSEED, the standard NCEDC waveform format. Multiple problems needed to be addressed, such as ambiguous or erroneously labeled CUSP data channel, sensors that were recorded on multiple data channels, and ensuring that each distinct data channel is mapped to a distinct SEED channel name. The NCEDC developed programs to use the time-dependent NCSN instrument response spreadsheet and NSCN-supplied channel name transformation rules to determine the the SEED channel naming, and to provide feedback to the NCSN on channel naming problems. In 2004, the NCEDC converted all the NCSN waveform data from the period 1984 through 2003 from CUSP format into MiniSEED format. We entered the waveform descriptors into the NCEDC database, and provided association information between the NCSN event ids and the corresponding waveform data. We have converted all the NCSN archived waveforms from their initial CUSP format to MiniSEED, and convert all new incoming CUSP waveforms to MiniSEED as they are received at the NCEDC.

The NCEDC has developed XML import and export procedures to provide better maintenance of the hardware tracking information and resulting instrument responses for stations in our database. When changes are made to either existing hardware or to station configurations, we export the current view in XML format, use a GUI-based XML editor to easily update the information, and import the changes back into the database. When adding new stations or hardware, we can easily use information from existing hardware or stations as templates for the new information. This allows us to treat the database as the authoritative source of information, and to use off-the-shelf tools such as the XML editor and XML differencing programs as part of our database maintenance procedures.

We distributed all our programs and procedures for populating the hardware tracking and instrument response tables to the SCEDC in order to help them populate their database.

During 2002-2003, the BSL had been processing events detected by the HRSN (BP) network. The waveform data and event parameters (picks and hypocenters) are stored in separate HRSN database tables, and will be merged with events from the NCSN when the NCSN catalog is migrated to the database. However, human event processing stopped after the San Simeon earthquake due to the rapid increase in seismicity related to that event.

Additional details on the joint catalog effort and database schema development may be found at http://www.ncedc.org/db

Berkeley Seismological Laboratory
215 McCone Hall, UC Berkeley, Berkeley, CA 94720-4760
Questions or comments? Send e-mail: www@seismo.berkeley.edu
© 2005, The Regents of the University of California