Local miniSEED Tutorial

Local miniSEED Tutorial#

Noisepy is a python software package to process ambient seismic noise cross correlations.

Publication about this software: Chengxin Jiang, Marine A. Denolle; NoisePy: A New High‐Performance Python Tool for Ambient‐Noise Seismology. Seismological Research Letters 2020; 91 (3): 1853–1866. doi: https://doi.org/10.1785/0220190364

This tutorial will walk you through the basic steps of using NoisePy to compute ambient noise cross correlation functions using single instance workflow.

# Uncomment and run this line if the environment doesn't have noisepy already installed:
# ! pip install noisepy-seis

Warning: NoisePy uses obspy as a core Python module to manipulate seismic data. Restart the runtime now for proper installation of obspy on Colab.

This tutorial should be ran after installing the noisepy package.

Import necessary modules#

Then we import the basic modules

import os
from noisepy.seis.io.mseedstore import MiniSeedDataStore
from noisepy.seis.io.channelcatalog import XMLStationChannelCatalog
from noisepy.seis.io.channel_filter_store import channel_filter
from datetime import datetime, timezone
from datetimerange import DateTimeRange

from noisepy.seis.io.datatypes import ConfigParameters, CCMethod, FreqNorm, RmResp, StackMethod, TimeNorm

Assume that you have some miniSEED files on your local file system. To use this MiniSeedDataStore, it is necessary to organize the file in a specific way. Here, we use SCEDC convention to organize and name miniSEED files. See Continuous Waveforms on https://scedc.caltech.edu/data/cloud.html for full details of this naming convention. ⚠️You may also modify the MiniSeedDataStore class to fit the naming strategy of your own data. See source.

Below we show an example of three days of data organized in the SCEDC convention.

!tree waveforms/
!tree stations/
STATION_XML = "./stations/"
DATA = "./waveforms/"

# timeframe for analysis
start = datetime(2019, 9, 1, tzinfo=timezone.utc)
end = datetime(2019, 9, 4, tzinfo=timezone.utc)
timerange = DateTimeRange(start, end)
print(timerange)
# Initialize ambient noise workflow configuration
config = ConfigParameters() # default config parameters which can be customized

config.start_date = start
config.end_date = end
config.acorr_only = False # only perform auto-correlation or not
config.xcorr_only = True # only perform cross-correlation or not

config.inc_hours = 24
config.sampling_rate = 20  # (int) Sampling rate in Hz of desired processing (it can be different than the data sampling rate)
config.cc_len = 3600  # (float) basic unit of data length for fft (sec)
config.step = 1800.0  # (float) overlapping between each cc_len (sec)

config.ncomp = 1  # 1 or 3 component data (needed to decide whether do rotation)

config.stationxml = False  # station.XML file used to remove instrument response for SAC/miniseed data
      # If True, the stationXML file is assumed to be provided.
config.rm_resp = RmResp.INV  # select 'no' to not remove response and use 'inv' if you use the stationXML,'spectrum',

config.freqmin, config.freqmax = 0.05, 2.0  # broad band filtering of the data before cross correlation
config.max_over_std = 10  # threshold to remove window of bad signals: set it to 10*9 if prefer not to remove them

config.freq_norm = FreqNorm.RMA  # choose between "rma" for a soft whitening or "no" for no whitening. Pure whitening is not implemented correctly at this point.
config.smoothspect_N = 10  # moving window length to smooth spectrum amplitude (points)

config.time_norm = TimeNorm.ONE_BIT # 'no' for no normalization, or 'rma', 'one_bit' for normalization in time domain,
config.smooth_N = 10  # moving window length for time domain normalization if selected (points)

config.cc_method = CCMethod.XCORR # 'xcorr' for pure cross correlation OR 'deconv' for deconvolution;
config.substack = True  # True = smaller stacks within the time chunk. False: it will stack over inc_hours
config.substack_windows = 1  # how long to stack over (for monitoring purpose)
config.maxlag= 200  # lags of cross-correlation to save (sec)
config.networks = ["TX"]
config.stations = ["*"]
config.channels = ["HH?"]

catalog = XMLStationChannelCatalog(STATION_XML, path_format='{network}.{name}.xml')
raw_store = MiniSeedDataStore(DATA, catalog,
                              channel_filter(config.networks, config.stations, config.channels), 
                              date_range=timerange)
span = raw_store.get_timespans()
print(span)
channels = raw_store.get_channels(span[0])
print(channels)
raw_store.read_data(span[0], channels[0]).stream