PNWstore Tutorial#

Noisepy is a python software package to process ambient seismic noise cross correlations.

Publication about this software: Chengxin Jiang, Marine A. Denolle; NoisePy: A New High‐Performance Python Tool for Ambient‐Noise Seismology. Seismological Research Letters 2020; 91 (3): 1853–1866. doi: https://doi.org/10.1785/0220190364

This tutorial will walk you through the basic steps of using NoisePy to compute ambient noise cross correlation functions using single instance workflow.

First, we install the noisepy-seis package

# Uncomment and run this line if the environment doesn't have noisepy already installed:
# ! pip install noisepy-seis 

Warning: NoisePy uses obspy as a core Python module to manipulate seismic data. Restart the runtime now for proper installation of obspy on Colab.

This tutorial should be ran after installing the noisepy package.

Import necessary modules#

Then we import the basic modules

from noisepy.seis import cross_correlate, stack_cross_correlations   # noisepy core functions
from noisepy.seis.io import plotting_modules
from noisepy.seis.io.asdfstore import ASDFCCStore, ASDFStackStore    # Object to store ASDF data within noisepy
from noisepy.seis.io.channel_filter_store import channel_filter
from noisepy.seis.io.pnwstore import PNWDataStore
from noisepy.seis.io.datatypes import CCMethod, ConfigParameters, Channel, ChannelData, ChannelType, FreqNorm, RmResp, Station, TimeNorm
from noisepy.seis.io.channelcatalog import XMLStationChannelCatalog  # Required stationXML handling object
import os
from datetime import datetime
from datetimerange import DateTimeRange

path = "./pnw_data" 

os.makedirs(path, exist_ok=True)
cc_data_path = os.path.join(path, "CCF")
stack_data_path = os.path.join(path, "STACK")
# storage of stationXML
STATION_XML = "/1-fnp/pnwstore1/p-wd11/PNWStationXML/"           

# __ indicates any two chars (network code). However, it will slow down the query
DATA = "/1-fnp/pnwstore1/p-wd00/PNW2020/__/"
DB_PATH ="/1-fnp/pnwstore1/p-wd00/PNW2020/timeseries.sqlite"
# timeframe for analysis
start = datetime(2020, 1, 2)
end = datetime(2020, 1, 4)
timerange = DateTimeRange(start, end)
print(timerange)

Ambient Noise Project Configuration#

We store the metadata information about the ambient noise cross correlation workflow in a ConfigParameters() object. We first initialize it, then we tune the parameters for this cross correlation.

# Initialize ambient noise workflow configuration
config = ConfigParameters() # default config parameters which can be customized

Customize the job parameters below:

config.sampling_rate = 20  # (int) Sampling rate in Hz of desired processing (it can be different than the data sampling rate)
config.cc_len = 3600  # (float) basic unit of data length for fft (sec)
    # criteria for data selection
config.ncomp = 3  # 1 or 3 component data (needed to decide whether do rotation)

config.acorr_only = False  # only perform auto-correlation or not
config.xcorr_only = True  # only perform cross-correlation or not

# config.inc_hours = 24 # if the data is first 

# pre-processing parameters
config.step = 1800.0  # (float) overlapping between each cc_len (sec)
config.stationxml = False  # station.XML file used to remove instrument response for SAC/miniseed data
config.rm_resp = RmResp.INV  # select 'no' to not remove response and use 'inv' if you use the stationXML,'spectrum',
config.freqmin = 0.05
config.freqmax = 2.0
config.max_over_std  = 10  # threshold to remove window of bad signals: set it to 10*9 if prefer not to remove them

# TEMPORAL and SPECTRAL NORMALISATION
config.freq_norm = FreqNorm.RMA  # choose between "rma" for a soft whitenning or "no" for no whitening. Pure whitening is not implemented correctly at this point.
config.smoothspect_N = 10  # moving window length to smooth spectrum amplitude (points)
    # here, choose smoothspect_N for the case of a strict whitening (e.g., phase_only)

config.time_norm = TimeNorm.NO  # 'no' for no normalization, or 'rma', 'one_bit' for normalization in time domain,
    # TODO: change time_norm option from "no" to "None"
config.smooth_N = 10  # moving window length for time domain normalization if selected (points)

config.cc_method = CCMethod.XCORR  # 'xcorr' for pure cross correlation OR 'deconv' for deconvolution;
    # FOR "COHERENCY" PLEASE set freq_norm to "rma", time_norm to "no" and cc_method to "xcorr"

# OUTPUTS:
config.substack = True  # True = smaller stacks within the time chunk. False: it will stack over inc_hours
config.substack_windows = 1 # how long to stack over (for monitoring purpose)
    # if substack=True, substack_windows=2, then you pre-stack every 2 correlation windows.
    # for instance: substack=True, substack_windows=1 means that you keep ALL of the correlations

config.maxlag = 200  # lags of cross-correlation to save (sec)
config.substack = True
# For this tutorial make sure the previous run is empty
os.system(f"rm -rf {cc_data_path}")

Step 1: Cross-correlation#

config.networks = ["UW", "UO", "PB", "CC"]
config.stations = ["BBO", "BABR", "SHUK", "PANH"]
config.channels = ["BH?", "HH?"]

catalog = XMLStationChannelCatalog(STATION_XML, path_format="{network}/{network}.{name}.xml")
raw_store = PNWDataStore(DATA, DB_PATH, catalog,
                         channel_filter(config.networks, config.stations, config.channels), 
                         date_range=timerange) # Store for reading raw data
cc_store = ASDFCCStore(cc_data_path) # Store for writing CC data

Perform the cross correlation Here, removing the instrumental response is slow. It could also be the interpolation

cross_correlate(raw_store, config, cc_store)

Plot a single set of the cross correlation

timespans = cc_store.get_timespans()
plotting_modules.plot_substack_cc(cc_store, timespans[0], 0.1, 1, None, False)

Step 2: Stack the cross correlation#

Provide a path to where the data is.

# open a new cc store in read-only mode since we will be doing parallel access for stacking
cc_store = ASDFCCStore(cc_data_path, mode="r")
stack_store = ASDFStackStore(stack_data_path)
stack_cross_correlations(cc_store, stack_store, config)

Plot the stacks

print(os.listdir(cc_data_path))
print(os.listdir(stack_data_path))
pairs = stack_store.get_station_pairs()
print(f"Found {len(pairs)} station pairs")
sta_stacks = stack_store.read_bulk(timerange, pairs) # no timestamp used in ASDFStackStore
plotting_modules.plot_all_moveout(sta_stacks, 'Allstack_linear', 0.1, 0.2, 'ZZ', 1)