Flattening with a Savitzky-Golay filter

gully
October 15, 2021

Often we want a mere quicklook of a flattened spectrum, and can’t be bothered to mask the spectrum, or conduct other laborious pre-processing steps. One way to achieve that goal is with the heuristic Savitzky-Golay filter (often called “Savgol” for short), which acts as a high-pass filter– removing any smooth trends to reveal just the sharp spectral lines.

Here we demonstrate the how to apply.flatten() method, mirrored and abridged from the lightkurve framework. Behind-the-scenes the lightkurve-based .flatten() method applies the Savgol filter from scipy. This method will usually require some tuning by human pattern recognition, since the underlying smoothness of the data depends on what spectra you have.

[1]:
from muler.hpf import HPFSpectrum
%config InlineBackend.figure_format='retina'
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from muler.hpf import HPFSpectrum
      2 get_ipython().run_line_magic('config', "InlineBackend.figure_format='retina'")

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/hpf.py:14
     12 import warnings
     13 import logging
---> 14 from muler.echelle import EchelleSpectrum, EchelleSpectrumList
     15 from muler.utilities import resample_list
     16 import numpy as np

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/echelle.py:31
     29 from scipy.ndimage import median_filter, gaussian_filter1d
     30 import specutils
---> 31 from muler.utilities import apply_numpy_mask, is_list, resample_list
     34 # from barycorrpy import get_BC_vel
     35 from astropy.coordinates import SkyCoord, EarthLocation

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/utilities.py:15
     13 from astropy.convolution import convolve, Gaussian1DKernel
     14 from scipy.ndimage import binary_dilation
---> 15 from astroquery.simbad import Simbad
     16 Simbad.add_votable_fields('V', 'B', 'J', 'H', 'K', 'parallax')
     17 LinInterpResampler = LinearInterpolatedResampler()

ModuleNotFoundError: No module named 'astroquery'
[2]:
path = 'https://github.com/OttoStruve/muler_example_data/raw/main/HPF/01_A0V_standards/'
filename = 'Goldilocks_20210801T083618_v1.0_0036.spectra.fits'
raw_spectrum = HPFSpectrum(file = path + filename, order=6)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[2], line 3
      1 path = 'https://github.com/OttoStruve/muler_example_data/raw/main/HPF/01_A0V_standards/'
      2 filename = 'Goldilocks_20210801T083618_v1.0_0036.spectra.fits'
----> 3 raw_spectrum = HPFSpectrum(file = path + filename, order=6)

NameError: name 'HPFSpectrum' is not defined
[3]:
spectrum = raw_spectrum.sky_subtract(method='vector').remove_nans().deblaze().normalize()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 spectrum = raw_spectrum.sky_subtract(method='vector').remove_nans().deblaze().normalize()

NameError: name 'raw_spectrum' is not defined
[4]:
flattened_spectrum = spectrum.flatten()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 flattened_spectrum = spectrum.flatten()

NameError: name 'spectrum' is not defined
[5]:
ax = spectrum.plot(label='Original');
flattened_spectrum.plot(ax=ax, label='Flattened');
ax.legend();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 ax = spectrum.plot(label='Original');
      2 flattened_spectrum.plot(ax=ax, label='Flattened');
      3 ax.legend();

NameError: name 'spectrum' is not defined

And voilà! We have a coarsely flattened spectrum without all the effort of manually masking lines. You can see an artifact at \(\lambda = 8860\;Å\). You can tune the flatten method with some of its keyword argments:

[6]:
spectrum.flatten(window_length=61).plot();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[6], line 1
----> 1 spectrum.flatten(window_length=61).plot();

NameError: name 'spectrum' is not defined

That’s better! :) You can see some ringing artifacts near the window size of 61 pixels. It’s useful to inspect the trend by using the return_trend=True kwarg, which will return two spectra: the flattened spectrum and the fitted trend spectrum.

[7]:
flattened_spectrum, trend = spectrum.flatten(window_length=81, niters=13, sigma=3.5,
                                             return_trend=True)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 flattened_spectrum, trend = spectrum.flatten(window_length=81, niters=13, sigma=3.5,
      2                                              return_trend=True)

NameError: name 'spectrum' is not defined
[8]:
ax = spectrum.plot(label='Original Spectrum')
trend.plot(ax=ax, label='Continuum trend')
ax.legend();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 ax = spectrum.plot(label='Original Spectrum')
      2 trend.plot(ax=ax, label='Continuum trend')
      3 ax.legend();

NameError: name 'spectrum' is not defined

You can see some signal self-substraction in the line at \(\lambda = 8860\;Å\), and the continuum fit drives through weak lines near 8785 Å. There’s also a way to hand-in a mask= kwarg to preserve features you want to keep like telluric lines in this case.