Masking and flattening HPF data with TelFit and Gaussian Processes

gully
October 14, 2021

In a previous tutorial we masked spectral lines and coarsely flattened the spectrum. That approach was only able to identify the strongest telluric lines and therefore caused some residual weak telluric lines to distort the inferred continuum shape. Also the third order polynomial was not flexible enough to account for high-frequency ringing artifacts leftover from the A0V’s deep Hydrogen lines.

In this tutorial we use a pristine telluric template and advanced Gaussian Process smoothing techniques to get an even more refined continuum fit.

[1]:
from muler.hpf import HPFSpectrum
from gollum.phoenix import PHOENIXSpectrum
import astropy.units as u
import numpy as np
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from muler.hpf import HPFSpectrum
      2 from gollum.phoenix import PHOENIXSpectrum
      3 import astropy.units as u

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/hpf.py:14
     12 import warnings
     13 import logging
---> 14 from muler.echelle import EchelleSpectrum, EchelleSpectrumList
     15 from muler.utilities import resample_list
     16 import numpy as np

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/echelle.py:31
     29 from scipy.ndimage import median_filter, gaussian_filter1d
     30 import specutils
---> 31 from muler.utilities import apply_numpy_mask, is_list, resample_list
     34 # from barycorrpy import get_BC_vel
     35 from astropy.coordinates import SkyCoord, EarthLocation

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/utilities.py:15
     13 from astropy.convolution import convolve, Gaussian1DKernel
     14 from scipy.ndimage import binary_dilation
---> 15 from astroquery.simbad import Simbad
     16 Simbad.add_votable_fields('V', 'B', 'J', 'H', 'K', 'parallax')
     17 LinInterpResampler = LinearInterpolatedResampler()

ModuleNotFoundError: No module named 'astroquery'
[2]:
import matplotlib.pyplot as plt
%config InlineBackend.figure_format='retina'

We’ll conduct the same steps as we did in our previous tutorial on masking.

[3]:
path = 'https://github.com/OttoStruve/muler_example_data/raw/main/HPF/01_A0V_standards/'
filename = 'Goldilocks_20210801T083618_v1.0_0036.spectra.fits'
raw_spectrum = HPFSpectrum(file = path+filename, order=6)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[3], line 3
      1 path = 'https://github.com/OttoStruve/muler_example_data/raw/main/HPF/01_A0V_standards/'
      2 filename = 'Goldilocks_20210801T083618_v1.0_0036.spectra.fits'
----> 3 raw_spectrum = HPFSpectrum(file = path+filename, order=6)

NameError: name 'HPFSpectrum' is not defined
[4]:
spectrum = raw_spectrum.sky_subtract(method='vector').remove_nans().deblaze().normalize()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 spectrum = raw_spectrum.sky_subtract(method='vector').remove_nans().deblaze().normalize()

NameError: name 'raw_spectrum' is not defined
[5]:
native_template = PHOENIXSpectrum(teff=9600, logg=5.0, download=True).normalize()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 native_template = PHOENIXSpectrum(teff=9600, logg=5.0, download=True).normalize()

NameError: name 'PHOENIXSpectrum' is not defined
[6]:
doctored_template = native_template\
                        .rotationally_broaden(105.0, u1=0.3)\
                        .rv_shift(-40.0)\
                        .instrumental_broaden(resolving_power=55_000)\
                        .resample(spectrum)\
                        .normalize()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[6], line 1
----> 1 doctored_template = native_template\
      2                         .rotationally_broaden(105.0, u1=0.3)\
      3                         .rv_shift(-40.0)\
      4                         .instrumental_broaden(resolving_power=55_000)\
      5                         .resample(spectrum)\
      6                         .normalize()

NameError: name 'native_template' is not defined
[7]:
ax = spectrum.plot(ylo=0, label='A0V Star');
doctored_template.plot(ax=ax, label='Doctored A0V Template'); ax.legend();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 ax = spectrum.plot(ylo=0, label='A0V Star');
      2 doctored_template.plot(ax=ax, label='Doctored A0V Template'); ax.legend();

NameError: name 'spectrum' is not defined
[8]:
telluric_response = spectrum.divide(doctored_template, handle_meta='ff')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 telluric_response = spectrum.divide(doctored_template, handle_meta='ff')

NameError: name 'spectrum' is not defined

Normalize by the top 5\(^{th}\) percentile, rather than the median.

[9]:
spectrum_peak = np.percentile(telluric_response.flux, 95)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[9], line 1
----> 1 spectrum_peak = np.percentile(telluric_response.flux, 95)

NameError: name 'np' is not defined
[10]:
telluric_response = telluric_response.divide(spectrum_peak, handle_meta='ff')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[10], line 1
----> 1 telluric_response = telluric_response.divide(spectrum_peak, handle_meta='ff')

NameError: name 'telluric_response' is not defined

Masking the lines with a template

muler now packages a static telluric template with its source code. The template is not tunable and is inadequate for directly dividing out the telluric absorption. Instead, we use this template for a coarse estimate of which spectral regions are likely to have the most severe telluric absorption.

[11]:
telfit_template = telluric_response.get_static_TelFit_template()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[11], line 1
----> 1 telfit_template = telluric_response.get_static_TelFit_template()

NameError: name 'telluric_response' is not defined
[12]:
ax = telfit_template.plot(ylo=0.98, yhi=1.005)
ax.set_xlim(telluric_response.wavelength.value.min(),
            telluric_response.wavelength.value.max());
ax.axhline(0.999, linestyle='dashed', color='salmon', label='$99.9\%$ Threshold');
plt.title('TelFit template')
ax.legend();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 ax = telfit_template.plot(ylo=0.98, yhi=1.005)
      2 ax.set_xlim(telluric_response.wavelength.value.min(),
      3             telluric_response.wavelength.value.max());
      4 ax.axhline(0.999, linestyle='dashed', color='salmon', label='$99.9\%$ Threshold');

NameError: name 'telfit_template' is not defined

In the next cell we will mask the pixels in our telluric response spectrum that TelFit would expect to have greater than 0.1% telluric absorption. You can tune the threshold to a higher or lower number depending on your needs. The dilation=N parameter also masks \(N\) adjacent pixels and accounts for the possibility of wavelength miscalibration in the data or model.

[13]:
continuum = telluric_response.mask_tellurics(threshold=0.999, dilation=7)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[13], line 1
----> 1 continuum = telluric_response.mask_tellurics(threshold=0.999, dilation=7)

NameError: name 'telluric_response' is not defined
[14]:
smoothed_continuum, gp_model = continuum.smooth_spectrum(return_model=True)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[14], line 1
----> 1 smoothed_continuum, gp_model = continuum.smooth_spectrum(return_model=True)

NameError: name 'continuum' is not defined
[15]:
inferred_continuum_flux = gp_model(telluric_response.wavelength.value)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[15], line 1
----> 1 inferred_continuum_flux = gp_model(telluric_response.wavelength.value)

NameError: name 'gp_model' is not defined
[16]:
inferred_continuum = telluric_response._copy(flux=inferred_continuum_flux
                                            * u.dimensionless_unscaled,
                                             uncertainty=None)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[16], line 1
----> 1 inferred_continuum = telluric_response._copy(flux=inferred_continuum_flux
      2                                             * u.dimensionless_unscaled,
      3                                              uncertainty=None)

NameError: name 'telluric_response' is not defined
[17]:
ax = telluric_response.plot(ylo=0.90, yhi=1.02, alpha=0.3, label='Masked telluric lines');
ax.axhline(1.0, color='k', linestyle='dotted');
continuum.plot(ax=ax, label='Continuum pixels')
inferred_continuum.plot(ax=ax, color='k', label='Inferred continuum')
ax.legend();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 ax = telluric_response.plot(ylo=0.90, yhi=1.02, alpha=0.3, label='Masked telluric lines');
      2 ax.axhline(1.0, color='k', linestyle='dotted');
      3 continuum.plot(ax=ax, label='Continuum pixels')

NameError: name 'telluric_response' is not defined

Neat! That continuum fit looks plausible. You can see that we missed some telluric lines, such as the one at \(\lambda \sim 8825\). But overall it does an OK job. Let’s flatten the spectrum by this inferred continuum.

[18]:
flattened_telluric_response = telluric_response.divide(inferred_continuum, handle_meta='ff')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[18], line 1
----> 1 flattened_telluric_response = telluric_response.divide(inferred_continuum, handle_meta='ff')

NameError: name 'telluric_response' is not defined
[19]:
ax = flattened_telluric_response.plot(ylo=0)
ax.axhline(1.0, linestyle='dotted', color='k');
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[19], line 1
----> 1 ax = flattened_telluric_response.plot(ylo=0)
      2 ax.axhline(1.0, linestyle='dotted', color='k');

NameError: name 'flattened_telluric_response' is not defined

Nice! Much flatter than before! The Hydrogen ringing artifacts are gone. We can now measure precision equivalent widths by summing pixels without having to worry about where to place the continuum.

[20]:
ax = flattened_telluric_response.plot(ylo=0.90, yhi=1.02)
ax.axhline(1.0, linestyle='dotted', color='k');
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[20], line 1
----> 1 ax = flattened_telluric_response.plot(ylo=0.90, yhi=1.02)
      2 ax.axhline(1.0, linestyle='dotted', color='k');

NameError: name 'flattened_telluric_response' is not defined

Limitations

The GP smoothing process is inherently an inference procedure, and may cause some undetermined amount of self-subtraction or artifical boosting. An even-more-precise approach could be to fit directly for the lines with their real line shapes, based on measured or inferred atmospheric conditions at the time of the measurement. TelFit and other frameworks offer such methods for telluric lines, and spectral synthesis for stars is also possible.