Masking and flattening HPF data with TelFit and Gaussian Processes¶
In a previous tutorial we masked spectral lines and coarsely flattened the spectrum. That approach was only able to identify the strongest telluric lines and therefore caused some residual weak telluric lines to distort the inferred continuum shape. Also the third order polynomial was not flexible enough to account for high-frequency ringing artifacts leftover from the A0V’s deep Hydrogen lines.
In this tutorial we use a pristine telluric template and advanced Gaussian Process smoothing techniques to get an even more refined continuum fit.
[1]:
from muler.hpf import HPFSpectrum
from gollum.phoenix import PHOENIXSpectrum
import astropy.units as u
import numpy as np
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 from muler.hpf import HPFSpectrum
2 from gollum.phoenix import PHOENIXSpectrum
3 import astropy.units as u
File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/hpf.py:14
12 import warnings
13 import logging
---> 14 from muler.echelle import EchelleSpectrum, EchelleSpectrumList
15 from muler.utilities import resample_list
16 import numpy as np
File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/echelle.py:31
29 from scipy.ndimage import median_filter, gaussian_filter1d
30 import specutils
---> 31 from muler.utilities import apply_numpy_mask, is_list, resample_list
34 # from barycorrpy import get_BC_vel
35 from astropy.coordinates import SkyCoord, EarthLocation
File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/utilities.py:15
13 from astropy.convolution import convolve, Gaussian1DKernel
14 from scipy.ndimage import binary_dilation
---> 15 from astroquery.simbad import Simbad
16 Simbad.add_votable_fields('V', 'B', 'J', 'H', 'K', 'parallax')
17 LinInterpResampler = LinearInterpolatedResampler()
ModuleNotFoundError: No module named 'astroquery'
[2]:
import matplotlib.pyplot as plt
%config InlineBackend.figure_format='retina'
We’ll conduct the same steps as we did in our previous tutorial on masking.
[3]:
path = 'https://github.com/OttoStruve/muler_example_data/raw/main/HPF/01_A0V_standards/'
filename = 'Goldilocks_20210801T083618_v1.0_0036.spectra.fits'
raw_spectrum = HPFSpectrum(file = path+filename, order=6)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[3], line 3
1 path = 'https://github.com/OttoStruve/muler_example_data/raw/main/HPF/01_A0V_standards/'
2 filename = 'Goldilocks_20210801T083618_v1.0_0036.spectra.fits'
----> 3 raw_spectrum = HPFSpectrum(file = path+filename, order=6)
NameError: name 'HPFSpectrum' is not defined
[4]:
spectrum = raw_spectrum.sky_subtract(method='vector').remove_nans().deblaze().normalize()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[4], line 1
----> 1 spectrum = raw_spectrum.sky_subtract(method='vector').remove_nans().deblaze().normalize()
NameError: name 'raw_spectrum' is not defined
[5]:
native_template = PHOENIXSpectrum(teff=9600, logg=5.0, download=True).normalize()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 1
----> 1 native_template = PHOENIXSpectrum(teff=9600, logg=5.0, download=True).normalize()
NameError: name 'PHOENIXSpectrum' is not defined
[6]:
doctored_template = native_template\
.rotationally_broaden(105.0, u1=0.3)\
.rv_shift(-40.0)\
.instrumental_broaden(resolving_power=55_000)\
.resample(spectrum)\
.normalize()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[6], line 1
----> 1 doctored_template = native_template\
2 .rotationally_broaden(105.0, u1=0.3)\
3 .rv_shift(-40.0)\
4 .instrumental_broaden(resolving_power=55_000)\
5 .resample(spectrum)\
6 .normalize()
NameError: name 'native_template' is not defined
[7]:
ax = spectrum.plot(ylo=0, label='A0V Star');
doctored_template.plot(ax=ax, label='Doctored A0V Template'); ax.legend();
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 1
----> 1 ax = spectrum.plot(ylo=0, label='A0V Star');
2 doctored_template.plot(ax=ax, label='Doctored A0V Template'); ax.legend();
NameError: name 'spectrum' is not defined
[8]:
telluric_response = spectrum.divide(doctored_template, handle_meta='ff')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 1
----> 1 telluric_response = spectrum.divide(doctored_template, handle_meta='ff')
NameError: name 'spectrum' is not defined
Normalize by the top 5\(^{th}\) percentile, rather than the median.
[9]:
spectrum_peak = np.percentile(telluric_response.flux, 95)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[9], line 1
----> 1 spectrum_peak = np.percentile(telluric_response.flux, 95)
NameError: name 'np' is not defined
[10]:
telluric_response = telluric_response.divide(spectrum_peak, handle_meta='ff')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[10], line 1
----> 1 telluric_response = telluric_response.divide(spectrum_peak, handle_meta='ff')
NameError: name 'telluric_response' is not defined
Masking the lines with a template¶
muler now packages a static telluric template with its source code. The template is not tunable and is inadequate for directly dividing out the telluric absorption. Instead, we use this template for a coarse estimate of which spectral regions are likely to have the most severe telluric absorption.
[11]:
telfit_template = telluric_response.get_static_TelFit_template()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[11], line 1
----> 1 telfit_template = telluric_response.get_static_TelFit_template()
NameError: name 'telluric_response' is not defined
[12]:
ax = telfit_template.plot(ylo=0.98, yhi=1.005)
ax.set_xlim(telluric_response.wavelength.value.min(),
telluric_response.wavelength.value.max());
ax.axhline(0.999, linestyle='dashed', color='salmon', label='$99.9\%$ Threshold');
plt.title('TelFit template')
ax.legend();
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[12], line 1
----> 1 ax = telfit_template.plot(ylo=0.98, yhi=1.005)
2 ax.set_xlim(telluric_response.wavelength.value.min(),
3 telluric_response.wavelength.value.max());
4 ax.axhline(0.999, linestyle='dashed', color='salmon', label='$99.9\%$ Threshold');
NameError: name 'telfit_template' is not defined
In the next cell we will mask the pixels in our telluric response spectrum that TelFit would expect to have greater than 0.1% telluric absorption. You can tune the threshold to a higher or lower number depending on your needs. The dilation=N parameter also masks \(N\) adjacent pixels and accounts for the possibility of wavelength miscalibration in the data or model.
[13]:
continuum = telluric_response.mask_tellurics(threshold=0.999, dilation=7)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[13], line 1
----> 1 continuum = telluric_response.mask_tellurics(threshold=0.999, dilation=7)
NameError: name 'telluric_response' is not defined
[14]:
smoothed_continuum, gp_model = continuum.smooth_spectrum(return_model=True)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[14], line 1
----> 1 smoothed_continuum, gp_model = continuum.smooth_spectrum(return_model=True)
NameError: name 'continuum' is not defined
[15]:
inferred_continuum_flux = gp_model(telluric_response.wavelength.value)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[15], line 1
----> 1 inferred_continuum_flux = gp_model(telluric_response.wavelength.value)
NameError: name 'gp_model' is not defined
[16]:
inferred_continuum = telluric_response._copy(flux=inferred_continuum_flux
* u.dimensionless_unscaled,
uncertainty=None)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[16], line 1
----> 1 inferred_continuum = telluric_response._copy(flux=inferred_continuum_flux
2 * u.dimensionless_unscaled,
3 uncertainty=None)
NameError: name 'telluric_response' is not defined
[17]:
ax = telluric_response.plot(ylo=0.90, yhi=1.02, alpha=0.3, label='Masked telluric lines');
ax.axhline(1.0, color='k', linestyle='dotted');
continuum.plot(ax=ax, label='Continuum pixels')
inferred_continuum.plot(ax=ax, color='k', label='Inferred continuum')
ax.legend();
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[17], line 1
----> 1 ax = telluric_response.plot(ylo=0.90, yhi=1.02, alpha=0.3, label='Masked telluric lines');
2 ax.axhline(1.0, color='k', linestyle='dotted');
3 continuum.plot(ax=ax, label='Continuum pixels')
NameError: name 'telluric_response' is not defined
Neat! That continuum fit looks plausible. You can see that we missed some telluric lines, such as the one at \(\lambda \sim 8825\). But overall it does an OK job. Let’s flatten the spectrum by this inferred continuum.
[18]:
flattened_telluric_response = telluric_response.divide(inferred_continuum, handle_meta='ff')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[18], line 1
----> 1 flattened_telluric_response = telluric_response.divide(inferred_continuum, handle_meta='ff')
NameError: name 'telluric_response' is not defined
[19]:
ax = flattened_telluric_response.plot(ylo=0)
ax.axhline(1.0, linestyle='dotted', color='k');
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[19], line 1
----> 1 ax = flattened_telluric_response.plot(ylo=0)
2 ax.axhline(1.0, linestyle='dotted', color='k');
NameError: name 'flattened_telluric_response' is not defined
Nice! Much flatter than before! The Hydrogen ringing artifacts are gone. We can now measure precision equivalent widths by summing pixels without having to worry about where to place the continuum.
[20]:
ax = flattened_telluric_response.plot(ylo=0.90, yhi=1.02)
ax.axhline(1.0, linestyle='dotted', color='k');
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[20], line 1
----> 1 ax = flattened_telluric_response.plot(ylo=0.90, yhi=1.02)
2 ax.axhline(1.0, linestyle='dotted', color='k');
NameError: name 'flattened_telluric_response' is not defined
Limitations¶
The GP smoothing process is inherently an inference procedure, and may cause some undetermined amount of self-subtraction or artifical boosting. An even-more-precise approach could be to fit directly for the lines with their real line shapes, based on measured or inferred atmospheric conditions at the time of the measurement. TelFit and other frameworks offer such methods for telluric lines, and spectral synthesis for stars is also possible.