Combining misaligned spectra, with error propagation

gully, Aishwarya, and Jessica November 12-30, 2021

In a previous tutorial we saw how easy it is to combine spectra that are aligned pixel-by-pixel: you simply + them together Here we show a new way to combine “misaligned spectra”—spectra that have some but not entire overlap in the extents of their wavelength axes.

[1]:
from muler.hpf import HPFSpectrum
import astropy.units as u
import numpy as np

import matplotlib.pyplot as plt
%config InlineBackend.figure_format='retina'
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from muler.hpf import HPFSpectrum
      2 import astropy.units as u
      3 import numpy as np

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/hpf.py:14
     12 import warnings
     13 import logging
---> 14 from muler.echelle import EchelleSpectrum, EchelleSpectrumList
     15 from muler.utilities import resample_list
     16 import numpy as np

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/echelle.py:31
     29 from scipy.ndimage import median_filter, gaussian_filter1d
     30 import specutils
---> 31 from muler.utilities import apply_numpy_mask, is_list, resample_list
     34 # from barycorrpy import get_BC_vel
     35 from astropy.coordinates import SkyCoord, EarthLocation

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/utilities.py:15
     13 from astropy.convolution import convolve, Gaussian1DKernel
     14 from scipy.ndimage import binary_dilation
---> 15 from astroquery.simbad import Simbad
     16 Simbad.add_votable_fields('V', 'B', 'J', 'H', 'K', 'parallax')
     17 LinInterpResampler = LinearInterpolatedResampler()

ModuleNotFoundError: No module named 'astroquery'

We will read in three files of similar A0V spectral type stars, combined here just for demonstration purposes.

[2]:
path = 'https://github.com/OttoStruve/muler_example_data/raw/main/HPF/01_A0V_standards/'
filenames = ['Goldilocks_20210212T072837_v1.0_0037.spectra.fits',
             'Goldilocks_20210517T054403_v1.0_0060.spectra.fits',
             'Goldilocks_20210801T083618_v1.0_0036.spectra.fits']
fns = [path + filename for filename in filenames]
[3]:
n_spectra = len(fns)
n_spectra
[3]:
3

We populate a list of these spectra so that we can handle them all in a moment…

[4]:
multi_spectra = []
for fn in fns:
    raw_spectrum = HPFSpectrum(file = fn, order=6)
    multi_spectra.append(raw_spectrum)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 3
      1 multi_spectra = []
      2 for fn in fns:
----> 3     raw_spectrum = HPFSpectrum(file = fn, order=6)
      4     multi_spectra.append(raw_spectrum)

NameError: name 'HPFSpectrum' is not defined

Now we deblaze, trim the edges, normalize, mask the tellurics, and barycentric correct each spectrum:

[5]:
fig, ax = plt.subplots(1, figsize=(10, 5))
clean_multi_spec = []
for i in range(n_spectra):
    clean_spectrum = multi_spectra[i].deblaze()\
        .trim_edges((4, 2042))\
        .normalize()\
        .mask_tellurics()\
        .barycentric_correct()
    clean_spectrum.plot(ax, color='k',alpha=0.1) #plot singular spectra
    clean_multi_spec.append(clean_spectrum) #save all clean spectra in list
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 fig, ax = plt.subplots(1, figsize=(10, 5))
      2 clean_multi_spec = []
      3 for i in range(n_spectra):

NameError: name 'plt' is not defined

Again, these three spectra come from different objects, so they’re not the best example, but they are misaligned, which is the central theme we are trying to illustrate.

Let’s use a new muler combine method on these spectra just to see how it works.

[6]:
from muler.utilities import combine_spectra_misaligned
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[6], line 1
----> 1 from muler.utilities import combine_spectra_misaligned

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/utilities.py:15
     13 from astropy.convolution import convolve, Gaussian1DKernel
     14 from scipy.ndimage import binary_dilation
---> 15 from astroquery.simbad import Simbad
     16 Simbad.add_votable_fields('V', 'B', 'J', 'H', 'K', 'parallax')
     17 LinInterpResampler = LinearInterpolatedResampler()

ModuleNotFoundError: No module named 'astroquery'
[7]:
combined_spec = combine_spectra_misaligned(clean_multi_spec)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 combined_spec = combine_spectra_misaligned(clean_multi_spec)

NameError: name 'combine_spectra_misaligned' is not defined
[8]:
clean_spectrum.shape, combined_spec.shape
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 clean_spectrum.shape, combined_spec.shape

NameError: name 'clean_spectrum' is not defined

The combined spectrum may have a different shape than the typical input spectrum. We recommend explicitly providing where you’d like the combined spectrum to be computed. This step is optional, but it can be convenient if you have a certain wavelength grid you want to preserve.

[9]:
pixel_midpoints = multi_spectra[0].wavelength.value
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[9], line 1
----> 1 pixel_midpoints = multi_spectra[0].wavelength.value

IndexError: list index out of range

The uncertainty is determined by computing the sample standard deviation of the mean.

[10]:
combined_spec_sampled_unc = combine_spectra_misaligned(clean_multi_spec,
                                           pixel_midpoints=pixel_midpoints)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[10], line 1
----> 1 combined_spec_sampled_unc = combine_spectra_misaligned(clean_multi_spec,
      2                                            pixel_midpoints=pixel_midpoints)

NameError: name 'combine_spectra_misaligned' is not defined

The combined spectrum will have the same shape as the pixel midpoints you pass in.

[11]:
pixel_midpoints.shape, combined_spec_sampled_unc.shape
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[11], line 1
----> 1 pixel_midpoints.shape, combined_spec_sampled_unc.shape

NameError: name 'pixel_midpoints' is not defined

You can optionally request to propagate the uncertainties, if your spectra have them:

[12]:
combined_spec_propagated_unc = combine_spectra_misaligned(clean_multi_spec,
                                           pixel_midpoints=pixel_midpoints,
                                           propagate_uncertainty=True)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 combined_spec_propagated_unc = combine_spectra_misaligned(clean_multi_spec,
      2                                            pixel_midpoints=pixel_midpoints,
      3                                            propagate_uncertainty=True)

NameError: name 'combine_spectra_misaligned' is not defined

You can also ask for propagate_uncertainty='max' which returns whatever uncertainty is higher.

[13]:
plt.plot(clean_spectrum.wavelength, clean_spectrum.snr, label='Single Spectrum')
plt.plot(combined_spec_sampled_unc.wavelength, combined_spec_sampled_unc.snr, label='Combined, Sampled $\sigma_i$')
plt.plot(combined_spec_propagated_unc.wavelength, combined_spec_propagated_unc.snr, label='Combined, Propagated $\sigma_i$')
plt.ylim(0, 1000)
plt.ylabel('$S/N$')
plt.xlabel('$\lambda \;(\AA)$')
plt.title('Signal-to-noise Ratio')
plt.legend();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[13], line 1
----> 1 plt.plot(clean_spectrum.wavelength, clean_spectrum.snr, label='Single Spectrum')
      2 plt.plot(combined_spec_sampled_unc.wavelength, combined_spec_sampled_unc.snr, label='Combined, Sampled $\sigma_i$')
      3 plt.plot(combined_spec_propagated_unc.wavelength, combined_spec_propagated_unc.snr, label='Combined, Propagated $\sigma_i$')

NameError: name 'plt' is not defined

All of the usual metadata keywords are preserved:

[14]:
combined_spec.instrumental_resolution
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[14], line 1
----> 1 combined_spec.instrumental_resolution

NameError: name 'combined_spec' is not defined
[15]:
combined_spec.provenance
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[15], line 1
----> 1 combined_spec.provenance

NameError: name 'combined_spec' is not defined
[16]:
combined_spec.spectrographname
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[16], line 1
----> 1 combined_spec.spectrographname

NameError: name 'combined_spec' is not defined
[17]:
combined_spec.site_name
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 combined_spec.site_name

NameError: name 'combined_spec' is not defined

But watch out since some metadata such as date or target name may be ill-defined for a combined spectrum, or simply not implemented yet.

[18]:
combined_spec.astropy_time.isot, clean_spectrum.astropy_time.isot
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[18], line 1
----> 1 combined_spec.astropy_time.isot, clean_spectrum.astropy_time.isot

NameError: name 'combined_spec' is not defined

For example, data acquired on different times and different nights will need a user-defined way to assign a notion of “time”. We do not attempt to assign these times. We also do not yet support accessing the .sky reference fiber for HPF. Please file a GitHub Issue if you are interested in support for these features.

This method is probably not suitable for precision radial velocity applications, and in general many applications may wish to preserve the native sampling of their spectra during inference procedures.