Download IGRINS data from Google Drive

IGRINS stores all of its data on Google Drive. Here we show how to programmatically fetch that data with Python.

[1]:
from muler.igrins import IGRINSSpectrum
import requests
%matplotlib inline
%config InlineBackend.figure_format='retina'
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from muler.igrins import IGRINSSpectrum
      2 import requests
      3 get_ipython().run_line_magic('matplotlib', 'inline')

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/igrins.py:15
     13 import json
     14 from matplotlib import pyplot as plt
---> 15 from muler.echelle import EchelleSpectrum, EchelleSpectrumList
     16 from muler.utilities import Slit, concatenate_orders, resample_list, roll_along_axis, edge_normalize, isolate_and_normalize_hi_order, round_to_multiple, photometry
     17 from astropy.time import Time

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/echelle.py:31
     29 from scipy.ndimage import median_filter, gaussian_filter1d
     30 import specutils
---> 31 from muler.utilities import apply_numpy_mask, is_list, resample_list
     34 # from barycorrpy import get_BC_vel
     35 from astropy.coordinates import SkyCoord, EarthLocation

File ~/checkouts/readthedocs.org/user_builds/muler/envs/latest/lib/python3.8/site-packages/muler/utilities.py:15
     13 from astropy.convolution import convolve, Gaussian1DKernel
     14 from scipy.ndimage import binary_dilation
---> 15 from astroquery.simbad import Simbad
     16 Simbad.add_votable_fields('V', 'B', 'J', 'H', 'K', 'parallax')
     17 LinInterpResampler = LinearInterpolatedResampler()

ModuleNotFoundError: No module named 'astroquery'

For now, you need the filename and Google Drive ID for a spectrum. Eventually this information may come from an observation Log provided by the IGRINS team. At the moment, I just retrieved this information by navigating to the Google Drive website.

These data are already public on the muler_example_date repo, but I uploaded them to a public Google Drive for the purpose of this demo. Eventually the entire IGRINS archive will be in a public Google Drive like this.

[2]:
download_dictionary = {'SDCH_20201202_0059.spec_a0v.fits':'1tBY0NCcTnnCkvXXvFNOiqd4e10S6W2RB',
                       'SDCH_20201202_0059.sn.fits':'1NlIUWPxiN_nkZ83JsTzXby_5ihSSZ34o'}

We need two files because the IGRINS pipeline houses the uncertainty values in a separate file from the flux values.

[3]:
def download_file(filename, file_id):
    """Download the file from Google Drive"""
    URL = "https://docs.google.com/uc?export=download"
    session = requests.Session()
    response = session.get(URL, params = { 'id' : file_id }, stream = True)
    if response.status_code == 200: # Successfully found the file on Google Drive
        CHUNK_SIZE = 32768

        with open(filename, "wb") as f:
            for chunk in response.iter_content(CHUNK_SIZE):
                if chunk: # filter out keep-alive new chunks
                    f.write(chunk)
        print("Downloaded {}".format(filename))
[4]:
for key, value in download_dictionary.items():
    download_file(key, value)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 2
      1 for key, value in download_dictionary.items():
----> 2     download_file(key, value)

Cell In[3], line 4, in download_file(filename, file_id)
      2 """Download the file from Google Drive"""
      3 URL = "https://docs.google.com/uc?export=download"
----> 4 session = requests.Session()
      5 response = session.get(URL, params = { 'id' : file_id }, stream = True)
      6 if response.status_code == 200: # Successfully found the file on Google Drive

NameError: name 'requests' is not defined
Hooray, we downloaded the files! They are saved in the local directory.
We can simply read them in by handing-in the filename:
[5]:
spectrum = IGRINSSpectrum(file='SDCH_20201202_0059.spec_a0v.fits', order=11).normalize()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 spectrum = IGRINSSpectrum(file='SDCH_20201202_0059.spec_a0v.fits', order=11).normalize()

NameError: name 'IGRINSSpectrum' is not defined
[6]:
spectrum = spectrum.trim_edges().remove_nans()
spectrum.plot();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[6], line 1
----> 1 spectrum = spectrum.trim_edges().remove_nans()
      2 spectrum.plot();

NameError: name 'spectrum' is not defined

Great, we fetched the spectrum from Google Drive, did some light post-processing, and plotted it!