Image download module#

Download images from the PDS archive.

marsimage.download.msl_index_products(camera, sol_start, sol_end, product_filter=['DRXX', 'RAD_', 'MXY_'], remove_thumbnails=True, find_best=True)[source]#

Index products from the PDS archive.

Warning: This function may fail for very large sol folders (like MSL Navcam), because the PDS takes very long to reply to many of these reqests.

Parameters:
  • camera (str) – The camera to download images from. Options are ‘mastcam’, ‘mahli’, ‘mardi’, ‘navcam’, ‘hazcam’.

  • sol_start (int) – The starting sol number.

  • sol_end (int) – The ending sol number.

  • product_filter (list, optional) – A whitelist of strings to filter the product ids, by default [‘DRXX’, ‘RAD_’, ‘MXY_’]. Only products containing these strings will be downloaded. If None, all products will be downloaded. TODO implemet regex filtering

  • find_best (bool, optional) – If True, only download the highest quality version of each product, by default True. This will also remove non unique thumbnails.

marsimage.download.download(url, path, skip_existing=True, session=None)[source]#

Download a file from a URL.

Parameters:
  • url (str) – The URL to download the file from.

  • path (str) – The path to save the file to. If a directory is provided, the file will be saved with the same name as the URL.

  • skip_existing (bool, optional) – If True, skip downloading if the file already exists, by default True.

  • session (requests.Session, optional) – A requests session to use for the download. This allows for reusing the same connection for multiple downloads, by default None.

Returns:

The path to the downloaded file.

Return type:

pathlib.Path

marsimage.download.download_pds3(product_url, path, skip_existing=True, session=None)[source]#

Download a PDS3 product and label from a URL.

Parameters:
  • product_url (str) – The URL to download the product from.

  • path (str) – The path to save the product to.

  • skip_existing (bool, optional) – If True, skip downloading if the file already exists, by default True.

  • session (requests.Session, optional) – A requests session to use for the download. This allows for reusing the same connection for multiple downloads, by default None.

Returns:

The path to the downloaded product.

Return type:

pathlib.Path

marsimage.download.download_products(products_df, output_dir='.', groupby='sol/camera', skip_existing=True, pbar=None, num_threads=None)[source]#

Download products from the DataFrame.

Parameters:
  • products_df (pd.DataFrame) – The DataFrame containing the product index created by msl_index_products or msl_index_pds_folders.

  • output_dir (str, optional) – The output directory to save the products to, by default ‘.’.

  • groupby (str, optional) – The column to group by, by default ‘sol’.

  • num_threads (int, optional) – The number of threads to use for downloading, by default number of CPUs.

Returns:

A DataFrame containing the paths to the downloaded files.

Return type:

pd.DataFrame

marsimage.download.download_msl(cameras, sol_start, sol_end, output_dir='.', groupby='sol/camera', product_filter=['DRXX', 'RAD_', 'MXY_'], skip_existing=True, **kwargs)[source]#

Download images from the PDS archive.

Parameters:
  • cameras (str | list of str) – The cameras to download images from. Options are ‘mastcam’, ‘mahli’, ‘mardi’, ‘navcam’, ‘hazcam’. If a list is provided, images from all cameras in the list will be downloaded. If ‘all’ is provided, images from all cameras will be downloaded.

  • sol_start (int) – The starting sol number.

  • sol_end (int) – The ending sol number.

  • output_dir (str, optional) – The output directory to save the products to, by default ‘.’.

  • product_filter (list, optional) – A whitelist of strings to filter the product ids, by default [‘DRXX’, ‘RAD_’, ‘MXY_’]. Only products containing these strings will be downloaded. If None, all products will be downloaded. TODO implemet regex filtering

  • find_best (bool, optional) – If True, only download the highest quality version of each product, by default True. This will also remove most thumbnails.

  • num_threads (int, optional) – The number of threads to use for downloading, by default number of CPUs.

Returns:

A DataFrame containing the paths to the downloaded files.

Return type:

pd.DataFrame