Image download module#
Download images from the PDS archive.
- marsimage.download.msl_index_products(camera, sol_start, sol_end, product_filter=['DRXX', 'RAD_', 'MXY_'], remove_thumbnails=True, find_best=True)[source]#
Index products from the PDS archive.
Warning: This function may fail for very large sol folders (like MSL Navcam), because the PDS takes very long to reply to many of these reqests.
- Parameters:
camera (str) – The camera to download images from. Options are ‘mastcam’, ‘mahli’, ‘mardi’, ‘navcam’, ‘hazcam’.
sol_start (int) – The starting sol number.
sol_end (int) – The ending sol number.
product_filter (list, optional) – A whitelist of strings to filter the product ids, by default [‘DRXX’, ‘RAD_’, ‘MXY_’]. Only products containing these strings will be downloaded. If None, all products will be downloaded. TODO implemet regex filtering
find_best (bool, optional) – If True, only download the highest quality version of each product, by default True. This will also remove non unique thumbnails.
- marsimage.download.download(url, path, skip_existing=True, session=None)[source]#
Download a file from a URL.
- Parameters:
url (str) – The URL to download the file from.
path (str) – The path to save the file to. If a directory is provided, the file will be saved with the same name as the URL.
skip_existing (bool, optional) – If True, skip downloading if the file already exists, by default True.
session (requests.Session, optional) – A requests session to use for the download. This allows for reusing the same connection for multiple downloads, by default None.
- Returns:
The path to the downloaded file.
- Return type:
- marsimage.download.download_pds3(product_url, path, skip_existing=True, session=None)[source]#
Download a PDS3 product and label from a URL.
- Parameters:
product_url (str) – The URL to download the product from.
path (str) – The path to save the product to.
skip_existing (bool, optional) – If True, skip downloading if the file already exists, by default True.
session (requests.Session, optional) – A requests session to use for the download. This allows for reusing the same connection for multiple downloads, by default None.
- Returns:
The path to the downloaded product.
- Return type:
- marsimage.download.download_products(products_df, output_dir='.', groupby='sol/camera', skip_existing=True, pbar=None, num_threads=None)[source]#
Download products from the DataFrame.
- Parameters:
products_df (pd.DataFrame) – The DataFrame containing the product index created by msl_index_products or msl_index_pds_folders.
output_dir (str, optional) – The output directory to save the products to, by default ‘.’.
groupby (str, optional) – The column to group by, by default ‘sol’.
num_threads (int, optional) – The number of threads to use for downloading, by default number of CPUs.
- Returns:
A DataFrame containing the paths to the downloaded files.
- Return type:
pd.DataFrame
- marsimage.download.download_msl(cameras, sol_start, sol_end, output_dir='.', groupby='sol/camera', product_filter=['DRXX', 'RAD_', 'MXY_'], skip_existing=True, **kwargs)[source]#
Download images from the PDS archive.
- Parameters:
cameras (str | list of str) – The cameras to download images from. Options are ‘mastcam’, ‘mahli’, ‘mardi’, ‘navcam’, ‘hazcam’. If a list is provided, images from all cameras in the list will be downloaded. If ‘all’ is provided, images from all cameras will be downloaded.
sol_start (int) – The starting sol number.
sol_end (int) – The ending sol number.
output_dir (str, optional) – The output directory to save the products to, by default ‘.’.
product_filter (list, optional) – A whitelist of strings to filter the product ids, by default [‘DRXX’, ‘RAD_’, ‘MXY_’]. Only products containing these strings will be downloaded. If None, all products will be downloaded. TODO implemet regex filtering
find_best (bool, optional) – If True, only download the highest quality version of each product, by default True. This will also remove most thumbnails.
num_threads (int, optional) – The number of threads to use for downloading, by default number of CPUs.
- Returns:
A DataFrame containing the paths to the downloaded files.
- Return type:
pd.DataFrame