Download Files from GitHub (lexi.download_files_from_github)

lexi_xray.lexi.download_files_from_github(file_name_list, repo, folder_path, branch='main', save_dir='downloaded_data', verbose=False)[source]

Function to download files from a GitHub repository. Eventually, this function will be removed and we will be able to use the get_lexi_data function to download the files directly from the CDAweb website. For now, we will use this function to download the files from the GitHub to be used as a placeholder until we have the real data hosted on the appropriate website.

Note

In this function, we are using two folders to store and download the files. The first folder contains the first 950 files, and the second folder contains the remaining files. The reason for this is that the GitHub API only returns a maximum of 1000 files per request. If the folder contains more than 1000 files, then the files are split into multiple folders. The folder names are as follows: files_0_to_950, files_950_to_1917. The folder names are hard-coded in the function.

Parameters

file_name_listlist

List of file names to download

repostr

Name of the GitHub repository

folder_pathstr

Path to the folder in the GitHub repository

branchstr, optional

Name of the branch in the GitHub repository. Default is “main”

save_dirstr, optional

Directory to save the downloaded files. Default is “downloaded_data”

verbosebool, optional

If True, print messages. Default is False

Returns

local_file_listlist

List of local file paths

Raises

ValueError

If the status code of the response is not 200