Download Files from GitHub (lexi.download_files_from_github)
- lexi_xray.lexi.download_files_from_github(file_name_list, repo, folder_path, branch='main', save_dir='downloaded_data', verbose=False)[source]
Function to download files from a GitHub repository. Eventually, this function will be removed and we will be able to use the get_lexi_data function to download the files directly from the CDAweb website. For now, we will use this function to download the files from the GitHub to be used as a placeholder until we have the real data hosted on the appropriate website.
Note
In this function, we are using two folders to store and download the files. The first folder contains the first 950 files, and the second folder contains the remaining files. The reason for this is that the GitHub API only returns a maximum of 1000 files per request. If the folder contains more than 1000 files, then the files are split into multiple folders. The folder names are as follows: files_0_to_950, files_950_to_1917. The folder names are hard-coded in the function.
Parameters
- file_name_listlist
List of file names to download
- repostr
Name of the GitHub repository
- folder_pathstr
Path to the folder in the GitHub repository
- branchstr, optional
Name of the branch in the GitHub repository. Default is “main”
- save_dirstr, optional
Directory to save the downloaded files. Default is “downloaded_data”
- verbosebool, optional
If True, print messages. Default is False
Returns
- local_file_listlist
List of local file paths
Raises
- ValueError
If the status code of the response is not 200