FTP (File Transfer Protocol) Documentation¶
The ftp
module is geared towards making it easier to interface with
NCBI’s FTP repository.
More specifically, we provide a way to easily find and list directories and their respective contents as well as to download blast databases and other databases for use with the Orthologs package. We have implemented database downloading with threading which is the safest way to implement this cross-platform.
We also provide a parallel module which can be used in conjunction with
the NcbiFTPClient
to download files or databases much quicker if
your system can handle that.
If you’re using Linux or a supercomputer and do not want to use threading to download ftp databases, you can look at this cli script.
Examples¶
Blastdb Download Example¶
This is a simple example of using some of the modules.
from OrthoEvol.Tools import NcbiFTPClient
ncbiftp = NcbiFTPClient(email='somebody@gmail.com')
ncbiftp.getblastdb(database_name='refseq_rna')
Windowmasker files Download Example¶
from OrthoEvol.Tools import NcbiFTPClient
import os
ids = ['9544', '9606']
ncbiftp = NcbiFTPClient(email='somebody@gmail.com')
ncbiftp.getwindowmaskerfiles(taxonomy_ids=ids, download_path=os.getcwd())
Refseq Release Download Example¶
from OrthoEvol.Tools import NcbiFTPClient
import os
ncbiftp = NcbiFTPClient(email='somebody@gmail.com')
ncbiftp.getrefseqrelease(taxon_group='vertebrate_mammalian', seqtype='rna', seqformat='gbff', download_path=os.getcwd())
List all directories in a path¶
ncbiftp.listdirectories(path='/blast/db/')
Out[54]: ['FASTA', 'cloud']
List all files in a path¶
ncbiftp.listfiles(path='/blast/db/')
List all files in the current working directory¶
# The default path is ftp.pwd() or the current directory
ncbiftp.listfiles()
Notes¶
Check the NCBI README for information about the preformatted blast databases that we use and suggest you use. We also provide an easy way to download them which is referenced in the above example.