In these days I had to download a lot of files from a remote FTP server, the best solution in cases like this one is to login on the remote server and do a zipped archive of all the files (for this use tar -zcvf archivename.tgz /path/to/archive/) , in this way you’ll have to download just 1 file that is also been compressed and FTP can do perfectly this.
But this time I had no shell on the remote server, just a FTP account, so what’s the best way to download a large number of files recursively ?
As first thing I’ve took a look at the manual page of ftp, ah I forgot to say that the machine where I’ve to download all these files is a head-less server so no graphical interface or handy graphical FTP clients, looking at the FTP man page, the most similar thing of what i needed was the command mget:
Expand the remote-files on the remote machine and do a get
for each file name thus produced. See glob for details on
the filename expansion. Resulting file names will then be
processed according to case, ntrans, and nmap settings.
Files are transferred into the local working directory, which
can be changed with ‘lcd directory’; new local directories
can be created with ‘! mkdir directory’.
So useful but not for me in this case where I’ve multiple subdirectory, so with a quick search on Google I’ve found that simply the protocol FTP don’t support recursive download and so you must use the client options to do this, so let’s see how to do it with Wget
GNU Wget is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.
So this seem the perfect tool to be used on a server, also as plus wget is available for sure in any Linux distribution repository and this make installing it trivial.
The basic syntax for wget is
With a command like this one you use the FTP protocol with account myusername and the password mypassword to donwload from ftp.yoursite.com the file yourfile.
But we need some extra options to get a recursive download from that FTP site.
-r –recursive Turn on recursive retrieving.
-l depth –level=depth Specify recursion maximum depth level depth. The default maximum depth is 5.
So our command becomes:
wget -r --level=99 ftp://myusername:firstname.lastname@example.org/
In this way starting from the root directory wget download recursively down to 99 levels (or you can use inf for infinite)
Or you can use the -m option (that stands for mirror)
The -m option turns on mirroring i.e. it turns on recursion and time-stamping, sets infinite recursion depth and keeps FTP directory listings:
wget -m ftp://myusername:email@example.com/
If like me you have a really big site i suggest to run it with a nohup in front of the command and run it in background.
A final tip for wget, if you have to re-run it with the same site, you can also use the option -nc, in this way the files will not be downloaded 2 times.
If a file is downloaded more than once in the same directory, Wget’s behavior depends on a few options, including -nc. In certain cases, the local file will be clobbered, or overwritten, upon repeated download. In other cases it will be preserved.
When running Wget with -r or -p, but without -N, -nd, or -nc, re-downloading a file will result in the new copy simply overwriting the old. Adding -nc will prevent this behavior, instead causing the original version to be preserved and any newer copies on the server to be ignored.