Frequently Asked Questions
(Links to external sites will open in new tabs.)
- Q1: How do I download large files?
A: Large files are files with sizes greater than 2GB (i.e., 2^31 or 2,147,483,648 bytes). While most modern systems and browsers can handle large files without complaint, older 32-bit systems may still require some special attention. For the details on why, consult the Wikipedia article: Large File Support.
So, if you are downloading, say, a 5GB file, and find that it suddenly stops after 2GB or (more rarely) 4GB, first check to make sure you're not trying to download the file onto a FAT32-format disk drive (most USB drives come formatted FAT32) which has a 2GB limit on individual file size; then try updating your browser. If you're copying to a drive with a large file friendly format (Windows NTFS, Mac HFS+, Linux ext3 and similar) and updating the browser doesn't help or isn't possible, then try one of these utilities:
- The wget command seems to handle large files in all cases we've tried. Versions for Linux and binaries for Windows are available from Wget's Website. Mac users may discover that wget is included in the tools available from your unix command line. Wget will run in the background or as a batch job. The typical command looks like this:
% wget URL
You can copy and paste the URL from your browser into the command line. The output file is written to the same name as the file on the server. If you should be downloading a file from a password-protected area, you'll need to use the --user and --password options (note the leading double dash) to supply the user name and password you've been given. For example, if your user name is "steve" and the password is "send2ME", the command would look like this:
% wget --user="steve" --password="send2ME" URL
Please read your local documentation for extensive variations on the theme.
- An increasingly popular option for downloading large files, this utility will also run in the background and is available for all major operating systems from the cURL web site. Contemporary Linux and Mac users may find it already installed and waiting for them at the command line.
A typical cURL command line looks like this:
% curl -O URL
where -O (that's an uppercase letter "O") tells curl to use the same name for the output file as the file name found at the end of the URL (omit this and curl will dump the file directly to your screen, which is probably not what you want); and URL can be copied and pasted from your browser. If the URL you are trying to access is password protected, then you will need to use the -u option to supply the user name and password. For example, if your user name is "steve" and the password is "send2ME", the command would look like this:
% curl -O -u "steve:send2ME" URL
Once again, see your local documentation for extensive options, including how to supply your own output file name and how to resume a transfer that was interrupted in midstream.
- Q2: What do I do with an ISO file?
A: ISO files are files that have been formatted to the ISO 9660 (generally level 2) standard for CDROM data. This file format can be burned directly to CD and generally to DVD (in the UDF/ISO bridge form), or you can examine the file contents directly, depending on what resources you have available.
Burning to Disk
Many commercial software packages for burning CDs will be able to burn an ISO image directly onto the disk, though you may have to hunt for the right option. Look in the help/manual for information on creating and burning "disk images".
Note:If you just add the .iso file to the list of files to burn, you won't get a usable result - you'll get a CD which will just show the .iso file in its root directory. You may have to change the extension, depending on your software, to get it to recognize the .iso file as a pre-mastered disk image instead of a regular input file.
On linux systems, you can used the cdrecord command-line utility to burn .iso files onto CDs. For DVDs, use the growisofs command. Alternately, a GUI interface like X-CD-Roast may also be available.
Reading the File Directly
Sometimes you can inspect the contents of the .iso file without having to burn it onto a disk. This depends heavily on your OS:
- MagicISO seems to be a popular shareware program for performing various tasks with ISOs. There are also commercial options as well. Note, however, that we have no information on Vista systems - everything we could find was for XP, etc. If you know something that we don't, please do tell us!
- Users of reasonably contemporary Macs (OS/X, OS9) can put the .iso file on your desktop and double-click it to have the file opened as a virtual disk drive (a drive icon appears) that you can then browse. In some circumstances this might even be done for you automatically.
- If you have root privileges, you can mount the .iso file on a loop-back device and browse its contents. The simplest form of this command looks like this:
% mount file.iso /mnt/readiso -t iso9660 -o loop
where "file.iso" is the name of the ISO image to be read and "/mnt/readiso" is a mount point created for the image. The option details depend on your specific flavor of Linux (the above works in Red Hat r4), so consult your mount command documentation before attempting this.
There are also some shareware and freeware ISO file manipulators out there for Linux systems, but no one seems to stand out in the small crowd.
- Q3: What do I do with a checksum?
A: Checksums are a way of ensuring that data has not been corrupted, either accidentally or maliciously. There are dozens of different types of checksums out there - MD5 is one of the most widespread and generally supported at the moment, and thus the one we're using on our website for our download files.
The checksum is calculated by applying an algorithm to every byte in the target file. The result is a string of hexadecimal digits which is (to a very high probability) unique to the file. By calculating the checksum on a copy of a file you've downloaded and comparing it to the checksum calculated by us prior to the transfer, you can check to see if a file has been corrupted on download before, say, writing it to DVD or trying to unpack it.
How Do I Calculate It?
Which software you use to calculate the MD5 checksum doesn't matter. Here are some commonly available routines we know about:
- This utility is available for Unix/Linux and Windows/MS-DOS from www.fourmilab.ch/md5/.
This site offers source code for Unix/Linux users and a 32-bit executable for Windows users. Our office Macs appear to have this utility available from the command line as well. To generate the MD5 checksum for a file:
% md5 file_name
- Unix/Linux users may find this utility already available on their systems. A quick search of the net will turn up programs of this name that will run on a variety of platforms. To generate an MD5 checksum:
% md5sum file_name
- The OpenSSL suite includes utilities for computing a number of different checksums, including MD5 sums. Unix/Linux and Solaris users may find a version of this already installed on their systems. The command to calculate an MD5 checksum with OpenSSL looks like this:
% openssl md5 file_name
What Do I Do with It?
We will collect the checksums for all the large files and ISO images associated with a data set into a single file, for convenience. If you download one or more of these large/ISO files, you should run your favorite MD5 routine on the file you receive and compare the resulting string to the checksum listed in the dataset checksum list on our website. If the strings are the same, you can unpack or burn the data knowing that at least the source file is clean and uncorrupted.
If the MD5 string is different, try downloading the file again. If you still can't match the checksum in our file, please let us know as soon as possible - it maybe that our file has somehow been corrupted.
- Q4: How do I open zip and tar files?
A: Zip and tar are two utilities we use to package entire directory trees for download. Zip compresses the files as it packages them; tar only copies the files and directory structure into a single file that can then be itself be compressed - usually via the GNU gzip (not the same as zip) routine. The ".zip" extension means the files have been packaged and compressed using Zip alone. The ".tar" extension means the files have been packaged by tar but are not compressed. The extensions ".tar.gz" and ".tgz" are used to indicate .tar files that have been compressed by gzip or the "-z" option of the tar command itself.
To unpack the files, unix/linux users can run "unzip" and the extract form of the tar command; Mac users can generally click on any of these file types and get the right results; and Windows users can use the WinZip utility. Here are some representative links in case your system doesn't have the unpacking utility you want or need: