I wanted to download a bunch of files from a webpage. Fortunately we have Python. Watch how ridiculously easy this is:
Python 2.6 (r26:66721, Oct 2 2008, 11:35:03) [MSC v.1500 32 bit (Intel)] on win 32
Type "help", "copyright", "credits" or "license" for more information.
>>> from BeautifulSoup import *
>>> import urllib
>>> import urllib2
>>> baseurl = "http://whatever.com/files/here/"
>>> soup = BeautifulSoup(urllib2.urlopen(baseurl))
>>> for link in links[5:]:
... print link.text
... urllib.urlretrieve(baseurl+link.text, link.text)
--- Now watch the fun, your downloads have begun. ---
So now a little explanation. BeautifulSoup is an HTML parser, and a damn good one at that. It can handle really badly formed HTML with grace, and makes it really easy to do screen-scraping. Really cool stuff.
Basically my variable 'soup' will hold the entire contents of the webpage. Now that object has a lot of capabilities, you will want to check out the BeautifulSoup docs to learn all of what it can do. How about this:
soup.findAll("a") #Boom. This will return a Python list of all "a" tags
Now all I do is loop through them all. I skip the first few because after inspection, these weren't files and I don't care about them. Now I just call the urllib.urlretrieve(url, filename). link.text is the actual text of the link.
In retrospect I probably should've done (urlretrieve(baseurl+link.href, link.text)), but you can figure that out for yourself. This is meant to be inspiration and to apply this you might have to make some changes to these nine lines.