PDA

View Full Version : Batch download Library of Congress images


darkbluesky
12-07-08, 07:35 AM
Hello,

I hope someone here could help me. I want to download a lot of images from the Library of Congress (they are royalty free, and downloadable). http://lcweb2.loc.gov/cgi-bin/query/p?pp/fsaall:@filreq(@field(COLLID+fsac)+@field(COLLID+f sac))::SortBy=CALL

But I want to download the full size TIFF files (44-190 Mb each one), whom link is available in the page of each photo, found after clicking on each thumbnail (i.e. http://lcweb2.loc.gov/cgi-bin/query/I?fsaall:1:./temp/~pp_gddM::displayType=1:m856sd=fsac:m856sf=1a33849 :@@@mdb=fsaall )

In that page there is also a link in the top 'Bibliographic Information', that shows the data of the photo (title, author, year, etc) that I need to download too.

So the problem is to batch download these TIFF files, and ALSO the bibliograpahic info, in some way they both are related (same name, same folder...?, etc). I have tried a lot of batch image downloaders, and "site crawlers", but I have not success.

Please, do you have some idea/advice on how to do that or how to use a, maybe, site crawler (?), to do what I want?

Thank you very much!