Gallery Thief Image Download Scripts

Download Here  >>> Click to download   
This is a collection of several scripts, each of which will download every available gallery picture at a web site. Which script you use depends on how the web site is organized.

OVERVIEW

When the scripts run, they create their own folder to store files in. The folder will be based on the script name. The downloaded picture files are generally given unique names based on the original URL so there won't be any name clobbering. If a file already exists, it will not be downloaded. In practical terms, this means you can shut your computer down while the script is running and the script will pick up where it left off the next time you start the script.

To run the scripts, I suggest you right-click them and select "Open" (if you want the script to run invisibly) or "Open with Command Prompt" (if you want to see the script run). If you just double-click the script, Windows will pick one of the two for you (the default is to run invisibly, but maybe you changed the default).

There are other download programs, plugins, and extensions out there. These scripts are BETTER because:

  1. Unlike a compiled program, you can see the insides of a script. To quote the late RR, "Trust, but verify". And with a script, you CAN verify.
  2. A script runs totally invisibly by default. Nobody is going to walk up and ask you about the downloader you have running.

Because these are simple scripts, nothing is hidden from you. You can right-click any of the scripts and select "Edit" to see every line of what the scripts will do. If you don't want to run the script straight through, you can single-step it by installing either Visual Studio:
http://lab.msdn.microsoft.com/express/vbasic/default.aspx
or the Scripting Debugger:
http://www.microsoft.com/downloads/details.aspx?familyid=2F465BE0-94FD-4569-B3C4-DFFDF19CCD99

I've released these scripts into the "Public Domain", so feel free to pass them around or modify them. Just don't  expect me to support them if you modify them. Actually, don't expect me to support them at all. Just because they work for me doesn't mean I can guarantee they'll work for you. Software is like that.


USAGE

The scripts must be modified for each web site whose gallery you wish to download. I suggest you give a modified script a name which will remind you what it is for. After all, the script is going to save the pictures in a folder named after the script name.

To edit a script, right-click it (I said RIGHT click it, not left click, not double click, but RIGHT click) and select "Edit" from the resulting list of choices. The "Notepad" program will open and show you the source code of the script. Scroll down a few lines until you come to a section marked off as the "USER EDITABLE AREA". It will be surrounded by asterisks (stars). You should only modify the code in that section, and then only things on the right side of an equals sign. Each of the scripts comes filled out with actual working sample entries to help you.


Contacting me for help

If modifying the script is beyond you or you seem to have something that the script can't quite handle, feel free to contact me at the email address given in the "readme.txt" file inside the zip file (The WinZip icon "Download Here" link near the top right of this web page).

Give me a couple picture URLs or URLs to some gallery pages and I'll send you back a script. Don't worry about what kind of URLs you're asking about. If a script solution is possible, I'll send you what you need. Just make sure I can send you back a script! Many email systems block scripts. Try sending yourself a script and see if it gets through. Let me know if you want me to send it:

If you ask me to send an encrypted or encoded response, I'll assume some level of privacy is wanted and will NOT give a description of the script or directions on how to decode or decrypt it.


MODIFYING THE "UnrelatedNumbers.vbs" SCRIPT

The "Unrelated Numbers" script assumes you can get the URL to individual pictures and that those pictures all fall in a numeric sequence. For example, when I visited the "starwarsgalaxies.station.sony.com" web site and looked for user screen captures, I found the pictures they displayed all seemed to follow a simple sequence: They all seemed to be named for the year, month, and date. In the "user editable section" shown below, I show the first number being set to run from a low range of "2004" through "2005" (which would be the year), the second number running from "01" to "12" (the month), and the third number covering the range of "01" through "31". Sure, some months don't have 31 days, so I won't get pictures for those missing days. So what? Since there is no text between the year and month, I leave the "SECOND_TEXT" entry blank (nothing inside the quotes). If Sony had chosen to put a dash in their URL like ("2004-01-01.jpg"), then I'd have shown the dash between the year and month like this:
Const SECOND_TEXT = "-"
Since there is no fourth number or fifth text, I just leave those entries empty.

'********************* USER EDITABLE AREA *********************************
'**************************************************************************
'http://starwarsgalaxies.station.sony.com/images/player_screenshots/20040101.jpg
Const FIRST_TEXT = "http://starwarsgalaxies.station.sony.com/images/player_screenshots/"
Const FIRST_NUMBER_LOW = "2004"
Const FIRST_NUMBER_HIGH = "2005"
Const SECOND_TEXT = ""
Const SECOND_NUMBER_LOW = "01"
Const SECOND_NUMBER_HIGH = "12"
Const THIRD_TEXT = ""
Const THIRD_NUMBER_LOW = "01"
Const THIRD_NUMBER_HIGH = "31"
Const FOURTH_TEXT = ".jpg"
Const FOURTH_NUMBER_LOW = ""
Const FOURTH_NUMBER_HIGH = ""
Const FIFTH_TEXT = ""
gintSelectedFilePrefix = FILE_PREFIX_PICTURE
'**************************************************************************
'********************* END OF USER EDITABLE AREA **************************


MODIFYING THE "IndexPage.vbs" SCRIPT

The "Index Page" script assumes a multi-level site where picture URLs can't be guessed and gallery pages also have dynamic URLs. The only thing you know for sure is an initial "home" page which contains links to several "gallery" pages. Each gallery page contains links to several "pictures". You supply the "home" URL, then supply unique "link text" to identify the gallery links on the home page. You'll need to look at the HTML source to do this. Then you look at the gallery pages and supply unique link text to identify picture links. In the example below, the home page links to dozens of other pages with darned near random names. However, when I examined the source code of the home page, I saw that all the links were HTM pages tied to images like this:
<A HREF="jan01/k9188-1.htm"><IMG SRC="jan01/k9188-1x.jpg"
To make it generic, I only used the part that never changed:
.htm"><IMG SRC=
And of course, since inside quotes have to be escaped with another quote, it became this:
.htm""><IMG SRC=
And that's what I used as my GALLERY_LINK_TEXT. When I go to the gallery page, well... this is a lousy example. It's not really a gallery. It only has one link to a picture. Good enough! It's a raw HTML link like this:
640 pixels wide: (<A HREF="k9188-1.jpg">k9188-1.jpg</A>)
Again to make it generic across all "gallery" pages, I only use the part that won't change:
.jpg</A>)
And that's what I use as the PICTURE_LINK_TEXT. In most cases (where a thumbnail picture links to a larger picture), you'd use something like this:
Const PICTURE_LINK_TEXT = ".jpg""><IMG"

'********************** USER EDITABLE AREA ********************************
'**************************************************************************
Const HOME_PAGE = "http://www.ars.usda.gov/is/graphics/photos/fruitsimages.new.htm"
Const GALLERY_LINK_TEXT = ".htm""><IMG SRC="
Const PICTURE_LINK_TEXT = ".jpg</A>)"
gintSelectedFilePrefix = FILE_PREFIX_PICTURE
'**************************************************************************
'********************* END OF USER EDITABLE AREA **************************


MODIFYING THE "SearchEngineIndex.vbs" SCRIPT

The "Search Engine Index" script is just a modification of the above "Index Page" script. I noticed that for many (an awful lot) of galleries, there is no single place you can go. You just have to let a search engine find them. The results page from the search engine is basically like a home page with links to the gallery pages. In the example below, I decided to use Google (highly recommended) to search for pages at the eaa38.com web site. To do that at Google, you'd use a search term like this:
site:eaa38.com
When you escape that to make it safe to use as a real URL, it becomes
site%3Aeaa38.com

Now, one of the added features in this script over an ordinary index page script is that this script will crawl through Google's maximum of 1000 results. To do that, you need to set it up to return 100 results per page and to start at page zero. That's done with a URL query like this:
&start=0&num=100 
The script will modify the URL automatically to change the start value from 0 to 100, 200, 300, and so on up to 900. Really, the only thing you have to do in most cases is replace "eaa38.com" with whatever web site you're interested in, for example "www.kitties.com". You'll need to replace it in both the SEARCH_ENGINE_URL line and the RESULTS_PAGE_LINK_TEXT line. In most cases(where a thumbnail picture links to a larger picture), you won't need to do anything with the PICTURE_LINK_TEXT.

'********************** USER EDITABLE AREA ********************************
'**************************************************************************
Const SEARCH_ENGINE_URL = "http://www.google.com/search?q=site%3Aeaa38.com&start=0&num=100"
Const RESULTS_PAGE_LINK_TEXT = "http://EAA38.com"
Const PICTURE_LINK_TEXT = ".jpg""><IMG"
gintSelectedFilePrefix = FILE_PREFIX_PICTURE
'**************************************************************************
'********************* END OF USER EDITABLE AREA **************************




Lost? Look at the site map.

Bad links? Questions? Send me mail.

Google
Yahoo
Ask Jeeves