Downloading Multiple PEOPLE Images

English > Development

<< < (2/3) > >>

mgpw4me@yahoo.com:

--- Quote from: nostra on January 18, 2010, 01:06:31 am ---You are probably returning the wrong value in ParsePage. You should return prList (= 2) if you have a list of Movies or Persons and prListImage (= 3) if you have a list of images...

--- End quote ---

--- Quote from: nostra on January 19, 2010, 08:20:57 pm ---People scripts and movie scripts are not meant to get multiple images, only the poster type is meant to.
You can still add multiple images in movie/person scripts using
--- Code: ---AddImageURL(ImageType : Integer; url : String)
--- End code ---
but you will not get a selection dialog in this case.

--- End quote ---

Well, that's "it" for me. I don't see any point to spending any further time on scripts.

nostra:
So you switch to plugins, wait for improved scripts or abandon the whole idea???

mgpw4me@yahoo.com:
I don't see the scripting environment being a reasonable way to do what I want. The script engine works great for what it was designed to do, but even with improvements, it would still be slow and cumbersome for image collection. What I've found is that it's faster to invoke a script (saves typing), but once you find images, it's better to switch to a browser, download images to a directory, then load images into PVD from the directory. This process allows the selection, manipulation, and order of images to be controlled. It's also much faster and multiple images aren't an issue.

DLLs are too much trouble. In the time it takes to build a couple of DLLs, I can write a whole environment in PHP.

- PHP provides http via FOPEN, so I can grab exactly the files I want, with very little overhead. For example, many sites have the person name in the file path, so I can go directly to a file and start parsing. If the file doesn't exist, FOPEN returns an error.
- HTML and string functions to duplicate the PVD scripting environment either exist or would be easy to duplicate
- I can manipulate the images via the GDI or ImageMagik (sic?). For example, I can get the filesize that's being downloaded, do a colour histogram on it and determine from the data blocks what the image size is. Image size + histogram would be a reasonable way to determine if I have an image for a particular person, and whether a larger images was available. Conversion to grayscale prior to a histogram creation usually makes the process very reliable with 'altered' images. Histograms with a very limited colour range is to some extent a measure of image quality...not enough contrast = too dark or too bright. The ability to control images also allows me to set an exact image size for inclusion into PVD....2000 X 3000 just slows the database too much when displaying a person.
- I can write subroutine modules and include them as necessary to reduce the complexity of processing multiple websites
- I can build a nice checkbox selection / navigation routine in html
- I can connect directly to the database via the Interbase API
- There is also an language collation class so I can convert Élodie Bouchez to Elodie Bouchez, which makes searching more reliable on most sites. Given that there is a 'translated name' field, in the database, I could easily (?) populate this and use that when viewing people in my database.

It's a rich environment that would be hugely difficult to duplicate in the existing PVD scripting environment. I think php would make a great scripting engine, but given that it has direct access to the database, and to the user's file system, I can see where it would cause significant problems in the area of database integrity. To me, those are good things...I have images on my hard drive that I'd like to put into PVD in an automated fashion. I guess it would be possible to write a high-level dll to invoke php from PVD and eliminate unwanted accesses.

At any rate, I'm not done with my idea of collecting images. I'm simply changing the tools to something that makes an unreasonable task more reasonable.

If I can make the work I do reasonable for others to use, I'll post my code.

*** addendum ***

I've just tested command-line mode against a website and HTTP works without a server installed. Database access test is next.

rick.ca:

--- Quote ---What I've found is that it's faster to invoke a script (saves typing), but once you find images, it's better to switch to a browser, download images to a directory, then load images into PVD from the directory.
--- End quote ---

If I were serious about collecting multiple photos for some people, this is what I'd do with the tools I have now:

1. Use a Web search to search for the selected person on a particular website or group of websites—the group of websites perhaps being a subset of those most appropriate for the type of person and type of photos sought.

2. Use whatever selection/display facilities are provided by the host sites to select and download the desired photos. This seems obvious, but my point is whatever means are provided by the website for doing this are going to be superior to what a script or plugin can do.

3. The photos would be downloaded to a temporary directory where they can be resized, edited, culled and sorted using a file and/or image manager.

4. Import the files into PVD.

I suppose that whole process could be automated (assuming that's what you mean by "a whole environment in PHP"), but is the result just going to be something less versatile not significantly more efficient?

mgpw4me@yahoo.com:
Assuming that sites are static, have site searches AND that I want to process each person in my database one at a time, you'd be right. I plan on doing batches, not individual processing, and I plan to do many of my searches in reverse...parse a site and compare to my database, not search for every person in my database on 50-70 sites.

Site Updates
I plan to store image links from site pages, in text files for each person. When I run an update on my database, I can re-parse a site, compare the links against my text file, mark any new ones for future processing and replace the old links file. The code to parse links is about 10 lines, including loading the page, and will work for any site...no more parsing on a per site basis (except for multiple pages, and the processing of image links where I can use common routines to filter the results).

Navigation
Some sites don't have search facilities...just lists of links to people. Rather than do 50,000 internet connections, I'll parse the list of people on the site and do however many database searches to see who I have. Any people not in my database will be added to an archive which will be checked when I add new people in my database. Only sites with search facilities will be need to be checked when I add a new person. Some sites also have galleries with broken image links...I can catch and throw those away with PHP...in the PVD script environment they're as valid as a working link.

Processing
I'll run updates overnight, then go through the image list at my leisure. The only point where I'll actually have to be at the computer will be during the image selection process, and since I'll be pre-processing the list against my database, only people in my database will be represented in my image selection routine. The selected images will be stored in a directory, then uploaded to the database en masse when I've selected all the images I want during that session. The actual image selection program will group links from various sites by the person name so I'll have all the images for a person in "one spot". For update processing, I'll just be working with a huge list of images that will comprise all the new links found in an update...no selection for a person to update...just I want the picture or I don't.

I haven't decided on the user interface yet...I'm looking at php GTK for windows...another 'out of box' solution where I can use modular programming methods.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version