English > PVD Python Scripts
New IMDb People v3 (Selenium) script
afrocuban:
Completely new IMDb People script with integrated Python+Selenium script. Put selenium script to the Scripts folder of your PVD too.
Everything works now, except (for me at least) populating photo to database. Please check and report if it works for you.
From the Change log -
--- Quote ---CHANGE LOG :
V 3.0.0.1 (04/01/2025) afrocuban (THANKS TO IVEK'S HUGE HELP):
- Selenium integration to PVD introduced. Check http://www.videodb.info/forum_en/index.php/topic,4368.0.html and
http://www.videodb.info/forum_en/index.php?topic=4367.0 for more
- Awards properly parsed now (HUGE THANKS TO IVEK HERE!!!).
- Birthplace, Filmography Bio and Genre field fixed.
- DownloadPage and ParsePage functions modified to split downpage-UTF8_NO_BOM.htm into downloading different file for each function: Awards, Bio, Credit and Genre. Main (Prncipal) page still downloaded with PVdBDownPage.exe to downpage-UTF8_NO_BOM.htm
- New HandlePhoto function to separately parse and add photo to record.
--- End quote ---
afrocuban:
If Ivek is willing to rewrite it to scrape everything from a single downpage-UTF8_NO_BOM.htm it would be great. I couldn't manage to do that
afrocuban:
Here's optimized selenium script, that reduces time wait significantly. Just replace it in your Script folder.
Ivek23:
--- Quote from: afrocuban on January 04, 2025, 07:20:20 am ---If Ivek is willing to rewrite it to scrape everything from a single downpage-UTF8_NO_BOM.htm it would be great. I couldn't manage to do that
--- End quote ---
I found a solution, the script now also uploads people's photos to the database. I will add it to the forum tomorrow at the latest.
afrocuban:
This is my final IMDB People script that now populates people's photos to database, thanks to Ivek who resolved it.
I am introducing new naming convention for my scripts, with a word "Selenium" in it and a version, since https is now fully covered by Selenium scripts.
Delete any of mine previous scripts, .psf or .py, and replace them with these.
First Selenium script downloads people's Base/Main page, and the other Selenium script downloads other four: Bio, Credit, Genres and Awards. In .psf script I have also introduced dynamic waiting for the Selenium script's downloading to finish in order to optimize timings for smaller and bigger datasets, in order user as rare as possible to click on a "Retry" button when asked.
Selenium scripts are now optimized with multithreading downloading, so downloading now for average person is only 25-30 seconds!
TO DO: to incorporate searching for people, back to all-in-one script.
Navigation
[0] Message Index
[#] Next page
Go to full version