Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Topics - afrocuban

Pages: [1] 2 3 4
1
I must have made a mistake, I just didn't notice it (I still have a lot of things to sort out for my mother's passing, so some details are missing and I don't notice them). I'll fix that and see if it works.

I fixed it now and it works perfectly.

Great! Enjoy it!

May I ask you a question? Can you describe the flow of adding series, seasons and episode links? Which script does which task? I'm almost done integrating series script into movie script, but I am stuck at the moment what generates seasons and what generates  episodes in each season, what provides links to the episodes and so on. Thank you in advance.

2

IF YOU DON'T READ THIS POST CAREFULLY AND FOLLOW EVERYTHING WRITTEN HERE, BUT JUST DOWNLOAD FILES, I COULD BET IT WILL NOT WORK FOR YOU AND YOU WILL COME BACK HERE ASKING QUESTIONS ALREADY ANSWERED IN THIS POST.


Almost 4 months after for the first time ever I heard the word "Selenium" knowing nothing about programming, I am finally bringing practically new PVD MOD considering amount of files and programs brought. It consists of the scripts and program as described here.

You now need only one script for IMDb movies, one for IMDb people and one for FilmAffinity movies for everything: search and download. Selenium scripts in the background are doing all "external" work, so in your PVD you have clean situation: 2 scripts and configurator for movies (plus .batch file for these 2 at your will), and one script and configurator for the people. Check the screenshots below.

I strongly suggest to rename your current "Scripts" folder to, for instance, "Scripts-Original", and to put this Scripts.7z in your PVD folder and extract it there. It will create "Scripts" folder with all the scripts and files needed for the PVD to work (as a bonus, I'm contributing source code for the Scripts Configurator program, as well as updated and polished UDL file for PVD scripts in Notepad++ that is just to be imported to Notepad++).  If you want to, after testing you can merge two folders, Selenium and non-Selenium scripts and files into "Scripts" folder.

Before that....
As stated here
ensure that:


Quote
A. You installed python
B. You installed selenium via cmd, with
Quote
pip install selenium

B. You installed requests via cmd, with
Quote
pip install requests

C. You have your Chrome bin on a PATH (to test this, open cmd and simply type "chrome" and check if Chrome opens).
D. You have Python folder on your PATH (to test this, open cmd and simply type "python --version" and check if got the proper feedback, for instance:
Quote
C:\Users\user>python --version
Python 3.12.6
E. pythonw.exe is not missing, or it's containing folder is on the PATH (to test this, open cmd and simply type "pythonw" and check if got the proper feedback, for instance:
Quote
C:\Users\user>pythonw

C:\Users\user> (empty output)


These scripts:


Quote
1. Use Chrome browser instead Firefox
2. Use chromedriver.exe instead geckodriver
3. Start chromedriver.exe silently
4. Silently invoke browser in a headless mode (no pop-up windows of browser)
5. Scrape .htm pages of a given urls
6. No path is needed to set manually inside the script - it is set to be relative to the path of selenium script!


You just use your PVD as ever, just be sure to extract as instructed above.

For using relative path, ensure:

Quote
6B. You put appropirate chromedriver.exe to the "Script" folder, too. There is no installation for chromedriver, just extract it from the .zip file into your "Scripts" folder described above. IMPORTANT!!!! You need to download chromedriver.exe of the same version as your Chrome browser. At the moment of this post, stable version is v134. You can find Crome browser download and appropriate chromedriver here. For example, for v134, Stable links are:
Chrome browser:

1. chrome   win32   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win32/chrome-win32.zip
2. chrome   win64   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win64/chrome-win64.zip
Chromedriver:

1. chromedriver   win32   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win32/chromedriver-win32.zip
2. chromedriver   win64   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win64/chromedriver-win64.zip

From this point on, everything is automated and headless, silent as never before.

Amount of data imported is huge! I have included dozens of new custom fields. IF YOU ARE DATA HUNGRY AS I AM, THE MOST DATA CAN BE COLLECTED IF YOU CHECK AND UNCHECK OPTIONS AS IN THE SCRIPTS CONFIGURATOR SCREENSHOTS BELOW. The updated table with all the fields in these 3 scripts can be found here. To see them all, you have to use Classic skin, or you have to add custom fields to your PVD and from there to create your own custom skin, or to use one of my skins from here, once I complete them and adjust them for the final Selenium v4 scripts.


Examine the table. That is the only way you will learn what fields comes from which movie/person page and if you want them or not. The less you want, the faster PVD will be.

Please feel free to test the scripts and give me a feedback if something doesn't work. When I say "it doesn't work", everything works, the only issue that can happen is that sites have changed html layout and again not all fileds are available, or you updated your Chrome automatically to a higher version and you didn't download and extract corespondent chromedriver version. The best indicator for this is that no .log file is created in "\Scripts\Tmp\" folder. I will update whatever you report in a month at most from the first report, to give us all the time to collect and report as many as possible issues.

What I have learned


On this long journey, what I have learned was how hard work coding is. Also, I had to learn pretty deep about Pascal/Delphi, about python, and most frustratingly - ahk! I had to revise all the scripts from the scratch several times. It was either because of the concepts i was developing along the way, or IMDb and FilmAffinity were changing their layout. For example, just yesterday new Chrome version was brought, and my chromedriver didn't work anymore, so I had to download new version of it too. Also, just 2 days ago I learned that FilmAffinity introduced their AKA for some movies, so I had to update FA script again. And so on and on for 4 months. Thus, I learned to appreciate it. The most important, now I even more appreciate EasyVVV's, and especially Ivek's work for more than a decade (!!!) to provide us with PVD alive!

So, humbly, I dedicate this hard work to EasyVVV, but most, and before anyone else I dedicate this to IVEK and to memory to his late mother! I HOPE IVEK CAN IMAGINE  HOW GRATEFUL TO HIM I PERSONALLY AM, AFTER I REALIZED HOW HARD THIS ALL WAS! GOD BLESS YOU IVEK!

3
PVD Python Scripts / Downloading with Selenium experiences and tips
« on: February 08, 2025, 03:58:26 am »
Share your tips here...

4
This is temp topic so you could track where am I at the moment with scripts.
Most probably I will keep only this message and modify it with most recent updates.


IMDb People script:
CHANGE LOG :

            V 4.0.0.1 (23/01/2025) afrocuban:
         - Full and complete transtion to Selenium. No more instances or references to PVdBDownPage. Huge thanks to VVVEasy and Ivek to maintain decade long option to keep PVD alive with it.
         - Search function brought back to the script with thumbnails in the search window.
         - Full implementation of PvdConfigOptions. Extremely important for optimization and especially useful when refreshing only certain set of data
         - Job Title, Career and other personal data now moved from comment to bio field.
         - Starting positions provided for all the fileds in section "Field Overwrite Options position in pvdconf.ini".
         - No more pop-up windows stealing focus! In earlier versions, the downpage-UTF8_NO_BOM.htm file was repeatedly downloaded and deleted to parse different pages. Each time this occurred, a pop-up window would steal the focus. Thanks to a different approach in this script, which now downloads separate pages in parallel using Selenium, as well as the special PurgeTmpFiles procedure and vbs script instead of cmd.exe, this focus-stealing issue is now resolved.
TO DO:

SCRIPT COMPLETE
Script and instructions available in a package here.


IMDb Movie script:
CHANGE LOG :

            V 4.0.0.1 (18/01/2025) afrocuban:
         - Transition to Selenium started.
         - Various DownloadPageXXX functions for each page to be downloaded constructed.
         - ParsePage function adapted accordingly.
         - DownloadPage and DownloadImage functions set up for Selenium scripts.
         - ParsePage_IMDBSearchTitle function fully transitioned to Selenium which means that
         - Search function is brought back to the script with thumbnails in the search window.
         - Starting positions provided for all the fileds in section "Field Overwrite Options position in pvdconf.ini".
         - No more pop-up windows stealing focus! In earlier versions, the downpage-UTF8_NO_BOM.htm file was repeatedly downloaded and deleted to parse different pages. Each time this occurred, a pop-up window would steal the focus. Thanks to a different approach in this script, which now downloads separate pages in parallel using Selenium, as well as the special PurgeTmpFiles procedure and vbs script instead cmd.exe, this focus-stealing issue is now resolved.
---------------------------------
February as of 1st 2025 news:
----------------------------------
- Script fully transitioned to Selenium.
- Script Options, Script Data and Global Vars upgraded.
- New GetPvdConfigOptions Function introduced, so now
- Whole script successfully set to rely on PVDConfigOptions.
- GetDownloadURL Function completely set and functional.
- DownloadPage Function completely set and functional.
- DownloadImage Function completely set and functional.
- ParsePage_IMDBSearchTitle Function completely set and functional.
- ParsePage_IMDBMovieBASE Function completely set and functional.
- ParsePage_IMDBMovieAWARDS Function completely set and functional. (check the screenshot here)
- ParsePage_IMDBMoviePLOTSUMMARY Function completely set and functional.
- ParsePage Function completely set and functional.


---------------------------------
February as of 6th 2025 news:
----------------------------------
- ParsePage_IMDBMovieCREDIT Function redesigned and completely set and functional.
- ParsePage_IMDBMovieAKA Function completely set and functional.
- ParsePage_IMDBMovieCONNECTIONS Function brought back to the script with additional connection type not existed in the script so far (check the screenshot here)


---------------------------------
February as of 16th 2025 news:
----------------------------------
SCRIPT BASICALLY COMPLETED

---------------------------------
March as of 13th 2025 news:
----------------------------------
SCRIPT COMPLETE
Script and instructions available in a package here.

IMDb Series Script:
Nothing yet. When I finish movie and people scripts, I will check for and assess a possibilty to merge it with episodes script.


IMDb Episodes Script:
Nothing yet. When I finish movie and people scripts, I will check for and assess a possibilty to merge it with series script.


FilmAffinity Script:
I will strip the version down to v4, and for now for testing purposes I have successfully transitioned only DownloadImage (poster) to Selenium. The plan is to fully transition it to Selenium too, but since it works perfectly at the moment, it is not priority. The transition should be fast, though, because most probably it is only needed to set Selenium scripts and switch functions to them (there are not as many functions for FA as fro IMDb scripts). Trailer page is intended to be added too.
TO DO:
SCRIPT COMPLETE
Script and instructions available in a package here.

Selenium Scripts:
Single Download base page set for IMDb Movie, People and FilmAffinity Movie. ("3 in 1")
Single Search script set for IMDb Movie, People and FilmAffinity Movie. ("3 in 1")
Single Download image script set for IMDb Movie, People and FilmAffinity Movie. ("3 in 1")

Single Download additional pages scripts set for IMDb Movie and FilmAffinity Movie. ("2 in 1")
Single Download additional pages scripts set for IMDb People.
Scripts and instructions available in a package here.


TO DO:
- To set download base page and additional pages scripts for Series and Episodes script.
- To adapt Search script for Series and Episodes.

OTHER:
- vbs script set to use instead of cmd.exe in order to avoid annoying pop-up windows stealing focus.
- SCRIPTS CONFIGURATOR completely rewritten from the scratch (yes, I had to learn about ahk, too :-\ ) so now we have resizable window with scrollbars with all the options I could think of included.
Scripts, configurator exe and source code, screenshots and instructions available in a package here.

WISHFULL THINKING:
- Bringing back Allmovie and Rottentomatoes scripts too.

5
PVD Python Scripts / New IMDb People v4 Script Discussion
« on: January 20, 2025, 02:14:19 am »
Hello Ivek.

Do you have an idea how could I force "BioList" from this function to populate to the field from the snippet in the file attached, while ShouldParseBio is set to 0 or is set to "Set If Empty" but bio field is not empty, on the last Exit from PVD?

BioList properly parses but it won't populate. The only trick I can do to populate BioList is to set "Bio" differently to "Overwrite" regardles of the state of the bio field, or to set it to "Set If Empty" if it was empty upon starting PVD with opBio=0 in the pvdconfig.ini (from the last PVD exit).

6
While working on v4 of IMDb Movie and People scripts, I have defined and we now finally have all the starting positions for IMDb scripts:


Quote
//Field Overwrite Options position in pvdconf.ini------------------------------------------------------------------------
//0=Do nothing,1=Set if Empty,2=Overwrite. The Length of 'IMDB_People_[EN][HTTPS].psf=' is 28
   opName            =   29-28;
   opTransName      =   30-28;
   opAltNames      =   31-28;
   opBirthday         =   32-28;
   opBirthplace      =   34-28;
   opGenre            =   43-28;
   opFilmography   =   45-28;
   opAwards         =   47-28;
   opDeath            =   33-28;
   opPhoto            =   48-28;
   opBio               =   39-28;   
   opCareer            =   46-28;   


Quote
//Field Overwrite Options position in pvdconf.ini------------------------------------------------------------------------
//0=Do nothing,1=Set if Empty,2=Overwrite. The Length of 'IMDB_[EN][HTTPS].psf=' is 21.
   opPoster                  =   87-21;
   opTitle                     =   23-21;
   opOrigTitle               =   24-21;
   opAKA                     =   25-21;
   opYear                     =   26-21;
   opGenre                  =   68-21;
   opCategory               =   70-21;
   opDirector               =   76-21;
   opProducer               =   79-21;
   opWriter                  =   77-21;
   opComposer            =   78-21;
   opActors                  =   75-21;
   opCountry               =   69-21;
   opOrigLang               =   44-21;
   opStudio                  =   73-21;
   opMPAA                  =   27-21;
   opORating               =   32-21;
   opTags                     =   74-21;
   opRDate                  =   40-21;
   opBudget               =   41-21;
   opMoney                  =   42-21;
   opTagline                  =   35-21;
   opDescription            =   36-21;
   opLength                  =   46-21;
   opAwards               =   85-21;
   opConnections         =   84-21;
   opFeatures               =   55-21;
   opEpisodes                =   92-21;


and I also corrected part of the code that it relates to in ParsePage function:


Quote
//Parse with the Person URL 'smNormal'------------------------------------------------------------------------------------
    If (Mode=smNormal) Then Begin
        //Get the script Overwrite Options saved in pvdconf.ini (Remember that PVD only save the options on exit)
        //0=Do nothing,1=Set if Empty,2=Overwrite
       PVDConfigOptions := TextBetWeenFirst(FileToString(GetAppPath+'pvdconf.ini'), SCRIPT_NAME + '.psf=', Chr(13));
LogMessage('PVDConfigOptions: ' + PVDConfigOptions);


so now using PVDConfigOptions makes sense.


Some of those opXXX were already applied in existing scripts, but...


I learned that they will not work after setting them as one might expect before PVD is restarted!

I could bet very few users of the very few that probably left are aware of this.


Since we are now using multiple pages download which takes probably most of the time to process the record, I'm looking for ways to optimize the process, so using these would be perfect down to the case when some field is set to "none" we wouldn't even download that page (for example not to download "awards" page at all if "Awards" is set to "None" - blank checkbox in PVD).
That would dramatically speed up the process.



But it has it's cons, and those are huge ones I'd say. Imagine user doesn't want to refresh or add Awards at all, and sets Awards to "None" in PVD, downloads and populate rest of the data for the record(s) and exits PVD. After few days user gets back to PVD, wanting to add new record. User opens PVD, downloads the data for the new record, only to see awards weren't added. After probably first trying several times to download, and visiting site to ensure record has awards, next thing user finally probably does is to check if the field is set to "None", and yes it was "None". User checks it to "Overwrite" or "Set If Empty", and tries to download data again.


But - nothing happens again! Because PARSING WILL NOT HAPPEN!Will user recall that it is needed to restart PVD now in order "Overwrite" or "Set If Empty" to take effect, or the user would rather think that script "isn't working once again"? I could bet on the latter one!


So, this is very usefull tool but at the same time very dangerous tool WHEN IT IS SET TO "NONE" and I'm not clear whether to use it or not, for optimizing the process. Even already applied PvdConfigOptions (for photo in both scripts and AKA, Credit, MPAA and Features in movie script) are already questionable regarding this... I didn't try, but most probably as it is the case in the current working scripts, none of those would work if before starting PVD they woud be set to "None" on the last exit, regardless the fact we set it to "Overwrite" upon most recent PVD start!!!


Luckily enough in the current state PvdConfigOptions didn't work at all since we had different script names, and in the script this line remained in its initial state:
Quote
PVDConfigOptions:=TextBetWeenFirst(FileToString(GetAppPath+'pvdconf.ini'),'IMDB_[EN][HTTPS].psf=',Chr(13));


so using it while script name is different than IMDB_[EN][HTTPS].psf was even wrong and misleading if there was entry for IMDB_[EN][HTTPS].psf in pvdconfig.ini while we were using script with different name (because the script with the different name for the PVDConfigOptions would take the values that were set for the IMDB_[EN][HTTPS].psf God knows when last time), or at best it is actually useless if we don't have an entry for IMDB_[EN][HTTPS].psf in pvdconfig.ini (because PVDConfigOptions will always be empty then, thus it will always be <>0 in If (GET_FULL_AKA and Not(USE_SAVED_PVDCONFIG and (Copy(PVDConfigOptions,opAKA,1)='0'))) Then Begin...).

Or people were/are aware of this all and I'm just now discovering "warm water"? If that's the case, for which I hope, than I will be more than happy to apply it immediately and without a question!


Your thoughts?

7
PVD Python Scripts / New IMDb Movie v4 Script Discussion
« on: January 16, 2025, 03:16:40 am »
I have brought back search function to Movie script too, using Selenium, and will publish it altogether with People v4.0 script and correpsondent Selenium scripts, after I finish full transition to Selenium.

All People scripts are ready, but I need to finish Movie scripts.


It'll probably take me a month.

8
Scripts and Templates / MOVED: New IMDb People v3 (Selenium) script
« on: January 07, 2025, 03:05:58 am »
This topic has been moved to PVD Python Scripts.

http://www.videodb.info/forum_en/index.php?topic=4367.0
according to Ivek's suggestion.

9
Thanks Ivek! Maybe we should agree to add to the script's names some suffix like "Chrome" and "FF" or similar, so people could easier recognize which ones they use. Or, we can agree that I will put "Chrome" to all of my scripts and the default is Firefox.

To all users: there is no obstacle to put both gecko and cromedriver to proper places, when you have both Chrome and Firefox installed (on the PATH), and to use both scripts concurently!

10
This is Selenium script specific for the People script here:


Script Description: This script automates the download of various IMDb pages using Selenium and ChromeDriver, including handling cookies and popups, and saving the resulting HTML to local files.
It automatically finds your localization by using service, the http://ipinfo.io API to get the country code and the dictionary to map to language acording to obtaining country code. If you don't want this, comment out first part of the script and uncomment the one at the end of this script. Open the script in text editor and read about this.
For this to work ensure that:


Quote
A. You installed python
B. You installed selenium and requests by


Quote
pip install selenium requests


C. You have your Chrome bin on a PATH
D. You have Python folder on your PATH
E. pythonw.exe is not missing, or it's containing folder is on the PATH


This script:


Quote
1. Uses Chrome browser instead Firefox
2. Uses chromedriver.exe instead geckodriver
3. Starts chromedriver.exe silently
4. Silently invokes browser in a headless mode (no pop-up windows of browser)
5. Scrapes .htm page of a given url
6. No path is needed to set manually inside the script - it is set to be relative to the path of selenium script!

For using relative path, ensure:

Quote
6A. You put this script into "Scripts" folder of your PVD instance.
6B. You put appropirate chromedriver.exe to the "Script" folder, too.

To silently invoke selenium script itself by PVD's .psf script (no pop-up windows of selenium script's cmd window), be sure to use pythonw.exe instead of python.exe, like this for example:

Quote
FileExecute('pythonw.exe', '"' + ScriptPath + 'selenium_script-Chrome_People.py" "' + URL + '" "' + ScriptPath + BASE_DOWNLOAD_FILE_NO_BOM + '"');

Now, the last one will probably be ensured by those who maintain corresponding scripts if interested in, and for now, those are Ivek and me, but be sure to check if it's there anyway.

You may want first to test the script manually, from cmd, for example like this:

Quote
C:\Users\user\selenium_script-Chrome_People.py "https://www.imdb.com/name/nm0000017"

From this point on, everything is automated and headless.

11
PVD Python Scripts / New IMDb People v3 (Selenium) script
« on: January 04, 2025, 04:31:58 am »
Completely new IMDb People script with integrated Python+Selenium script. Put selenium script to the Scripts folder of your PVD too.


Everything works now, except (for me at least) populating photo to database. Please check and report if it works for you.


From the Change log -


Quote
CHANGE LOG :
            V 3.0.0.1 (04/01/2025) afrocuban (THANKS TO IVEK'S HUGE HELP):
         - Selenium integration to PVD introduced. Check http://www.videodb.info/forum_en/index.php/topic,4368.0.html and
         http://www.videodb.info/forum_en/index.php?topic=4367.0 for more
         - Awards properly parsed now (HUGE THANKS TO IVEK HERE!!!).
         - Birthplace, Filmography Bio and Genre field fixed. 
         - DownloadPage and ParsePage functions modified to split downpage-UTF8_NO_BOM.htm into downloading different file for each function:                       Awards, Bio, Credit and Genre. Main (Prncipal) page still downloaded with PVdBDownPage.exe to downpage-UTF8_NO_BOM.htm
         - New HandlePhoto function to separately parse and add photo to record.

12
Support / MOVED: Re: New FilmAffinity Script
« on: January 04, 2025, 04:29:27 am »

13
PVD Python Scripts / Python (+Selenium) Chrome general script
« on: December 30, 2024, 03:44:34 pm »
This is fork and upgrade of Ivek's selenium script found here:

Quote
http://www.videodb.info/forum_en/index.php/topic,4362.msg22691.html#msg22691


For this to work ensure that:


Quote
A. You installed python
B. You installed selenium by


Quote
pip install selenium

C. You have your Chrome bin on a PATH
D. You have Python folder on your PATH
E. pythonw.exe is not missing, or it's containing folder is on the PATH


This script:


Quote
1. Uses Chrome browser instead Firefox
2. Uses chromedriver.exe instead geckodriver
3. Starts chromedriver.exe silently
4. Silently invokes browser in a headless mode (no pop-up windows of browser)
5. Scrapes .htm page of a given url
6. No path is needed to set manually inside the script - it is set to be relative to the path of selenium script!

For using relative path, ensure:

Quote
6A. You put this script into "Scripts" folder of your PVD instance.
6B. You put appropirate chromedriver.exe to the "Script" folder, too.

To silently invoke selenium script itself by PVD's .psf script (no pop-up windows of selenium script's cmd window), be sure to use pythonw.exe instead of python.exe, like this for example:

Quote
FileExecute('pythonw.exe', '"' + ScriptPath + 'selenium_script-Chrome.py" "' + URL + '" "' + ScriptPath + BASE_DOWNLOAD_FILE_NO_BOM + '"');

Now, the last one will probably be ensured by those who maintain corresponding scripts if interested in, and for now, those are Ivek and me, but be sure to check if it's there anyway.

From this point on, everything is automated and headless.

14
Other Topics / MOVED: Integrating Selenium to PVD
« on: December 15, 2024, 09:36:46 pm »
This topic has been moved to Development.

http://www.videodb.info/forum_en/index.php?topic=4357.0 from Other topics, because it is natural place for it.

15
Scripts and Templates / New FilmAffinity Script v6.1
« on: December 15, 2024, 08:58:48 pm »
Here's my completely new FilmAffinity_[EN][HTTPS_Poster]_v6.1.psf script.


It includes major changes and additions.


CHANGE LOG:

           V 6.1.0.1-afrocuban (12/15/2024) afrocuban: MAJOR CHANGES INTRODUCED.

   Custom field types need to be strictly followed in order to be presented properly.

   Following custom/original fields added/changed respectively:

   • ~Studio~ section completely re-written:
1. Original "producer (human, category 4)" field now properly imports only when there is a Producer explicitly stated on FA.
2. Original "studio" (AField, value 8 ), populated now with studios and distributors concatenated. Co-productions (countries) and producer persons not included.
3. "FA Co-productions" Multiselect list custom field introduced for the first time.
4. "FA Producers"  Multiselect list custom field introduced for the first time.
5. "FA Studio" Multiselect list custom field introduced for the first time. This field inludes only studios, not Co-production (countires), producer person, nor Distributors.
6. "FA Distributors" Multiselect list custom field introduced for the first time.

   • ~Writers~ section completely re-written:
1. Original "writers (human, category 2)" field now properly imports only when there is a screenwriter explicitely stated on FA.
2. "FA Writers" Multiselect list custom field added for the same value as under 1.
3. "FA Script" Memo custom field introduced for the first time filled with everything else found in the Writers block ("novel:...", "book:...", etc).

   • New custom Memo field "FA Related". Related movies to the current one.
   • New custom Memo field "FA SimMov". Similar movies with % of similiraty.
   • New custom Memo field "FA Awards". Years linked to FA pages with all awards for that competition for that year.
   • New custom Memo field "FA Misc URLs". Trailers, Image gallery and Pro reviews links with the counts present for each three, for the current movie.
   • New custom Multiselect list field "FA Cinematography".
   • New custom Memo field "FA Ranking Position", underlinked to external top lists.
   • New custom Memo field "FA Ranking Lists Position", underlinked to external users' lists.
   • New custom Memo field "FA Critics" with Review, Authors and Magazines, each underlinked to external sources, when there is one.
   • New custom Memo field "FA Released By" added.
   • New custom Memo field "FA Release Date" added.
   • New custom Memo field "FA Lists" added.
   • New custom Multiselect list field "FA Genre" added.
   • New custom Multiselect list field "FA Category" added.
   • New custom Memo field "FA OrigTitle" added.
   • New custom Multiselect listfield "FA Year" added.
   • New custom Long Text field "FA length" added.
   • New custom Long Text field "FA features" added.
   • New custom Multiselect List field "FA Country" added.
   • New custom Multiselect List field "FA Directors" added.
   • New custom Multiselect List field "FA Writers" added.
   • New custom Multiselect List field "FA Composers" added.
   • New custom Memo field "FA Description" added.
   • New custom Memo field "FA Actors" added.


   • Two new functions introduced in the script:
1. "DownloadAndParseTrailerPage" Function added which constructs the URL for the trailer page, uses the DownloadPage Function to download it, and then calls a parsing function to process it.
2. "ParseTrailerPage" Function added to parse the downloaded trailer page HTML content.
3. "DownloadPage" Function modified to accept an output file name, by adding a parameter to specify the output file name, so that we can use this function to download both the main movie page and the trailer page. This modified function now accepts OutFile as a parameter to specify the output file name.
4. "ParsePage" Main Function modified to include the call to "DownloadAndParseTrailerPage" function.
Additional parsing page "downpage_trailer-UTF8_NO_BOM_FA.htm" is downloaded beside "downpage-UTF8_NO_BOM_FA.htm", but nothing still parsed since it contains only dynamic content which needs Selenium integration, because PVdBDownPage.exe can download only static content of html.

TO DO:
- Selenium integration to PVD.
- Logging to movie site throught the script.

           V 5.0.0.1-afrocuban (12/05/2024) afrocuban: Poster now available via HTTPS.

You can view how imported data looks in my new skins on this topic, and what is imported actually.

You can look for most recent custom fields from this topic's message and later, and use it for easier tracking which custom fields to add to your PVD

16
Scripts and Templates / Dark Skin - V4-1.1
« on: December 15, 2024, 08:47:52 pm »
I'm attaching 4 versions of a new skin I created. The only difference is the design of a front section.


They contain all custom fields I found in all scripts and as far as I know it is the only custom one that includes custom fields, at least in such a huge number. Because of this fact, maybe Ivek might consider to sticky this topic and to add skins to download section as a base for future customizations by other users who will have all current custom fields on their disposal, out of customsection.

I have created these skins with deprecated, but still fully functioning Serna Free ML Editor, so editing these skins with any other non-WYSIWYG editor will give you headache because of repetitive comments, while in Serna they were crucial for easier navigating and manipulating.

You will need to add new custom fields from this topic's message and later, in order to be able to see all data. In that table look in column H for the field name, and column K for the PVD Database Field Type.


If you use my new FilmAffinity_[EN][HTTPS_Poster]_v6.1.psf script or above, you will need this skin to view all the data imported, beside default Classic skins, of course.


I added completely new tab "FilmAffinity" since now it deserves new skin section because of the amount of data imported.


Here's my PVD IMDb Full HD v4-4.1 - Dark.xml skin.

In addition, there are screenshots how imported data look in my skin.

17
Development / Integrating Selenium to PVD
« on: December 15, 2024, 03:09:47 am »

Sorry that I'm in the conversation uninvited and not knowing programming. But I learned a bit along the way trying to locally download FA page with trailers. I succeeded to download it as downpage_trailer-UTF8_NO_BOM_FA.htm beside downpage-UTF8_NO_BOM.htm in order to try to parse them both, but no good news. I mean, I can parse them both, but dynamic content isn't downloaded,

I don't think we can do it with PVdBDownPage.exe:

Quote
// Download the URL page.
  LogMessage(Chr(9) + Chr(9) + '      Download with PVdBDownPage in file:|' + ScriptPath + OutFile + ' the information of:|' + URL + '||');
  FileExecute(ScriptPath + 'PVdBDownPage.exe', '"' + URL + '" "' + ScriptPath + OutFile + '"');

because it cannot download dynamic content since it
Quote
is likely due to the JavaScript that loads the video elements dynamically on the original page. When you download the page using DownloadPage, the JavaScript doesn’t execute, so the dynamic content (like embedded trailers) isn’t included in the downloaded HTML file.
To work around this, we have a couple of options:

Use a Web Browser Automation Tool: Tools like Selenium can control a web browser to visit the page and let all JavaScript execute before saving the fully rendered page’s HTML. This way, the dynamic content will be included.

Find the Video URL Manually: Check the original HTML of the page in your browser (right-click and “View Page Source” or inspect elements with Developer Tools). Look for the actual URLs of the video files. You can then use those URLs directly in your script.

Using a web browser automation tool like Selenium is a bit more involved and requires additional setup, but it is a more robust solution for handling dynamic content. If you prefer to go this route, I can guide you through the setup process.

Alternatively, manually inspecting the page’s source for video URLs is quicker but less convenient for repeated tasks.



Quote
Here's the rest of the conversation:
Invoking Selenium directly from within a PVD script (written in Pascal) is not straightforward because Selenium typically operates via Python, Java, or another high-level language. However, you can use a workaround by invoking an external Python script from your PVD Pascal script.

Here’s how you can integrate Selenium into your workflow:

Create a Python Script Using Selenium:

This script will navigate to the URL, wait for the content to load, and save the fully rendered HTML.

python
Quote
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Setup the WebDriver
driver = webdriver.Chrome(executable_path='path_to_chromedriver')

# Navigate to the URL
driver.get("https://www.filmaffinity.com/en/evideos.php?movie_id=xxxx")

# Wait for the trailers to load (adjust the waiting condition as needed)
WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.CLASS_NAME, "divvid"))
)

# Get the fully rendered HTML
html = driver.page_source

# Save the HTML to a file
with open('downpage_trailer-UTF8_NO_BOM_FA.htm', 'w', encoding='utf-8') as file:
    file.write(html)

# Close the browser

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Setup the WebDriver
driver = webdriver.Chrome(executable_path='path_to_chromedriver')

# Navigate to the URL
driver.get("https://www.filmaffinity.com/en/evideos.php?movie_id=xxxx")

# Wait for the trailers to load (adjust the waiting condition as needed)
WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.CLASS_NAME, "divvid"))
)

# Get the fully rendered HTML
html = driver.page_source

# Save the HTML to a file
with open('downpage_trailer-UTF8_NO_BOM_FA.htm', 'w', encoding='utf-8') as file:
    file.write(html)

# Close the browser
driver.quit()

Invoke the Python Script from Your PVD Script:

Use the FileExecute method to call the Python script from within your Pascal script.

pascal
Quote
// Log the action
LogMessage(Chr(9) + Chr(9) + '      Download with Selenium in file:|' + ScriptPath + OutFile + ' the information of:|' + URL + '||');

// Construct the command to run the Python script
FileExecute('python.exe', ScriptPath + 'selenium_script.py');
In this setup:

The Python script handles the browsing and downloading of the fully rendered HTML using Selenium.

Your PVD script uses the FileExecute method to call the Python script, passing any necessary parameters if needed.

After the Python script finishes and the HTML is saved, your PVD script can continue processing the saved HTML as usual.

Example Complete Workflow
1. Python Script (selenium_script.py)

python
Quote
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import sys

# Get the URL from the command line arguments
url = sys.argv[1]
output_file = sys.argv[2]

# Setup the WebDriver
driver = webdriver.Chrome(executable_path='path_to_chromedriver')

# Navigate to the URL
driver.get(url)

# Wait for the trailers to load (adjust the waiting condition as needed)
WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.CLASS_NAME, "divvid"))
)

# Get the fully rendered HTML
html = driver.page_source

# Save the HTML to a file
with open(output_file, 'w', encoding='utf-8') as file:
    file.write(html)

# Close the browser
driver.quit()
2. Pascal Script (Invoke Selenium Script)

pascal
Quote
// Log the action
LogMessage(Chr(9) + Chr(9) + '      Download with Selenium in file:|' + ScriptPath + OutFile + ' the information of:|' + URL + '||');

// Construct the command to run the Python script
FileExecute('python.exe', '"' + ScriptPath + 'selenium_script.py" "' + URL + '" "' + ScriptPath + OutFile + '"');
By following this workflow, you can effectively integrate Selenium into your PVD script to handle pages with dynamic content. This way, you can ensure that all necessary content, like embedded trailers, is included in the HTML that your PVD script processes.

How would selenium know which url to open?
Great question. You can pass the URL to Selenium through command line arguments when you invoke the Python script. This way, your PVD script can dynamically supply the URL to Selenium. Here's how you can do it:

Step-by-Step Implementation
Modify Your Python Script to Accept Command Line Arguments:

You’ll adjust the Python script to accept the URL as an argument. This ensures that Selenium knows which URL to open.

Example Python Script (selenium_script.py):

python
Quote
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import sys

# Get the URL and output file from the command line arguments
url = sys.argv[1]
output_file = sys.argv[2]

# Setup the WebDriver
driver = webdriver.Chrome(executable_path='path_to_chromedriver')

# Navigate to the URL
driver.get(url)

# Wait for the trailers to load (adjust the waiting condition as needed)
WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.CLASS_NAME, "divvid"))
)

# Get the fully rendered HTML
html = driver.page_source

# Save the HTML to a file
with open(output_file, 'w', encoding='utf-8') as file:
    file.write(html)

# Close the browser
driver.quit()
Invoke the Python Script from Your PVD Script:

Use the FileExecute method to call the Python script, passing the URL and the output file path as arguments.

Example Pascal Script:

pascal
Quote
// Log the action
LogMessage(Chr(9) + Chr(9) + '      Download with Selenium in file:|' + ScriptPath + OutFile + ' the information of:|' + URL + '||');

// Construct the command to run the Python script
FileExecute('python.exe', '"' + ScriptPath + 'selenium_script.py" "' + URL + '" "' + ScriptPath + OutFile + '"');
How It Works
Python Script: The Python script (selenium_script.py) receives the URL and the output file path as command line arguments. It then uses Selenium to open the URL, waits for the trailers to load, and saves the fully rendered HTML to the specified output file.

Pascal Script: Your PVD script constructs the command to run the Python script with the necessary arguments (the URL and the output file path) and then executes it. This ensures that the correct URL is passed to Selenium dynamically.

This approach allows you to integrate Selenium seamlessly into your PVD script, enabling it to handle pages with dynamic content effectively. If you need further assistance with any part of the setup, feel free to ask!

How slow is this process?
The process of using Selenium to render and download the HTML can be slower than simply downloading a static page because Selenium has to:

Open a Web Browser: Launching a browser instance (e.g., Chrome or Firefox) takes a few seconds.

Navigate to the URL: Just like a human user, Selenium will navigate to the specified URL, which involves network latency.

Wait for JavaScript Execution: Selenium waits for the page to fully load, including the execution of any JavaScript that dynamically loads content. This can vary in time depending on the complexity of the page and the speed of your internet connection.

Save the HTML: Once the page is fully loaded, Selenium will extract and save the HTML content, which is relatively quick.

Estimated Time
Browser Launch: 1-3 seconds

Page Navigation: 1-5 seconds (depends on network speed)

JavaScript Execution & Content Load: 3-10 seconds (depends on page complexity)

Saving HTML: <1 second

In total, it can take anywhere from 5 to 20 seconds or more per page, depending on various factors.

Performance Considerations
Network Speed: Faster internet connections will reduce the navigation and loading time.

Page Complexity: Pages with heavy JavaScript and multiple dynamic elements will take longer to fully load.

Browser & System Performance: Using a faster browser and a system with more resources (CPU, RAM) will speed up the process.

While this is slower compared to a direct HTML download using PVdBDownPage.exe, it ensures that all dynamic content is fully loaded and accessible. This trade-off can be worthwhile if you need to scrape content that is loaded dynamically by JavaScript.

If speed becomes a critical issue, you might look into optimizing the Selenium script further, such as by reducing the wait times where possible or running multiple instances in parallel if applicable.

18
Support / Custom offline fields
« on: November 24, 2024, 11:18:42 am »
I guess the answer is "No", but asking doesn't cost, so here I am, haha. Is it possible to create custom fields for local, internal, offline fields like Subtitles, for example? I'm asking because my goal is to avoid all the fields that are one-liners, meaning of a "Short text" and "Long text" type. This because, I already mentioned it earlier, when the width of the values in these fields are wider than the section, even wider than the display screen, those values cannot  wrap to the next line (only labels can), so they overflow to "the next dimension", haha. If I could create custom fields for such fields, I'd create "memo" type fields thus avoiding overflow.

19
Other Topics / PVD Skins and Serna Free
« on: November 23, 2024, 08:15:58 pm »
For creating skins I'm using Serna Free xml editor. It's obsolete, it's code is open sourced on GitHub, and I couldn't make anything without it. It's WYSIWYG, so you don't deal with any notepad-like coding. You can move elements (PVD fields and other elements) through the tree and in PVD to refresh the skin to immediately check results.
Serna Free is 12 years obsolete, but still working properly, but it's not easy to find it today, so you probably have to use some torrent downloader if you want to find, download and use it. I didn't find any other xml editor that is near Serna, just like there is no movie database program near like PVD.
Once you try it, you will never use anything else for at least skinning PVD.

20
Other Topics / PVD Scripts in Notepad++
« on: November 23, 2024, 07:46:26 pm »
I don't know how and where I got original PVD UserDefinedLanguage script for Notepad++, but now I tweaked it, together with Default Styler in Notepad++, so it's now easier for me to navigate through PVD scripts. Here are PVD UDL for .psf files, and tweaked default styler. You can import UDL Via Language->User Defined Language->Define Your Language->Import... and paste stylers.xml to appropriate folder for Notepad++ (The stylers.xml file for Notepad++ is typically located in the following directory ffor a normal installation: %AppData%\Notepad++\themes\stylers.xml and for a portable version: <installation_directory>\Data\Config\stylers.xml). You might want to backup your original stylers.xml, if you're not satisfied with the new Notepad++ look, and bring it back later. You can tweak the keys and colors of UDL to your likings at the place you imported it.


Choose the language PVD_UDL_final and choose Default(stylers.xml) in Settings->Style Configurator..., as in the picture 1, then restart Notepad++ and your PVD script should look like in the picture 2.

Pages: [1] 2 3 4