Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - afrocuban

Pages: [1] 2 3 4 5 6 ... 33
1
This just sleeps between 8 and 12 seconds and it could be very fragile. Also, it makes whole process longer 1-2 minutes per title?

2
This will work most probably, but I think it is fragile too... I am sure soon it will be not possible again, but maybe something new will come up...



Quote
This will work most probably, but I think it is fragile too... I am sure soon it will be not possible again, but maybe something new will come up...

Yes — what you currently have is a classic Selenium workflow:
It spins up a new ChromeDriver instance for each URL.
It applies stealth tweaks, sets cookies, navigates, clicks “See more” buttons, saves HTML, and then quits.
You’re running this in parallel threads (ThreadPoolExecutor(max_workers=4)), so four fresh Chrome sessions at a time.
That’s exactly the pattern IMDb’s AWS WAF is now blocking: fresh, headless, parallel, automation‑fingerprinted sessions.

Why this fails against IMDb now
Headless mode is fingerprinted. Even with stealth, AWS WAF detects it.
Fresh profiles per run (no browsing history, no persistent cookies) scream “bot.”
Parallel sessions look like automation, not human browsing.
Adding fake cookies doesn’t help — WAF requires valid tokens generated by a real browser session.
How to adapt this code to “attach to Chrome”
Instead of creating a new webdriver.Chrome(...) each time, you’d:

Start Chrome manually with debugging enabled

chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\ChromeIMDb"
This opens Chrome with a persistent profile (C:\ChromeIMDb).
You log in once, build up cookies/history naturally.
Change your Selenium init code
Replace:

service = Service(chrome_path)
chrome_options = build_chrome_options(headed=False)
driver = webdriver.Chrome(service=service, options=chrome_options)
With:

service = Service(chrome_path)
chrome_options = webdriver.ChromeOptions()
chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
driver = webdriver.Chrome(service=service, options=chrome_options)
Now Selenium attaches to the Chrome you already opened.
It inherits your cookies, extensions, and fingerprint.
Run serially, not in parallel
IMDb WAF is sensitive to multiple simultaneous sessions.
Change ThreadPoolExecutor(max_workers=4) → max_workers=1.

Important adjustments
Don’t quit the browser (driver.quit()) after each run — that would kill your attached Chrome. Instead, just close tabs (driver.close()) or reuse the same driver.
Remove fake cookie injection — you don’t need it if you’re using your real Chrome profile.
Headed mode only — you’ll see the browser window, but that’s what passes WAF.
In short: your current script is fine for FilmAffinity, but IMDb now requires either:

Attach to Chrome (reuse your real session), or
Switch to IMDb datasets / APIs for long‑term stability.

3
144.0.7559.31 not 144.0.7559.59
And also, you don't need external sites. Nothing is parsed so far from external sites so far. It was placed there for possible use in the future.

4
Thanks Ivek! Wish you a great health!

5
PVD Python Scripts / AllMovie and Rottentomatoes Selenium v4.3 Scripts
« on: January 07, 2026, 12:08:02 am »
I have finally revived them. They are written from the scratch and practically aren't related to the old ones almost not at all.


Read more and download, starting from this message:

https://www.videodb.info/forum_en/index.php/topic,4379.msg23022.html#msg23022


Both are very delicate and sophisticated: they will preserve all of your old urls and old custom fields data that don't exist anymore, and will import and process only new ones.

6
Scripts and Templates / PVD IMDb Full HD v4-3.1 - Dark skin afrocuban
« on: January 07, 2026, 12:02:20 am »
Here's my skin updated to the final scripts from the
https://www.videodb.info/forum_en/index.php/topic,4379.msg23024.html#msg23024

Put both DPG4-3.png
  and PVD IMDb Full HD v4-3.1 - Dark.xml into Skin/Movies folder.

If you want your skin to look like this you have to import all the custom fields from the .csv files in attachment. There is a 2min way to do this using DBeaver instead manually adding one by one in PVD, so consult AI how to do that. Backup your database first of course!


No further updates will be made.

7
Finally, here are all the scripts.


Never forget to read first message in the topic. All the answers and solutions are there, scripts and PVD to work flawlessly.


As usual, backup and empty Scripts folder and extract Scripts_2026-01-06.7z there. Extract the other file into PVD root folder.

If you want to use the scripts with my skin, you can download it with the list of custom fields here:

https://www.videodb.info/forum_en/index.php/topic,4388.msg23025.html#msg23025

Important note: Since I didn't see even "thanks", or any kind of feedback (except from Ivek, and I haven't seem him recently either) for a more than a year of hard work, I guess there is no interest for these, so I will not update scripts anymore. But anyway, given files are firm base someone else to take over and continue where I left. If I could do it with AI, anyone can.


Best regards.

8
PVD Python Scripts / PVD Selenium v4.3 All Scripts
« on: January 06, 2026, 11:44:47 pm »
In this message I'm attaching udl files for Notepad++, which now is perfectly fit for PVD scripting.


Most important - folding and unfolding is now seamless as in the screenshot.


As usual, replace stylers.xml with the given one and import PVD v4.3_2026-01-06.xml and it should look as in the screenshot.

9
PVD Python Scripts / PVD Selenium v4.3 All Scripts
« on: January 06, 2026, 11:37:58 pm »

Merry Christmas and a Happy New Year to everyone.

I am announcing definitive v4.3 scripts. Only description and screenshots in this message because of attachments limit.



Tons of improvements, bugs fixing, stabilizing and other things.


New Search window, with 30 seconds to choose now.


Separated python scripts for IMDb People script.


Fully stabilized and normalized code, now finally easy to navigate through, with as much as possible comments left in the scripts.


New AllMovie and Rottentomatoes scripts as promised to finish in a year:

WISHFUL THINKING:
- Bringing back Allmovie and Rottentomatoes scripts too.

Tons of custom fields for AllMovie and RottenTomatoes.
Also, Rottentomatoes all-in-one script for movies, series and episodes.
Search window for Rottentomatoes to choose Movies or TV Shows to search for.

10

To be clear, these parts work correctly, my fixes to some of the code are only cosmetic in nature. Below are examples of what I had in mind.



Oh, ok then. Unfortunately, the cosmetics is very important to my custom skin design to visually separate fields and sections (screenshot below), so it would be huge overload for me to keep two versions when updating.

Regarding cleanning FullInfo, it is very important section for many reasons, and I admit it was always too clummsy for me to clean so I was primarily focused on it to work, and I will clean it at next update release.


Thanks for reviewing though!

11
Hey, Ivek. Thanks. Can you please post examples or imdb links where those matters and my code doesn't work? I just can't grasp just by looking at the code. Thanks.

12
Here are v4.3 scripts.

1. Unpack the first 7z in "PersonalVideoDB" folder (3 files, overwrite existing, but backup them first if you want).
2. Unpack the second 7z in your "Scripts" folder. It is safe to move everything from it before extracting. These are all you need to safely run PVD with python.


You need all of these in order PVD to run as intended. Especially People script is complex, since I have integrated options to make it easier to dynamically update them and not to wait at all for deceased or the people that have only name and url. Test it.

If something desn't work, first check:
1. That you are running same version of Chrome and chromedriver.
2. That you installed whatever is needed for Python to work as described so far on this topic.

If that doesn't help, please publish screenshots and logs, so I could reproduce the issue too and being able to fix it.

Please test and let me know if everything work or not.


Enjoy!

13
New v4.3 Files

Most comprehensive and stable I have done so far. Two main things:

1. New PVD Scripts Configurator built from scratch in Python Tkinter. Much better GUI than AHK.
    Nice quirk - I have introduced dark/light theme for it as in the first photo. Reordered tabs, so default tab is IMDb Movie tab. You can still use both configurators (second photo - new Configurator has prefix "py"), all updated to v4.3, same options and functionalities. I'm giving them all now, for continuity, but in the future, most probably I will discontinue AHK. In any case, I'm giving .ahk file so anyone can maintain it in the future. I will stick to Python TKinter GUI.


2. As anounced earlier, new  **UPDATE DYNAMIC VALUES ONLY** switch for a fast update of only certain dynamic fields.

From the Change logs:

IMDb Movie Script
Quote

CHANGE LOG :
V 4.3.0.1 (11/15/2025) afrocuban: Script Configurator Enhancements
-------------------------------------------------------------------------------
- Built from the scratch new  PythonTkinter Script Conigurator:
    • It has all the functionalities as AHK. 
    • Plus litght/dark theme developed
    • Unlike AHK can be used as a standalone application with the same effect as when invoked in PVD.
   
- Added new feature in Script Configurator to enable and manage saved settings from `pvdconf.ini`:
    • **USE SAVED PVDCONFIG** now needs to be enabled to unlock configuration options below. 
    • This allows users to apply settings that require a restart of Personal Video Database upon saving. 
    • **Use this setting carefully!** Any changes will take effect only after clicking "Save All Script Configurations" (which will restart the application). 
- **UPDATE DYNAMIC VALUES_ONLY**: 
   • Allows users to update only **dynamic values** like Rating, Top 250, Metascore, and Number of votes. 
    • Updates the **Awards summary** for movies released within the last two years, capturing recent wins for fresh releases.
    • When disabled, additional configuration options become available for comprehensive updates.
    • Now poster can be downloaded from any page - separate procedure provided for it in Script Configurator,
    • Single Instance in Script Configurator. No more flooding with multiple instance by mistake.
    • Redesigned whole script to now accept UPDATE DYNAMIC VALUES ONLY switch properly




FilmaFfinity Script
Quote
CHANGE LOG :

V 4.3.0.1 (11/27/2025) afrocuban: Script Configurator Enhancements
-------------------------------------------------------------------------------
- Built from the scratch new  PythonTkinter Script Conigurator:
    • It has all the functionalities as AHK. 
    • Plus litght/dark theme developed
    • Unlike AHK can be used as a standalone application with the same effect as when invoked in PVD.


- Added new feature in Script Configurator to enable and manage saved settings from `pvdconf.ini`:
    • **USE SAVED PVDCONFIG** now needs to be enabled to unlock configuration options below. 
    • This allows users to apply settings that require a restart of Personal Video Database upon saving. 
    • **Use this setting carefully!** Any changes will take effect only after clicking "Save All Script Configurations" (which will restart the application). 
- **UPDATE DYNAMIC VALUES_ONLY**: 
    • Allows users to update only **dynamic values** like Rating, Number of votes and Awards for movies made in last 2 years.
    • When disabled, additional configuration options become available for comprehensive updates.




IMDb People Script
Quote


CHANGE LOG :
V 4.3.0.1 (11/27/2025) afrocuban: Script Configurator Enhancements
-------------------------------------------------------------------------------
- Built from the scratch new  PythonTkinter Script Conigurator:
    • It has all the functionalities as AHK. 
    • Plus litght/dark theme developed
    • Unlike AHK can be used as a standalone application with the same effect as when invoked in PVD.
   
- Added new feature in Script Configurator to enable and manage saved settings from `pvdconf.ini`:
    • **USE SAVED PVDCONFIG** now needs to be enabled to unlock configuration options below. 
    • This allows users to apply settings that require a restart of Personal Video Database upon saving. 
    • **Use this setting carefully!** Any changes will take effect only after clicking "Save" (which will restart the application). 
- **UPDATE DYNAMIC VALUES_ONLY**:
    • Allows users to update only **dynamic values** for persons that were alive at the moment of adding them to PVD, or at their last update. Updating only from the Main page.
    • When disabled, additional configuration options become available for comprehensive updates.

14

My next goal is to include new switch in the Script Configurator - UPDATE_DYNAMIC_VALUES_ONLY, by adding few dozens of lines into movie selenium script that would call only main page and update only dynamic values like: Rating, Top 250, Bottom 100, Number of votes.  And for the Awards summary when the movie is not older than 2 years than current date catching fresh wins for recent releases.


Well, I finished it earlier than expected, for IMDb Movie script. Also with a lot of chalenges I have redesigned Script Configurator once again, bringing new functionalities:

Quote
      //Retreive Data Config
  USE_SAVED_PVDCONFIG  = True ; // ***PVDCONFIG*** - Turn this ON to unlock and change the options below (from pvdconf.ini). Settings are applied when you click "Save All Script Configurations (Personal Video Database will automatically restart)" button below. Use carefully!


//############################################
//#  All options below require USE_SAVED_PVDCONFIG
//#  to be enabled so the Script Configurator
//#  can apply your settings correctly.
//############################################

  UPDATE_DYNAMIC_VALUES_ONLY  = True ;   //Update only dynamic values such as: Rating, Top 250, Metascore, Number of votes. Also update the Awards summary when the movie is less than 2 years old, to capture fresh wins for recent releases. Deselect to enable the options

//################################################
//#  All options below require UPDATE_DYNAMIC_VALUES_ONLY
//#  to be enabled so the Script Configurator
//#  can apply your settings correctly.
//###############################################


So you can see in the screenshots that now checking specific boxes disables or enables other options. Which means that....



My plan is versions to stay on v4.2 for a long time unless something significant in their design changes.


my plan will not last long since thee changes are huge for users to make them easy navigating and choosing proper options withoout to much contemplating, so with this I will soon go to 4.3.

But I still will not publish anything, because I want to finish People and FA scripts in terms of UPDATE_DYNAMIC_VALUES_ONLY, and I also want to further tweak Script Configurator GUI. I just hate "Save button" will not autosize to the last option in each tab, so for example in a People and FA tab we have to scroll all the way down. That is not just visual thing, but I rather want to implement "Apply" button that will be applied to each tab independently, while we will have overall "Cancel" and "Save & Restart PVD" button. That is tremendous challenge for ahk, that made me last year even to start creating GUI with python, but at the moment it looked even more difficult with python, so I abandoned it then. Now it looks the time to try it again is spot on.

15
My next goal is to include new switch in the Script Configurator - UPDATE_DYNAMIC_VALUES_ONLY, by adding few dozens of lines into movie selenium script that would call only main page and update only dynamic values like: Rating, Top 250, Bottom 100, Number of votes.  And for the Awards summary when the movie is not older than 2 years than current date catching fresh wins for recent releases.

16

Here are all scripts and files fully updated, fixed and polished in a less than a month I started to fix all 16 of them, and I was so happy I got back into it easily and quickly. I have tested all scripts and files against many border case titles and persons and for me everything worked more than smooth and satisfying.

They are now faster and more stable and I am not facing anymore internet interruptions, because I heavily redesigned the most problematic python selenium scripts.

If especially Selenium_Chrome_Movie_Additional_pages_v4.py script is demanding for your CPU when downloading movies and you experience lags of any kind, open the file in Notepad++ and i
n the line 375:

Quote
with ThreadPoolExecutor(max_workers=4) as executor:


reduce number 4 to 3, 2 or 1, just test it. Whenever you lessen the number, the process of downloading files will be longer, so find your balance. If you have good CPU and a lot of RAM, then you can even increase the number above 4.

I'd be happy to further fine tuning and fix it, so please let me know about each case details so I could reproduce it too and then being able to fix it. If you have any further suggestion, I'd be happy to hear it as well while I didn't forget it again, but please explain why and how by giving specific examples, because I am not a programmer, but just using common sense and AI, and that is the only way I can understand the problem.

My plan is versions to stay on v4.2 for a long time unless something significant in their design changes.


Enjoy!
;) :)

17
Oh, and please remind me to upload purge_tmp_files.vbs if I forget, because I changed it by adding to delete fake UserData folders now too

18
As of now, per title, I'm getting these timings:
Main Page downloading: ~18-22sec
Other 9 pages Page downloading: ~48-55 sec
Image downloading: ~6-8 sec
PVD script processing: ~5-10 sec

So in total for now, to get biggest possible amount of data per title, especially those with many awards, connections, etc... it is needed ~77-95 sec which is pretty acceptable for the amount of data.

19
Yes, I just commented out reference code in the script, and I already intended to leave it there and everywhere it is mentioned, to preserve the logic for the future. I was actually talking to comment it out in the Script Configurator too, to avoid inexperienced users to click on those options expecting to get data from /reference page. I will move the function before ParsePage function, though, that is a nice tip.

Now, meanwhile, I introduced another function:

Quote
function WaitForPageFile(const FileName, PageLabel: string;
  InitialWait: Integer; StabilizeMs: Integer): string; //BlockOpen
var
  i, currentWait: Integer;
  tryResult: string;
begin
  Result := '';
  i := 0;

  // provide defaults manually
  if InitialWait = 0 then
    InitialWait := 2000;
  if StabilizeMs = 0 then
    StabilizeMs := 1000;

  currentWait := InitialWait;

  while not FileExists(FileName) do
  begin
    LogMessage('Function DownloadPage - Waiting ' + IntToStr(currentWait div 1000) +
               's for the presence of: ' + FileName + '|');
    Wait(currentWait);

    // escalate wait times
    case i of
      0: currentWait := 5000;
      1: currentWait := 15000;
      2: currentWait := 10000;
      3: currentWait := 10000;
      4: currentWait := 15000;
      5: currentWait := 15000;
    end;

    Inc(i);
    if i = INTERNET_TEST_ITERATIONS then
    begin
      case MessageBox('IMDb Movie Function DownloadPage - Too many faulty attempts to internet connection for ' + PageLabel +
                      '. Cancel, Retry, or Continue (Ignore)? NOTE: IF YOU PRESS IGNORE YOU WILL NOT GET DATA FROM THAT PAGE, SO CONSIDER TO RETRY OR TO CANCEL AND START DOWNLOAD AGAIN! IMDb really makes it harder and harder to get the data.',
                      SCRIPT_FILE_NAME, 2) of
        3: // Cancel
        begin
          LogMessage('Function DownloadPage for ' + PageLabel +
                     ' ended with NO INTERNET connection ===============|');
          Result := '';
          Exit;
        end;
        4: // Retry
        begin
          i := 0;
          currentWait := InitialWait;
        end;
        5: // Ignore
        begin
          LogMessage('Function DownloadPage - Creating dummy ' + PageLabel +
                     ' HTML file due to Ignore selection|');
          with TStringList.Create do
          try
            Add('<html><body>Dummy ' + PageLabel + ' due to user Ignore.</body></html>');
            SaveToFile(FileName);
          finally
            Free;
          end;
          Break;
        end;
      end;
    end;
  end;

  // stabilization wait
  LogMessage('Function DownloadPage - ' + PageLabel +
             ' file detected, waiting extra ' + IntToStr(StabilizeMs) + 'ms to stabilize...');
  Wait(StabilizeMs);

  // manual error handling instead of try/except
  if not FileExists(FileName) then
  begin
    LogMessage(ProcException('FileError', 'NOT_FOUND: ' + FileName));
    Exit;
  end;

  tryResult := FileToString(FileName); // if your environment throws, replace with safe read
  if tryResult = '' then
    LogMessage(ProcException('FileError', 'FAILED to read ' + FileName))
  else
  begin
    Result := ConvertEncoding(tryResult, 65001);
    LogMessage(ProcException('FileRead', 'SUCCESS: ' + FileName));
  end;
end; //BlockClose

so, now instead this snippet for every page:

Quote

     
   // Initialize currentWait for the FullCredits file
      currentWait := 5000;  // Start with 5 seconds
      //Wait for the FullCredits file to finish downloading
      If Not(((USE_SAVED_PVDCONFIG) And ((ConfigOptions[8] = '0') And (ConfigOptions[10] = '0'))) Or
         (GET_FULL_CREDIT_FROM_REFERENCE And ((Pos('Series', MediaType) = 0) And (Pos('Series', GetFieldValueXML('category')) = 0)))) Then Begin // Also Known As (FullCredits)
      i := 0;
      currentWait := 2000;  // Initialize wait time
      while not FileExists(FilePath + FileTitleFullCredits) do begin
         LogMessage('Function DownloadPage - Waiting ' + IntToStr(currentWait div 1000) + 's (because of the people like Alfonso Cuaron with 268 wins and 208 nominations at the moment of writing this script) for the presence of: ' + FilePath + FileTitleFullCredits);
         wait(currentWait);
          // Increment the wait time for the next iteration
         case i of
            0: currentWait := 5000;  // 5 seconds
            1: currentWait := 15000;  // 15 seconds
            2: currentWait := 10000;  // 10 seconds
            3: currentWait := 10000;  // 10 seconds
            4: currentWait := 15000;  // 15 seconds
            5: currentWait := 15000;  // 15 seconds
         end;
         i := i + 1;
         if i = INTERNET_TEST_ITERATIONS then
         begin
            case MessageBox('IMDb Movie Function DownloadPage - Too many faulty attempts to internet connection for FullCredits. Cancel, Retry, or Continue (Ignore)? NOTE: IF YOU PRESS IGNORE YOU WILL NOT GET DATA FROM THAT PAGE, SO CONSIDER TO RETRY OR TO CANCEL AND START DOWNLOAD AGAIN! IMDb really makes it harder and harder to get the data.', SCRIPT_FILE_NAME, 2) of
               3: // Abort -> treat as Cancel
               begin
               LogMessage('Function DownloadPage for FullCredits END with NO INTERNET connection =============== |');
                  Result := '';
                  Exit;
               end;
               4: // Retry
               begin
                  i := 0;
                  currentWait := 2000;  // Reset wait time
               end;
               5: // Ignore->create dummy file
               begin
                  LogMessage('Creating dummy FullCredits HTML file due to Ignore selection...');
                  with TStringList.Create do
                  try
                     Add('<html><body>Dummy FullCredits due to user Ignore.</body></html>');
                     SaveToFile(FilePath + FileTitleFullCredits);
                  finally
                     Free;
                  end;
                  Result := FilePath + FileTitleFullCredits;
               end;
            end;
         end;
      end;


    // Add a short stabilization wait after the file is recognized
    LogMessage('Function DownloadPage - CHANGETHISWITHPROPERFILENAME file detected, waiting extra 1s to stabilize...');
    Wait(2000); // wait 2 second (adjust as needed)


      WebText := FileToString(FilePath + FileTitleFullCredits);
      WebText := ConvertEncoding(WebText, 65001); // UTF-8
      FullCreditsPageDownloaded := True;
      LogMessage('Function DownloadPage - FullCredits file found: ' + FilePath + FileTitleFullCredits);
      LogMessage('Value of FullCreditsPageDownloaded: ' + BoolToStr(FullCreditsPageDownloaded));
   end;


we will have only this


Quote



if not (USE_SAVED_PVDCONFIG and (ConfigOptions[15] = '0')) then
begin
  WebText := WaitForPageFile(FilePath + FileTitleParentalGuide, 'ParentalGuide', 5000, 2000);
  ParentalGuidePageDownloaded := WebText <> '';
  LogMessage('Function DownloadPage - Value of ParentalGuidePageDownloaded: ' +
             BoolToStr(ParentalGuidePageDownloaded));
end;


That way we will speed up the script and have hundreds if not thousands of lines less in the script.


I will do that and test for all repetitive tasks.

Also, I'm exploring ways IMDb not to refuse connections with selenium/python and testing at the moment creating fake userData folders, rotating user agents, and it goes good for now.

20
I have completed FullCredits function, so now all the data can be imported again to PVD. Before I publish it I want to check 2 things:
1. What to do with Reference page code, and should I exclude its options from Script Configurator, since now it is completely not needed.
2. To check People script and if I can fix it quickly, then I will publish all the scripts and files again in one, final package for this IMDb html layout change.

After that please check the scripts and let me know what doesn't work

Pages: [1] 2 3 4 5 6 ... 33
anything