English > PVD Python Scripts

PVD Selenium MOD v4 IMDb Movie, People and FilmAffinity Scripts

<< < (5/8) > >>

afrocuban:
I forgot to upload ahk for the Script Configurator .exe


SeleniumPVDbScriptsConfig-v4.exe is compiled with ahk2exe for AutoHotkey v1.1.37.02 option "U32 (default) bin", without compression.

afrocuban:
I have finished CompanyCredits function and now no need for Reference page at all. Consequently, I have updated, redesigned and compiled Script Configurator.

Now I need to update FullCredits function, meaning to get full cast & crew, directors for series, producers and composers and everything will be done. I will not clean Reference page from the code. I will only comment it out, so it could be possibly used in the future IMDb page changes.

Most probably I will not post again until finishing everything.

afrocuban:
I have completed FullCredits function, so now all the data can be imported again to PVD. Before I publish it I want to check 2 things:
1. What to do with Reference page code, and should I exclude its options from Script Configurator, since now it is completely not needed.
2. To check People script and if I can fix it quickly, then I will publish all the scripts and files again in one, final package for this IMDb html layout change.

After that please check the scripts and let me know what doesn't work

Ivek23:

--- Quote from: afrocuban on November 10, 2025, 12:13:09 am ---I have completed FullCredits function, so now all the data can be imported again to PVD. Before I publish it I want to check 2 things:
1. What to do with Reference page code, and should I exclude its options from Script Configurator, since now it is completely not needed.
2. To check People script and if I can fix it quickly, then I will publish all the scripts and files again in one, final package for this IMDb html layout change.

After that please check the scripts and let me know what doesn't work
--- End quote ---

First of all, Function ParsePage_IMDBMovieREFERENCE should be moved to the end of the script before Function ParsePage.

Secondly, the entire part of the reference page code should be left and, as mentioned above, before Function ParsePage, or even better, it should be moved to the very end of the script, where there is already a History of changes, so that it can be re-included in the script if necessary in the future or completely blocked. All its options should also be excluded from both the script code and the script configurator, because now, as mentioned, it is no longer needed.

As for the People script, I don't know what works or doesn't work, because I don't use it.

afrocuban:
Yes, I just commented out reference code in the script, and I already intended to leave it there and everywhere it is mentioned, to preserve the logic for the future. I was actually talking to comment it out in the Script Configurator too, to avoid inexperienced users to click on those options expecting to get data from /reference page. I will move the function before ParsePage function, though, that is a nice tip.

Now, meanwhile, I introduced another function:


--- Quote ---function WaitForPageFile(const FileName, PageLabel: string;
  InitialWait: Integer; StabilizeMs: Integer): string; //BlockOpen
var
  i, currentWait: Integer;
  tryResult: string;
begin
  Result := '';
  i := 0;

  // provide defaults manually
  if InitialWait = 0 then
    InitialWait := 2000;
  if StabilizeMs = 0 then
    StabilizeMs := 1000;

  currentWait := InitialWait;

  while not FileExists(FileName) do
  begin
    LogMessage('Function DownloadPage - Waiting ' + IntToStr(currentWait div 1000) +
               's for the presence of: ' + FileName + '|');
    Wait(currentWait);

    // escalate wait times
    case i of
      0: currentWait := 5000;
      1: currentWait := 15000;
      2: currentWait := 10000;
      3: currentWait := 10000;
      4: currentWait := 15000;
      5: currentWait := 15000;
    end;

    Inc(i);
    if i = INTERNET_TEST_ITERATIONS then
    begin
      case MessageBox('IMDb Movie Function DownloadPage - Too many faulty attempts to internet connection for ' + PageLabel +
                      '. Cancel, Retry, or Continue (Ignore)? NOTE: IF YOU PRESS IGNORE YOU WILL NOT GET DATA FROM THAT PAGE, SO CONSIDER TO RETRY OR TO CANCEL AND START DOWNLOAD AGAIN! IMDb really makes it harder and harder to get the data.',
                      SCRIPT_FILE_NAME, 2) of
        3: // Cancel
        begin
          LogMessage('Function DownloadPage for ' + PageLabel +
                     ' ended with NO INTERNET connection ===============|');
          Result := '';
          Exit;
        end;
        4: // Retry
        begin
          i := 0;
          currentWait := InitialWait;
        end;
        5: // Ignore
        begin
          LogMessage('Function DownloadPage - Creating dummy ' + PageLabel +
                     ' HTML file due to Ignore selection|');
          with TStringList.Create do
          try
            Add('<html><body>Dummy ' + PageLabel + ' due to user Ignore.</body></html>');
            SaveToFile(FileName);
          finally
            Free;
          end;
          Break;
        end;
      end;
    end;
  end;

  // stabilization wait
  LogMessage('Function DownloadPage - ' + PageLabel +
             ' file detected, waiting extra ' + IntToStr(StabilizeMs) + 'ms to stabilize...');
  Wait(StabilizeMs);

  // manual error handling instead of try/except
  if not FileExists(FileName) then
  begin
    LogMessage(ProcException('FileError', 'NOT_FOUND: ' + FileName));
    Exit;
  end;

  tryResult := FileToString(FileName); // if your environment throws, replace with safe read
  if tryResult = '' then
    LogMessage(ProcException('FileError', 'FAILED to read ' + FileName))
  else
  begin
    Result := ConvertEncoding(tryResult, 65001);
    LogMessage(ProcException('FileRead', 'SUCCESS: ' + FileName));
  end;
end; //BlockClose

--- End quote ---

so, now instead this snippet for every page:


--- Quote ---
     
   // Initialize currentWait for the FullCredits file
      currentWait := 5000;  // Start with 5 seconds
      //Wait for the FullCredits file to finish downloading
      If Not(((USE_SAVED_PVDCONFIG) And ((ConfigOptions[8] = '0') And (ConfigOptions[10] = '0'))) Or
         (GET_FULL_CREDIT_FROM_REFERENCE And ((Pos('Series', MediaType) = 0) And (Pos('Series', GetFieldValueXML('category')) = 0)))) Then Begin // Also Known As (FullCredits)
      i := 0;
      currentWait := 2000;  // Initialize wait time
      while not FileExists(FilePath + FileTitleFullCredits) do begin
         LogMessage('Function DownloadPage - Waiting ' + IntToStr(currentWait div 1000) + 's (because of the people like Alfonso Cuaron with 268 wins and 208 nominations at the moment of writing this script) for the presence of: ' + FilePath + FileTitleFullCredits);
         wait(currentWait);
          // Increment the wait time for the next iteration
         case i of
            0: currentWait := 5000;  // 5 seconds
            1: currentWait := 15000;  // 15 seconds
            2: currentWait := 10000;  // 10 seconds
            3: currentWait := 10000;  // 10 seconds
            4: currentWait := 15000;  // 15 seconds
            5: currentWait := 15000;  // 15 seconds
         end;
         i := i + 1;
         if i = INTERNET_TEST_ITERATIONS then
         begin
            case MessageBox('IMDb Movie Function DownloadPage - Too many faulty attempts to internet connection for FullCredits. Cancel, Retry, or Continue (Ignore)? NOTE: IF YOU PRESS IGNORE YOU WILL NOT GET DATA FROM THAT PAGE, SO CONSIDER TO RETRY OR TO CANCEL AND START DOWNLOAD AGAIN! IMDb really makes it harder and harder to get the data.', SCRIPT_FILE_NAME, 2) of
               3: // Abort -> treat as Cancel
               begin
               LogMessage('Function DownloadPage for FullCredits END with NO INTERNET connection =============== |');
                  Result := '';
                  Exit;
               end;
               4: // Retry
               begin
                  i := 0;
                  currentWait := 2000;  // Reset wait time
               end;
               5: // Ignore->create dummy file
               begin
                  LogMessage('Creating dummy FullCredits HTML file due to Ignore selection...');
                  with TStringList.Create do
                  try
                     Add('<html><body>Dummy FullCredits due to user Ignore.</body></html>');
                     SaveToFile(FilePath + FileTitleFullCredits);
                  finally
                     Free;
                  end;
                  Result := FilePath + FileTitleFullCredits;
               end;
            end;
         end;
      end;


    // Add a short stabilization wait after the file is recognized
    LogMessage('Function DownloadPage - CHANGETHISWITHPROPERFILENAME file detected, waiting extra 1s to stabilize...');
    Wait(2000); // wait 2 second (adjust as needed)


      WebText := FileToString(FilePath + FileTitleFullCredits);
      WebText := ConvertEncoding(WebText, 65001); // UTF-8
      FullCreditsPageDownloaded := True;
      LogMessage('Function DownloadPage - FullCredits file found: ' + FilePath + FileTitleFullCredits);
      LogMessage('Value of FullCreditsPageDownloaded: ' + BoolToStr(FullCreditsPageDownloaded));
   end;

--- End quote ---


we will have only this



--- Quote ---


if not (USE_SAVED_PVDCONFIG and (ConfigOptions[15] = '0')) then
begin
  WebText := WaitForPageFile(FilePath + FileTitleParentalGuide, 'ParentalGuide', 5000, 2000);
  ParentalGuidePageDownloaded := WebText <> '';
  LogMessage('Function DownloadPage - Value of ParentalGuidePageDownloaded: ' +
             BoolToStr(ParentalGuidePageDownloaded));
end;

--- End quote ---


That way we will speed up the script and have hundreds if not thousands of lines less in the script.


I will do that and test for all repetitive tasks.

Also, I'm exploring ways IMDb not to refuse connections with selenium/python and testing at the moment creating fake userData folders, rotating user agents, and it goes good for now.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version