Recent Posts

Pages: 1 2 3 [4] 5 6 7 8 9 10
31
Yes, I just commented out reference code in the script, and I already intended to leave it there and everywhere it is mentioned, to preserve the logic for the future. I was actually talking to comment it out in the Script Configurator too, to avoid inexperienced users to click on those options expecting to get data from /reference page. I will move the function before ParsePage function, though, that is a nice tip.

Now, meanwhile, I introduced another function:

Quote
function WaitForPageFile(const FileName, PageLabel: string;
  InitialWait: Integer; StabilizeMs: Integer): string; //BlockOpen
var
  i, currentWait: Integer;
  tryResult: string;
begin
  Result := '';
  i := 0;

  // provide defaults manually
  if InitialWait = 0 then
    InitialWait := 2000;
  if StabilizeMs = 0 then
    StabilizeMs := 1000;

  currentWait := InitialWait;

  while not FileExists(FileName) do
  begin
    LogMessage('Function DownloadPage - Waiting ' + IntToStr(currentWait div 1000) +
               's for the presence of: ' + FileName + '|');
    Wait(currentWait);

    // escalate wait times
    case i of
      0: currentWait := 5000;
      1: currentWait := 15000;
      2: currentWait := 10000;
      3: currentWait := 10000;
      4: currentWait := 15000;
      5: currentWait := 15000;
    end;

    Inc(i);
    if i = INTERNET_TEST_ITERATIONS then
    begin
      case MessageBox('IMDb Movie Function DownloadPage - Too many faulty attempts to internet connection for ' + PageLabel +
                      '. Cancel, Retry, or Continue (Ignore)? NOTE: IF YOU PRESS IGNORE YOU WILL NOT GET DATA FROM THAT PAGE, SO CONSIDER TO RETRY OR TO CANCEL AND START DOWNLOAD AGAIN! IMDb really makes it harder and harder to get the data.',
                      SCRIPT_FILE_NAME, 2) of
        3: // Cancel
        begin
          LogMessage('Function DownloadPage for ' + PageLabel +
                     ' ended with NO INTERNET connection ===============|');
          Result := '';
          Exit;
        end;
        4: // Retry
        begin
          i := 0;
          currentWait := InitialWait;
        end;
        5: // Ignore
        begin
          LogMessage('Function DownloadPage - Creating dummy ' + PageLabel +
                     ' HTML file due to Ignore selection|');
          with TStringList.Create do
          try
            Add('<html><body>Dummy ' + PageLabel + ' due to user Ignore.</body></html>');
            SaveToFile(FileName);
          finally
            Free;
          end;
          Break;
        end;
      end;
    end;
  end;

  // stabilization wait
  LogMessage('Function DownloadPage - ' + PageLabel +
             ' file detected, waiting extra ' + IntToStr(StabilizeMs) + 'ms to stabilize...');
  Wait(StabilizeMs);

  // manual error handling instead of try/except
  if not FileExists(FileName) then
  begin
    LogMessage(ProcException('FileError', 'NOT_FOUND: ' + FileName));
    Exit;
  end;

  tryResult := FileToString(FileName); // if your environment throws, replace with safe read
  if tryResult = '' then
    LogMessage(ProcException('FileError', 'FAILED to read ' + FileName))
  else
  begin
    Result := ConvertEncoding(tryResult, 65001);
    LogMessage(ProcException('FileRead', 'SUCCESS: ' + FileName));
  end;
end; //BlockClose

so, now instead this snippet for every page:

Quote

     
   // Initialize currentWait for the FullCredits file
      currentWait := 5000;  // Start with 5 seconds
      //Wait for the FullCredits file to finish downloading
      If Not(((USE_SAVED_PVDCONFIG) And ((ConfigOptions[8] = '0') And (ConfigOptions[10] = '0'))) Or
         (GET_FULL_CREDIT_FROM_REFERENCE And ((Pos('Series', MediaType) = 0) And (Pos('Series', GetFieldValueXML('category')) = 0)))) Then Begin // Also Known As (FullCredits)
      i := 0;
      currentWait := 2000;  // Initialize wait time
      while not FileExists(FilePath + FileTitleFullCredits) do begin
         LogMessage('Function DownloadPage - Waiting ' + IntToStr(currentWait div 1000) + 's (because of the people like Alfonso Cuaron with 268 wins and 208 nominations at the moment of writing this script) for the presence of: ' + FilePath + FileTitleFullCredits);
         wait(currentWait);
          // Increment the wait time for the next iteration
         case i of
            0: currentWait := 5000;  // 5 seconds
            1: currentWait := 15000;  // 15 seconds
            2: currentWait := 10000;  // 10 seconds
            3: currentWait := 10000;  // 10 seconds
            4: currentWait := 15000;  // 15 seconds
            5: currentWait := 15000;  // 15 seconds
         end;
         i := i + 1;
         if i = INTERNET_TEST_ITERATIONS then
         begin
            case MessageBox('IMDb Movie Function DownloadPage - Too many faulty attempts to internet connection for FullCredits. Cancel, Retry, or Continue (Ignore)? NOTE: IF YOU PRESS IGNORE YOU WILL NOT GET DATA FROM THAT PAGE, SO CONSIDER TO RETRY OR TO CANCEL AND START DOWNLOAD AGAIN! IMDb really makes it harder and harder to get the data.', SCRIPT_FILE_NAME, 2) of
               3: // Abort -> treat as Cancel
               begin
               LogMessage('Function DownloadPage for FullCredits END with NO INTERNET connection =============== |');
                  Result := '';
                  Exit;
               end;
               4: // Retry
               begin
                  i := 0;
                  currentWait := 2000;  // Reset wait time
               end;
               5: // Ignore->create dummy file
               begin
                  LogMessage('Creating dummy FullCredits HTML file due to Ignore selection...');
                  with TStringList.Create do
                  try
                     Add('<html><body>Dummy FullCredits due to user Ignore.</body></html>');
                     SaveToFile(FilePath + FileTitleFullCredits);
                  finally
                     Free;
                  end;
                  Result := FilePath + FileTitleFullCredits;
               end;
            end;
         end;
      end;


    // Add a short stabilization wait after the file is recognized
    LogMessage('Function DownloadPage - CHANGETHISWITHPROPERFILENAME file detected, waiting extra 1s to stabilize...');
    Wait(2000); // wait 2 second (adjust as needed)


      WebText := FileToString(FilePath + FileTitleFullCredits);
      WebText := ConvertEncoding(WebText, 65001); // UTF-8
      FullCreditsPageDownloaded := True;
      LogMessage('Function DownloadPage - FullCredits file found: ' + FilePath + FileTitleFullCredits);
      LogMessage('Value of FullCreditsPageDownloaded: ' + BoolToStr(FullCreditsPageDownloaded));
   end;


we will have only this


Quote



if not (USE_SAVED_PVDCONFIG and (ConfigOptions[15] = '0')) then
begin
  WebText := WaitForPageFile(FilePath + FileTitleParentalGuide, 'ParentalGuide', 5000, 2000);
  ParentalGuidePageDownloaded := WebText <> '';
  LogMessage('Function DownloadPage - Value of ParentalGuidePageDownloaded: ' +
             BoolToStr(ParentalGuidePageDownloaded));
end;


That way we will speed up the script and have hundreds if not thousands of lines less in the script.


I will do that and test for all repetitive tasks.

Also, I'm exploring ways IMDb not to refuse connections with selenium/python and testing at the moment creating fake userData folders, rotating user agents, and it goes good for now.
32
PVD Python Scripts / Re: PVD Selenium MOD v4 IMDb Movie, People and FilmAffinity Scripts
« Last post by Ivek23 on November 10, 2025, 07:46:03 am »
I have completed FullCredits function, so now all the data can be imported again to PVD. Before I publish it I want to check 2 things:
1. What to do with Reference page code, and should I exclude its options from Script Configurator, since now it is completely not needed.
2. To check People script and if I can fix it quickly, then I will publish all the scripts and files again in one, final package for this IMDb html layout change.

After that please check the scripts and let me know what doesn't work

First of all, Function ParsePage_IMDBMovieREFERENCE should be moved to the end of the script before Function ParsePage.

Secondly, the entire part of the reference page code should be left and, as mentioned above, before Function ParsePage, or even better, it should be moved to the very end of the script, where there is already a History of changes, so that it can be re-included in the script if necessary in the future or completely blocked. All its options should also be excluded from both the script code and the script configurator, because now, as mentioned, it is no longer needed.

As for the People script, I don't know what works or doesn't work, because I don't use it.
33
I have completed FullCredits function, so now all the data can be imported again to PVD. Before I publish it I want to check 2 things:
1. What to do with Reference page code, and should I exclude its options from Script Configurator, since now it is completely not needed.
2. To check People script and if I can fix it quickly, then I will publish all the scripts and files again in one, final package for this IMDb html layout change.

After that please check the scripts and let me know what doesn't work
34
I have finished CompanyCredits function and now no need for Reference page at all. Consequently, I have updated, redesigned and compiled Script Configurator.

Now I need to update FullCredits function, meaning to get full cast & crew, directors for series, producers and composers and everything will be done. I will not clean Reference page from the code. I will only comment it out, so it could be possibly used in the future IMDb page changes.

Most probably I will not post again until finishing everything.
35
I forgot to upload ahk for the Script Configurator .exe


SeleniumPVDbScriptsConfig-v4.exe is compiled with ahk2exe for AutoHotkey v1.1.37.02 option "U32 (default) bin", without compression.
36
Scripts and Templates / Re: Dark Skin - V4-1.1
« Last post by afrocuban on November 06, 2025, 01:32:57 am »
Updated Movie skins v4-1.2 and v4-3.2 as of 2025-11-05.


All the custom fields added, and tabs redesigned.
37

Meanwhile, I have made huge fixes and improvements on all other files/scripts. Especially Main page, so now a lot of new data is pulled from the Main page which is now downloaded much faster. I encourage you to try it, by selecting fields in Script Configurator as in the screenshots, waiting PVD to restart, and then to select to overwrite all fields like in the other screenshot, and immediately start to download.

One of the biggest changes is that now I have created a procedure that enables poster to be downloaded from any page. For example, if you choose only "AKA" page to download (to update only aka's for example) you can download poster with that function too, now. Just be sure to select "Download Posters" in the Script Configurator.

I have updated and recompiled a Script Configurator with minor changes too. You need to replace all the files with the given in the attachment, and they all go only to /Scripts folder as usual.

Here's from the CHANGE LOGs of the IMDB and FilmAffinity scripts what I did:

IMDb Script
Quote
---------------------------------------------
CHANGE LOG :
V 4.1.0.1 (5/11/2025) afrocuban
- Procedure EnsurePosterDownloaded is introduced in order to be able to download poster from any page.
- Fixed pages due to change layout.
- Improved ParsePage_IMDBMovieBASE function.
- Introduced revised Messagebox now includes Cancel, Retry, or Continue (Ignore). NOTE: IF YOU PRESS IGNORE YOU WILL NOT GET DATA FROM THAT PAGE, SO CONSIDER TO RETRY OR TO CANCEL AND START DOWNLOAD AGAIN! IMDb really makes it harder and harder to get the data.
- Script Configurator decriptions adjusted to reflect actual processes.
- In corresponding Selenium scripts significantly improved downloading pages speed.
- In corresponding Selenium scripts fixed searching titles.

FilmAffinity Script
Quote

CHANGE LOG :
V 4.1.0.1-afrocuban (11/5/2025) afrocuban:
- Backup CHEAT_PREFIX_URLs introduced, since httpbin.org now almost always CREATES "503 Service Temporarily Unavailable" PROBLEMS FOR FILMAFFINITY RECENTLY. At the moment, the one that works is 'http://httpbingo.org/response-headers?key='


(*@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

// IF NON OF THESE WORK THEN THERE ARE NO OTHER SERVICES AVAILABLE AND THE ONLY OPTION LEFT IS TO INSTALL LOCAL HTTPBIN LIKE THIS

Install and run httpbin locally like this:

```bash
pip install httpbin
python -m httpbin.core
```

It will start a local server on port `5000`, and you can absolutely use:

```
http://localhost:5000/response-headers?key=https://www.filmaffinity.com/en/film699169.html/
```

Just like you would with the public httpbin.org — and it will behave the same way: echoing back the `key` header in the response.

---

### ? What to expect
When you visit that URL in your browser or send a request via code, you’ll get a JSON response like:

```json
{
  "key": "https://www.filmaffinity.com/en/film699169.html/"
}
{
```

No HTTPS redirection, no 503 errors, and full control — because it’s running locally.

---


@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

- Fixed "FA Critics" field due to layout changes.
- Fixed "FAMovieTrailers" field to now properly include source name (youtube, dailymotion, etc).
- Other layout changes fixed.
- In corresponding Selenium scripts significantly improved downloading pages speed, especially Trailers page..
- In corresponding Selenium scripts fixed searching titles due to FilmAffinity Search page layout changed. (Movie title now has to be retrieved from the section for mobile devices).
- Other minor layout changes updated, and the script cleaned additionaly.


So with these scripts, you can get all possible data for now except:
- Full Cast (for now you can get only cast found on the main page)
- Producers
- Composers
- Full Production Companies
- Distributors
- All Directors for Series (only one director for now if you download /fullcredits page, because I have just started to fix it).

I will not provide support for these scripts until I finish all because I know IMDB script still doesn't fully work. I am interested in you to test border case titles which I'd fix upon finishing. For that, please provide the link and the log for specific field (or what you get in the field and what you expect). I will be able to fix things only if I can reproduce them and for that you need to provide me with data above.
38
Great. I couldn't tell what it could be, but if I had to guess I'd say it was that new chrome.options still weren't implemented in geckodriver.

Where am I at the point?
Now I'm left with /fullcredits and /reference page. But... After reviewing new /reference design it became totaly pointless to download and parse it. It now simply doesn't have anything more than other pages already being downloaded, except full "Production Companies" and "Distributors". So, I will probably implement CompanyCredits page which is tiny and much faster to download in order to get full "Production Companies" and "Distributors". That will also drastically simplify the code.

Another important change is that I introduced new error MessageBox as seen in the last screenshot, because IMDB makes it harder and harder to fetch data in a non-human way, with Selenium or so. Read the message on it to know what it means to press any button.



More on everything in the next message.
39
Maybe Selenium_Chrome_Movie_Additional_pages_v4.py will be loaded to see how it works.

I have optimized Selenium_Chrome_Base_page_v4.py script so now downloading of the IMDb main page should be dramatically faster - around 18 seconds on my computer.

The Selenium_Chrome_Base_page_v4.py script works fine. However, it doesn't work at all with Firefox and Geckodriver options.

I managed to edit the Selenium_Chrome_Base_page_v4.py script with Firefox and Geckodriver options using AI and it works now.

I would then do the same in the Selenium_Chrome_Movie_Additional_pages_v4.py script.
40
Oh, sorry to hear. I even didn't know it worked earlier and how it could work since I never published geckodriver version??? Probably I don't understand the context of your message.

I changed all parts of the code where there is a Chrome and Chromedriver options record to Firefox and Geckodriver options record and the Selenium_Chrome_Base_page_v4.py script also worked using the Firefox browser.

As I mentioned before, with the latest Selenium_Chrome_Base_page_v4.py script update and the Chrome and Chromedriver options record, I changed it to Firefox and Geckodriver options record, but the Selenium_Chrome_Base_page_v4.py script no longer works at all using the Firefox browser.


We already talked about this a while ago in another PVD Python Scripts topic.
Pages: 1 2 3 [4] 5 6 7 8 9 10