Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - jondak

Pages: [1] 2 3
1
Did you start 4 chrome instances manually prior to run the PVD?
start chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\PVD\1"
start chrome.exe --remote-debugging-port=9223 --user-data-dir="C:\PVD\2"
start chrome.exe --remote-debugging-port=9224 --user-data-dir="C:\PVD\3"
start chrome.exe --remote-debugging-port=9225 --user-data-dir="C:\PVD\4"

This i didn't do. Will try again next week when time permits.

Thank you

2
Hello,

i replaced in Selenium_Chrome_Movie_Additional_pages_v4.py from the "Function to download a page and handle "See more" clicks for specific pages" till the end with the script in quote but i get the all errors.

maybe i got the script wrong or maybe i'm getting port blocked.

Regards.

3
Hello,

there are my scripts. will try the modification on the weekend .

4
This just sleeps between 8 and 12 seconds and it could be very fragile. Also, it makes whole process longer 1-2 minutes per title?

I use Moviedb to get the picture and IMDB Selenium to get the data
Tests:
Witches' Well 2024 https://www.imdb.com/title/tt29793692/ - it took 1 min 55 sec
The Matrix 1999 https://www.imdb.com/title/tt0133093/ - it took 2 min 10 sec

I limited the tags to 300 as above 500 it crashed the database and i had to manually edited it with DBeaver, rest is in the pictures attached.

I run PVD in a win10 VM as in win11 i can't get it download any data.

When i first got the AWS pages instead of the data ones i thought i got ip banned by imdb so i tried to proxy and VPN my connection with no success. I even copied the vm to my computer at work to test and same result.
Then i looked into why i get the pages and the results pointed to the fact i appeared as a bot getting page after page with no "human" pause between them so i added the sleep.

I found other solutions but not tested them:

change: chrome_options = build_chrome_options(headed=False)
to this: headed_mode = "keywords" in download_url
            chrome_options = build_chrome_options(headed=headed_mode)

also this but seemed longer:

add this after page load:

if "challenge.js" in driver.page_source or "AwsWafIntegration" in driver.page_source:
    logging.warning("AWS WAF detected — retrying with longer delay")
    time.sleep(15)
    driver.refresh()
    time.sleep(8 )

Regards

5
Just in case other people get this to fix it in Selenium_Chrome_Movie_Additional_pages_v4:

after driver.get(download_url)

i added:

time.sleep(random.uniform(8, 12))

This change does not work because it blocks the download of Additional pages.


Hello,

without the modification i was getting AWS pages on the additional pages. I attached one example renamed to txt.

6
Hello,

thank you for your epic work on the keeping the scripts and PVD alive.


After working very well for 3-4 days, today 22.01.2026 I keep getting on keywords, reviews pages download this:

Code: [Select]
<html lang="en"><head>
           "context":"
};
    </script>
    <script src="https://1c5c1ecf7303.8b78215a.eu-north-1.token.awswaf.com/1c5c1ecf7303/e231f0619a5e/0319a8d4ae69/challenge.js"></script>
</head>
<body>
    <div id="challenge-container"></div>
    <script type="text/javascript">
        AwsWafIntegration.saveReferrer();
        AwsWafIntegration.checkForceRefresh().then((forceRefresh) => {
            if (forceRefresh) {
                AwsWafIntegration.forceRefreshToken().then(() => {
                    window.location.reload(true);
                });
            } else {
                AwsWafIntegration.getToken().then(() => {
                    window.location.reload(true);
                });
            }
        });
    </script>
    <noscript>
        <h1>JavaScript is disabled</h1>
        In order to continue, we need to verify that you're not a robot.
        This requires JavaScript. Enable JavaScript and then reload the page.
    </noscript>

</body></html>

After some searching i got this from chatgpt:

What the error actually is the file you’re saving is not the keywords page. It’s an AWS WAF (Web Application Firewall) challenge page returned by IMDb

Key signs from the HTML:

challenge.js
AwsWafIntegration
“verify that you're not a robot”
JavaScript-based token refresh

This means:IMDb detected automation and served a bot-check page instead of real content

Just in case other people get this to fix it in Selenium_Chrome_Movie_Additional_pages_v4:

after driver.get(download_url)

i added:

time.sleep(random.uniform(8, 12))


7
PVD Python Scripts / Re: PVD Selenium MOD v4 IMDb Movie Script Confusion
« on: September 06, 2025, 09:29:45 am »
Hello,

i was using the Selenium chrome scripts till they changed the page and they stopped pulling data.

in a previous post you said:


I don't know what's going on, because I'm not using any IMDB_Movies_[EN][Selenium]-v4 script versions at the moment. And I won't be using them for quite some time. afrocuban user has been away for quite some time, so he can fix these scripts as best he can, because he knows the design of these scripts. He'll probably be away for quite some time, and the question here is how long it will be before he returns.

I was asking what scripts you are using to pull data now.


Thank you .

8
PVD Python Scripts / Re: PVD Selenium MOD v4 IMDb Movie Script Confusion
« on: September 05, 2025, 03:37:01 pm »
Hello Ivek23,

can you tell us what combination of scripts what are you using ?

9
Other Topics / Re: IMDb test 1b sctipt
« on: July 26, 2023, 06:49:41 pm »
Hello,

can anyone upload here the fix for AKA ( 50 entries). I saw it on the other forum but it went offline while i was waiting for account approval.


10
Support / Re: Personal Video Database 1.0.2.7 MOD
« on: July 20, 2021, 10:19:40 am »
I would like, for example, Title1 to be shown in the left pane/tree view. Is that possible?




I don't know if it what you asking but you can add Orginal title in the left list:



you put something lie this:

%N. %O

This is for 0.9.9.21 but i suspect the 1.0.27 is the same.

Hope this helps.

11
Support / Re: What happened with my topic?
« on: July 20, 2021, 08:42:06 am »
Thank you,

didn't know they use different labels if there are multiple countries. Though they change it in the site redesign.

Testing RC now

Cheers.


12
Support / Re: What happened with my topic?
« on: July 19, 2021, 09:29:24 pm »
Hello,

For the Country Field:

    curPos:=Pos('<span class="ipc-metadata-list-item__label">Countries of origin</span>',HTML);                                      //WEB_SPECIFIC.
    If 0<curPos Then Begin
      EndPos:=curPos;
      ItemValue:=HTMLValues(HTML,'<span class="ipc-metadata-list-item__label">Countries of origin</span>','</ul>','<li role="presentation" class="ipc-inline-list__item">','</li>',', ',endPos);

becomes:

    curPos:=Pos('<span class="ipc-metadata-list-item__label">Country of origin</span>',HTML);                                      //WEB_SPECIFIC.
    If 0<curPos Then Begin
      EndPos:=curPos;
      ItemValue:=HTMLValues(HTML,'<span class="ipc-metadata-list-item__label">Country of origin</span>','</ul>','<li role="presentation" class="ipc-inline-list__item">','</li>',', ',endPos);


They replaced in the page Countries of origin with Country of origin  ( both test and RC version)

I've tested with 2 movies only as its late here and worked

13
Support / Re: Personal Video Database 1.0.2.7 MOD
« on: July 02, 2019, 11:08:53 am »
Thank you.

14
Support / Re: Personal Video Database 1.0.2.7 MOD
« on: June 19, 2019, 06:46:27 pm »
I'm having this since two weeks ago.

I'm scanning with IMDB.

Same error:
IMDB_[EN][HTTPS].psf V 1.4.1.0 (10/02/2019)

For now 2 movies triggered the error:
https://www.imdb.com/title/tt1210059/ Flying Lessons (2010)

https://www.imdb.com/title/tt1877647/ Ghoul (2012)

Seems the movies with this error have no Plot Keywords


15
Support / Re: Personal Video Database 1.0.2.7 MOD
« on: November 22, 2018, 07:56:17 am »
Thank you.

The fix worked for the titles i linked. But it seems not all titles are made the same  :-\

Dai juk hei kek (2012) https://www.imdb.com/title/tt2266938/ has a crash also.

Maybe is from the chirilic alphabet on the AKA page?

Cheers.

16
Support / Re: Personal Video Database 1.0.2.7 MOD
« on: November 17, 2018, 08:44:27 pm »
Found a bug in the last version in the IMDB_ [EN] [HTTPS] script in both 9.9.2.1 and 1.0.2.7

The error comes if the Script configuration box is ticked: Download "Also know as' provider page for retrieve the info...
If the box is not checked the script works and saves the info.

I atached the bug report.

I got the error while i tryed to update the following movies:

Calendar Girl (2011) https://www.imdb.com/title/tt1611816/
The Brazen Bull (2010) https://www.imdb.com/title/tt1415284/

The script worked till 16.11.2018, probably something changed to imdb site

17
Support / Re: Personal Video Database 1.0.2.7 MOD
« on: October 25, 2018, 06:31:19 pm »
Hello,

are the movie connections parsed by the MOD version ?

example:
https://www.imdb.com/title/tt0071275/movieconnections/?tab=mc&ref_=tt_trv_cnn

Cheers

18
Support / Re: Personal Video Database 1.0.2.7 MOD
« on: August 14, 2018, 08:06:03 am »
Yes, with that modification it worked.

Thank you.

20
Support / Re: Personal Video Database 1.0.2.7 MOD
« on: July 17, 2018, 05:40:13 pm »
Thank you Ivek23. Modified and works well.

Found a minor bug:

Movie:  Ratatouille (2007) https://www.imdb.com/title/tt0382932/

MPAA retrieved: Rated PG for mild action                                                                        Edit     

It adds the "Edit" at the end of the MPAA rating retrived. Other movies works well. Will continue testing.

Cheers.

Pages: [1] 2 3