Personal Video Database

English => Support => Topic started by: afrocuban on November 08, 2024, 05:09:10 pm

Title: Imdb People script issues
Post by: afrocuban on November 08, 2024, 05:09:10 pm
Does anyone have the issue with this skin, as I do. Namely, when downloading a person's photo is taking a place, it freezes, I cannot cancel import, and the only way to continue is to kill viddb.exe process, and to restart PVD.


Thanks a bunch for a feedback.
Title: Re: Imdb People script issues
Post by: afrocuban on December 04, 2024, 12:09:39 am
Can someone try to import any person with People Imdb script please, so I could know I am the only one having this issue? Upon restarting PVD if I click on the name the actor on which PVD frozen up, freezing occurs again and I have to improvise to delete actor's record from the database.
Title: Re: Imdb People script issues
Post by: Ivek23 on December 04, 2024, 02:26:48 pm
What is the error message in the log file?
Title: Re: Imdb People script issues
Post by: afrocuban on December 08, 2024, 05:03:46 pm
None. It starts parsing and hangs. On a status bar in PVD I see it hangs on importing people photo jpg.
Title: Re: Imdb People script issues
Post by: Ivek23 on December 08, 2024, 08:42:16 pm
None. It starts parsing and hangs. On a status bar in PVD I see it hangs on importing people photo jpg.

Then it blocks the entire code for downloading the people photo jpg.

Quote
(*   
    //Get ~Photo~ . Remember that the PVdB ~transname~ Translated Name is not stored in TheMovieDB. Can be used for PhotoURL
    ItemValue:=TextBetWeenFirst(ItemList,'"image":"','",');                  // WEB_SPECIFIC.
    If (Length(ItemValue)>0) and (Pos('nopicture',ItemValue)=0)Then Begin            //"https://m.media-amazon.com/images/G/01/imdb/images/nopicture/...' NOT exists working httpS
        PhotoURL:=TextBetWeenFirst(ItemValue,BASE_URL_IMAGE_PRE_TRUE,'.');       //Get poster code. Strings which opens/closes the data. WEB_SPECIFIC       
        If ((Length(PhotoURL)>0) and Not(USE_SAVED_PVDCONFIG and (Copy(PVDConfigOptions,opPhoto,1)='0'))) then begin  //The Poster will be saved in PVD
            PhotoURL:=BASE_URL_IMAGE_PRE_TRUE + PhotoURL;                             //Base poster URL without '.jpg'. WEB_SPECIFIC
            ImageFile:=GetAppPath+'Scripts\'+BASE_DOWNLOAD_FILE_IMAGE_NAME+'-Photo.jpg'
            // Avoid HTTPS redirection: Download https image to file
            If (1=DownloadImage(PhotoURL + '._V1_UY' + IntToStr(MAX_IMAGE_HEIGTH) + '_.jpg',ImageFile)) then begin  //Dowload in the selected user max size. WEB_SPECIFIC
                AddImageURL(itPoster,ImageFile);    //Get the photo to the database.But I don't know why but it doesnt work: not retrive the photo like in movie poster
                AddSearchResult(GetFieldValueXML('name'), '', '', ImageFile, ImageFile); //It's not possible avoid GetFieldValueXML because the name can't be the same.
                if PHOTO_URL_IN_TRANSNAME then AddFieldValueXML('transname',PhotoURL + '._V1_UY' + IntToStr(MAX_IMAGE_HEIGTH) + '_.jpg'); //For storing the URL to the person photo, for send to KODI in a Template
                LogMessage('      Get result PhotoURL:'+PhotoURL + '._V1_UY' + IntToStr(MAX_IMAGE_HEIGTH) + '_.jpg'+'||');
                LogMessage('Script end. After, PVdB will retreive from ListImage and info of person in order get the photo');
                Result:=prListImage;
            end else if (1=DownloadImage(ItemValue +'.jpg',ImageFile)) then begin  //Donwload in the web base size. WEB_SPECIFIC
                AddImageURL(itPoster,ImageFile);    //Get the photo to the database.But I don't know why but it doesnt work: not retrive the photo like in movie poster
                AddSearchResult(GetFieldValueXML('name'), '', '', ImageFile, ImageFile); //It's not possible avoid GetFieldValueXML because the name can't be the same.
                if PHOTO_URL_IN_TRANSNAME then AddFieldValueXML('transname',PhotoURL+'.jpg'); //For storing the URL to the person photo, for send to KODI in a Template
                LogMessage('      Get result PhotoURL:'+PhotoURL+'.jpg'+'||');
                LogMessage('Script end. After, PVdB will retreive from ListImage and info of person in order get the photo');
                Result:=prListImage;
            end;       
        End;       
    End Else Begin
        PhotoURL:='';
    End;
*)
Title: Re: Imdb People script issues
Post by: afrocuban on December 08, 2024, 09:09:23 pm
Thanks. Are you saying I should try to comment out downloading photo to check if that is the culprit?
Title: Re: Imdb People script issues
Post by: Ivek23 on December 09, 2024, 08:17:17 am
Thanks. Are you saying I should try to comment out downloading photo to check if that is the culprit?

Yes, I hope that the culprit of the problem is only this code for downloading the photo of people, and not something else, because I haven't tested it myself.
Title: Re: Imdb People script issues
Post by: Ivek23 on December 09, 2024, 10:44:06 am
Thanks. Are you saying I should try to comment out downloading photo to check if that is the culprit?

Yes, I hope that the culprit of the problem is only this code for downloading the photo of people, and not something else, because I haven't tested it myself.

The code for Function ParsePage_IMDBPeopleBIO is to blame, so that PVD freezes.

Code: [Select]
Function ParsePage_IMDBPeopleBIO(HTML:String):Cardinal; //BlockOpen
    //Returns:
    //     Result:=prFinished; Script has finished gathering data
    //     Result:=prError; If żany big problem? with exit;
    //Retrieve: ~bio~ Biography from "Mini Bio" IMDB section
// (* *)
  Var
    curPos,endPos,debug_pos1:Integer;
    ItemValue:String;
    PersonID,ItemValue0,ItemValue10,ItemValue1,ItemValue11:String;
ItemList,ItemList00,ItemList0,ItemList1,ItemList11,ItemList12:String;
  Begin
    LogMessage('Function ParsePage_IMDBPeopleBIO BEGIN=====================||');       
    Result:=prFinished;  //It will change to prError if any big problem with exit;
(*
    //Get "Biography" info
    curPos:=Pos('<h1 class="ipc-title__text">Biography</h1>',HTML);      //Strings start which opens the block content data. WEB_SPECIFIC
    if (curPos=0) then Exit;
ItemList0:=TextBetWeenFirst(HTML,'<h1 class="ipc-title__text','<h3 class="ipc-title__text"><span>Contribute to this page</span></h3>');
//LogMessage('  ** Parse Biography '+#13+ItemList0+' **');

If (Length(ItemList0)>0) Then Begin
ItemValue1:=TextBetWeenFirst(ItemList0,'<span class="ipc-metadata-list-item__label" aria-disabled="false">Birth name</span>','</div>');
if BIRTH_NAME_IN_TRANSNAME then
if ItemValue1 <> '' then AddFieldValueXML('transname',ItemValue1);
If ItemValue1 <> '' then LogMessage('      Get result from Birth Name01:'+ItemValue1+'||');
End;

ItemList:='';
ItemList11:='';
//Get PersonID
PersonID:=TextBetWeenFirst(HTML,'<meta property="imdb:pageConst" content="','"/>');   //WEB_SPECIFIC.
    if (2<Length(PersonID)) then begin
ItemList:='<link url="http://www.imdb.com/name/'+PersonID+'/bio/#overview">Biography Info</link>';
        LogMessage('      Get result PersonID:'+PersonID+'||');
    end;
    //Get "Mini bio" Biography text
If Pos('<h1 class="ipc-title__text">Biography</h1>',HTML)>0 Then Begin
curPos:=Pos('<h3 class="ipc-title__text"><span id="mini_bio">Mini Bio</span>',HTML);       //WEB_SPECIFIC.
If 0<curPos Then Begin
curPos:=PosFrom('<li role="presentation" class="ipc-metadata-list__item" id="mini_bio_0" data-testid="list-item"><div class="ipc-metadata-list-item__content-container"><ul class="ipc-inline-list ipc-inline-list--show-dividers ipc-inline-list--inline ipc-metadata-list-item__list-content base" role="presentation"><div class="ipc-html-content ipc-html-content--base ipc-metadata-list-item-html-item" role="presentation"><div class="ipc-html-content-inner-div">',HTML,EndPos)+Length('<li role="presentation" class="ipc-metadata-list__item" id="mini_bio_0" data-testid="list-item"><div class="ipc-metadata-list-item__content-container"><ul class="ipc-inline-list ipc-inline-list--show-dividers ipc-inline-list--inline ipc-metadata-list-item__list-content base" role="presentation"><div class="ipc-html-content ipc-html-content--base ipc-metadata-list-item-html-item" role="presentation"><div class="ipc-html-content-inner-div">');
EndPos:=PosFrom('</div>',HTML,curPos);
//ItemValue:=Copy(HTML,curPos,endPos-curPos);
ItemValue:=Trim(Copy(HTML,curPos,endPos-curPos)); //ItemValue:=Copy(HTML,curPos+425,endPos-curPos-425);
//LogMessage('      Get result bio (from Mini bio)1:'+ItemValue+'||');
ItemValue:=StringReplace(ItemValue,#10,#160,True,False,True);
//LogMessage('      Get result bio (from Mini bio)2:'+ItemValue+'||');
ItemValue:=StringReplace(ItemValue,'<a class="ipc-md-link ipc-md-link--entity" href="','<link url="http://www.imdb.com' ,True,False,True);
ItemValue:=StringReplace(ItemValue,'/?ref_=nmbio_mbio">',+'/">',True,False,True);
ItemValue:=StringReplace(ItemValue,'</a>','</link>',True,False,True);
//LogMessage('      Get result bio (from Mini bio)0:'+ItemValue+'||');
//curPos:=Pos('###',ItemValue);
//If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-2);
//curPos:=Pos('</p>',ItemValue);      //WEB_SPECIFIC. Chr(13)
//If 0<curPos then Delete(ItemValue,curPos,Length(ItemValue)-curPos);
//if Pos('~  ', ItemValue) = 0 then Delete(ItemValue,1,2);
If BIO_URL_IN_BIO then ItemValue:=RemoveTags(ItemValue, False);
LogMessage('      Get result bio (from Mini bio):'+ItemValue+'||');
If ItemValue <> '' then ItemList11:=ItemList11+ItemValue;
//if ItemValue <> '' then AddFieldValueXML('bio',ItemValue);
End;
End;

//If (ItemList11 = '') AND (ItemList <> '') Then
ItemList12:=ItemList;
If (ItemList11 <> '') AND (ItemList <> '') Then 
ItemList12:=ItemList11;

//ItemList12:=ItemList11+#13+'--------------------------------------------------------------------------'+#13+ItemList+#32#32#32+'<link url="http://www.imdb.com/name/'+PersonID+'/bio/#mini_bio">Mini bio Biography</link>';

///If BIO_INFO_IN_BIO then AddFieldValueXML('bio',ItemList12);

  ///If Not(BIO_INFO_IN_BIO) Then AddFieldValueXML('bio',ItemList11);

//Get "Birth name" Biography text
ItemList00:='';
ItemList00:=TextBetWeenFirst(HTML,'<h1 class="ipc-title__text','<h3 class="ipc-title__text"><span>Contribute to this page</span></h3>'); 
//LogMessage('  *** Parse Biography '+#13+ItemList00+' ***');
If (Length(ItemList00)>0) Then Begin
ItemValue0:=TextBetWeenFirst(ItemList00,'<span class="ipc-metadata-list-item__label" aria-disabled="false">Birth name</span>','</div></div></div>');
if BIRTH_NAME_IN_TRANSNAME then
//if ItemValue0 <> '' then AddFieldValueXML('transname',ItemValue0);
If ItemValue <> '' then LogMessage('      Get result from Birth Name02:'+ItemValue0+'||');
If ItemValue0 <> '' then ItemValue0:='BirthName:  '+ItemValue0;
If ItemValue0 <> '' then ItemList12:=ItemList12+#13+'--------------------------------------------------------------------------'+#13+ItemValue0;
End;

If BIO_INFO_IN_BIO then AddFieldValueXML('bio',ItemList12);

  If Not(BIO_INFO_IN_BIO) Then AddFieldValueXML('bio',ItemList11);
         
*)
    LogMessage('Function ParsePage_IMDBPeopleBIO END=====================||');
  End; //BlockClose



Below is the added code and IMDB_People_[EN][HTTPS] (2) script, where this function is blocked. The script needs massive changes due to major changes in the source code of the website.
Title: Re: Imdb People script issues
Post by: afrocuban on December 21, 2024, 10:57:39 pm
As I said somewhere else, i fixed bio and genre fields, and now I'm dealing with integrating selenium into PVD for downloading dymanic HTML content.
Reference from this point forward (http://www.videodb.info/forum_en/index.php/topic,4357.msg22661.html#msg22661)
Now, I have passed the phase to parse the Awards page manually downloaded with selenium, and I'm having hard time with it. I have fixed the code to parse the page, and it successfuly parse it as you can see here:
Quote
(12/21/2024 10:24:56 PM) Parsed Event: Ariel Awards, Mexico
(12/21/2024 10:24:56 PM) Parsed Award:  Golden Ariel
(12/21/2024 10:24:56 PM) Parsed Category: Best Picture (Mejor Película)
(12/21/2024 10:24:56 PM) Parsed Recipient: Roma
(12/21/2024 10:24:56 PM) Parsed Year: 2019
(12/21/2024 10:24:56 PM) Parsed Won: True
(12/21/2024 10:24:56 PM) Before calling AddAward with parameters:
(12/21/2024 10:24:56 PM) Event: Ariel Awards, Mexico
(12/21/2024 10:24:56 PM) Award:  Golden Ariel
(12/21/2024 10:24:56 PM) Category: Best Picture (Mejor Película)
(12/21/2024 10:24:56 PM) Recipient: Roma
(12/21/2024 10:24:56 PM) Year: 2019
(12/21/2024 10:24:56 PM) Won: True
(12/21/2024 10:24:56 PM) AddAward executed successfully.
(12/21/2024 10:24:56 PM) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(12/21/2024 10:24:56 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True


But for some reason the value is not populated/displayed in PVD. So I thought I'll create custom memo field "IMDb People Awards" to populate there the value to check what it looks like, only to realize no custom field is visible in PVD's People section???
Is it possible at all to add custom fields in People section?
I manually put the value in the field (added to my dark people skin), but when I exit edit mode, it's not displayed. When I enter edit mode, it's there. If I restart PVD value dissappears from the custom field.


Anyway, does  anyone know looking at the log, why this properly parsed award wouldn't populate to field although reported that it did?

Here's whole function (I even added some extra logging around adding value to the field in order to see what is going on, but to no avail - everything looks perfect yet value is not there):

Quote
Function CustomBoolToStr(Value: Boolean): String;
Begin
  If Value Then
    Result := 'True'
  Else
    Result := 'False';
End;


Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
  curPos, endPos: Integer;
  ItemList, Event, Award, Category, Recipient, Year: String;
  AValue: String; // Declaring AValue as a String
  Won: Boolean;
  FailSafe: Integer;  // To prevent infinite loops


Begin
  LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');


  try
    Result := prFinished;


    // Log the initial HTML snippet being parsed
    LogMessage('Initial HTML snippet: ' + Copy(HTML, 1, 500));


    // Find the position of the Awards title
    curPos := Pos('<h1 class="ipc-title__text">Awards</h1>', HTML);
    If curPos > 0 Then Begin
      // Find the position of the Awards section
      curPos := PosFrom('<section class="ipc-page-section ipc-page-section--base">', HTML, curPos);
    End;


    If curPos > 0 Then Begin
      // Find the end position of the Awards section
      endPos := PosFrom('</section>', HTML, curPos);
      If endPos = 0 Then endPos := Length(HTML);


      If (curPos > 0) AND (endPos > curPos) Then Begin
        // Extract the Awards block
        ItemList := Copy(HTML, curPos, endPos - curPos);


        // Extract and log the event name
        curPos := PosFrom('<h3 class="ipc-title__text">', ItemList, 1);
        If curPos > 0 Then Begin
          curPos := PosFrom('>', ItemList, curPos) + 1;
          endPos := PosFrom('</span>', ItemList, curPos);
          Event := Copy(ItemList, curPos, endPos - curPos);
          Event := Trim(Event);


          // Remove the <span> tag
          Event := Copy(Event, Pos('>', Event) + 1, Length(Event));
          LogMessage('Parsed Event: ' + Event);
        End Else LogMessage('Error: Event title div not found.');


        // Parse each award item manually
        curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item"', ItemList, 1);
        FailSafe := 0;  // Initialize fail-safe counter


        While (curPos > 0) And (FailSafe < 10) Do Begin
          // Extract and log the award name
          curPos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, curPos);
          If curPos > 0 Then Begin
            curPos := PosFrom('>', ItemList, curPos) + 1;
            endPos := PosFrom('</span>', ItemList, curPos);
            Award := Copy(ItemList, curPos, endPos - curPos);
            LogMessage('Parsed Award: ' + Award);


            // Extract and log the category name
            curPos := PosFrom('<span class="ipc-metadata-list-summary-item__li awardCategoryName"', ItemList, curPos);
            If curPos > 0 Then Begin
              curPos := PosFrom('>', ItemList, curPos) + 1;
              endPos := PosFrom('</span>', ItemList, curPos);
              Category := Copy(ItemList, curPos, endPos - curPos);
              LogMessage('Parsed Category: ' + Category);


              // Extract and log the recipient name
              curPos := PosFrom('<a class="ipc-metadata-list-summary-item__li ipc-metadata-list-summary-item__li--link"', ItemList, curPos);
              If curPos > 0 Then Begin
                curPos := PosFrom('>', ItemList, curPos) + 1;
                endPos := PosFrom('</a>', ItemList, curPos);
                Recipient := Copy(ItemList, curPos, endPos - curPos);
                LogMessage('Parsed Recipient: ' + Recipient);


                // Extract and log the year
                curPos := PosFrom('<a class="ipc-metadata-list-summary-item__t"', ItemList, curPos);
                If curPos > 0 Then Begin
                  curPos := PosFrom('>', ItemList, curPos) + 1;
                  endPos := PosFrom(' ', ItemList, curPos);  // Find the space after the year
                  Year := Copy(ItemList, curPos, endPos - curPos);
                  Year := Trim(Year);
                  LogMessage('Parsed Year: ' + Year);
                End Else LogMessage('Error: Year not found.');


                // Determine if the award was won
                Won := Pos('Winner', ItemList) > 0;
                If Won Then
                  LogMessage('Parsed Won: True')
                Else
                  LogMessage('Parsed Won: False');


                // Construct the AValue string
                AValue := 'Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=' + CustomBoolToStr(Won);


                // Log the parameters before calling AddAward
                LogMessage('Before calling AddAward with parameters:');
                LogMessage('Event: ' + Event);
                LogMessage('Award: ' + Award);
                LogMessage('Category: ' + Category);
                LogMessage('Recipient: ' + Recipient);
                LogMessage('Year: ' + Year);
                LogMessage('Won: ' + CustomBoolToStr(Won));


                // Add the award to the database with error handling
                try
                  AddAward(Event, Award, Category, Recipient, Year, Won);
                  LogMessage('AddAward executed successfully.');
                except
                  Begin
                    LogMessage('Exception encountered in AddAward');
                    Result := prError;
                  End;
                end;


                // Populate the custom field with AValue
                AddCustomFieldValueByName('IMDb People Awards', AValue);
                    LogMessage('IMDb People Awards added ' + AValue)


                // Log the action of adding the award
                If Won Then
                  LogMessage('Added Award to Database: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won: True')
                Else
                  LogMessage('Added Award to Database: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won: False');
              End Else LogMessage('Error: Recipient not found.');
            End Else LogMessage('Error: Category not found.');
          End Else LogMessage('Error: Award not found.');


          // Move to the next item
          curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item"', ItemList, curPos + 1);
        End;
      End Else LogMessage('Error: Invalid endPos or curPos for Awards section');
    End Else LogMessage('Error: Awards section not found');


  except
    Begin
      LogMessage('Exception encountered');
      Result := prError;
    End;
  end;


  LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
  Result := prFinished;
End;


//BlockClose
Title: Re: Imdb People script issues
Post by: Ivek23 on December 22, 2024, 10:05:26 am
But for some reason the value is not populated/displayed in PVD. So I thought I'll create custom memo field "IMDb People Awards" to populate there the value to check what it looks like, only to realize no custom field is visible in PVD's People section???
Is it possible at all to add custom fields in People section?
I manually put the value in the field (added to my dark people skin), but when I exit edit mode, it's not displayed. When I enter edit mode, it's there. If I restart PVD value dissappears from the custom field.

You can only see this in the comment box for people to see what is happening.


The code you have now does not complete the process, so this is the code

Quote
AddAward(EventName, AwardName, AwardCategory, AwardRecipient, EventYear, AwardWon);

therefore, it cannot write the award data so that the awards will then be visible in the awards field in the database.

Here is the awards code to help you.

Quote
Function ParsePage_IMDBPeopleAWARDS(HTML:String):Cardinal; //BlockOpen
    //Returns:
    //     Result:=prFinished; Script has finished gathering data
    //     Result:=prError; If żany big problem? with exit
    //Retrieve: AddAward(Event, Award, Category, Recipient, Year, Won)
  Var
    curPos,endPos,endPosAux,index,curPos0,curPos1,curPos2,curPos3,curPos4,endPos0,endPos1,endPos2:Integer;
    ItemList:String;
    ItemArray: TWideArray;
    MovieURL,MovieYear,EventBlock,EventName,EventYear,YearBlock,AwardBlock,AwardName,AwardCategory,AwardRecipient:String;
    AwardWon: Boolean;
  Begin
    LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
    Result:=prFinished;  //It will change to prError if any big problem with exit   
//Get award (several values save in PVD with AddAward(Event, Award, Category, Recipient, Year, Won)
        // Parameters: Example Al Pacino
        //Event (Academy Awards, USA): Name of the event 
        //Year (1993) = EventYear
        //Won (True,Winner/Nominee) set to true if the recipient won the award and to false otherwise       
        //Award (Oscar): Best award name
        //Category (Best Actor in a Leading Role): award category
        //Recipient (Scent of a Woman): for people records the variable should contain the title of a movie for which the person won the award
        //          for movie records this variable should contain the name of a specific person who won the award
        //Year (1973): release year of a movie (only applicable when adding award to a person record) -> NO: Use EventYear allways, in movie and in people
        //Won (True,Winner/Nominee) set to true if the recipient won the award and to false otherwise
//Go to "Awards" There is 4 levels: 1) Event (name) 2) Year (not saved) 3) Award (with outcome-Winner and name) 4) Recipient (award_description and Movie(name and year)) 
    curPos:=Pos('<h1 class="header">Awards',HTML);                                     //Strings start which opens the block content data. WEB_SPECIFIC
    curPos:=PosFrom('</h1>',HTML,curPos);                                              //Strings end which opens the block content data.  WEB_SPECIFIC
    curPos:=curPos+Length('</h1>');                                                    //Strings end which opens the block content data.  WEB_SPECIFIC
    //Event Level
    curPos:=PosFrom('<table class="awards"',HTML,curPos);                              //String which opens/closes the Event close but not the name. Search directly '<h3>' is very inconsistent. WEB_SPECIFIC
    index:=1;
    While curPos>0 Do Begin
        If (index>EVENTS_LIMIT) Then break;     //Limited depassed (Remember index begin in 0).
        //Go back for get the EventName and EventYear (Get all "raw" list data for create good values separators)
        curPos:=PrevPos('<h3>',HTML,curPos);                                             //String which opens the EventName and EventYear list data. WEB_SPECIFIC
        endPos:=PosFrom('</h3>',HTML,curPos)+Length('</h3>');                            //Strings which opens/closes the data. WEB_SPECIFIC
        ItemList:=Copy(HTML,curPos,endPos-curPos);
        EventName:=RemoveTags(ItemList, False);
        //LogMessage('           Parse results ('+IntToStr(curPos)+','+IntToStr(endPos)+') complex ItemList:'+ItemList+'||');         
        //Get all "raw" Event data for create good values separators
        curPos:=PosFrom('<table class="awards"',HTML,endPos);                               //String which opens/closes the Event table data but not the name. WEB_SPECIFIC
        endPos:=PosFrom('</table>',HTML,curPos);
        //Strings which opens/closes the data. WEB_SPECIFIC
        EventBlock:=Copy(HTML,curPos,endPos-curPos);
        //LogMessage('           Parse results ('+IntToStr(curPos)+','+IntToStr(endPos)+') complex EventBlock:'+EventBlock+'||');
        //Year Level
        curPos0:=Pos('<td class="award_year"',EventBlock);                                  //String which opens the AwardYear list data. WEB_SPECIFIC
        While curPos0>0 Do Begin
            //Get EventYear
            endPos0:=PosFrom('</td>',EventBlock,curPos0)+Length('</td>');                   //Strings which opens/closes the data. WEB_SPECIFIC
            ItemList:=Copy(EventBlock,curPos0,endPos0-curPos0);
            EventYear:=Trim(RemoveTags(ItemList, False));
            //Get all "raw" Year data for create good values separators
            endPosAux:=PosFrom('<td class="award_year"',EventBlock,endPos0);                 //Strings which opens/closes the next block data. WEB_SPECIFIC
            If (endPosAux=0) Then endPosAux:=Length(EventBlock);                           //If no more blocks, set endPosAux at the last character.
            YearBlock:=Copy(EventBlock,curPos0,endPosAux-curPos0);
            //LogMessage('           Parse results ('+IntToStr(curPos0)+','+IntToStr(endPosAux)+') complex YearBlock:'+YearBlock+'||');
            //Award Level     
            curPos1:=Pos('<td class="award_outcome"',YearBlock);                         //String which opens the AwardName and Won list data. WEB_SPECIFIC
            While curPos1>0 Do Begin
                //Get AwardWon and AwardName
                endPos1:=PosFrom('</td>',YearBlock,curPos1)+Length('</td>');                   //Strings which opens/closes the data. WEB_SPECIFIC
                ItemList:=Copy(YearBlock,curPos1,endPos1-curPos1);
                ItemList:=StringReplace(ItemList,'category','>;<',True,True,False);              //WEB_SPECIFIC
                ItemList:=RemoveTags(ItemList, False);
                //LogMessage('           Parse results ('+IntToStr(curPos1)+','+IntToStr(endPos1)+') complex ItemList:'+ItemList+'||');
                ExplodeString(ItemList,ItemArray,';');
                AwardWon:= False;                                                           //Normaly in 'Nominee' case. WEB_SPECIFIC
                If Pos('Winner',ItemArray[0])>0 Then AwardWon:= True;                          //WEB_SPECIFIC
                AwardName:=ItemArray[1];
                //Get all "raw" Award data for create good values separators
                endPosAux:=PosFrom('<td class="award_outcome"',YearBlock,endPos1);         //Strings which opens/closes the next block data. WEB_SPECIFIC
                If (endPosAux=0) Then endPosAux:=Length(YearBlock);                           //If no more blocks, set endPosAux at the last character.
                AwardBlock:=Copy(YearBlock,curPos1,endPosAux-curPos1);
                //LogMessage('           Parse results ('+IntToStr(curPos1)+','+IntToStr(endPosAux)+') complex AwardBlock:'+AwardBlock+'||');
                //Recipient Level     
                curPos2:=Pos('<td class="award_description">',AwardBlock);                       //String which opens the AwardCategory and AwardRecipient list data. WEB_SPECIFIC
                While curPos2>0 Do Begin
                    //Get all "raw" list data for create good values separators (not use TextBetWeen)
                    endPos2:=PosFrom('</td>',AwardBlock,curPos2)+Length('</td>');                 //Strings which opens/closes the data. WEB_SPECIFIC
                    ItemList:=Copy(AwardBlock,curPos2,endPos2-curPos2);
                    //LogMessage('           Parse results ('+IntToStr(curPos2)+','+IntToStr(curPos2)+') complex ItemList:'+ItemList+'||');
                    //The Receipt awards ItemList may have:  1) empty description or not have name (not interesting) and break ItemArray[]. 2) Several titles with year 3) Detail o full Notes
                    //So is better search sequentily by token in a block than with ItemArray
                    endPosAux:=PosFrom(#13,ItemList,2);                                        //Strings which opens/closes the data. WEB_SPECIFIC
                    curPos3:=PosFrom('title',ItemList,2);                                       //Strings which opens/closes the data. WEB_SPECIFIC
                    If (endPosAux<curPos3) Or (curPos3=0) Then Begin                            //There is Awardcategory because #13 is befor name or there isn't name. WEB_SPECIFIC
                        curPos4:=1;
                        AwardCategory:=TextBetWeen(ItemList,'<td class="award_description">',#13,false,curPos4);   //Strings which opens/closes the data. WEB_SPECIFIC
                        LogMessage('     Parse Results in AwardCategory:'+AwardCategory+'||');
                        curPos4:=Pos('Shared with:',AwardCategory);                          //WEB_SPECIFIC.
                        If 0<curPos4 then AwardCategory:=Copy(AwardCategory,0,curPos4-1);
                        LogMessage('     Parse Results in AwardCategory0:'+AwardCategory+'||');
                    End Else Begin
                        AwardCategory:='';
                    End;
                    If curPos3=0 Then Begin //Award without Recipient
                        AddAward(EventName, AwardName, AwardCategory, '', EventYear, AwardWon);
                        LogMessage('      Get results Awards:#'+IntToStr(index)+'|'+EventName+'|'+AwardName+'|'+AwardCategory+'|'+''+'|'+EventYear+'|'); //+BoolToStr(AwardWon)+'||');
                    End;
                    While curPos3>0 Do Begin
                        MovieURL:='http://www.imdb.com/title'+TextBetWeen(ItemList,'<a href="/title','?ref_=nmawd_awd_',true,curPos4)+'/';                                      //Strings which opens/closes the data. WEB_SPECIFIC
                        LogMessage('  **  Parse Results in MovieURL: '+MovieURL);
                        AwardRecipient:=TextBetWeen(ItemList,'>','<',false,curPos3);              //Strings which opens/closes the data. WEB_SPECIFIC
                        LogMessage('      Parse Results in AwardRecipient:'+AwardRecipient+'||');
                        MovieYear:=TextBetWeen(ItemList,'(',')',false,curPos3);                  //Strings which opens/closes the data. WEB_SPECIFIC
                        LogMessage('  **  Parse Results in MovieYear:'+MovieYear);
                        AddAward(EventName, AwardName, AwardCategory, AwardRecipient, EventYear, AwardWon);
                        LogMessage('      Get results Awards:#'+IntToStr(index)+'|'+EventName+'|'+AwardName+'|'+AwardCategory+'|'+AwardRecipient+'|'+EventYear+'|'); //+BoolToStr(AwardWon)+'||');                   
                        endPosAux:=PosFrom('truncated-note',ItemList,curPos3);                   //Strings which opens/closes the data. WEB_SPECIFIC
                        curPos3:=PosFrom('title',ItemList,curPos3);                             //Strings which opens/closes the data. WEB_SPECIFIC
                        If curPos3>endPosAux Then curPos3:=0                                   //Avoid Names in notes. WEB_SPECIFIC                                                                                                                                     
                    End;
                    curPos2:=PosFrom('<td class="award_description">',AwardBlock,endPos2);        //String which opens the AwardCategory and AwardRecipient list data. WEB_SPECIFIC
                End;
                curPos1:=PosFrom('<td class="award_outcome"',YearBlock,endPos1);               //String which opens the AwardName and Won list data. WEB_SPECIFIC
            End;
            curPos0:=PosFrom('<td class="award_year"',EventBlock,endPos0);                      //String which opens the AwardYearlist data. WEB_SPECIFIC
        End;
        curPos:=PosFrom('<table class="awards"',HTML,endPos);                               //String which detectecs the Event. Search directly '<h3>' is very inconsistent. WEB_SPECIFIC
        index:=index+1;
    End;
    LogMessage('Function ParsePage_IMDBMovieAWARDS END=====================||');
  End; //BlockClose
Title: Re: Imdb People script issues
Post by: afrocuban on December 22, 2024, 06:26:56 pm

Thanks Ivek!


I never knew about custom fields.


I also didn't know how would database know that there are more awards on the page so it will not populate what is already offered to it:

Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
It clearly says that it is added to database and it contains all the parameters defined by the procedure.


But I'll try to parse all awards and events to check if it works then...


Regarding th function you sent, I started from it actually, but with no luck. I couldn't revise it in a meaningful way to get desired result (mainly due to TextBetWeen), so I had to start from scratch manually parsing html content.
Title: Re: Imdb People script issues
Post by: Ivek23 on December 22, 2024, 09:36:31 pm
Quote
(22.12.2024 21:12:36) Compiling script: IMDB_People_[EN][HTTPS].psf
(22.12.2024 21:12:36) Script compiled successfully: IMDB_People_[EN][HTTPS].psf
[Hint] (492:7): Variable 'CURPOS' never used
[Hint] (493:7): Variable 'ITEMVALUE' never used
[Hint] (493:7): Variable 'IMAGEFILE' never used
[Hint] (494:7): Variable 'NAME' never used
[Hint] (494:7): Variable 'PREVIEWURL' never used
[Hint] (582:5): Variable 'CURPOS' never used
[Hint] (582:5): Variable 'ENDPOS' never used
[Hint] (582:5): Variable 'DEBUG_POS1' never used
[Hint] (582:5): Variable 'INDEX' never used
[Hint] (583:5): Variable 'PHOTOURL' never used
[Hint] (583:5): Variable 'ITEMVALUE' never used
[Hint] (583:5): Variable 'ITEMLIST' never used
[Hint] (583:5): Variable 'IMAGEFILE' never used
[Hint] (584:2): Variable 'PERSONID' never used
[Hint] (584:2): Variable 'ITEMVALUE0' never used
[Hint] (584:2): Variable 'ITEMVALUE1' never used
[Hint] (584:2): Variable 'ITEMVALUE2' never used
[Hint] (584:2): Variable 'ITEMVALUE3' never used
[Hint] (585:2): Variable 'JOBTITLE' never used
[Hint] (585:2): Variable 'ALTNAMES' never used
[Hint] (585:2): Variable 'ALTNAMES1' never used
[Hint] (585:2): Variable 'DEATHAGE' never used
[Hint] (586:2): Variable 'ITEMLIST0' never used
[Hint] (586:2): Variable 'ITEMLIST1' never used
[Hint] (586:2): Variable 'ITEMLIST2' never used
[Hint] (586:2): Variable 'ITEMLIST4' never used
[Hint] (587:2): Variable 'TITLE' never used
[Hint] (587:2): Variable 'ROLE' never used
[Hint] (587:2): Variable 'YEAR' never used
[Hint] (587:2): Variable 'MOVIEURL' never used
[Warning] (852:57): "True and" is not needed
[Warning] (852:29): "True and" is not needed
(22.12.2024 21:12:36) Executing script binary
(22.12.2024 21:12:36) Script loaded: IMDB_People_[EN][HTTPS].psf 1.4.3.5
(22.12.2024 21:12:37) Loading database: D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\MOVIES33B1B22.pvd
(22.12.2024 21:12:38) Person -> LoadStatic -> 0ms
(22.12.2024 21:12:38) Person -> LoadMultivalues -> 0ms
(22.12.2024 21:12:38) Person -> LoadFilms -> 0ms
(22.12.2024 21:12:38) Person -> LoadAwards -> 0ms
(22.12.2024 21:12:38) Person -> LoadImages -> 0ms
(22.12.2024 21:12:40) Compiling script: IMDB_People_[EN][HTTPS].psf
(22.12.2024 21:12:40) Script compiled successfully: IMDB_People_[EN][HTTPS].psf
[Hint] (492:7): Variable 'CURPOS' never used
[Hint] (493:7): Variable 'ITEMVALUE' never used
[Hint] (493:7): Variable 'IMAGEFILE' never used
[Hint] (494:7): Variable 'NAME' never used
[Hint] (494:7): Variable 'PREVIEWURL' never used
[Hint] (582:5): Variable 'CURPOS' never used
[Hint] (582:5): Variable 'ENDPOS' never used
[Hint] (582:5): Variable 'DEBUG_POS1' never used
[Hint] (582:5): Variable 'INDEX' never used
[Hint] (583:5): Variable 'PHOTOURL' never used
[Hint] (583:5): Variable 'ITEMVALUE' never used
[Hint] (583:5): Variable 'ITEMLIST' never used
[Hint] (583:5): Variable 'IMAGEFILE' never used
[Hint] (584:2): Variable 'PERSONID' never used
[Hint] (584:2): Variable 'ITEMVALUE0' never used
[Hint] (584:2): Variable 'ITEMVALUE1' never used
[Hint] (584:2): Variable 'ITEMVALUE2' never used
[Hint] (584:2): Variable 'ITEMVALUE3' never used
[Hint] (585:2): Variable 'JOBTITLE' never used
[Hint] (585:2): Variable 'ALTNAMES' never used
[Hint] (585:2): Variable 'ALTNAMES1' never used
[Hint] (585:2): Variable 'DEATHAGE' never used
[Hint] (586:2): Variable 'ITEMLIST0' never used
[Hint] (586:2): Variable 'ITEMLIST1' never used
[Hint] (586:2): Variable 'ITEMLIST2' never used
[Hint] (586:2): Variable 'ITEMLIST4' never used
[Hint] (587:2): Variable 'TITLE' never used
[Hint] (587:2): Variable 'ROLE' never used
[Hint] (587:2): Variable 'YEAR' never used
[Hint] (587:2): Variable 'MOVIEURL' never used
[Warning] (852:57): "True and" is not needed
[Warning] (852:29): "True and" is not needed
(22.12.2024 21:12:40) Executing script binary
(22.12.2024 21:12:40) Prijava v...
(22.12.2024 21:12:40) Person -> LoadStatic -> 0ms
(22.12.2024 21:12:40) Person -> LoadMultivalues -> 0ms
(22.12.2024 21:12:40) Person -> LoadFilms -> 0ms
(22.12.2024 21:12:40) Person -> LoadAwards -> 0ms
(22.12.2024 21:12:40) Person -> LoadImages -> 0ms
(22.12.2024 21:12:40) Function GetDownloadURL BEGIN======================|
(22.12.2024 21:12:40) Global Var-Mode|0|
(22.12.2024 21:12:40) Global Var-DownloadURL||
(22.12.2024 21:12:40) Person -> LoadStatic -> 0ms
(22.12.2024 21:12:40) Person -> LoadMultivalues -> 0ms
(22.12.2024 21:12:40) Person -> LoadFilms -> 0ms
(22.12.2024 21:12:40) Person -> LoadAwards -> 0ms
(22.12.2024 21:12:40) Person -> LoadImages -> 0ms
(22.12.2024 21:12:40) Stored URL is:http://www.imdb.com/name/nm0190859/awards/||
(22.12.2024 21:12:40) * Stored URL is:http://www.imdb.com/name/nm0190859/awards//||
(22.12.2024 21:12:40)       IMDB URL.
(22.12.2024 21:12:40)       Parse stored information DownloadURL:https://www.imdb.com/name/nm0190859/||
(22.12.2024 21:12:40) Function GetDownloadURL END====================== with Mode=1 Result=D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\portable.bat|
(22.12.2024 21:12:40) Iskanje ljudi informacij za: Alfonso Cuarón
(22.12.2024 21:12:40) Function ParsePage BEGIN======================|
(22.12.2024 21:12:40) Global Var-Mode|1|
(22.12.2024 21:12:40) Global Var-DownloadURL|https://www.imdb.com/name/nm0190859/|
(22.12.2024 21:12:40) Local Var-URL||
(22.12.2024 21:12:40)   ParsePage mode smNormal|1|. Getting provider data for PersonID|nm0190859|
(22.12.2024 21:12:40)       Function DownloadPage BEGIN======================|
(22.12.2024 21:12:40)       Global Var-DownloadURL|https://www.imdb.com/name/nm0190859/|
(22.12.2024 21:12:40)             Local Var-URL|https://www.imdb.com/name/nm0190859/|
(22.12.2024 21:12:40)             Waiting 1s for delete:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:41)             Download with PVdBDownPage in file:|D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm the information of:|https://www.imdb.com/name/nm0190859/||
(22.12.2024 21:12:41)             Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:43)             Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:45)             Now present complete page file: D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:45)       Function DownloadPage END======================|
(22.12.2024 21:12:45) Function ParsePage_IMDBPersonBASE BEGIN======================|
(22.12.2024 21:12:45) Function ParsePage_IMDBPersonBASE END=====================||
(22.12.2024 21:12:45)       Function DownloadPage BEGIN======================|
(22.12.2024 21:12:45)       Global Var-DownloadURL|https://www.imdb.com/name/nm0190859/awards/|
(22.12.2024 21:12:45)             Local Var-URL|https://www.imdb.com/name/nm0190859/awards/|
(22.12.2024 21:12:46)             Waiting 1s for delete:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:47)             Download with PVdBDownPage in file:|D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm the information of:|https://www.imdb.com/name/nm0190859/awards/||
(22.12.2024 21:12:47)             Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:49)             Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:51)             Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:53)             Now present complete page file: D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:53)       Function DownloadPage END======================|
(22.12.2024 21:12:53) Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||
(22.12.2024 21:12:53) Initial HTML snippet: <!DOCTYPE html><html lang="en-US" xmlns:og="http://opengraphprotocol.org/schema/" xmlns:fb="http://www.facebook.com/2008/fbml"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><script>if(typeof uet === 'function'){ uet('bb', 'LoadTitle', {wb: 1}); }</script><script>window.addEventListener('load', (event) => {
        if (typeof window.csa !== 'undefined' && typeof window.csa === 'function') {
            var csaLatencyPlugin = window.csa('Content', {
             
(22.12.2024 21:12:53) Parsed Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Parsed Award:  Golden Ariel
(22.12.2024 21:12:53) Parsed Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Golden Ariel
(22.12.2024 21:12:53) Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Silver Ariel
(22.12.2024 21:12:53) Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Error: Year not found.
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Silver Ariel
(22.12.2024 21:12:53) Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Golden Ariel
(22.12.2024 21:12:53) Parsed Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Golden Ariel
(22.12.2024 21:12:53) Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Silver Ariel
(22.12.2024 21:12:53) Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Error: Year not found.
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Silver Ariel
(22.12.2024 21:12:53) Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Golden Ariel
(22.12.2024 21:12:53) Parsed Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Golden Ariel
(22.12.2024 21:12:53) Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Silver Ariel
(22.12.2024 21:12:53) Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Error: Year not found.
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Silver Ariel
(22.12.2024 21:12:53) Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award:  Golden Ariel
(22.12.2024 21:12:53) Parsed Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award:  Golden Ariel
(22.12.2024 21:12:53) Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True

Above is a part of the log output where it is visible that the Function ParsePage_IMDBPeopleAWARDS does not close. I had this in mind before, that the part of the code that would end the Function ParsePage_IMDBPeopleAWARDS is missing.
Title: Re: Imdb People script issues
Post by: Ivek23 on December 23, 2024, 01:29:13 pm
Here is the IMDB_People_[EN][HTTPS]_Awards script, which now correctly transfers Awards data to the awards field for the 'Chico' Hernandez person from the url added below using a Python Selenium script

https://www.imdb.com/name/nm0379491/awards/

I have corrected or added some parts of the code to your code and it works.

Python Selenium script instructions and code will be published probably by the new year in the Integrating Selenium to PVD topic.

http://www.videodb.info/forum_en/index.php/topic,4357.0.html
Title: Re: Imdb People script issues
Post by: afrocuban on December 23, 2024, 09:05:40 pm

Above is a part of the log output where it is visible that the Function ParsePage_IMDBPeopleAWARDS does not close. I had this in mind before, that the part of the code that would end the Function ParsePage_IMDBPeopleAWARDS is missing.

Wow! Strange things happen! Now I realize what you meant, but it never occur to me since it didn't loop in my case, that's why I didn't understand!

Here is the IMDB_People_[EN][HTTPS]_Awards script, which now correctly transfers Awards data to the awards field for the 'Chico' Hernandez person from the url added below using a Python Selenium script

https://www.imdb.com/name/nm0379491/awards/ (https://www.imdb.com/name/nm0379491/awards/)

I have corrected or added some parts of the code to your code and it works.

Python Selenium script instructions and code will be published probably by the new year in the Integrating Selenium to PVD topic.

http://www.videodb.info/forum_en/index.php/topic,4357.0.html (http://www.videodb.info/forum_en/index.php/topic,4357.0.html)

Thanks! It's so great that you are willing to look in the code I provide AND HELP! I'm still testing it, and it looks that it properly parses awards inside events, but it always takes the name of the first event (In your case, person had only one award, but in my case there are multiple, and the first is "Ariel Awards, Mexico" and we can see in the log that there are also Oscars, ALMA Awards, and others after that I didn't post, but all added to event "Ariel Awards, Mexico" event):

Quote
12/23/2024 8:34:38 PM) Parsed Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Parsed Award:  Golden Ariel
(12/23/2024 8:34:38 PM) Parsed Category: Best Picture (Mejor Película)
(12/23/2024 8:34:38 PM) Parsed Recipient: Roma
(12/23/2024 8:34:38 PM) Parsed Year: 2019
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award:  Golden Ariel
(12/23/2024 8:34:38 PM) Category: Best Picture (Mejor Película)
(12/23/2024 8:34:38 PM) Recipient: Roma
(12/23/2024 8:34:38 PM) Year: 2019
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(12/23/2024 8:34:38 PM) Parsed Award:  Silver Ariel
(12/23/2024 8:34:38 PM) Parsed Category: Best Original Story (Mejor Argumento Original)
(12/23/2024 8:34:38 PM) Parsed Recipient: Love in the Time of Hysteria
(12/23/2024 8:34:38 PM) Parsed Year: 1992
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award:  Silver Ariel
(12/23/2024 8:34:38 PM) Category: Best Original Story (Mejor Argumento Original)
(12/23/2024 8:34:38 PM) Recipient: Love in the Time of Hysteria
(12/23/2024 8:34:38 PM) Year: 1992
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Original Story (Mejor Argumento Original), Recipient=Love in the Time of Hysteria, Year=1992, Won: True
(12/23/2024 8:34:38 PM) Parsed Award:  Oscar
(12/23/2024 8:34:38 PM) Parsed Category: Best Achievement in Cinematography
(12/23/2024 8:34:38 PM) Parsed Recipient: Roma
(12/23/2024 8:34:38 PM) Parsed Year: 2019
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award:  Oscar
(12/23/2024 8:34:38 PM) Category: Best Achievement in Cinematography
(12/23/2024 8:34:38 PM) Recipient: Roma
(12/23/2024 8:34:38 PM) Year: 2019
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Oscar, Category=Best Achievement in Cinematography, Recipient=Roma, Year=2019, Won: True
(12/23/2024 8:34:38 PM) Parsed Award:  Oscar
(12/23/2024 8:34:38 PM) Parsed Category: Best Achievement in Film Editing
(12/23/2024 8:34:38 PM) Parsed Recipient: Gravity
(12/23/2024 8:34:38 PM) Parsed Year: 2007
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award:  Oscar
(12/23/2024 8:34:38 PM) Category: Best Achievement in Film Editing
(12/23/2024 8:34:38 PM) Recipient: Gravity
(12/23/2024 8:34:38 PM) Year: 2007
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Oscar, Category=Best Achievement in Film Editing, Recipient=Gravity, Year=2007, Won: True
(12/23/2024 8:34:38 PM) Parsed Award:  Saturn Award
(12/23/2024 8:34:38 PM) Parsed Category: Best Writing
(12/23/2024 8:34:38 PM) Parsed Recipient: Gravity
(12/23/2024 8:34:38 PM) Parsed Year: 2014
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award:  Saturn Award
(12/23/2024 8:34:38 PM) Category: Best Writing
(12/23/2024 8:34:38 PM) Recipient: Gravity
(12/23/2024 8:34:38 PM) Year: 2014
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Saturn Award, Category=Best Writing, Recipient=Gravity, Year=2014, Won: True
(12/23/2024 8:34:38 PM) Parsed Award:  ALMA Award
(12/23/2024 8:34:38 PM) Parsed Category: Outstanding Screenplay - Motion Picture
(12/23/2024 8:34:38 PM) Parsed Recipient: Children of Men
(12/23/2024 8:34:38 PM) Parsed Year: 1999
(12/23/2024 8:34:38 PM) Parsed Won: True


Fortunately, or unfortunately, I'm testing with Alfonso Cuaron, https://www.imdb.com/name/nm0190859/ (https://www.imdb.com/name/nm0190859/) who has 152 events and several hundred awards, so it should vocer all the cases to be tested.


ONE MORE IMPORTANT THING TO NOTE:
For some reason, PVD and script won't work (at least for me) if I manually set the page to be parsed by Function ParsePage_IMDBPeopleAWARDS, like this for example:

Quote
// Parse Awards provider page = BASE_URL_AWARD_PERSON
If GET_FULL_AWARDS Then Begin
  LogMessage('Starting to parse awards page.');
  HTML := ('Tmp\UTF8_NO_BOM-Awards.mhtml');
  LogMessage('Read awards page from file: ' + Copy(HTML, 1, 500)); // Log the file content
Code: [Select]
(I can't remember if this is proper syntax, but I set it properly at the time of the testing, whatever it was)

it wouldn't work without downloading so I had to fake downloading with completely new function:


Quote
Function DownloadPage1(URL:AnsiString; FileName:AnsiString):String;
Var
  ScriptPath, WebText: String;
Begin
  LogMessage(Chr(9)+Chr(9)+'Function DownloadPage1 BEGIN======================|');
  LogMessage(Chr(9)+Chr(9)+'Global Var-DownloadURL|'+URL+'|');
  LogMessage(Chr(9)+Chr(9)+'      Local Var-URL|'+URL+'|');
  ScriptPath := GetAppPath + 'Scripts\';


  // Directly read the existing file instead of downloading
  If FileExists(ScriptPath + FileName) Then Begin
    LogMessage(Chr(9)+Chr(9)+'      File already exists: '+ScriptPath + FileName);
    WebText := FileToString(ScriptPath + FileName);
    WebText := ConvertEncoding(WebText, 65001);  // Convert to UTF-8
    Result := WebText;
    LogMessage(Chr(9)+Chr(9)+'      Read file content successfully.');
  End Else Begin
    LogMessage(Chr(9)+Chr(9)+'      File does not exist: '+ScriptPath + FileName);
    Result := '';
  End;


  LogMessage(Chr(9)+Chr(9)+'Function DownloadPage1 END======================|');
End;




and then to "call downloading"

Quote

// Parse Awards provider page = BASE_URL_AWARD_PERSON
If GET_FULL_AWARDS Then Begin
  LogMessage('Starting to parse awards page.');
  HTML := DownloadPage1(DownloadURL, 'Tmp\UTF8_NO_BOM-Awards.mhtml');
  LogMessage('Read awards page from file: ' + Copy(HTML, 1, 500)); // Log the file content
 


When I reach the phase of passing URL TO SELENIUM TO DOWNLOAD THE PAGE, I'm still not sure how it will work in .psf: will I have to fake download after Selenium passes back the page, or whatever. For someone not knowing how to code, this is too much to comprehend without actual trials.
Title: Re: Imdb People script issues
Post by: afrocuban on December 23, 2024, 10:01:42 pm
Your code also loops:


Quote
1372: (12/23/2024 9:53:57 PM) Parsed Award:  Dorian Award
   Line   1379: (12/23/2024 9:53:57 PM) Award:  Dorian Award
   Line   1385: (12/23/2024 9:53:57 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   2646: (12/23/2024 9:53:59 PM) Parsed Award:  Dorian Award
   Line   2653: (12/23/2024 9:53:59 PM) Award:  Dorian Award
   Line   2659: (12/23/2024 9:53:59 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   3920: (12/23/2024 9:54:01 PM) Parsed Award:  Dorian Award
   Line   3927: (12/23/2024 9:54:01 PM) Award:  Dorian Award
   Line   3933: (12/23/2024 9:54:01 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   5194: (12/23/2024 9:54:02 PM) Parsed Award:  Dorian Award
   Line   5201: (12/23/2024 9:54:02 PM) Award:  Dorian Award
   Line   5207: (12/23/2024 9:54:02 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   6468: (12/23/2024 9:54:04 PM) Parsed Award:  Dorian Award
   Line   6475: (12/23/2024 9:54:04 PM) Award:  Dorian Award
   Line   6481: (12/23/2024 9:54:04 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   7742: (12/23/2024 9:54:06 PM) Parsed Award:  Dorian Award
   Line   7749: (12/23/2024 9:54:06 PM) Award:  Dorian Award
   Line   7755: (12/23/2024 9:54:06 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   9016: (12/23/2024 9:54:08 PM) Parsed Award:  Dorian Award
   Line   9023: (12/23/2024 9:54:08 PM) Award:  Dorian Award
   Line   9029: (12/23/2024 9:54:08 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  10290: (12/23/2024 9:54:09 PM) Parsed Award:  Dorian Award
   Line  10297: (12/23/2024 9:54:09 PM) Award:  Dorian Award
   Line  10303: (12/23/2024 9:54:09 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  11564: (12/23/2024 9:54:11 PM) Parsed Award:  Dorian Award
   Line  11571: (12/23/2024 9:54:11 PM) Award:  Dorian Award
   Line  11577: (12/23/2024 9:54:11 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  12838: (12/23/2024 9:54:13 PM) Parsed Award:  Dorian Award
   Line  12845: (12/23/2024 9:54:13 PM) Award:  Dorian Award
   Line  12851: (12/23/2024 9:54:13 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  14112: (12/23/2024 9:54:15 PM) Parsed Award:  Dorian Award
   Line  14119: (12/23/2024 9:54:15 PM) Award:  Dorian Award
   Line  14125: (12/23/2024 9:54:15 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  15386: (12/23/2024 9:54:17 PM) Parsed Award:  Dorian Award
   Line  15393: (12/23/2024 9:54:17 PM) Award:  Dorian Award
   Line  15399: (12/23/2024 9:54:17 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  16660: (12/23/2024 9:54:18 PM) Parsed Award:  Dorian Award
   Line  16667: (12/23/2024 9:54:18 PM) Award:  Dorian Award
   Line  16673: (12/23/2024 9:54:18 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  17934: (12/23/2024 9:54:20 PM) Parsed Award:  Dorian Award
   Line  17941: (12/23/2024 9:54:20 PM) Award:  Dorian Award
   Line  17947: (12/23/2024 9:54:20 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  19208: (12/23/2024 9:54:21 PM) Parsed Award:  Dorian Award
   Line  19215: (12/23/2024 9:54:21 PM) Award:  Dorian Award
   Line  19221: (12/23/2024 9:54:21 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  20482: (12/23/2024 9:54:23 PM) Parsed Award:  Dorian Award
   Line  20489: (12/23/2024 9:54:23 PM) Award:  Dorian Award
   Line  20495: (12/23/2024 9:54:23 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  21756: (12/23/2024 9:54:25 PM) Parsed Award:  Dorian Award
   Line  21763: (12/23/2024 9:54:25 PM) Award:  Dorian Award
   Line  21769: (12/23/2024 9:54:25 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  23030: (12/23/2024 9:54:26 PM) Parsed Award:  Dorian Award
   Line  23037: (12/23/2024 9:54:27 PM) Award:  Dorian Award
   Line  23043: (12/23/2024 9:54:27 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  24304: (12/23/2024 9:54:28 PM) Parsed Award:  Dorian Award
   Line  24311: (12/23/2024 9:54:28 PM) Award:  Dorian Award
   Line  24317: (12/23/2024 9:54:28 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  25578: (12/23/2024 9:54:30 PM) Parsed Award:  Dorian Award
   Line  25585: (12/23/2024 9:54:30 PM) Award:  Dorian Award
   Line  25591: (12/23/2024 9:54:30 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  26852: (12/23/2024 9:54:32 PM) Parsed Award:  Dorian Award
   Line  26859: (12/23/2024 9:54:32 PM) Award:  Dorian Award
   Line  26865: (12/23/2024 9:54:32 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  28126: (12/23/2024 9:54:33 PM) Parsed Award:  Dorian Award
   Line  28133: (12/23/2024 9:54:33 PM) Award:  Dorian Award
   Line  28139: (12/23/2024 9:54:33 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  29400: (12/23/2024 9:54:35 PM) Parsed Award:  Dorian Award
   Line  29407: (12/23/2024 9:54:35 PM) Award:  Dorian Award
   Line  29413: (12/23/2024 9:54:35 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  30674: (12/23/2024 9:54:37 PM) Parsed Award:  Dorian Award
   Line  30681: (12/23/2024 9:54:37 PM) Award:  Dorian Award
   Line  30687: (12/23/2024 9:54:37 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  31948: (12/23/2024 9:54:39 PM) Parsed Award:  Dorian Award
   Line  31955: (12/23/2024 9:54:39 PM) Award:  Dorian Award
   Line  31961: (12/23/2024 9:54:39 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  33222: (12/23/2024 9:54:41 PM) Parsed Award:  Dorian Award
   Line  33229: (12/23/2024 9:54:41 PM) Award:  Dorian Award
   Line  33235: (12/23/2024 9:54:41 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  34496: (12/23/2024 9:54:43 PM) Parsed Award:  Dorian Award
   Line  34503: (12/23/2024 9:54:43 PM) Award:  Dorian Award
   Line  34509: (12/23/2024 9:54:43 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  35770: (12/23/2024 9:54:44 PM) Parsed Award:  Dorian Award
   Line  35777: (12/23/2024 9:54:44 PM) Award:  Dorian Award
   Line  35783: (12/23/2024 9:54:44 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  37044: (12/23/2024 9:54:46 PM) Parsed Award:  Dorian Award
   Line  37051: (12/23/2024 9:54:46 PM) Award:  Dorian Award
   Line  37057: (12/23/2024 9:54:46 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  38318: (12/23/2024 9:54:48 PM) Parsed Award:  Dorian Award
   Line  38325: (12/23/2024 9:54:48 PM) Award:  Dorian Award
   Line  38331: (12/23/2024 9:54:48 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  39592: (12/23/2024 9:54:50 PM) Parsed Award:  Dorian Award
   Line  39599: (12/23/2024 9:54:50 PM) Award:  Dorian Award
   Line  39605: (12/23/2024 9:54:50 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  40866: (12/23/2024 9:54:52 PM) Parsed Award:  Dorian Award
   Line  40873: (12/23/2024 9:54:52 PM) Award:  Dorian Award
   Line  40879: (12/23/2024 9:54:52 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  42140: (12/23/2024 9:54:54 PM) Parsed Award:  Dorian Award
   Line  42147: (12/23/2024 9:54:54 PM) Award:  Dorian Award


so that's one more thing to resolve
Title: Re: Imdb People script issues
Post by: Ivek23 on December 24, 2024, 02:34:14 pm
Your code also loops:


Quote
1372: (12/23/2024 9:53:57 PM) Parsed Award:  Dorian Award
   Line   1379: (12/23/2024 9:53:57 PM) Award:  Dorian Award
   Line   1385: (12/23/2024 9:53:57 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   2646: (12/23/2024 9:53:59 PM) Parsed Award:  Dorian Award
   Line   2653: (12/23/2024 9:53:59 PM) Award:  Dorian Award
   Line   2659: (12/23/2024 9:53:59 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   3920: (12/23/2024 9:54:01 PM) Parsed Award:  Dorian Award
   Line   3927: (12/23/2024 9:54:01 PM) Award:  Dorian Award
   Line   3933: (12/23/2024 9:54:01 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   5194: (12/23/2024 9:54:02 PM) Parsed Award:  Dorian Award
   Line   5201: (12/23/2024 9:54:02 PM) Award:  Dorian Award
   Line   5207: (12/23/2024 9:54:02 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   6468: (12/23/2024 9:54:04 PM) Parsed Award:  Dorian Award
   Line   6475: (12/23/2024 9:54:04 PM) Award:  Dorian Award
   Line   6481: (12/23/2024 9:54:04 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   7742: (12/23/2024 9:54:06 PM) Parsed Award:  Dorian Award
   Line   7749: (12/23/2024 9:54:06 PM) Award:  Dorian Award
   Line   7755: (12/23/2024 9:54:06 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line   9016: (12/23/2024 9:54:08 PM) Parsed Award:  Dorian Award
   Line   9023: (12/23/2024 9:54:08 PM) Award:  Dorian Award
   Line   9029: (12/23/2024 9:54:08 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  10290: (12/23/2024 9:54:09 PM) Parsed Award:  Dorian Award
   Line  10297: (12/23/2024 9:54:09 PM) Award:  Dorian Award
   Line  10303: (12/23/2024 9:54:09 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  11564: (12/23/2024 9:54:11 PM) Parsed Award:  Dorian Award
   Line  11571: (12/23/2024 9:54:11 PM) Award:  Dorian Award
   Line  11577: (12/23/2024 9:54:11 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  12838: (12/23/2024 9:54:13 PM) Parsed Award:  Dorian Award
   Line  12845: (12/23/2024 9:54:13 PM) Award:  Dorian Award
   Line  12851: (12/23/2024 9:54:13 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  14112: (12/23/2024 9:54:15 PM) Parsed Award:  Dorian Award
   Line  14119: (12/23/2024 9:54:15 PM) Award:  Dorian Award
   Line  14125: (12/23/2024 9:54:15 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  15386: (12/23/2024 9:54:17 PM) Parsed Award:  Dorian Award
   Line  15393: (12/23/2024 9:54:17 PM) Award:  Dorian Award
   Line  15399: (12/23/2024 9:54:17 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  16660: (12/23/2024 9:54:18 PM) Parsed Award:  Dorian Award
   Line  16667: (12/23/2024 9:54:18 PM) Award:  Dorian Award
   Line  16673: (12/23/2024 9:54:18 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  17934: (12/23/2024 9:54:20 PM) Parsed Award:  Dorian Award
   Line  17941: (12/23/2024 9:54:20 PM) Award:  Dorian Award
   Line  17947: (12/23/2024 9:54:20 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  19208: (12/23/2024 9:54:21 PM) Parsed Award:  Dorian Award
   Line  19215: (12/23/2024 9:54:21 PM) Award:  Dorian Award
   Line  19221: (12/23/2024 9:54:21 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  20482: (12/23/2024 9:54:23 PM) Parsed Award:  Dorian Award
   Line  20489: (12/23/2024 9:54:23 PM) Award:  Dorian Award
   Line  20495: (12/23/2024 9:54:23 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  21756: (12/23/2024 9:54:25 PM) Parsed Award:  Dorian Award
   Line  21763: (12/23/2024 9:54:25 PM) Award:  Dorian Award
   Line  21769: (12/23/2024 9:54:25 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  23030: (12/23/2024 9:54:26 PM) Parsed Award:  Dorian Award
   Line  23037: (12/23/2024 9:54:27 PM) Award:  Dorian Award
   Line  23043: (12/23/2024 9:54:27 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  24304: (12/23/2024 9:54:28 PM) Parsed Award:  Dorian Award
   Line  24311: (12/23/2024 9:54:28 PM) Award:  Dorian Award
   Line  24317: (12/23/2024 9:54:28 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  25578: (12/23/2024 9:54:30 PM) Parsed Award:  Dorian Award
   Line  25585: (12/23/2024 9:54:30 PM) Award:  Dorian Award
   Line  25591: (12/23/2024 9:54:30 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  26852: (12/23/2024 9:54:32 PM) Parsed Award:  Dorian Award
   Line  26859: (12/23/2024 9:54:32 PM) Award:  Dorian Award
   Line  26865: (12/23/2024 9:54:32 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  28126: (12/23/2024 9:54:33 PM) Parsed Award:  Dorian Award
   Line  28133: (12/23/2024 9:54:33 PM) Award:  Dorian Award
   Line  28139: (12/23/2024 9:54:33 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  29400: (12/23/2024 9:54:35 PM) Parsed Award:  Dorian Award
   Line  29407: (12/23/2024 9:54:35 PM) Award:  Dorian Award
   Line  29413: (12/23/2024 9:54:35 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  30674: (12/23/2024 9:54:37 PM) Parsed Award:  Dorian Award
   Line  30681: (12/23/2024 9:54:37 PM) Award:  Dorian Award
   Line  30687: (12/23/2024 9:54:37 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  31948: (12/23/2024 9:54:39 PM) Parsed Award:  Dorian Award
   Line  31955: (12/23/2024 9:54:39 PM) Award:  Dorian Award
   Line  31961: (12/23/2024 9:54:39 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  33222: (12/23/2024 9:54:41 PM) Parsed Award:  Dorian Award
   Line  33229: (12/23/2024 9:54:41 PM) Award:  Dorian Award
   Line  33235: (12/23/2024 9:54:41 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  34496: (12/23/2024 9:54:43 PM) Parsed Award:  Dorian Award
   Line  34503: (12/23/2024 9:54:43 PM) Award:  Dorian Award
   Line  34509: (12/23/2024 9:54:43 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  35770: (12/23/2024 9:54:44 PM) Parsed Award:  Dorian Award
   Line  35777: (12/23/2024 9:54:44 PM) Award:  Dorian Award
   Line  35783: (12/23/2024 9:54:44 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  37044: (12/23/2024 9:54:46 PM) Parsed Award:  Dorian Award
   Line  37051: (12/23/2024 9:54:46 PM) Award:  Dorian Award
   Line  37057: (12/23/2024 9:54:46 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  38318: (12/23/2024 9:54:48 PM) Parsed Award:  Dorian Award
   Line  38325: (12/23/2024 9:54:48 PM) Award:  Dorian Award
   Line  38331: (12/23/2024 9:54:48 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  39592: (12/23/2024 9:54:50 PM) Parsed Award:  Dorian Award
   Line  39599: (12/23/2024 9:54:50 PM) Award:  Dorian Award
   Line  39605: (12/23/2024 9:54:50 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  40866: (12/23/2024 9:54:52 PM) Parsed Award:  Dorian Award
   Line  40873: (12/23/2024 9:54:52 PM) Award:  Dorian Award
   Line  40879: (12/23/2024 9:54:52 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
   Line  42140: (12/23/2024 9:54:54 PM) Parsed Award:  Dorian Award
   Line  42147: (12/23/2024 9:54:54 PM) Award:  Dorian Award


so that's one more thing to resolve

Yes, I know about this issue in the log file.

Title: Re: Imdb People script issues
Post by: Ivek23 on December 24, 2024, 02:42:40 pm
Yes, I know about this problem, I have looked into the log files and found a partial solution, which will help.

Code that is not complete::
Quote
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
  curPos, endPos: Integer;
  ItemList, Event, Award, Category, Recipient, Year: String;
  AValue: String; // Declaring AValue as a String
  Won: Boolean;
  FailSafe: Integer;  // To prevent infinite loops
  curPos1,curPos2,curPos3,curPos4,endPos1,endPos2:Integer;

Begin
  LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');


  try
    Result := prFinished;


    // Log the initial HTML snippet being parsed
    LogMessage('Initial HTML snippet: ' + Copy(HTML, 1, 500));


    // Find the position of the Awards title
    curPos := Pos('<h1 class="ipc-title__text">Awards</h1>', HTML);
    If curPos > 0 Then Begin
      // Find the position of the Awards section
      curPos := PosFrom('<section class="ipc-page-section ipc-page-section--base">', HTML, curPos);
    End;


    If curPos > 0 Then Begin
      // Find the end position of the Awards section
      endPos := PosFrom('<h3 class="ipc-title__text"><span id="contribute">Contribute to this page</span>', HTML, curPos);
      If endPos = 0 Then endPos := Length(HTML);


      If (curPos > 0) AND (endPos > curPos) Then Begin
        // Extract the Awards block
        ItemList := Copy(HTML, curPos, endPos - curPos);
      //LogMessage(ItemList);

       //While curPos > 0 Do Begin
          // Extract and log the award name
       
        // Extract and log the event name
      //curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, curPos);
      //curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, 1);
      curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text"><span id="ev', ItemList, 0);
      FailSafe := 0;  // Initialize fail-safe counter
      //While curPos > 0 Do Begin
      While (curPos > 0) And (FailSafe < 10) Do Begin
        // Extract and log the award name
         If curPos > 0 Then Begin
          curPos := PosFrom('>', ItemList, curPos) + 29;
          endPos := PosFrom('</span>', ItemList, curPos);
          Event := Copy(ItemList, curPos, endPos - curPos);
        LogMessage('** Parsed Event: ' + Event);
        Event := RemoveTagsEx0(Event);
          Event := Trim(Event);
        LogMessage('* Parsed Event: ' + Event);
        //Event := RemoveTagsEx1(Trim(Event));


          // Remove the <span> tag
          Event := Copy(Event, Pos('>', Event)  + 1 , Length(Event));
          LogMessage('Parsed Event: ' + Event);
       
      //(*
          // Parse each award item manually
        curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item"', ItemList, 1);
        //curPos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, curPos + 1);
          If curPos > 0 Then Begin
      //*) 
      //(*
          curPos1 := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, curPos1  + 16);
          If curPos1 > 0 Then Begin
            curPos1 := PosFrom('>', ItemList, curPos1) + 1;
            endPos1 := PosFrom('</span>', ItemList, curPos1);
            Award := Copy(ItemList, curPos1, endPos1 - curPos1);
            LogMessage('Parsed Award: ' + Award);
      //*) 
 
 
 
      (*
                // Log the parameters before calling AddAward
                //LogMessage('Before calling AddAward with parameters:');
                //LogMessage('Event: ' + Event);
                //LogMessage('Award: ' + Award);
                //LogMessage('Category: ' + Category);
                //LogMessage('Recipient: ' + Recipient);
                //LogMessage('Year: ' + Year);
                //LogMessage('Won: ' + CustomBoolToStr(Won));
      *) 
 
                // Populate the custom field with AValue
                //AddCustomFieldValueByName('IMDb People Awards', AValue);
                //    LogMessage('IMDb People Awards added ' + AValue)
 
      //(*         
         //curPos1 := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, EndPos1 + 10);
         curPos1 := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, curPos1 + 0);
         //curPos1 := PosFrom('<span/class="ipc-metadata-list-summary-item__tst">', ItemList, curPos1);
         //curPos1 := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, EndPos1);
          End Else LogMessage('Error: Award not found.');   
      //*) 
      //(*
        // Move to the next item
          curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item"', ItemList, curPos + 0);
          End Else LogMessage('Error: Awards not found.');   
      //*) 
       
       //curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text"><span id="ev', ItemList, curPos + 1);
       //curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, 1);
       //curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, curPos);
       curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, EndPos);
        End Else LogMessage('Error: Event title div not found.');
      
      End;   

      End Else LogMessage('Error: Invalid endPos or curPos for Awards section');
    End Else LogMessage('Error: Awards section not found');
   
  except
    Begin
      LogMessage('Exception encountered');
      Result := prError;
    End;
  end;


  LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
  Result := prFinished;
End;

Log details:
Quote
(24.12.2024 14:28:23)       Function DownloadPage END======================|
(24.12.2024 14:28:23) Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||
(24.12.2024 14:28:23) Initial HTML snippet: <!DOCTYPE html><html lang="en-US" xmlns:og="http://opengraphprotocol.org/schema/" xmlns:fb="http://www.facebook.com/2008/fbml"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><script>if(typeof uet === 'function'){ uet('bb', 'LoadTitle', {wb: 1}); }</script><script>window.addEventListener('load', (event) => {
        if (typeof window.csa !== 'undefined' && typeof window.csa === 'function') {
            var csaLatencyPlugin = window.csa('Content', {
             
(24.12.2024 14:28:23) ** Parsed Event: <span id="ev0000386">Kids' Choice Awards, USA
(24.12.2024 14:28:23) * Parsed Event: Kids' Choice Awards, USA
(24.12.2024 14:28:23) Parsed Event: Kids' Choice Awards, USA
(24.12.2024 14:28:23) Parsed Award:  Blimp Award
(24.12.2024 14:28:23) ** Parsed Event: <span id="ev0000616">Soap Opera Digest Awards
(24.12.2024 14:28:23) * Parsed Event: Soap Opera Digest Awards
(24.12.2024 14:28:23) Parsed Event: Soap Opera Digest Awards
(24.12.2024 14:28:23) Parsed Award:  Soap Opera Digest Award
(24.12.2024 14:28:23) ** Parsed Event: <span id="ev0000716">Young Artist Awards
(24.12.2024 14:28:23) * Parsed Event: Young Artist Awards
(24.12.2024 14:28:23) Parsed Event: Young Artist Awards
(24.12.2024 14:28:23) Parsed Award:  Young Artist Award
(24.12.2024 14:28:23) ** Parsed Event: <span id="ev0000718">YoungStar Awards
(24.12.2024 14:28:23) * Parsed Event: YoungStar Awards
(24.12.2024 14:28:23) Parsed Event: YoungStar Awards
(24.12.2024 14:28:23) Parsed Award:  Young Artist Award
(24.12.2024 14:28:23) Function ParsePage_IMDBPeopleAWARDS END=====================||
(24.12.2024 14:28:23)     Provider data info retreived Ok in 2024-12-24 14:28:23|
(24.12.2024 14:28:23) Function ParsePage smNormal END======================|
(24.12.2024 14:28:23) Person -> LoadStatic -> 0ms
(24.12.2024 14:28:23) Person -> LoadMultivalues -> 0ms
(24.12.2024 14:28:23) Person -> LoadFilms -> 0ms
(24.12.2024 14:28:23) Person -> LoadAwards -> 0ms
(24.12.2024 14:28:23) Person -> LoadImages -> 0ms

<span id="ev0000718">YoungStar Awards

is helpful for which event the awards refer to
Title: Re: Imdb People script issues
Post by: Ivek23 on December 26, 2024, 06:56:49 pm
Here is the IMDB_People_[EN][HTTPS]_Awards script, which now correctly transfers Awards data to the awards field for the 'Chico' Hernandez person from the url added below using a Python Selenium script

https://www.imdb.com/name/nm0379491/awards/

I have corrected or added some parts of the code to your code and it works.

Python Selenium script instructions and code will be published probably by the new year in the Integrating Selenium to PVD topic.

http://www.videodb.info/forum_en/index.php/topic,4357.0.html

Python Selenium script is at the link below.

http://www.videodb.info/forum_en/index.php/topic,4362.msg22691.html#msg22691

IMDB_[EN][HTTPS]_TEST_Aka script in link below.

http://www.videodb.info/forum_en/index.php/topic,4363.0.html
Title: Re: Imdb People script issues
Post by: Ivek23 on December 28, 2024, 04:45:16 pm
Unfortunately, I don't plan on working on any Imdb Awards section anymore for any updates or fixes to the movies or people code in Function ParsePage_IMDBMovieAWARDS. It's too complicated and completely inappropriate layout or notation of the Awards page source code to be able to edit it to properly record the Awards data.
Title: Re: Imdb People script issues
Post by: afrocuban on December 28, 2024, 08:20:17 pm
I completely understand. It is so complicated that even AI can't do anything about so far.
The best I could do is to get 2 functions.


The first parses all events, but none of the awards:
Quote
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
  curPos, endPos: Integer;
  Event, Award, Category, Recipient, Year: String;
  Won: Boolean;
  FailSafe: Integer;  // To prevent infinite loops
Begin
  LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
  Result := prFinished;

  // Log the first 500 characters of the initial HTML snippet
  LogMessage('Initial HTML snippet (first 500 chars): ' + Copy(HTML, 1, 500));
  // Log the last 500 characters of the initial HTML snippet
  LogMessage('Initial HTML snippet (last 500 chars): ' + Copy(HTML, Length(HTML) - 499, 500));

  // Initialize the search for the first event section
  curPos := Pos('<section class="ipc-page-section ipc-page-section--base">', HTML);
  LogMessage('curPos after finding first event section: ' + IntToStr(curPos));

  If curPos > 0 Then Begin
    FailSafe := 0;  // Initialize fail-safe counter
    While (curPos > 0) And (FailSafe < 200) Do Begin
      // Ensure we don't exceed the HTML length
      If curPos >= Length(HTML) Then Break;

      // Extract the Event Name
      curPos := PosFrom('<span id="ev', HTML, curPos);
      If curPos > 0 Then Begin
        curPos := PosFrom('>', HTML, curPos) + 1;
        endPos := PosFrom('</span>', HTML, curPos);
        Event := Trim(Copy(HTML, curPos, endPos - curPos));
        LogMessage('Parsed Event: ' + Event);

        // Process each award within the event
curPos := endPos;
While (curPos > 0) And (curPos < Length(HTML)) And (PosFrom('</section><section class="ipc-page-section ipc-page-section--base">', HTML, curPos) = 0) And (PosFrom('</section><div class="nas-slot">', HTML, curPos) = 0) Do Begin
  curPos := PosFrom('<div data-testid="sub-section-', HTML, curPos);
  If curPos > 0 Then Begin
    curPos := PosFrom('>', HTML, curPos) + 1;
    endPos := PosFrom('<>', HTML, curPos);
    Award := Copy(HTML, curPos, endPos - curPos);
   // LogMessage('Extracted Award Content: ' + Award);
         
            // Parse award details from the Award block
            // Extract Award Name
            curPos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', HTML, curPos);
            If curPos > 0 Then Begin
              curPos := PosFrom('>', HTML, curPos) + 1;
              endPos := PosFrom('</span>', HTML, curPos);
              Award := Copy(HTML, curPos, endPos - curPos);
           
            End;
            // Extract Category
            curPos := PosFrom('<span class="ipc-metadata-list-summary-item__li awardCategoryName"', Award, 1);
            If curPos > 0 Then Begin
              curPos := PosFrom('>', Award, curPos) + 1;
              endPos := PosFrom('</span>', Award, curPos);
              Category := Copy(Award, curPos, endPos - curPos);
            End;

            // Extract Recipient
            curPos := PosFrom('<a class="ipc-metadata-list-summary-item__li ipc-metadata-list-summary-item__li--link"', Award, 1);
            If curPos > 0 Then Begin
              curPos := PosFrom('>', Award, curPos) + 1;
              endPos := PosFrom('</a>', Award, curPos);
              Recipient := Copy(Award, curPos, endPos - curPos);
            End;

            // Extract Year
            curPos := PosFrom('<a class="ipc-metadata-list-summary-item__t"', Award, 1);
            If curPos > 0 Then Begin
              curPos := PosFrom('>', Award, curPos) + 1;
              endPos := PosFrom(' ', Award, curPos);  // Find the space after the year
              Year := Copy(Award, curPos, endPos - curPos);
              Year := Trim(Year);
            End;

            // Determine if the award was won
            Won := PosFrom('Winner', Award, 1) > 0;

            // Add award to the database
            AddAward(Event, Award, Category, Recipient, Year, Won);
            If Won Then
              LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=True')
            Else
              LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=False');
          End;
        End;
      End;

      // Move to the next event or end of awards block
      If PosFrom('</section><section class="ipc-page-section ipc-page-section--base">', HTML, curPos) > 0 Then
        curPos := PosFrom('</section><section class="ipc-page-section ipc-page-section--base">', HTML, curPos) + Length('</section><section class="ipc-page-section ipc-page-section--base">')
      Else If PosFrom('<div class="nas-slot">', HTML, curPos) > 0 Then Begin
        LogMessage('End of awards block detected.');
        Break;
      End Else Begin
        LogMessage('Error: Unable to identify next event or end of awards block.');
        Break;
      End;
      Inc(FailSafe);
    End;
  End Else LogMessage('Error: First event section not found');

  LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
  Result := prFinished;
End;
//BlockClose


The second one parses all awards and only first event, and assigns all the awards to that event:


Quote
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
  curPos, endPos, awardPos, categoryPos, recipientPos: Integer;
  Event, Award, Category, Recipient, Year: String;
  Won: Boolean;
Begin
  LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
  Result := prFinished;


  // Locate the start of the specific event section
  curPos := Pos('<section class="ipc-page-section ipc-page-section--base">', HTML);
  LogMessage('curPos after finding event section: ' + IntToStr(curPos));


  If curPos > 0 Then Begin
    // Extract event name
    curPos := PosFrom('<span id="ev', HTML, curPos);
    If curPos = 0 Then Begin
      LogMessage('Event name not found');
      Exit;
    End;
    curPos := PosFrom('>', HTML, curPos) + 1;
    endPos := PosFrom('</span>', HTML, curPos);
    Event := Trim(Copy(HTML, curPos, endPos - curPos));
    LogMessage('Parsed Event: ' + Event);
   
    curPos := endPos;


    // Process awards within this event
    While curPos > 0 Do Begin
      // Find next award div
      curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item">', HTML, curPos);
      If curPos = 0 Then Begin
        LogMessage('No more awards found in this event');
        Break;
      End;
      LogMessage('curPos after finding award div: ' + IntToStr(curPos));


      awardPos := curPos;  // Save the starting position of the award
      curPos := PosFrom('>', HTML, curPos) + 1;
      endPos := PosFrom('<></section>', HTML, curPos);  // Adjusted to the correct closing tag
      If endPos = 0 Then Begin
        LogMessage('No closing tag for award div found');
        Break;  // No more awards
      End;


      Award := Copy(HTML, awardPos, endPos - awardPos);
      curPos := endPos + Length('<></section>');
      // LogMessage('Award Content Extracted Successfully: ' + Award);


      // Extract year
      awardPos := PosFrom('<a class="ipc-metadata-list-summary-item__t"', Award, 1);
      If awardPos = 0 Then Begin
        LogMessage('Year not found');
        Continue;
      End;
      awardPos := PosFrom('>', Award, awardPos) + 1;
      endPos := PosFrom(' ', Award, awardPos);  // Find the space after the year
      Year := Copy(Award, awardPos, endPos - awardPos);
      Year := Trim(Year);
      LogMessage('Parsed Year: ' + Year);


      // Determine if the award was won
      Won := PosFrom('Winner', Award, 1) > 0;
      If Won Then
        LogMessage('Parsed Won: True')
      Else
        LogMessage('Parsed Won: False');


      // Extract Category
      categoryPos := PosFrom('<span class="ipc-metadata-list-summary-item__li awardCategoryName"', Award, awardPos);
      LogMessage('EVE GA CAT: ' + IntToStr(categoryPos));
      If categoryPos > 0 Then Begin
        categoryPos := PosFrom('>', Award, categoryPos) + 1;
        endPos := PosFrom('</span>', Award, categoryPos);
        Category := Copy(Award, categoryPos, endPos - categoryPos);
      End;
      LogMessage('Parsed Category Name: ' + Category);


      // Extract recipient
      recipientPos := PosFrom('<a class="ipc-metadata-list-summary-item__li ipc-metadata-list-summary-item__li--link"', Award, categoryPos);
      If recipientPos = 0 Then Begin
        LogMessage('Recipient tag not found');
        Continue;
      End;
      recipientPos := PosFrom('>', Award, recipientPos) + 1;
      endPos := PosFrom('</a>', Award, recipientPos);
      Recipient := Copy(Award, recipientPos, endPos - recipientPos);
      LogMessage('Parsed Recipient: ' + Recipient);


      // Extract award name
      awardPos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', Award, awardPos);
      If awardPos = 0 Then Begin
        LogMessage('Award Name not found');
        Continue;
      End;
      awardPos := PosFrom('>', Award, awardPos) + 1;
      LogMessage('Parsed awardPos ' + IntToStr(awardPos));
      endPos := PosFrom('</span>', Award, awardPos);
      LogMessage('Parsed endPos ' + IntToStr(endPos));
      Award := Copy(Award, awardPos, endPos - awardPos);
      LogMessage('Parsed Award Name: ' + Award);


      // Add award to the database
      AddAward(Event, Award, Category, Recipient, Year, Won);
      If Won Then
        LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=True')
      Else
        LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=False');


      // Advance curPos to ensure moving to the next award
      curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item">', HTML, curPos);
    End;
  End Else LogMessage('Error: Event section not found');


  LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
  Result := prFinished;
End;
//BlockClose


For the sake of my life I cannot do anything to combine them, no matter what I try. Not even close... :o :'(


How's that even possible?


The page I'm trying to parse is attached, as well as the script which containes fixed genres and bio.
Title: Re: Imdb People script issues
Post by: afrocuban on December 28, 2024, 08:38:43 pm
And probably FINALLY this one is a winner (have to try to tweak Recipient yet):





Quote
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
  curPos, endPos, awardPos, categoryPos, recipientPos, yearPos, eventEndPos, namePos: Integer;
  Event, Award, AwardName, Category, Recipient, Year: String;
  Won: Boolean;
Begin
  LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
  Result := prFinished;


  // Locate the start of the first event section
  curPos := Pos('<section class="ipc-page-section ipc-page-section--base">', HTML);
  While curPos > 0 Do Begin
    LogMessage('curPos after finding event section: ' + IntToStr(curPos));


    // Extract event name
    curPos := PosFrom('<span id="ev', HTML, curPos);
    If curPos = 0 Then Begin
      LogMessage('Event name not found');
      Break;
    End;
    curPos := PosFrom('>', HTML, curPos) + 1;
    endPos := PosFrom('</span>', HTML, curPos);
    Event := Trim(Copy(HTML, curPos, endPos - curPos));
    LogMessage('Parsed Event: ' + Event);


    // Move cursor to start processing awards within the event
    curPos := endPos;
    eventEndPos := PosFrom('<section class="ipc-page-section ipc-page-section--base">', HTML, curPos);
    If eventEndPos = 0 Then
      eventEndPos := Length(HTML);  // Set to the end of HTML if no more events


    // Process awards within the event
    While curPos < eventEndPos Do Begin
      // Find next award div within the current event
      awardPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item">', HTML, curPos);
      If (awardPos = 0) Or (awardPos >= eventEndPos) Then Begin
        LogMessage('No more awards found in this event');
        Break;
      End;
      LogMessage('curPos after finding award div: ' + IntToStr(awardPos));


      // Extract entire award block
      curPos := awardPos;
      endPos := PosFrom('</li>', HTML, curPos);
      If endPos = 0 Then Begin
        LogMessage('No closing tag for award div found');
        Break;
      End;


      Award := Copy(HTML, curPos, endPos - curPos);
      curPos := endPos + Length('</li>');
      LogMessage('Award Content Extracted Successfully: ' + Award);


      // Extract year
      yearPos := PosFrom('<a class="ipc-metadata-list-summary-item__t"', Award, 1);
      If yearPos = 0 Then Begin
        LogMessage('Year not found');
        Continue;
      End;
      yearPos := PosFrom('>', Award, yearPos) + 1;
      endPos := PosFrom(' ', Award, yearPos);
      Year := Copy(Award, yearPos, endPos - yearPos);
      Year := Trim(Year);
      LogMessage('Parsed Year: ' + Year);


      // Determine if the award was won
      Won := PosFrom('Winner', Award, 1) > 0;
      If Won Then
        LogMessage('Parsed Won: True')
      Else
        LogMessage('Parsed Won: False');


      // Extract award name
      namePos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', Award, 1);
      If namePos > 0 Then Begin
        namePos := PosFrom('>', Award, namePos) + 1;
        endPos := PosFrom('</span>', Award, namePos);
        AwardName := Copy(Award, namePos, endPos - namePos);
        LogMessage('Parsed Award Name: ' + AwardName);
      End Else Begin
        LogMessage('Award Name not found');
        AwardName := '';
      End;


      // Extract category
      categoryPos := PosFrom('<span class="ipc-metadata-list-summary-item__li awardCategoryName" aria-disabled="false">', Award, 1);
      If categoryPos > 0 Then Begin
        categoryPos := PosFrom('>', Award, categoryPos) + 1;
        endPos := PosFrom('</span>', Award, categoryPos);
        Category := Copy(Award, categoryPos, endPos - categoryPos);
        LogMessage('Parsed Category: ' + Category);
      End Else Begin
        LogMessage('Category tag not found');
        Category := '';
      End;


      // Extract recipient
      recipientPos := PosFrom('<a class="ipc-metadata-list-summary-item__li ipc-metadata-list-summary-item__li--link"', Award, endPos + Length('</span>') + 1);
      If recipientPos > 0 Then Begin
        recipientPos := PosFrom('>', Award, recipientPos) + 1;
        endPos := PosFrom('</a>', Award, recipientPos);
        Recipient := Copy(Award, recipientPos, endPos - recipientPos);
        LogMessage('Parsed Recipient: ' + Recipient);
      End Else Begin
        LogMessage('Recipient tag not found');
        Recipient := '';
      End;


      // Add award to the database
      AddAward(Event, AwardName, Category, Recipient, Year, Won);
      If Won Then
        LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + AwardName + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=True')
      Else
        LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + AwardName + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=False');
    End;


    // Move to the next event section
    curPos := eventEndPos;
    curPos := PosFrom('<section class="ipc-page-section ipc-page-section--base">', HTML, curPos);
  End;


  LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
  Result := prFinished;
End;
//BlockClose


Here's the beginning of the log (first event is "Ariel Awards, Mexico" and the first award in it is "Golden Ariel") and the end of the log ("BOFA" is the last award of the last event of this page - "Brazil Online Film Award"):


Quote
(12/28/2024 8:35:07 PM) Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||
(12/28/2024 8:35:07 PM) curPos after finding event section: 148924
(12/28/2024 8:35:07 PM) Parsed Event: Ariel Awards, Mexico
(12/28/2024 8:35:07 PM) curPos after finding award div: 149982
(12/28/2024 8:35:07 PM) Parsed Year: 2019
(12/28/2024 8:35:07 PM) Parsed Won: True
(12/28/2024 8:35:07 PM) Parsed Award Name:  Golden Ariel
(12/28/2024 8:35:07 PM) Parsed Category: Best Picture (Mejor Película)
(12/28/2024 8:35:07 PM) Recipient tag not found
(12/28/2024 8:35:07 PM) AddAward executed successfully: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=, Year=2019, Won=True
============= intermediate logs here
(12/28/2024 8:35:10 PM) AddAward executed successfully: Event=Premios Eres, Award= Premio Eres, Category=Best Picture (Mejor Película), Recipient=, Year=1993, Won=False
(12/28/2024 8:35:10 PM) No more awards found in this event
(12/28/2024 8:35:10 PM) curPos after finding event section: 1574223
(12/28/2024 8:35:10 PM) Parsed Event: Brazil Online Film Award
(12/28/2024 8:35:10 PM) curPos after finding award div: 1575285
(12/28/2024 8:35:10 PM) Parsed Year: 2019
(12/28/2024 8:35:10 PM) Parsed Won: True
(12/28/2024 8:35:10 PM) Parsed Award Name:  BOFA
(12/28/2024 8:35:10 PM) Parsed Category: Best Director
(12/28/2024 8:35:10 PM) Recipient tag not found
(12/28/2024 8:35:10 PM) AddAward executed successfully: Event=Brazil Online Film Award, Award= BOFA, Category=Best Director, Recipient=, Year=2019, Won=True
(12/28/2024 8:35:10 PM) No more awards found in this event
(12/28/2024 8:35:10 PM) Function ParsePage_IMDBPeopleAWARDS END=====================||
(12/28/2024 8:35:10 PM) After calling ParsePage_IMDBPeopleAWARDS
(12/28/2024 8:35:10 PM) Parsed awards page.
(12/28/2024 8:35:10 PM) Parsing awards page finished successfully.
(12/28/2024 8:35:10 PM)     Provider data info retrieved Ok on 2024-12-28 20:35:10|
(12/28/2024 8:35:10 PM) Function ParsePage smNormal END======================|
(12/28/2024 8:35:10 PM) Person -> LoadStatic -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadMultivalues -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadFilms -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadAwards -> 15ms
(12/28/2024 8:35:10 PM) Person -> LoadImages -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadStatic -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadMultivalues -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadFilms -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadAwards -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadImages -> 16ms

Title: Re: Imdb People script issues
Post by: Ivek23 on December 29, 2024, 09:46:28 am
Quote
      // Extract entire award block
      curPos := awardPos;
      endPos := PosFrom('</li>', HTML, curPos);
      If endPos = 0 Then Begin
        LogMessage('No closing tag for award div found');
        Break;
      End;


      Award := Copy(HTML, curPos, endPos - curPos);
      curPos := endPos + Length('</li>');
      LogMessage('Award Content Extracted Successfully: ' + Award);

Just change this part of the code above with this part of the code below and Recipient will work.

Quote
      // Extract entire award block
      curPos := awardPos;
      endPos := PosFrom('</div></div></li>', HTML, curPos);
      If endPos = 0 Then Begin
        LogMessage('No closing tag for award div found');
        Break;
      End;


      Award := Copy(HTML, curPos, endPos - curPos);
      curPos := endPos + Length('</div></div></li>');
      LogMessage('Award Content Extracted Successfully: ' + Award);

Title: Re: Imdb People script issues
Post by: afrocuban on December 29, 2024, 10:43:11 am
Quote
      // Extract entire award block
      curPos := awardPos;
      endPos := PosFrom('</li>', HTML, curPos);
      If endPos = 0 Then Begin
        LogMessage('No closing tag for award div found');
        Break;
      End;


      Award := Copy(HTML, curPos, endPos - curPos);
      curPos := endPos + Length('</li>');
      LogMessage('Award Content Extracted Successfully: ' + Award);


Just change this part of the code above with this part of the code below and Recipient will work.

Quote
      // Extract entire award block
      curPos := awardPos;
      endPos := PosFrom('<><></li>', HTML, curPos);
      If endPos = 0 Then Begin
        LogMessage('No closing tag for award div found');
        Break;
      End;


      Award := Copy(HTML, curPos, endPos - curPos);
      curPos := endPos + Length('<><></li>');
      LogMessage('Award Content Extracted Successfully: ' + Award);


Is that for Recipient? Because everything else works except Recipient.
Title: Re: Imdb People script issues
Post by: afrocuban on December 29, 2024, 10:57:30 am
Ohhhhh, I seee now!!!! Award extracted didn't contain Recipient!!! Thank you I will try it later!
Title: Re: Imdb People script issues
Post by: afrocuban on December 29, 2024, 11:49:44 am
I can now confirm that parsing awards works completely.

What doesn't work is populating to database, at least for me. No award or event is populated, although everything is properly parsed. Here's the log for the person and page given above.

What that can be???
Title: Re: Imdb People script issues
Post by: Ivek23 on December 29, 2024, 02:41:24 pm
This solo IMDB_People_[EN][HTTPS]_Awards 1 script with selenium transfers the awards data to the awards field without any problems. The regular IMDB_People_[EN][HTTPS]_Awards 2 script for the PVD MOD version also transfers the awards data to the awards field without any problems

The problem is in your IMDB_People_[EN][HTTPS]-Awards script, because this script did not add any awards data to the awards field for me either.

I already know where the problem is. For me, it transfers downpage-UTF8_NO_BOM.htm for the awards from the website, but for you, the problem is probably the parsing for the awards at the end of the script
Title: Re: Imdb People script issues
Post by: afrocuban on December 29, 2024, 08:48:33 pm
Thanks for the quick feedback, Ivek. I have 2 questions I'm puzzled with now.
1. What selenium script do you use to download person's page? Is it the same one for aka?
2. I tried to put Awards function to the beginning of the script, at the same place where it is in your scripts too: just after the Function ParsePage_IMDBSearchName, but nothing different happened, so it's not about that it looks, or I didn't understand your remark?


Now, I tried regular IMDB_People_[EN][HTTPS]_Awards 2 script for the PVD MOD version you posted, but even that one didn't populate any award to my database, unlike for you. So, it's probably something with my database then, what do you think?
Title: Re: Imdb People script issues
Post by: Ivek23 on December 29, 2024, 09:33:22 pm
Thanks for the quick feedback, Ivek. I have 2 questions I'm puzzled with now.
1. What selenium script do you use to download person's page? Is it the same one for aka?

It is the same selenium script as for aka and I can download both the movies and people awards data with your awards code.

2. I tried to put Awards function to the beginning of the script, at the same place where it is in your scripts too: just after the Function ParsePage_IMDBSearchName, but nothing different happened, so it's not about that it looks, or I didn't understand your remark?

The problem is in the name of the file extension Tmp\UTF8_NO_BOM-Awards.mhtml. Change it to .htm and it should work.

Now, I tried regular IMDB_People_[EN][HTTPS]_Awards 2 script for the PVD MOD version you posted, but even that one didn't populate any award to my database, unlike for you. So, it's probably something with my database then, what do you think?

It's probably not your database, maybe you don't have the awards field marked, because it was the same for me until I checked the awards field in the settings, after that it transferred the awards data to the awards fields without any problems.
Title: Re: Imdb People script issues
Post by: afrocuban on December 29, 2024, 11:43:21 pm

It is the same selenium script as for aka and I can download both the movies and people awards data with your awards code.
Thanks! Do you get all awards with it, or only "static" ones?
The problem is in the name of the file extension Tmp\UTF8_NO_BOM-Awards.mhtml. Change it to .htm and it should work.
Wait WHAT? It works THIS WAY, thanks a lot(!) but where and why on Earth that comes from? What htm or mhtml has to do with database at all?
It's probably not your database, maybe you don't have the awards field marked, because it was the same for me until I checked the awards field in the settings, after that it transferred the awards data to the awards fields without any problems.
You were right. Scripts were set to "- Set if empty" It works now, of course.


But, I'm still puzzled, not to say shocked about htm and mhtml...
Title: Re: Imdb People script issues
Post by: Ivek23 on December 30, 2024, 08:04:32 am

It is the same selenium script as for aka and I can download both the movies and people awards data with your awards code.
Thanks! Do you get all awards with it, or only "static" ones?

A regular script only transfers "static" awards data, while with selenium it transfers all awards data.

The problem is in the name of the file extension Tmp\UTF8_NO_BOM-Awards.mhtml. Change it to .htm and it should work.

Wait WHAT? It works THIS WAY, thanks a lot(!) but where and why on Earth that comes from? What htm or mhtml has to do with database at all?

Your IMDB_People_[EN][HTTPS]-Awards script also uses the .htm extension for transferring all other data, except for transferring awards data, which file is not transferred to the Tmp folder by the script. This is what I meant when I mentioned the change at the end of the script, or an additional change is needed at the beginning of the script. For example, this needs to be added there.
Quote
BASE_DOWNLOAD_FILE_NO_BOM-AWARDS  = 'Tmp\UTF8_NO_BOM-Awards.mhtml';

The selenium script will also transfer the same if you do not change the extension in it.

It's probably not your database, maybe you don't have the awards field marked, because it was the same for me until I checked the awards field in the settings, after that it transferred the awards data to the awards fields without any problems.
You were right. Scripts were set to "- Set if empty" It works now, of course.


But, I'm still puzzled, not to say shocked about htm and mhtml...

Great that it works now.

About htm and mhtml it has already been mentioned above.
Title: Re: Imdb People script issues
Post by: afrocuban on December 30, 2024, 03:16:50 pm

Unfortunately, I don't plan on working on any Imdb Awards section anymore for any updates or fixes to the movies or people code in Function ParsePage_IMDBMovieAWARDS. It's too complicated and completely inappropriate layout or notation of the Awards page source code to be able to edit it to properly record the Awards data.


Thanks Ivek. I will now continue to integrate everything in order to get fully revised and functional IMDB_People_[EN][HTTPS].psf script and will do my best to maintain it in the future...


For that, I prepared Chrome selenium script at:
Quote
http://www.videodb.info/forum_en/index.php?topic=4364.msg22706 (http://www.videodb.info/forum_en/index.php?topic=4364.msg22706)
Title: Re: Imdb People script issues
Post by: afrocuban on January 03, 2025, 03:50:23 am
Hello and Happy New Year!

I'm close to finish the script, and at the moment I'm stuck with the fact that all of a sudden people's photos aren't populated to PVD although properly downloaded and sent to AddImageURL. Here's the log about it:
Quote
1/3/2025 3:36:26 AM) ImageFile path in Get ~Photo~: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/3/2025 3:36:26 AM)       Function DownloadImage BEGIN======================|
(1/3/2025 3:36:26 AM)       Global Var-DownloadURL|https://www.imdb.com/name/nm0001833/|
(1/3/2025 3:36:26 AM)             Local Var-URL|https://m.media-amazon.com/images/M/MV5BMjAwNzc5MjE0N15BMl5BanBnXkFtZTcwMzUyNTMzNw@@._V1_UY12000_.jpg|
(1/3/2025 3:36:26 AM)             Local Var-OutPutFile|X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg|
(1/3/2025 3:36:26 AM)             Download with PVdBDownPage in file:|X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg the information of:|https://m.media-amazon.com/images/M/MV5BMjAwNzc5MjE0N15BMl5BanBnXkFtZTcwMzUyNTMzNw@@._V1_UY12000_.jpg||
(1/3/2025 3:36:26 AM)             Waiting 2s for exists of:X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/3/2025 3:36:28 AM)             Now present complete page file: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/3/2025 3:36:28 AM)       Function DownloadImage END======================|
(1/3/2025 3:36:28 AM) Image successfully downloaded to: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/3/2025 3:36:28 AM) Adding image with URL: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg and type: itPoster
(1/3/2025 3:36:28 AM) AddImageURL has been called with ImageType: 0 and ImageFile: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg


Do you have any idea why is that? I also tried with itPhoto (4) but no luck... Checkbox "Overwrite" is checked for the script, so it's not about that too, and it doesn't work even if there wasn't photo at all.
Title: Re: Imdb People script issues
Post by: Ivek23 on January 03, 2025, 10:16:39 am
Also try the IMDB_People_[EN][HTTPS]-Awards script, if it uploads people's photos.

Reduce it here too, maybe it will help

MAX_IMAGE_HEIGHT = 1200;

The definition of the people's photo import path could also be a problem, check it and you'll see what happens.

In the awards code, correct the part of the code here and download all visible awards.

Quote
      // Extract entire award block
      curPos := awardPos;
      endPos := PosFrom('</div></div></li>', HTML, curPos);
      If endPos = 0 Then Begin
        LogMessage('No closing tag for award div found');
        Break;
      End;


      Award := Copy(HTML, curPos, endPos - curPos);
      curPos := endPos + Length('</div></div></li>');
      LogMessage('Award Content Extracted Successfully: ' + Award);
     Award:=StringReplace(Award,'"winner"','',False,True,False);
Title: Re: Imdb People script issues
Post by: afrocuban on January 04, 2025, 02:50:38 am
I tried everything you said, and everything else possible and impossible, I even created separate, dedicated function to AddImageURL, but to na avail.... Here's the log for that. It claims photo is added, but it's not:
Quote
(1/4/2025 2:30:45 AM) Function DownloadPage END======================|
(1/4/2025 2:30:45 AM) All pages downloaded successfully.
(1/4/2025 2:30:45 AM)       Function DownloadPageMain BEGIN======================|
(1/4/2025 2:30:45 AM)       Global Var-DownloadURL|downpage-UTF8_NO_BOM.htm|
(1/4/2025 2:30:45 AM) Reading main file: X:PATH\To_PVD\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(1/4/2025 2:30:45 AM) Function DownloadPageMain END======================|
(1/4/2025 2:30:45 AM) HandlePhoto function called.
(1/4/2025 2:30:45 AM) ImageFile path in Get ~Photo~: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:45 AM)       Function DownloadImage BEGIN======================|
(1/4/2025 2:30:45 AM)       Global Var-DownloadURL|https://www.imdb.com/name/nm0000017/|
(1/4/2025 2:30:45 AM)             Local Var-URL|https://m.media-amazon.com/images/M/MV5BMTI0NTcyMTM5OF5BMl5BanBnXkFtZTYwOTkzMDU2._V1_UY12000_.jpg|
(1/4/2025 2:30:45 AM)             Local Var-OutPutFile|X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg|
(1/4/2025 2:30:45 AM)             Download with PVdBDownPage in file:|X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg the information of:|https://m.media-amazon.com/images/M/MV5BMTI0NTcyMTM5OF5BMl5BanBnXkFtZTYwOTkzMDU2._V1_UY12000_.jpg||
(1/4/2025 2:30:45 AM)             Waiting 2s for exists of:X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:47 AM)             Now present complete page file: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:47 AM)       Function DownloadImage END======================|
(1/4/2025 2:30:47 AM) Image successfully downloaded to: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:47 AM) Adding image with URL: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg and type: itPoster
(1/4/2025 2:30:47 AM) AddImageURL has been called with ImageType: 0 and ImageFile: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:47 AM) Person -> LoadStatic -> 0ms
(1/4/2025 2:30:47 AM) Person -> LoadMultivalues -> 0ms
(1/4/2025 2:30:47 AM) Person -> LoadFilms -> 16ms
(1/4/2025 2:30:47 AM) Person -> LoadAwards -> 0ms
(1/4/2025 2:30:47 AM) Person -> LoadImages -> 0ms
(1/4/2025 2:30:47 AM) Get result PhotoURL: https://m.media-amazon.com/images/M/MV5BMTI0NTcyMTM5OF5BMl5BanBnXkFtZTYwOTkzMDU2._V1_UY12000_.jpg (https://m.media-amazon.com/images/M/MV5BMTI0NTcyMTM5OF5BMl5BanBnXkFtZTYwOTkzMDU2._V1_UY12000_.jpg) ||
(1/4/2025 2:30:47 AM) Script end. After, PVdB will retrieve from ListImage and info of person in order get the photo
(1/4/2025 2:30:47 AM) Photo processed and added successfully.

I have changed real path of my PVD with "X:PATH\To_PVD\" to post it here for privacy reasons...
Title: Re: Imdb People script issues
Post by: afrocuban on January 04, 2025, 04:17:42 am
I give up. I will post script without photo populate working for me. Maybe someone will check it...


Here it is, check:

Quote
http://www.videodb.info/forum_en/index.php/topic,4367.0.html
and
Quote
http://www.videodb.info/forum_en/index.php?topic=4368.0
for more
Title: Re: Imdb People script issues
Post by: Ivek23 on January 04, 2025, 04:21:34 pm
I found a solution, the script now also uploads people's photos to the database. I will add it to the forum tomorrow at the latest.
Title: Re: Imdb People script issues
Post by: Ivek23 on January 04, 2025, 06:20:11 pm
Here is the IMDB_People_[EN][HTTPS]_TEST_2_full script, which can be found at the link below.

http://www.videodb.info/forum_en/index.php/topic,4369.0.html
Title: Re: Imdb People script issues
Post by: afrocuban on January 04, 2025, 09:29:10 pm

I found a solution, the script now also uploads people's photos to the database. I will add it to the forum tomorrow at the latest.


I knew it!
Thanks once again!
Title: Re: Imdb People script issues
Post by: Ivek23 on January 06, 2025, 12:05:38 pm

I found a solution, the script now also uploads people's photos to the database. I will add it to the forum tomorrow at the latest.


I knew it!
Thanks once again!

Thanks
Title: Re: Imdb People script issues
Post by: afrocuban on January 06, 2025, 01:48:49 pm
This was impossible without you and your willingness to help.

I am now thinking of search function.

Maybe to transfer search to Selenium too (if storedURL isn't found)? That would make PVD script seamless, calling everything external from a single PVD script, and parsing what is gotten back? We would have only one PVD script for each PVD segment: movies and people. Even single Selenium script for search, depending on argument (movie or people) passed to it. What do you think?
Title: Re: Imdb People script issues
Post by: Ivek23 on January 06, 2025, 06:36:08 pm
This was impossible without you and your willingness to help.

I am now thinking of search function.

Maybe to transfer search to Selenium too (if storedURL isn't found)? That would make PVD script seamless, calling everything external from a single PVD script, and parsing what is gotten back? We would have only one PVD script for each PVD segment: movies and people. Even single Selenium script for search, depending on argument (movie or people) passed to it. What do you think?

Good idea.
Title: Re: Imdb People script issues
Post by: afrocuban on January 07, 2025, 03:03:15 am
Great. I already had first interaction with AI regarding this and it gave me the concept. The thing that worries me is presenting search results in PVD, gotten by Selenium, but let's go step by step...