Personal Video Database
English => Support => Topic started by: afrocuban on November 08, 2024, 05:09:10 pm
-
Does anyone have the issue with this skin, as I do. Namely, when downloading a person's photo is taking a place, it freezes, I cannot cancel import, and the only way to continue is to kill viddb.exe process, and to restart PVD.
Thanks a bunch for a feedback.
-
Can someone try to import any person with People Imdb script please, so I could know I am the only one having this issue? Upon restarting PVD if I click on the name the actor on which PVD frozen up, freezing occurs again and I have to improvise to delete actor's record from the database.
-
What is the error message in the log file?
-
None. It starts parsing and hangs. On a status bar in PVD I see it hangs on importing people photo jpg.
-
None. It starts parsing and hangs. On a status bar in PVD I see it hangs on importing people photo jpg.
Then it blocks the entire code for downloading the people photo jpg.
(*
//Get ~Photo~ . Remember that the PVdB ~transname~ Translated Name is not stored in TheMovieDB. Can be used for PhotoURL
ItemValue:=TextBetWeenFirst(ItemList,'"image":"','",'); // WEB_SPECIFIC.
If (Length(ItemValue)>0) and (Pos('nopicture',ItemValue)=0)Then Begin //"https://m.media-amazon.com/images/G/01/imdb/images/nopicture/...' NOT exists working httpS
PhotoURL:=TextBetWeenFirst(ItemValue,BASE_URL_IMAGE_PRE_TRUE,'.'); //Get poster code. Strings which opens/closes the data. WEB_SPECIFIC
If ((Length(PhotoURL)>0) and Not(USE_SAVED_PVDCONFIG and (Copy(PVDConfigOptions,opPhoto,1)='0'))) then begin //The Poster will be saved in PVD
PhotoURL:=BASE_URL_IMAGE_PRE_TRUE + PhotoURL; //Base poster URL without '.jpg'. WEB_SPECIFIC
ImageFile:=GetAppPath+'Scripts\'+BASE_DOWNLOAD_FILE_IMAGE_NAME+'-Photo.jpg'
// Avoid HTTPS redirection: Download https image to file
If (1=DownloadImage(PhotoURL + '._V1_UY' + IntToStr(MAX_IMAGE_HEIGTH) + '_.jpg',ImageFile)) then begin //Dowload in the selected user max size. WEB_SPECIFIC
AddImageURL(itPoster,ImageFile); //Get the photo to the database.But I don't know why but it doesnt work: not retrive the photo like in movie poster
AddSearchResult(GetFieldValueXML('name'), '', '', ImageFile, ImageFile); //It's not possible avoid GetFieldValueXML because the name can't be the same.
if PHOTO_URL_IN_TRANSNAME then AddFieldValueXML('transname',PhotoURL + '._V1_UY' + IntToStr(MAX_IMAGE_HEIGTH) + '_.jpg'); //For storing the URL to the person photo, for send to KODI in a Template
LogMessage(' Get result PhotoURL:'+PhotoURL + '._V1_UY' + IntToStr(MAX_IMAGE_HEIGTH) + '_.jpg'+'||');
LogMessage('Script end. After, PVdB will retreive from ListImage and info of person in order get the photo');
Result:=prListImage;
end else if (1=DownloadImage(ItemValue +'.jpg',ImageFile)) then begin //Donwload in the web base size. WEB_SPECIFIC
AddImageURL(itPoster,ImageFile); //Get the photo to the database.But I don't know why but it doesnt work: not retrive the photo like in movie poster
AddSearchResult(GetFieldValueXML('name'), '', '', ImageFile, ImageFile); //It's not possible avoid GetFieldValueXML because the name can't be the same.
if PHOTO_URL_IN_TRANSNAME then AddFieldValueXML('transname',PhotoURL+'.jpg'); //For storing the URL to the person photo, for send to KODI in a Template
LogMessage(' Get result PhotoURL:'+PhotoURL+'.jpg'+'||');
LogMessage('Script end. After, PVdB will retreive from ListImage and info of person in order get the photo');
Result:=prListImage;
end;
End;
End Else Begin
PhotoURL:='';
End;
*)
-
Thanks. Are you saying I should try to comment out downloading photo to check if that is the culprit?
-
Thanks. Are you saying I should try to comment out downloading photo to check if that is the culprit?
Yes, I hope that the culprit of the problem is only this code for downloading the photo of people, and not something else, because I haven't tested it myself.
-
Thanks. Are you saying I should try to comment out downloading photo to check if that is the culprit?
Yes, I hope that the culprit of the problem is only this code for downloading the photo of people, and not something else, because I haven't tested it myself.
The code for Function ParsePage_IMDBPeopleBIO is to blame, so that PVD freezes.
Function ParsePage_IMDBPeopleBIO(HTML:String):Cardinal; //BlockOpen
//Returns:
// Result:=prFinished; Script has finished gathering data
// Result:=prError; If żany big problem? with exit;
//Retrieve: ~bio~ Biography from "Mini Bio" IMDB section
// (* *)
Var
curPos,endPos,debug_pos1:Integer;
ItemValue:String;
PersonID,ItemValue0,ItemValue10,ItemValue1,ItemValue11:String;
ItemList,ItemList00,ItemList0,ItemList1,ItemList11,ItemList12:String;
Begin
LogMessage('Function ParsePage_IMDBPeopleBIO BEGIN=====================||');
Result:=prFinished; //It will change to prError if any big problem with exit;
(*
//Get "Biography" info
curPos:=Pos('<h1 class="ipc-title__text">Biography</h1>',HTML); //Strings start which opens the block content data. WEB_SPECIFIC
if (curPos=0) then Exit;
ItemList0:=TextBetWeenFirst(HTML,'<h1 class="ipc-title__text','<h3 class="ipc-title__text"><span>Contribute to this page</span></h3>');
//LogMessage(' ** Parse Biography '+#13+ItemList0+' **');
If (Length(ItemList0)>0) Then Begin
ItemValue1:=TextBetWeenFirst(ItemList0,'<span class="ipc-metadata-list-item__label" aria-disabled="false">Birth name</span>','</div>');
if BIRTH_NAME_IN_TRANSNAME then
if ItemValue1 <> '' then AddFieldValueXML('transname',ItemValue1);
If ItemValue1 <> '' then LogMessage(' Get result from Birth Name01:'+ItemValue1+'||');
End;
ItemList:='';
ItemList11:='';
//Get PersonID
PersonID:=TextBetWeenFirst(HTML,'<meta property="imdb:pageConst" content="','"/>'); //WEB_SPECIFIC.
if (2<Length(PersonID)) then begin
ItemList:='<link url="http://www.imdb.com/name/'+PersonID+'/bio/#overview">Biography Info</link>';
LogMessage(' Get result PersonID:'+PersonID+'||');
end;
//Get "Mini bio" Biography text
If Pos('<h1 class="ipc-title__text">Biography</h1>',HTML)>0 Then Begin
curPos:=Pos('<h3 class="ipc-title__text"><span id="mini_bio">Mini Bio</span>',HTML); //WEB_SPECIFIC.
If 0<curPos Then Begin
curPos:=PosFrom('<li role="presentation" class="ipc-metadata-list__item" id="mini_bio_0" data-testid="list-item"><div class="ipc-metadata-list-item__content-container"><ul class="ipc-inline-list ipc-inline-list--show-dividers ipc-inline-list--inline ipc-metadata-list-item__list-content base" role="presentation"><div class="ipc-html-content ipc-html-content--base ipc-metadata-list-item-html-item" role="presentation"><div class="ipc-html-content-inner-div">',HTML,EndPos)+Length('<li role="presentation" class="ipc-metadata-list__item" id="mini_bio_0" data-testid="list-item"><div class="ipc-metadata-list-item__content-container"><ul class="ipc-inline-list ipc-inline-list--show-dividers ipc-inline-list--inline ipc-metadata-list-item__list-content base" role="presentation"><div class="ipc-html-content ipc-html-content--base ipc-metadata-list-item-html-item" role="presentation"><div class="ipc-html-content-inner-div">');
EndPos:=PosFrom('</div>',HTML,curPos);
//ItemValue:=Copy(HTML,curPos,endPos-curPos);
ItemValue:=Trim(Copy(HTML,curPos,endPos-curPos)); //ItemValue:=Copy(HTML,curPos+425,endPos-curPos-425);
//LogMessage(' Get result bio (from Mini bio)1:'+ItemValue+'||');
ItemValue:=StringReplace(ItemValue,#10,#160,True,False,True);
//LogMessage(' Get result bio (from Mini bio)2:'+ItemValue+'||');
ItemValue:=StringReplace(ItemValue,'<a class="ipc-md-link ipc-md-link--entity" href="','<link url="http://www.imdb.com' ,True,False,True);
ItemValue:=StringReplace(ItemValue,'/?ref_=nmbio_mbio">',+'/">',True,False,True);
ItemValue:=StringReplace(ItemValue,'</a>','</link>',True,False,True);
//LogMessage(' Get result bio (from Mini bio)0:'+ItemValue+'||');
//curPos:=Pos('###',ItemValue);
//If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-2);
//curPos:=Pos('</p>',ItemValue); //WEB_SPECIFIC. Chr(13)
//If 0<curPos then Delete(ItemValue,curPos,Length(ItemValue)-curPos);
//if Pos('~ ', ItemValue) = 0 then Delete(ItemValue,1,2);
If BIO_URL_IN_BIO then ItemValue:=RemoveTags(ItemValue, False);
LogMessage(' Get result bio (from Mini bio):'+ItemValue+'||');
If ItemValue <> '' then ItemList11:=ItemList11+ItemValue;
//if ItemValue <> '' then AddFieldValueXML('bio',ItemValue);
End;
End;
//If (ItemList11 = '') AND (ItemList <> '') Then
ItemList12:=ItemList;
If (ItemList11 <> '') AND (ItemList <> '') Then
ItemList12:=ItemList11;
//ItemList12:=ItemList11+#13+'--------------------------------------------------------------------------'+#13+ItemList+#32#32#32+'<link url="http://www.imdb.com/name/'+PersonID+'/bio/#mini_bio">Mini bio Biography</link>';
///If BIO_INFO_IN_BIO then AddFieldValueXML('bio',ItemList12);
///If Not(BIO_INFO_IN_BIO) Then AddFieldValueXML('bio',ItemList11);
//Get "Birth name" Biography text
ItemList00:='';
ItemList00:=TextBetWeenFirst(HTML,'<h1 class="ipc-title__text','<h3 class="ipc-title__text"><span>Contribute to this page</span></h3>');
//LogMessage(' *** Parse Biography '+#13+ItemList00+' ***');
If (Length(ItemList00)>0) Then Begin
ItemValue0:=TextBetWeenFirst(ItemList00,'<span class="ipc-metadata-list-item__label" aria-disabled="false">Birth name</span>','</div></div></div>');
if BIRTH_NAME_IN_TRANSNAME then
//if ItemValue0 <> '' then AddFieldValueXML('transname',ItemValue0);
If ItemValue <> '' then LogMessage(' Get result from Birth Name02:'+ItemValue0+'||');
If ItemValue0 <> '' then ItemValue0:='BirthName: '+ItemValue0;
If ItemValue0 <> '' then ItemList12:=ItemList12+#13+'--------------------------------------------------------------------------'+#13+ItemValue0;
End;
If BIO_INFO_IN_BIO then AddFieldValueXML('bio',ItemList12);
If Not(BIO_INFO_IN_BIO) Then AddFieldValueXML('bio',ItemList11);
*)
LogMessage('Function ParsePage_IMDBPeopleBIO END=====================||');
End; //BlockClose
Below is the added code and IMDB_People_[EN][HTTPS] (2) script, where this function is blocked. The script needs massive changes due to major changes in the source code of the website.
-
As I said somewhere else, i fixed bio and genre fields, and now I'm dealing with integrating selenium into PVD for downloading dymanic HTML content.
Reference from this point forward (http://www.videodb.info/forum_en/index.php/topic,4357.msg22661.html#msg22661)
Now, I have passed the phase to parse the Awards page manually downloaded with selenium, and I'm having hard time with it. I have fixed the code to parse the page, and it successfuly parse it as you can see here:
(12/21/2024 10:24:56 PM) Parsed Event: Ariel Awards, Mexico
(12/21/2024 10:24:56 PM) Parsed Award: Golden Ariel
(12/21/2024 10:24:56 PM) Parsed Category: Best Picture (Mejor Película)
(12/21/2024 10:24:56 PM) Parsed Recipient: Roma
(12/21/2024 10:24:56 PM) Parsed Year: 2019
(12/21/2024 10:24:56 PM) Parsed Won: True
(12/21/2024 10:24:56 PM) Before calling AddAward with parameters:
(12/21/2024 10:24:56 PM) Event: Ariel Awards, Mexico
(12/21/2024 10:24:56 PM) Award: Golden Ariel
(12/21/2024 10:24:56 PM) Category: Best Picture (Mejor Película)
(12/21/2024 10:24:56 PM) Recipient: Roma
(12/21/2024 10:24:56 PM) Year: 2019
(12/21/2024 10:24:56 PM) Won: True
(12/21/2024 10:24:56 PM) AddAward executed successfully.
(12/21/2024 10:24:56 PM) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(12/21/2024 10:24:56 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
But for some reason the value is not populated/displayed in PVD. So I thought I'll create custom memo field "IMDb People Awards" to populate there the value to check what it looks like, only to realize no custom field is visible in PVD's People section???
Is it possible at all to add custom fields in People section?
I manually put the value in the field (added to my dark people skin), but when I exit edit mode, it's not displayed. When I enter edit mode, it's there. If I restart PVD value dissappears from the custom field.
Anyway, does anyone know looking at the log, why this properly parsed award wouldn't populate to field although reported that it did?
Here's whole function (I even added some extra logging around adding value to the field in order to see what is going on, but to no avail - everything looks perfect yet value is not there):
Function CustomBoolToStr(Value: Boolean): String;
Begin
If Value Then
Result := 'True'
Else
Result := 'False';
End;
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
curPos, endPos: Integer;
ItemList, Event, Award, Category, Recipient, Year: String;
AValue: String; // Declaring AValue as a String
Won: Boolean;
FailSafe: Integer; // To prevent infinite loops
Begin
LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
try
Result := prFinished;
// Log the initial HTML snippet being parsed
LogMessage('Initial HTML snippet: ' + Copy(HTML, 1, 500));
// Find the position of the Awards title
curPos := Pos('<h1 class="ipc-title__text">Awards</h1>', HTML);
If curPos > 0 Then Begin
// Find the position of the Awards section
curPos := PosFrom('<section class="ipc-page-section ipc-page-section--base">', HTML, curPos);
End;
If curPos > 0 Then Begin
// Find the end position of the Awards section
endPos := PosFrom('</section>', HTML, curPos);
If endPos = 0 Then endPos := Length(HTML);
If (curPos > 0) AND (endPos > curPos) Then Begin
// Extract the Awards block
ItemList := Copy(HTML, curPos, endPos - curPos);
// Extract and log the event name
curPos := PosFrom('<h3 class="ipc-title__text">', ItemList, 1);
If curPos > 0 Then Begin
curPos := PosFrom('>', ItemList, curPos) + 1;
endPos := PosFrom('</span>', ItemList, curPos);
Event := Copy(ItemList, curPos, endPos - curPos);
Event := Trim(Event);
// Remove the <span> tag
Event := Copy(Event, Pos('>', Event) + 1, Length(Event));
LogMessage('Parsed Event: ' + Event);
End Else LogMessage('Error: Event title div not found.');
// Parse each award item manually
curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item"', ItemList, 1);
FailSafe := 0; // Initialize fail-safe counter
While (curPos > 0) And (FailSafe < 10) Do Begin
// Extract and log the award name
curPos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, curPos);
If curPos > 0 Then Begin
curPos := PosFrom('>', ItemList, curPos) + 1;
endPos := PosFrom('</span>', ItemList, curPos);
Award := Copy(ItemList, curPos, endPos - curPos);
LogMessage('Parsed Award: ' + Award);
// Extract and log the category name
curPos := PosFrom('<span class="ipc-metadata-list-summary-item__li awardCategoryName"', ItemList, curPos);
If curPos > 0 Then Begin
curPos := PosFrom('>', ItemList, curPos) + 1;
endPos := PosFrom('</span>', ItemList, curPos);
Category := Copy(ItemList, curPos, endPos - curPos);
LogMessage('Parsed Category: ' + Category);
// Extract and log the recipient name
curPos := PosFrom('<a class="ipc-metadata-list-summary-item__li ipc-metadata-list-summary-item__li--link"', ItemList, curPos);
If curPos > 0 Then Begin
curPos := PosFrom('>', ItemList, curPos) + 1;
endPos := PosFrom('</a>', ItemList, curPos);
Recipient := Copy(ItemList, curPos, endPos - curPos);
LogMessage('Parsed Recipient: ' + Recipient);
// Extract and log the year
curPos := PosFrom('<a class="ipc-metadata-list-summary-item__t"', ItemList, curPos);
If curPos > 0 Then Begin
curPos := PosFrom('>', ItemList, curPos) + 1;
endPos := PosFrom(' ', ItemList, curPos); // Find the space after the year
Year := Copy(ItemList, curPos, endPos - curPos);
Year := Trim(Year);
LogMessage('Parsed Year: ' + Year);
End Else LogMessage('Error: Year not found.');
// Determine if the award was won
Won := Pos('Winner', ItemList) > 0;
If Won Then
LogMessage('Parsed Won: True')
Else
LogMessage('Parsed Won: False');
// Construct the AValue string
AValue := 'Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=' + CustomBoolToStr(Won);
// Log the parameters before calling AddAward
LogMessage('Before calling AddAward with parameters:');
LogMessage('Event: ' + Event);
LogMessage('Award: ' + Award);
LogMessage('Category: ' + Category);
LogMessage('Recipient: ' + Recipient);
LogMessage('Year: ' + Year);
LogMessage('Won: ' + CustomBoolToStr(Won));
// Add the award to the database with error handling
try
AddAward(Event, Award, Category, Recipient, Year, Won);
LogMessage('AddAward executed successfully.');
except
Begin
LogMessage('Exception encountered in AddAward');
Result := prError;
End;
end;
// Populate the custom field with AValue
AddCustomFieldValueByName('IMDb People Awards', AValue);
LogMessage('IMDb People Awards added ' + AValue)
// Log the action of adding the award
If Won Then
LogMessage('Added Award to Database: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won: True')
Else
LogMessage('Added Award to Database: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won: False');
End Else LogMessage('Error: Recipient not found.');
End Else LogMessage('Error: Category not found.');
End Else LogMessage('Error: Award not found.');
// Move to the next item
curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item"', ItemList, curPos + 1);
End;
End Else LogMessage('Error: Invalid endPos or curPos for Awards section');
End Else LogMessage('Error: Awards section not found');
except
Begin
LogMessage('Exception encountered');
Result := prError;
End;
end;
LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
Result := prFinished;
End;
//BlockClose
-
But for some reason the value is not populated/displayed in PVD. So I thought I'll create custom memo field "IMDb People Awards" to populate there the value to check what it looks like, only to realize no custom field is visible in PVD's People section???
Is it possible at all to add custom fields in People section?
I manually put the value in the field (added to my dark people skin), but when I exit edit mode, it's not displayed. When I enter edit mode, it's there. If I restart PVD value dissappears from the custom field.
You can only see this in the comment box for people to see what is happening.
The code you have now does not complete the process, so this is the code
AddAward(EventName, AwardName, AwardCategory, AwardRecipient, EventYear, AwardWon);
therefore, it cannot write the award data so that the awards will then be visible in the awards field in the database.
Here is the awards code to help you.
Function ParsePage_IMDBPeopleAWARDS(HTML:String):Cardinal; //BlockOpen
//Returns:
// Result:=prFinished; Script has finished gathering data
// Result:=prError; If żany big problem? with exit
//Retrieve: AddAward(Event, Award, Category, Recipient, Year, Won)
Var
curPos,endPos,endPosAux,index,curPos0,curPos1,curPos2,curPos3,curPos4,endPos0,endPos1,endPos2:Integer;
ItemList:String;
ItemArray: TWideArray;
MovieURL,MovieYear,EventBlock,EventName,EventYear,YearBlock,AwardBlock,AwardName,AwardCategory,AwardRecipient:String;
AwardWon: Boolean;
Begin
LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
Result:=prFinished; //It will change to prError if any big problem with exit
//Get award (several values save in PVD with AddAward(Event, Award, Category, Recipient, Year, Won)
// Parameters: Example Al Pacino
//Event (Academy Awards, USA): Name of the event
//Year (1993) = EventYear
//Won (True,Winner/Nominee) set to true if the recipient won the award and to false otherwise
//Award (Oscar): Best award name
//Category (Best Actor in a Leading Role): award category
//Recipient (Scent of a Woman): for people records the variable should contain the title of a movie for which the person won the award
// for movie records this variable should contain the name of a specific person who won the award
//Year (1973): release year of a movie (only applicable when adding award to a person record) -> NO: Use EventYear allways, in movie and in people
//Won (True,Winner/Nominee) set to true if the recipient won the award and to false otherwise
//Go to "Awards" There is 4 levels: 1) Event (name) 2) Year (not saved) 3) Award (with outcome-Winner and name) 4) Recipient (award_description and Movie(name and year))
curPos:=Pos('<h1 class="header">Awards',HTML); //Strings start which opens the block content data. WEB_SPECIFIC
curPos:=PosFrom('</h1>',HTML,curPos); //Strings end which opens the block content data. WEB_SPECIFIC
curPos:=curPos+Length('</h1>'); //Strings end which opens the block content data. WEB_SPECIFIC
//Event Level
curPos:=PosFrom('<table class="awards"',HTML,curPos); //String which opens/closes the Event close but not the name. Search directly '<h3>' is very inconsistent. WEB_SPECIFIC
index:=1;
While curPos>0 Do Begin
If (index>EVENTS_LIMIT) Then break; //Limited depassed (Remember index begin in 0).
//Go back for get the EventName and EventYear (Get all "raw" list data for create good values separators)
curPos:=PrevPos('<h3>',HTML,curPos); //String which opens the EventName and EventYear list data. WEB_SPECIFIC
endPos:=PosFrom('</h3>',HTML,curPos)+Length('</h3>'); //Strings which opens/closes the data. WEB_SPECIFIC
ItemList:=Copy(HTML,curPos,endPos-curPos);
EventName:=RemoveTags(ItemList, False);
//LogMessage(' Parse results ('+IntToStr(curPos)+','+IntToStr(endPos)+') complex ItemList:'+ItemList+'||');
//Get all "raw" Event data for create good values separators
curPos:=PosFrom('<table class="awards"',HTML,endPos); //String which opens/closes the Event table data but not the name. WEB_SPECIFIC
endPos:=PosFrom('</table>',HTML,curPos);
//Strings which opens/closes the data. WEB_SPECIFIC
EventBlock:=Copy(HTML,curPos,endPos-curPos);
//LogMessage(' Parse results ('+IntToStr(curPos)+','+IntToStr(endPos)+') complex EventBlock:'+EventBlock+'||');
//Year Level
curPos0:=Pos('<td class="award_year"',EventBlock); //String which opens the AwardYear list data. WEB_SPECIFIC
While curPos0>0 Do Begin
//Get EventYear
endPos0:=PosFrom('</td>',EventBlock,curPos0)+Length('</td>'); //Strings which opens/closes the data. WEB_SPECIFIC
ItemList:=Copy(EventBlock,curPos0,endPos0-curPos0);
EventYear:=Trim(RemoveTags(ItemList, False));
//Get all "raw" Year data for create good values separators
endPosAux:=PosFrom('<td class="award_year"',EventBlock,endPos0); //Strings which opens/closes the next block data. WEB_SPECIFIC
If (endPosAux=0) Then endPosAux:=Length(EventBlock); //If no more blocks, set endPosAux at the last character.
YearBlock:=Copy(EventBlock,curPos0,endPosAux-curPos0);
//LogMessage(' Parse results ('+IntToStr(curPos0)+','+IntToStr(endPosAux)+') complex YearBlock:'+YearBlock+'||');
//Award Level
curPos1:=Pos('<td class="award_outcome"',YearBlock); //String which opens the AwardName and Won list data. WEB_SPECIFIC
While curPos1>0 Do Begin
//Get AwardWon and AwardName
endPos1:=PosFrom('</td>',YearBlock,curPos1)+Length('</td>'); //Strings which opens/closes the data. WEB_SPECIFIC
ItemList:=Copy(YearBlock,curPos1,endPos1-curPos1);
ItemList:=StringReplace(ItemList,'category','>;<',True,True,False); //WEB_SPECIFIC
ItemList:=RemoveTags(ItemList, False);
//LogMessage(' Parse results ('+IntToStr(curPos1)+','+IntToStr(endPos1)+') complex ItemList:'+ItemList+'||');
ExplodeString(ItemList,ItemArray,';');
AwardWon:= False; //Normaly in 'Nominee' case. WEB_SPECIFIC
If Pos('Winner',ItemArray[0])>0 Then AwardWon:= True; //WEB_SPECIFIC
AwardName:=ItemArray[1];
//Get all "raw" Award data for create good values separators
endPosAux:=PosFrom('<td class="award_outcome"',YearBlock,endPos1); //Strings which opens/closes the next block data. WEB_SPECIFIC
If (endPosAux=0) Then endPosAux:=Length(YearBlock); //If no more blocks, set endPosAux at the last character.
AwardBlock:=Copy(YearBlock,curPos1,endPosAux-curPos1);
//LogMessage(' Parse results ('+IntToStr(curPos1)+','+IntToStr(endPosAux)+') complex AwardBlock:'+AwardBlock+'||');
//Recipient Level
curPos2:=Pos('<td class="award_description">',AwardBlock); //String which opens the AwardCategory and AwardRecipient list data. WEB_SPECIFIC
While curPos2>0 Do Begin
//Get all "raw" list data for create good values separators (not use TextBetWeen)
endPos2:=PosFrom('</td>',AwardBlock,curPos2)+Length('</td>'); //Strings which opens/closes the data. WEB_SPECIFIC
ItemList:=Copy(AwardBlock,curPos2,endPos2-curPos2);
//LogMessage(' Parse results ('+IntToStr(curPos2)+','+IntToStr(curPos2)+') complex ItemList:'+ItemList+'||');
//The Receipt awards ItemList may have: 1) empty description or not have name (not interesting) and break ItemArray[]. 2) Several titles with year 3) Detail o full Notes
//So is better search sequentily by token in a block than with ItemArray
endPosAux:=PosFrom(#13,ItemList,2); //Strings which opens/closes the data. WEB_SPECIFIC
curPos3:=PosFrom('title',ItemList,2); //Strings which opens/closes the data. WEB_SPECIFIC
If (endPosAux<curPos3) Or (curPos3=0) Then Begin //There is Awardcategory because #13 is befor name or there isn't name. WEB_SPECIFIC
curPos4:=1;
AwardCategory:=TextBetWeen(ItemList,'<td class="award_description">',#13,false,curPos4); //Strings which opens/closes the data. WEB_SPECIFIC
LogMessage(' Parse Results in AwardCategory:'+AwardCategory+'||');
curPos4:=Pos('Shared with:',AwardCategory); //WEB_SPECIFIC.
If 0<curPos4 then AwardCategory:=Copy(AwardCategory,0,curPos4-1);
LogMessage(' Parse Results in AwardCategory0:'+AwardCategory+'||');
End Else Begin
AwardCategory:='';
End;
If curPos3=0 Then Begin //Award without Recipient
AddAward(EventName, AwardName, AwardCategory, '', EventYear, AwardWon);
LogMessage(' Get results Awards:#'+IntToStr(index)+'|'+EventName+'|'+AwardName+'|'+AwardCategory+'|'+''+'|'+EventYear+'|'); //+BoolToStr(AwardWon)+'||');
End;
While curPos3>0 Do Begin
MovieURL:='http://www.imdb.com/title'+TextBetWeen(ItemList,'<a href="/title','?ref_=nmawd_awd_',true,curPos4)+'/'; //Strings which opens/closes the data. WEB_SPECIFIC
LogMessage(' ** Parse Results in MovieURL: '+MovieURL);
AwardRecipient:=TextBetWeen(ItemList,'>','<',false,curPos3); //Strings which opens/closes the data. WEB_SPECIFIC
LogMessage(' Parse Results in AwardRecipient:'+AwardRecipient+'||');
MovieYear:=TextBetWeen(ItemList,'(',')',false,curPos3); //Strings which opens/closes the data. WEB_SPECIFIC
LogMessage(' ** Parse Results in MovieYear:'+MovieYear);
AddAward(EventName, AwardName, AwardCategory, AwardRecipient, EventYear, AwardWon);
LogMessage(' Get results Awards:#'+IntToStr(index)+'|'+EventName+'|'+AwardName+'|'+AwardCategory+'|'+AwardRecipient+'|'+EventYear+'|'); //+BoolToStr(AwardWon)+'||');
endPosAux:=PosFrom('truncated-note',ItemList,curPos3); //Strings which opens/closes the data. WEB_SPECIFIC
curPos3:=PosFrom('title',ItemList,curPos3); //Strings which opens/closes the data. WEB_SPECIFIC
If curPos3>endPosAux Then curPos3:=0 //Avoid Names in notes. WEB_SPECIFIC
End;
curPos2:=PosFrom('<td class="award_description">',AwardBlock,endPos2); //String which opens the AwardCategory and AwardRecipient list data. WEB_SPECIFIC
End;
curPos1:=PosFrom('<td class="award_outcome"',YearBlock,endPos1); //String which opens the AwardName and Won list data. WEB_SPECIFIC
End;
curPos0:=PosFrom('<td class="award_year"',EventBlock,endPos0); //String which opens the AwardYearlist data. WEB_SPECIFIC
End;
curPos:=PosFrom('<table class="awards"',HTML,endPos); //String which detectecs the Event. Search directly '<h3>' is very inconsistent. WEB_SPECIFIC
index:=index+1;
End;
LogMessage('Function ParsePage_IMDBMovieAWARDS END=====================||');
End; //BlockClose
-
Thanks Ivek!
I never knew about custom fields.
I also didn't know how would database know that there are more awards on the page so it will not populate what is already offered to it:
Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
It clearly says that it is added to database and it contains all the parameters defined by the procedure.
But I'll try to parse all awards and events to check if it works then...
Regarding th function you sent, I started from it actually, but with no luck. I couldn't revise it in a meaningful way to get desired result (mainly due to TextBetWeen), so I had to start from scratch manually parsing html content.
-
(22.12.2024 21:12:36) Compiling script: IMDB_People_[EN][HTTPS].psf
(22.12.2024 21:12:36) Script compiled successfully: IMDB_People_[EN][HTTPS].psf
[Hint] (492:7): Variable 'CURPOS' never used
[Hint] (493:7): Variable 'ITEMVALUE' never used
[Hint] (493:7): Variable 'IMAGEFILE' never used
[Hint] (494:7): Variable 'NAME' never used
[Hint] (494:7): Variable 'PREVIEWURL' never used
[Hint] (582:5): Variable 'CURPOS' never used
[Hint] (582:5): Variable 'ENDPOS' never used
[Hint] (582:5): Variable 'DEBUG_POS1' never used
[Hint] (582:5): Variable 'INDEX' never used
[Hint] (583:5): Variable 'PHOTOURL' never used
[Hint] (583:5): Variable 'ITEMVALUE' never used
[Hint] (583:5): Variable 'ITEMLIST' never used
[Hint] (583:5): Variable 'IMAGEFILE' never used
[Hint] (584:2): Variable 'PERSONID' never used
[Hint] (584:2): Variable 'ITEMVALUE0' never used
[Hint] (584:2): Variable 'ITEMVALUE1' never used
[Hint] (584:2): Variable 'ITEMVALUE2' never used
[Hint] (584:2): Variable 'ITEMVALUE3' never used
[Hint] (585:2): Variable 'JOBTITLE' never used
[Hint] (585:2): Variable 'ALTNAMES' never used
[Hint] (585:2): Variable 'ALTNAMES1' never used
[Hint] (585:2): Variable 'DEATHAGE' never used
[Hint] (586:2): Variable 'ITEMLIST0' never used
[Hint] (586:2): Variable 'ITEMLIST1' never used
[Hint] (586:2): Variable 'ITEMLIST2' never used
[Hint] (586:2): Variable 'ITEMLIST4' never used
[Hint] (587:2): Variable 'TITLE' never used
[Hint] (587:2): Variable 'ROLE' never used
[Hint] (587:2): Variable 'YEAR' never used
[Hint] (587:2): Variable 'MOVIEURL' never used
[Warning] (852:57): "True and" is not needed
[Warning] (852:29): "True and" is not needed
(22.12.2024 21:12:36) Executing script binary
(22.12.2024 21:12:36) Script loaded: IMDB_People_[EN][HTTPS].psf 1.4.3.5
(22.12.2024 21:12:37) Loading database: D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\MOVIES33B1B22.pvd
(22.12.2024 21:12:38) Person -> LoadStatic -> 0ms
(22.12.2024 21:12:38) Person -> LoadMultivalues -> 0ms
(22.12.2024 21:12:38) Person -> LoadFilms -> 0ms
(22.12.2024 21:12:38) Person -> LoadAwards -> 0ms
(22.12.2024 21:12:38) Person -> LoadImages -> 0ms
(22.12.2024 21:12:40) Compiling script: IMDB_People_[EN][HTTPS].psf
(22.12.2024 21:12:40) Script compiled successfully: IMDB_People_[EN][HTTPS].psf
[Hint] (492:7): Variable 'CURPOS' never used
[Hint] (493:7): Variable 'ITEMVALUE' never used
[Hint] (493:7): Variable 'IMAGEFILE' never used
[Hint] (494:7): Variable 'NAME' never used
[Hint] (494:7): Variable 'PREVIEWURL' never used
[Hint] (582:5): Variable 'CURPOS' never used
[Hint] (582:5): Variable 'ENDPOS' never used
[Hint] (582:5): Variable 'DEBUG_POS1' never used
[Hint] (582:5): Variable 'INDEX' never used
[Hint] (583:5): Variable 'PHOTOURL' never used
[Hint] (583:5): Variable 'ITEMVALUE' never used
[Hint] (583:5): Variable 'ITEMLIST' never used
[Hint] (583:5): Variable 'IMAGEFILE' never used
[Hint] (584:2): Variable 'PERSONID' never used
[Hint] (584:2): Variable 'ITEMVALUE0' never used
[Hint] (584:2): Variable 'ITEMVALUE1' never used
[Hint] (584:2): Variable 'ITEMVALUE2' never used
[Hint] (584:2): Variable 'ITEMVALUE3' never used
[Hint] (585:2): Variable 'JOBTITLE' never used
[Hint] (585:2): Variable 'ALTNAMES' never used
[Hint] (585:2): Variable 'ALTNAMES1' never used
[Hint] (585:2): Variable 'DEATHAGE' never used
[Hint] (586:2): Variable 'ITEMLIST0' never used
[Hint] (586:2): Variable 'ITEMLIST1' never used
[Hint] (586:2): Variable 'ITEMLIST2' never used
[Hint] (586:2): Variable 'ITEMLIST4' never used
[Hint] (587:2): Variable 'TITLE' never used
[Hint] (587:2): Variable 'ROLE' never used
[Hint] (587:2): Variable 'YEAR' never used
[Hint] (587:2): Variable 'MOVIEURL' never used
[Warning] (852:57): "True and" is not needed
[Warning] (852:29): "True and" is not needed
(22.12.2024 21:12:40) Executing script binary
(22.12.2024 21:12:40) Prijava v...
(22.12.2024 21:12:40) Person -> LoadStatic -> 0ms
(22.12.2024 21:12:40) Person -> LoadMultivalues -> 0ms
(22.12.2024 21:12:40) Person -> LoadFilms -> 0ms
(22.12.2024 21:12:40) Person -> LoadAwards -> 0ms
(22.12.2024 21:12:40) Person -> LoadImages -> 0ms
(22.12.2024 21:12:40) Function GetDownloadURL BEGIN======================|
(22.12.2024 21:12:40) Global Var-Mode|0|
(22.12.2024 21:12:40) Global Var-DownloadURL||
(22.12.2024 21:12:40) Person -> LoadStatic -> 0ms
(22.12.2024 21:12:40) Person -> LoadMultivalues -> 0ms
(22.12.2024 21:12:40) Person -> LoadFilms -> 0ms
(22.12.2024 21:12:40) Person -> LoadAwards -> 0ms
(22.12.2024 21:12:40) Person -> LoadImages -> 0ms
(22.12.2024 21:12:40) Stored URL is:http://www.imdb.com/name/nm0190859/awards/||
(22.12.2024 21:12:40) * Stored URL is:http://www.imdb.com/name/nm0190859/awards//||
(22.12.2024 21:12:40) IMDB URL.
(22.12.2024 21:12:40) Parse stored information DownloadURL:https://www.imdb.com/name/nm0190859/||
(22.12.2024 21:12:40) Function GetDownloadURL END====================== with Mode=1 Result=D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\portable.bat|
(22.12.2024 21:12:40) Iskanje ljudi informacij za: Alfonso Cuarón
(22.12.2024 21:12:40) Function ParsePage BEGIN======================|
(22.12.2024 21:12:40) Global Var-Mode|1|
(22.12.2024 21:12:40) Global Var-DownloadURL|https://www.imdb.com/name/nm0190859/|
(22.12.2024 21:12:40) Local Var-URL||
(22.12.2024 21:12:40) ParsePage mode smNormal|1|. Getting provider data for PersonID|nm0190859|
(22.12.2024 21:12:40) Function DownloadPage BEGIN======================|
(22.12.2024 21:12:40) Global Var-DownloadURL|https://www.imdb.com/name/nm0190859/|
(22.12.2024 21:12:40) Local Var-URL|https://www.imdb.com/name/nm0190859/|
(22.12.2024 21:12:40) Waiting 1s for delete:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:41) Download with PVdBDownPage in file:|D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm the information of:|https://www.imdb.com/name/nm0190859/||
(22.12.2024 21:12:41) Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:43) Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:45) Now present complete page file: D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:45) Function DownloadPage END======================|
(22.12.2024 21:12:45) Function ParsePage_IMDBPersonBASE BEGIN======================|
(22.12.2024 21:12:45) Function ParsePage_IMDBPersonBASE END=====================||
(22.12.2024 21:12:45) Function DownloadPage BEGIN======================|
(22.12.2024 21:12:45) Global Var-DownloadURL|https://www.imdb.com/name/nm0190859/awards/|
(22.12.2024 21:12:45) Local Var-URL|https://www.imdb.com/name/nm0190859/awards/|
(22.12.2024 21:12:46) Waiting 1s for delete:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:47) Download with PVdBDownPage in file:|D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm the information of:|https://www.imdb.com/name/nm0190859/awards/||
(22.12.2024 21:12:47) Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:49) Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:51) Waiting 2s for exists of:D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:53) Now present complete page file: D:\MyTest-PVD\Nova mapa (5)\PersonalVideoDBP11\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(22.12.2024 21:12:53) Function DownloadPage END======================|
(22.12.2024 21:12:53) Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||
(22.12.2024 21:12:53) Initial HTML snippet: <!DOCTYPE html><html lang="en-US" xmlns:og="http://opengraphprotocol.org/schema/" xmlns:fb="http://www.facebook.com/2008/fbml"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><script>if(typeof uet === 'function'){ uet('bb', 'LoadTitle', {wb: 1}); }</script><script>window.addEventListener('load', (event) => {
if (typeof window.csa !== 'undefined' && typeof window.csa === 'function') {
var csaLatencyPlugin = window.csa('Content', {
(22.12.2024 21:12:53) Parsed Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Parsed Award: Golden Ariel
(22.12.2024 21:12:53) Parsed Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Golden Ariel
(22.12.2024 21:12:53) Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Silver Ariel
(22.12.2024 21:12:53) Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Error: Year not found.
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Silver Ariel
(22.12.2024 21:12:53) Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Golden Ariel
(22.12.2024 21:12:53) Parsed Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Golden Ariel
(22.12.2024 21:12:53) Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Silver Ariel
(22.12.2024 21:12:53) Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Error: Year not found.
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Silver Ariel
(22.12.2024 21:12:53) Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Golden Ariel
(22.12.2024 21:12:53) Parsed Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Golden Ariel
(22.12.2024 21:12:53) Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Silver Ariel
(22.12.2024 21:12:53) Category: Best Editing (Mejor Edición)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Editing (Mejor Edición), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Silver Ariel
(22.12.2024 21:12:53) Parsed Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Error: Year not found.
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Silver Ariel
(22.12.2024 21:12:53) Category: Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Screenplay Written Directly for the Screen (Mejor Guión Cinematográfico Original), Recipient=Roma, Year=2019, Won: True
(22.12.2024 21:12:53) Parsed Award: Golden Ariel
(22.12.2024 21:12:53) Parsed Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Parsed Recipient: Roma
(22.12.2024 21:12:53) Parsed Year: 2019
(22.12.2024 21:12:53) Parsed Won: True
(22.12.2024 21:12:53) Before calling AddAward with parameters:
(22.12.2024 21:12:53) Event: Ariel Awards, Mexico
(22.12.2024 21:12:53) Award: Golden Ariel
(22.12.2024 21:12:53) Category: Best Picture (Mejor Película)
(22.12.2024 21:12:53) Recipient: Roma
(22.12.2024 21:12:53) Year: 2019
(22.12.2024 21:12:53) Won: True
(22.12.2024 21:12:53) AddAward executed successfully.
(22.12.2024 21:12:53) IMDb People Awards added Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won=True
(22.12.2024 21:12:53) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
Above is a part of the log output where it is visible that the Function ParsePage_IMDBPeopleAWARDS does not close. I had this in mind before, that the part of the code that would end the Function ParsePage_IMDBPeopleAWARDS is missing.
-
Here is the IMDB_People_[EN][HTTPS]_Awards script, which now correctly transfers Awards data to the awards field for the 'Chico' Hernandez person from the url added below using a Python Selenium script
https://www.imdb.com/name/nm0379491/awards/
I have corrected or added some parts of the code to your code and it works.
Python Selenium script instructions and code will be published probably by the new year in the Integrating Selenium to PVD topic.
http://www.videodb.info/forum_en/index.php/topic,4357.0.html
-
Above is a part of the log output where it is visible that the Function ParsePage_IMDBPeopleAWARDS does not close. I had this in mind before, that the part of the code that would end the Function ParsePage_IMDBPeopleAWARDS is missing.
Wow! Strange things happen! Now I realize what you meant, but it never occur to me since it didn't loop in my case, that's why I didn't understand!
Here is the IMDB_People_[EN][HTTPS]_Awards script, which now correctly transfers Awards data to the awards field for the 'Chico' Hernandez person from the url added below using a Python Selenium script
https://www.imdb.com/name/nm0379491/awards/ (https://www.imdb.com/name/nm0379491/awards/)
I have corrected or added some parts of the code to your code and it works.
Python Selenium script instructions and code will be published probably by the new year in the Integrating Selenium to PVD topic.
http://www.videodb.info/forum_en/index.php/topic,4357.0.html (http://www.videodb.info/forum_en/index.php/topic,4357.0.html)
Thanks! It's so great that you are willing to look in the code I provide AND HELP! I'm still testing it, and it looks that it properly parses awards inside events, but it always takes the name of the first event (In your case, person had only one award, but in my case there are multiple, and the first is "Ariel Awards, Mexico" and we can see in the log that there are also Oscars, ALMA Awards, and others after that I didn't post, but all added to event "Ariel Awards, Mexico" event):
12/23/2024 8:34:38 PM) Parsed Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Parsed Award: Golden Ariel
(12/23/2024 8:34:38 PM) Parsed Category: Best Picture (Mejor Película)
(12/23/2024 8:34:38 PM) Parsed Recipient: Roma
(12/23/2024 8:34:38 PM) Parsed Year: 2019
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Golden Ariel
(12/23/2024 8:34:38 PM) Category: Best Picture (Mejor Película)
(12/23/2024 8:34:38 PM) Recipient: Roma
(12/23/2024 8:34:38 PM) Year: 2019
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: Silver Ariel
(12/23/2024 8:34:38 PM) Parsed Category: Best Original Story (Mejor Argumento Original)
(12/23/2024 8:34:38 PM) Parsed Recipient: Love in the Time of Hysteria
(12/23/2024 8:34:38 PM) Parsed Year: 1992
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Silver Ariel
(12/23/2024 8:34:38 PM) Category: Best Original Story (Mejor Argumento Original)
(12/23/2024 8:34:38 PM) Recipient: Love in the Time of Hysteria
(12/23/2024 8:34:38 PM) Year: 1992
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Original Story (Mejor Argumento Original), Recipient=Love in the Time of Hysteria, Year=1992, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: Oscar
(12/23/2024 8:34:38 PM) Parsed Category: Best Achievement in Cinematography
(12/23/2024 8:34:38 PM) Parsed Recipient: Roma
(12/23/2024 8:34:38 PM) Parsed Year: 2019
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Oscar
(12/23/2024 8:34:38 PM) Category: Best Achievement in Cinematography
(12/23/2024 8:34:38 PM) Recipient: Roma
(12/23/2024 8:34:38 PM) Year: 2019
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Oscar, Category=Best Achievement in Cinematography, Recipient=Roma, Year=2019, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: Oscar
(12/23/2024 8:34:38 PM) Parsed Category: Best Achievement in Film Editing
(12/23/2024 8:34:38 PM) Parsed Recipient: Gravity
(12/23/2024 8:34:38 PM) Parsed Year: 2007
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Oscar
(12/23/2024 8:34:38 PM) Category: Best Achievement in Film Editing
(12/23/2024 8:34:38 PM) Recipient: Gravity
(12/23/2024 8:34:38 PM) Year: 2007
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Oscar, Category=Best Achievement in Film Editing, Recipient=Gravity, Year=2007, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: Saturn Award
(12/23/2024 8:34:38 PM) Parsed Category: Best Writing
(12/23/2024 8:34:38 PM) Parsed Recipient: Gravity
(12/23/2024 8:34:38 PM) Parsed Year: 2014
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Saturn Award
(12/23/2024 8:34:38 PM) Category: Best Writing
(12/23/2024 8:34:38 PM) Recipient: Gravity
(12/23/2024 8:34:38 PM) Year: 2014
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Saturn Award, Category=Best Writing, Recipient=Gravity, Year=2014, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: ALMA Award
(12/23/2024 8:34:38 PM) Parsed Category: Outstanding Screenplay - Motion Picture
(12/23/2024 8:34:38 PM) Parsed Recipient: Children of Men
(12/23/2024 8:34:38 PM) Parsed Year: 1999
(12/23/2024 8:34:38 PM) Parsed Won: True
Fortunately, or unfortunately, I'm testing with Alfonso Cuaron, https://www.imdb.com/name/nm0190859/ (https://www.imdb.com/name/nm0190859/) who has 152 events and several hundred awards, so it should vocer all the cases to be tested.
ONE MORE IMPORTANT THING TO NOTE:
For some reason, PVD and script won't work (at least for me) if I manually set the page to be parsed by Function ParsePage_IMDBPeopleAWARDS, like this for example:
// Parse Awards provider page = BASE_URL_AWARD_PERSON
If GET_FULL_AWARDS Then Begin
LogMessage('Starting to parse awards page.');
HTML := ('Tmp\UTF8_NO_BOM-Awards.mhtml');
LogMessage('Read awards page from file: ' + Copy(HTML, 1, 500)); // Log the file content
(I can't remember if this is proper syntax, but I set it properly at the time of the testing, whatever it was)
it wouldn't work without downloading so I had to fake downloading with completely new function:
Function DownloadPage1(URL:AnsiString; FileName:AnsiString):String;
Var
ScriptPath, WebText: String;
Begin
LogMessage(Chr(9)+Chr(9)+'Function DownloadPage1 BEGIN======================|');
LogMessage(Chr(9)+Chr(9)+'Global Var-DownloadURL|'+URL+'|');
LogMessage(Chr(9)+Chr(9)+' Local Var-URL|'+URL+'|');
ScriptPath := GetAppPath + 'Scripts\';
// Directly read the existing file instead of downloading
If FileExists(ScriptPath + FileName) Then Begin
LogMessage(Chr(9)+Chr(9)+' File already exists: '+ScriptPath + FileName);
WebText := FileToString(ScriptPath + FileName);
WebText := ConvertEncoding(WebText, 65001); // Convert to UTF-8
Result := WebText;
LogMessage(Chr(9)+Chr(9)+' Read file content successfully.');
End Else Begin
LogMessage(Chr(9)+Chr(9)+' File does not exist: '+ScriptPath + FileName);
Result := '';
End;
LogMessage(Chr(9)+Chr(9)+'Function DownloadPage1 END======================|');
End;
and then to "call downloading"
// Parse Awards provider page = BASE_URL_AWARD_PERSON
If GET_FULL_AWARDS Then Begin
LogMessage('Starting to parse awards page.');
HTML := DownloadPage1(DownloadURL, 'Tmp\UTF8_NO_BOM-Awards.mhtml');
LogMessage('Read awards page from file: ' + Copy(HTML, 1, 500)); // Log the file content
When I reach the phase of passing URL TO SELENIUM TO DOWNLOAD THE PAGE, I'm still not sure how it will work in .psf: will I have to fake download after Selenium passes back the page, or whatever. For someone not knowing how to code, this is too much to comprehend without actual trials.
-
Your code also loops:
1372: (12/23/2024 9:53:57 PM) Parsed Award: Dorian Award
Line 1379: (12/23/2024 9:53:57 PM) Award: Dorian Award
Line 1385: (12/23/2024 9:53:57 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 2646: (12/23/2024 9:53:59 PM) Parsed Award: Dorian Award
Line 2653: (12/23/2024 9:53:59 PM) Award: Dorian Award
Line 2659: (12/23/2024 9:53:59 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 3920: (12/23/2024 9:54:01 PM) Parsed Award: Dorian Award
Line 3927: (12/23/2024 9:54:01 PM) Award: Dorian Award
Line 3933: (12/23/2024 9:54:01 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 5194: (12/23/2024 9:54:02 PM) Parsed Award: Dorian Award
Line 5201: (12/23/2024 9:54:02 PM) Award: Dorian Award
Line 5207: (12/23/2024 9:54:02 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 6468: (12/23/2024 9:54:04 PM) Parsed Award: Dorian Award
Line 6475: (12/23/2024 9:54:04 PM) Award: Dorian Award
Line 6481: (12/23/2024 9:54:04 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 7742: (12/23/2024 9:54:06 PM) Parsed Award: Dorian Award
Line 7749: (12/23/2024 9:54:06 PM) Award: Dorian Award
Line 7755: (12/23/2024 9:54:06 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 9016: (12/23/2024 9:54:08 PM) Parsed Award: Dorian Award
Line 9023: (12/23/2024 9:54:08 PM) Award: Dorian Award
Line 9029: (12/23/2024 9:54:08 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 10290: (12/23/2024 9:54:09 PM) Parsed Award: Dorian Award
Line 10297: (12/23/2024 9:54:09 PM) Award: Dorian Award
Line 10303: (12/23/2024 9:54:09 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 11564: (12/23/2024 9:54:11 PM) Parsed Award: Dorian Award
Line 11571: (12/23/2024 9:54:11 PM) Award: Dorian Award
Line 11577: (12/23/2024 9:54:11 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 12838: (12/23/2024 9:54:13 PM) Parsed Award: Dorian Award
Line 12845: (12/23/2024 9:54:13 PM) Award: Dorian Award
Line 12851: (12/23/2024 9:54:13 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 14112: (12/23/2024 9:54:15 PM) Parsed Award: Dorian Award
Line 14119: (12/23/2024 9:54:15 PM) Award: Dorian Award
Line 14125: (12/23/2024 9:54:15 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 15386: (12/23/2024 9:54:17 PM) Parsed Award: Dorian Award
Line 15393: (12/23/2024 9:54:17 PM) Award: Dorian Award
Line 15399: (12/23/2024 9:54:17 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 16660: (12/23/2024 9:54:18 PM) Parsed Award: Dorian Award
Line 16667: (12/23/2024 9:54:18 PM) Award: Dorian Award
Line 16673: (12/23/2024 9:54:18 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 17934: (12/23/2024 9:54:20 PM) Parsed Award: Dorian Award
Line 17941: (12/23/2024 9:54:20 PM) Award: Dorian Award
Line 17947: (12/23/2024 9:54:20 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 19208: (12/23/2024 9:54:21 PM) Parsed Award: Dorian Award
Line 19215: (12/23/2024 9:54:21 PM) Award: Dorian Award
Line 19221: (12/23/2024 9:54:21 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 20482: (12/23/2024 9:54:23 PM) Parsed Award: Dorian Award
Line 20489: (12/23/2024 9:54:23 PM) Award: Dorian Award
Line 20495: (12/23/2024 9:54:23 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 21756: (12/23/2024 9:54:25 PM) Parsed Award: Dorian Award
Line 21763: (12/23/2024 9:54:25 PM) Award: Dorian Award
Line 21769: (12/23/2024 9:54:25 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 23030: (12/23/2024 9:54:26 PM) Parsed Award: Dorian Award
Line 23037: (12/23/2024 9:54:27 PM) Award: Dorian Award
Line 23043: (12/23/2024 9:54:27 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 24304: (12/23/2024 9:54:28 PM) Parsed Award: Dorian Award
Line 24311: (12/23/2024 9:54:28 PM) Award: Dorian Award
Line 24317: (12/23/2024 9:54:28 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 25578: (12/23/2024 9:54:30 PM) Parsed Award: Dorian Award
Line 25585: (12/23/2024 9:54:30 PM) Award: Dorian Award
Line 25591: (12/23/2024 9:54:30 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 26852: (12/23/2024 9:54:32 PM) Parsed Award: Dorian Award
Line 26859: (12/23/2024 9:54:32 PM) Award: Dorian Award
Line 26865: (12/23/2024 9:54:32 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 28126: (12/23/2024 9:54:33 PM) Parsed Award: Dorian Award
Line 28133: (12/23/2024 9:54:33 PM) Award: Dorian Award
Line 28139: (12/23/2024 9:54:33 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 29400: (12/23/2024 9:54:35 PM) Parsed Award: Dorian Award
Line 29407: (12/23/2024 9:54:35 PM) Award: Dorian Award
Line 29413: (12/23/2024 9:54:35 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 30674: (12/23/2024 9:54:37 PM) Parsed Award: Dorian Award
Line 30681: (12/23/2024 9:54:37 PM) Award: Dorian Award
Line 30687: (12/23/2024 9:54:37 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 31948: (12/23/2024 9:54:39 PM) Parsed Award: Dorian Award
Line 31955: (12/23/2024 9:54:39 PM) Award: Dorian Award
Line 31961: (12/23/2024 9:54:39 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 33222: (12/23/2024 9:54:41 PM) Parsed Award: Dorian Award
Line 33229: (12/23/2024 9:54:41 PM) Award: Dorian Award
Line 33235: (12/23/2024 9:54:41 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 34496: (12/23/2024 9:54:43 PM) Parsed Award: Dorian Award
Line 34503: (12/23/2024 9:54:43 PM) Award: Dorian Award
Line 34509: (12/23/2024 9:54:43 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 35770: (12/23/2024 9:54:44 PM) Parsed Award: Dorian Award
Line 35777: (12/23/2024 9:54:44 PM) Award: Dorian Award
Line 35783: (12/23/2024 9:54:44 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 37044: (12/23/2024 9:54:46 PM) Parsed Award: Dorian Award
Line 37051: (12/23/2024 9:54:46 PM) Award: Dorian Award
Line 37057: (12/23/2024 9:54:46 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 38318: (12/23/2024 9:54:48 PM) Parsed Award: Dorian Award
Line 38325: (12/23/2024 9:54:48 PM) Award: Dorian Award
Line 38331: (12/23/2024 9:54:48 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 39592: (12/23/2024 9:54:50 PM) Parsed Award: Dorian Award
Line 39599: (12/23/2024 9:54:50 PM) Award: Dorian Award
Line 39605: (12/23/2024 9:54:50 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 40866: (12/23/2024 9:54:52 PM) Parsed Award: Dorian Award
Line 40873: (12/23/2024 9:54:52 PM) Award: Dorian Award
Line 40879: (12/23/2024 9:54:52 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 42140: (12/23/2024 9:54:54 PM) Parsed Award: Dorian Award
Line 42147: (12/23/2024 9:54:54 PM) Award: Dorian Award
so that's one more thing to resolve
-
Your code also loops:
1372: (12/23/2024 9:53:57 PM) Parsed Award: Dorian Award
Line 1379: (12/23/2024 9:53:57 PM) Award: Dorian Award
Line 1385: (12/23/2024 9:53:57 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 2646: (12/23/2024 9:53:59 PM) Parsed Award: Dorian Award
Line 2653: (12/23/2024 9:53:59 PM) Award: Dorian Award
Line 2659: (12/23/2024 9:53:59 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 3920: (12/23/2024 9:54:01 PM) Parsed Award: Dorian Award
Line 3927: (12/23/2024 9:54:01 PM) Award: Dorian Award
Line 3933: (12/23/2024 9:54:01 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 5194: (12/23/2024 9:54:02 PM) Parsed Award: Dorian Award
Line 5201: (12/23/2024 9:54:02 PM) Award: Dorian Award
Line 5207: (12/23/2024 9:54:02 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 6468: (12/23/2024 9:54:04 PM) Parsed Award: Dorian Award
Line 6475: (12/23/2024 9:54:04 PM) Award: Dorian Award
Line 6481: (12/23/2024 9:54:04 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 7742: (12/23/2024 9:54:06 PM) Parsed Award: Dorian Award
Line 7749: (12/23/2024 9:54:06 PM) Award: Dorian Award
Line 7755: (12/23/2024 9:54:06 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 9016: (12/23/2024 9:54:08 PM) Parsed Award: Dorian Award
Line 9023: (12/23/2024 9:54:08 PM) Award: Dorian Award
Line 9029: (12/23/2024 9:54:08 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 10290: (12/23/2024 9:54:09 PM) Parsed Award: Dorian Award
Line 10297: (12/23/2024 9:54:09 PM) Award: Dorian Award
Line 10303: (12/23/2024 9:54:09 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 11564: (12/23/2024 9:54:11 PM) Parsed Award: Dorian Award
Line 11571: (12/23/2024 9:54:11 PM) Award: Dorian Award
Line 11577: (12/23/2024 9:54:11 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 12838: (12/23/2024 9:54:13 PM) Parsed Award: Dorian Award
Line 12845: (12/23/2024 9:54:13 PM) Award: Dorian Award
Line 12851: (12/23/2024 9:54:13 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 14112: (12/23/2024 9:54:15 PM) Parsed Award: Dorian Award
Line 14119: (12/23/2024 9:54:15 PM) Award: Dorian Award
Line 14125: (12/23/2024 9:54:15 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 15386: (12/23/2024 9:54:17 PM) Parsed Award: Dorian Award
Line 15393: (12/23/2024 9:54:17 PM) Award: Dorian Award
Line 15399: (12/23/2024 9:54:17 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 16660: (12/23/2024 9:54:18 PM) Parsed Award: Dorian Award
Line 16667: (12/23/2024 9:54:18 PM) Award: Dorian Award
Line 16673: (12/23/2024 9:54:18 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 17934: (12/23/2024 9:54:20 PM) Parsed Award: Dorian Award
Line 17941: (12/23/2024 9:54:20 PM) Award: Dorian Award
Line 17947: (12/23/2024 9:54:20 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 19208: (12/23/2024 9:54:21 PM) Parsed Award: Dorian Award
Line 19215: (12/23/2024 9:54:21 PM) Award: Dorian Award
Line 19221: (12/23/2024 9:54:21 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 20482: (12/23/2024 9:54:23 PM) Parsed Award: Dorian Award
Line 20489: (12/23/2024 9:54:23 PM) Award: Dorian Award
Line 20495: (12/23/2024 9:54:23 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 21756: (12/23/2024 9:54:25 PM) Parsed Award: Dorian Award
Line 21763: (12/23/2024 9:54:25 PM) Award: Dorian Award
Line 21769: (12/23/2024 9:54:25 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 23030: (12/23/2024 9:54:26 PM) Parsed Award: Dorian Award
Line 23037: (12/23/2024 9:54:27 PM) Award: Dorian Award
Line 23043: (12/23/2024 9:54:27 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 24304: (12/23/2024 9:54:28 PM) Parsed Award: Dorian Award
Line 24311: (12/23/2024 9:54:28 PM) Award: Dorian Award
Line 24317: (12/23/2024 9:54:28 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 25578: (12/23/2024 9:54:30 PM) Parsed Award: Dorian Award
Line 25585: (12/23/2024 9:54:30 PM) Award: Dorian Award
Line 25591: (12/23/2024 9:54:30 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 26852: (12/23/2024 9:54:32 PM) Parsed Award: Dorian Award
Line 26859: (12/23/2024 9:54:32 PM) Award: Dorian Award
Line 26865: (12/23/2024 9:54:32 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 28126: (12/23/2024 9:54:33 PM) Parsed Award: Dorian Award
Line 28133: (12/23/2024 9:54:33 PM) Award: Dorian Award
Line 28139: (12/23/2024 9:54:33 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 29400: (12/23/2024 9:54:35 PM) Parsed Award: Dorian Award
Line 29407: (12/23/2024 9:54:35 PM) Award: Dorian Award
Line 29413: (12/23/2024 9:54:35 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 30674: (12/23/2024 9:54:37 PM) Parsed Award: Dorian Award
Line 30681: (12/23/2024 9:54:37 PM) Award: Dorian Award
Line 30687: (12/23/2024 9:54:37 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 31948: (12/23/2024 9:54:39 PM) Parsed Award: Dorian Award
Line 31955: (12/23/2024 9:54:39 PM) Award: Dorian Award
Line 31961: (12/23/2024 9:54:39 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 33222: (12/23/2024 9:54:41 PM) Parsed Award: Dorian Award
Line 33229: (12/23/2024 9:54:41 PM) Award: Dorian Award
Line 33235: (12/23/2024 9:54:41 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 34496: (12/23/2024 9:54:43 PM) Parsed Award: Dorian Award
Line 34503: (12/23/2024 9:54:43 PM) Award: Dorian Award
Line 34509: (12/23/2024 9:54:43 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 35770: (12/23/2024 9:54:44 PM) Parsed Award: Dorian Award
Line 35777: (12/23/2024 9:54:44 PM) Award: Dorian Award
Line 35783: (12/23/2024 9:54:44 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 37044: (12/23/2024 9:54:46 PM) Parsed Award: Dorian Award
Line 37051: (12/23/2024 9:54:46 PM) Award: Dorian Award
Line 37057: (12/23/2024 9:54:46 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 38318: (12/23/2024 9:54:48 PM) Parsed Award: Dorian Award
Line 38325: (12/23/2024 9:54:48 PM) Award: Dorian Award
Line 38331: (12/23/2024 9:54:48 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 39592: (12/23/2024 9:54:50 PM) Parsed Award: Dorian Award
Line 39599: (12/23/2024 9:54:50 PM) Award: Dorian Award
Line 39605: (12/23/2024 9:54:50 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 40866: (12/23/2024 9:54:52 PM) Parsed Award: Dorian Award
Line 40873: (12/23/2024 9:54:52 PM) Award: Dorian Award
Line 40879: (12/23/2024 9:54:52 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Dorian Award, Category=Screenplay of the Year, Recipient=Roma, Year=2014, Won: True
Line 42140: (12/23/2024 9:54:54 PM) Parsed Award: Dorian Award
Line 42147: (12/23/2024 9:54:54 PM) Award: Dorian Award
so that's one more thing to resolve
Yes, I know about this issue in the log file.
-
Yes, I know about this problem, I have looked into the log files and found a partial solution, which will help.
Code that is not complete::
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
curPos, endPos: Integer;
ItemList, Event, Award, Category, Recipient, Year: String;
AValue: String; // Declaring AValue as a String
Won: Boolean;
FailSafe: Integer; // To prevent infinite loops
curPos1,curPos2,curPos3,curPos4,endPos1,endPos2:Integer;
Begin
LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
try
Result := prFinished;
// Log the initial HTML snippet being parsed
LogMessage('Initial HTML snippet: ' + Copy(HTML, 1, 500));
// Find the position of the Awards title
curPos := Pos('<h1 class="ipc-title__text">Awards</h1>', HTML);
If curPos > 0 Then Begin
// Find the position of the Awards section
curPos := PosFrom('<section class="ipc-page-section ipc-page-section--base">', HTML, curPos);
End;
If curPos > 0 Then Begin
// Find the end position of the Awards section
endPos := PosFrom('<h3 class="ipc-title__text"><span id="contribute">Contribute to this page</span>', HTML, curPos);
If endPos = 0 Then endPos := Length(HTML);
If (curPos > 0) AND (endPos > curPos) Then Begin
// Extract the Awards block
ItemList := Copy(HTML, curPos, endPos - curPos);
//LogMessage(ItemList);
//While curPos > 0 Do Begin
// Extract and log the award name
// Extract and log the event name
//curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, curPos);
//curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, 1);
curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text"><span id="ev', ItemList, 0);
FailSafe := 0; // Initialize fail-safe counter
//While curPos > 0 Do Begin
While (curPos > 0) And (FailSafe < 10) Do Begin
// Extract and log the award name
If curPos > 0 Then Begin
curPos := PosFrom('>', ItemList, curPos) + 29;
endPos := PosFrom('</span>', ItemList, curPos);
Event := Copy(ItemList, curPos, endPos - curPos);
LogMessage('** Parsed Event: ' + Event);
Event := RemoveTagsEx0(Event);
Event := Trim(Event);
LogMessage('* Parsed Event: ' + Event);
//Event := RemoveTagsEx1(Trim(Event));
// Remove the <span> tag
Event := Copy(Event, Pos('>', Event) + 1 , Length(Event));
LogMessage('Parsed Event: ' + Event);
//(*
// Parse each award item manually
curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item"', ItemList, 1);
//curPos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, curPos + 1);
If curPos > 0 Then Begin
//*)
//(*
curPos1 := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, curPos1 + 16);
If curPos1 > 0 Then Begin
curPos1 := PosFrom('>', ItemList, curPos1) + 1;
endPos1 := PosFrom('</span>', ItemList, curPos1);
Award := Copy(ItemList, curPos1, endPos1 - curPos1);
LogMessage('Parsed Award: ' + Award);
//*)
(*
// Log the parameters before calling AddAward
//LogMessage('Before calling AddAward with parameters:');
//LogMessage('Event: ' + Event);
//LogMessage('Award: ' + Award);
//LogMessage('Category: ' + Category);
//LogMessage('Recipient: ' + Recipient);
//LogMessage('Year: ' + Year);
//LogMessage('Won: ' + CustomBoolToStr(Won));
*)
// Populate the custom field with AValue
//AddCustomFieldValueByName('IMDb People Awards', AValue);
// LogMessage('IMDb People Awards added ' + AValue)
//(*
//curPos1 := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, EndPos1 + 10);
curPos1 := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, curPos1 + 0);
//curPos1 := PosFrom('<span/class="ipc-metadata-list-summary-item__tst">', ItemList, curPos1);
//curPos1 := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', ItemList, EndPos1);
End Else LogMessage('Error: Award not found.');
//*)
//(*
// Move to the next item
curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item"', ItemList, curPos + 0);
End Else LogMessage('Error: Awards not found.');
//*)
//curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text"><span id="ev', ItemList, curPos + 1);
//curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, 1);
//curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, curPos);
curPos := PosFrom('?ref_=nmawd" class="ipc-title-link-wrapper" tabindex="0"><h3 class="ipc-title__text">', ItemList, EndPos);
End Else LogMessage('Error: Event title div not found.');
End;
End Else LogMessage('Error: Invalid endPos or curPos for Awards section');
End Else LogMessage('Error: Awards section not found');
except
Begin
LogMessage('Exception encountered');
Result := prError;
End;
end;
LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
Result := prFinished;
End;
Log details:
(24.12.2024 14:28:23) Function DownloadPage END======================|
(24.12.2024 14:28:23) Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||
(24.12.2024 14:28:23) Initial HTML snippet: <!DOCTYPE html><html lang="en-US" xmlns:og="http://opengraphprotocol.org/schema/" xmlns:fb="http://www.facebook.com/2008/fbml"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><script>if(typeof uet === 'function'){ uet('bb', 'LoadTitle', {wb: 1}); }</script><script>window.addEventListener('load', (event) => {
if (typeof window.csa !== 'undefined' && typeof window.csa === 'function') {
var csaLatencyPlugin = window.csa('Content', {
(24.12.2024 14:28:23) ** Parsed Event: <span id="ev0000386">Kids' Choice Awards, USA
(24.12.2024 14:28:23) * Parsed Event: Kids' Choice Awards, USA
(24.12.2024 14:28:23) Parsed Event: Kids' Choice Awards, USA
(24.12.2024 14:28:23) Parsed Award: Blimp Award
(24.12.2024 14:28:23) ** Parsed Event: <span id="ev0000616">Soap Opera Digest Awards
(24.12.2024 14:28:23) * Parsed Event: Soap Opera Digest Awards
(24.12.2024 14:28:23) Parsed Event: Soap Opera Digest Awards
(24.12.2024 14:28:23) Parsed Award: Soap Opera Digest Award
(24.12.2024 14:28:23) ** Parsed Event: <span id="ev0000716">Young Artist Awards
(24.12.2024 14:28:23) * Parsed Event: Young Artist Awards
(24.12.2024 14:28:23) Parsed Event: Young Artist Awards
(24.12.2024 14:28:23) Parsed Award: Young Artist Award
(24.12.2024 14:28:23) ** Parsed Event: <span id="ev0000718">YoungStar Awards
(24.12.2024 14:28:23) * Parsed Event: YoungStar Awards
(24.12.2024 14:28:23) Parsed Event: YoungStar Awards
(24.12.2024 14:28:23) Parsed Award: Young Artist Award
(24.12.2024 14:28:23) Function ParsePage_IMDBPeopleAWARDS END=====================||
(24.12.2024 14:28:23) Provider data info retreived Ok in 2024-12-24 14:28:23|
(24.12.2024 14:28:23) Function ParsePage smNormal END======================|
(24.12.2024 14:28:23) Person -> LoadStatic -> 0ms
(24.12.2024 14:28:23) Person -> LoadMultivalues -> 0ms
(24.12.2024 14:28:23) Person -> LoadFilms -> 0ms
(24.12.2024 14:28:23) Person -> LoadAwards -> 0ms
(24.12.2024 14:28:23) Person -> LoadImages -> 0ms
<span id="ev0000718">YoungStar Awards
is helpful for which event the awards refer to
-
Here is the IMDB_People_[EN][HTTPS]_Awards script, which now correctly transfers Awards data to the awards field for the 'Chico' Hernandez person from the url added below using a Python Selenium script
https://www.imdb.com/name/nm0379491/awards/
I have corrected or added some parts of the code to your code and it works.
Python Selenium script instructions and code will be published probably by the new year in the Integrating Selenium to PVD topic.
http://www.videodb.info/forum_en/index.php/topic,4357.0.html
Python Selenium script is at the link below.
http://www.videodb.info/forum_en/index.php/topic,4362.msg22691.html#msg22691
IMDB_[EN][HTTPS]_TEST_Aka script in link below.
http://www.videodb.info/forum_en/index.php/topic,4363.0.html
-
Unfortunately, I don't plan on working on any Imdb Awards section anymore for any updates or fixes to the movies or people code in Function ParsePage_IMDBMovieAWARDS. It's too complicated and completely inappropriate layout or notation of the Awards page source code to be able to edit it to properly record the Awards data.
-
I completely understand. It is so complicated that even AI can't do anything about so far.
The best I could do is to get 2 functions.
The first parses all events, but none of the awards:
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
curPos, endPos: Integer;
Event, Award, Category, Recipient, Year: String;
Won: Boolean;
FailSafe: Integer; // To prevent infinite loops
Begin
LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
Result := prFinished;
// Log the first 500 characters of the initial HTML snippet
LogMessage('Initial HTML snippet (first 500 chars): ' + Copy(HTML, 1, 500));
// Log the last 500 characters of the initial HTML snippet
LogMessage('Initial HTML snippet (last 500 chars): ' + Copy(HTML, Length(HTML) - 499, 500));
// Initialize the search for the first event section
curPos := Pos('<section class="ipc-page-section ipc-page-section--base">', HTML);
LogMessage('curPos after finding first event section: ' + IntToStr(curPos));
If curPos > 0 Then Begin
FailSafe := 0; // Initialize fail-safe counter
While (curPos > 0) And (FailSafe < 200) Do Begin
// Ensure we don't exceed the HTML length
If curPos >= Length(HTML) Then Break;
// Extract the Event Name
curPos := PosFrom('<span id="ev', HTML, curPos);
If curPos > 0 Then Begin
curPos := PosFrom('>', HTML, curPos) + 1;
endPos := PosFrom('</span>', HTML, curPos);
Event := Trim(Copy(HTML, curPos, endPos - curPos));
LogMessage('Parsed Event: ' + Event);
// Process each award within the event
curPos := endPos;
While (curPos > 0) And (curPos < Length(HTML)) And (PosFrom('</section><section class="ipc-page-section ipc-page-section--base">', HTML, curPos) = 0) And (PosFrom('</section><div class="nas-slot">', HTML, curPos) = 0) Do Begin
curPos := PosFrom('<div data-testid="sub-section-', HTML, curPos);
If curPos > 0 Then Begin
curPos := PosFrom('>', HTML, curPos) + 1;
endPos := PosFrom('<>', HTML, curPos);
Award := Copy(HTML, curPos, endPos - curPos);
// LogMessage('Extracted Award Content: ' + Award);
// Parse award details from the Award block
// Extract Award Name
curPos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', HTML, curPos);
If curPos > 0 Then Begin
curPos := PosFrom('>', HTML, curPos) + 1;
endPos := PosFrom('</span>', HTML, curPos);
Award := Copy(HTML, curPos, endPos - curPos);
End;
// Extract Category
curPos := PosFrom('<span class="ipc-metadata-list-summary-item__li awardCategoryName"', Award, 1);
If curPos > 0 Then Begin
curPos := PosFrom('>', Award, curPos) + 1;
endPos := PosFrom('</span>', Award, curPos);
Category := Copy(Award, curPos, endPos - curPos);
End;
// Extract Recipient
curPos := PosFrom('<a class="ipc-metadata-list-summary-item__li ipc-metadata-list-summary-item__li--link"', Award, 1);
If curPos > 0 Then Begin
curPos := PosFrom('>', Award, curPos) + 1;
endPos := PosFrom('</a>', Award, curPos);
Recipient := Copy(Award, curPos, endPos - curPos);
End;
// Extract Year
curPos := PosFrom('<a class="ipc-metadata-list-summary-item__t"', Award, 1);
If curPos > 0 Then Begin
curPos := PosFrom('>', Award, curPos) + 1;
endPos := PosFrom(' ', Award, curPos); // Find the space after the year
Year := Copy(Award, curPos, endPos - curPos);
Year := Trim(Year);
End;
// Determine if the award was won
Won := PosFrom('Winner', Award, 1) > 0;
// Add award to the database
AddAward(Event, Award, Category, Recipient, Year, Won);
If Won Then
LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=True')
Else
LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=False');
End;
End;
End;
// Move to the next event or end of awards block
If PosFrom('</section><section class="ipc-page-section ipc-page-section--base">', HTML, curPos) > 0 Then
curPos := PosFrom('</section><section class="ipc-page-section ipc-page-section--base">', HTML, curPos) + Length('</section><section class="ipc-page-section ipc-page-section--base">')
Else If PosFrom('<div class="nas-slot">', HTML, curPos) > 0 Then Begin
LogMessage('End of awards block detected.');
Break;
End Else Begin
LogMessage('Error: Unable to identify next event or end of awards block.');
Break;
End;
Inc(FailSafe);
End;
End Else LogMessage('Error: First event section not found');
LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
Result := prFinished;
End;
//BlockClose
The second one parses all awards and only first event, and assigns all the awards to that event:
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
curPos, endPos, awardPos, categoryPos, recipientPos: Integer;
Event, Award, Category, Recipient, Year: String;
Won: Boolean;
Begin
LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
Result := prFinished;
// Locate the start of the specific event section
curPos := Pos('<section class="ipc-page-section ipc-page-section--base">', HTML);
LogMessage('curPos after finding event section: ' + IntToStr(curPos));
If curPos > 0 Then Begin
// Extract event name
curPos := PosFrom('<span id="ev', HTML, curPos);
If curPos = 0 Then Begin
LogMessage('Event name not found');
Exit;
End;
curPos := PosFrom('>', HTML, curPos) + 1;
endPos := PosFrom('</span>', HTML, curPos);
Event := Trim(Copy(HTML, curPos, endPos - curPos));
LogMessage('Parsed Event: ' + Event);
curPos := endPos;
// Process awards within this event
While curPos > 0 Do Begin
// Find next award div
curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item">', HTML, curPos);
If curPos = 0 Then Begin
LogMessage('No more awards found in this event');
Break;
End;
LogMessage('curPos after finding award div: ' + IntToStr(curPos));
awardPos := curPos; // Save the starting position of the award
curPos := PosFrom('>', HTML, curPos) + 1;
endPos := PosFrom('<></section>', HTML, curPos); // Adjusted to the correct closing tag
If endPos = 0 Then Begin
LogMessage('No closing tag for award div found');
Break; // No more awards
End;
Award := Copy(HTML, awardPos, endPos - awardPos);
curPos := endPos + Length('<></section>');
// LogMessage('Award Content Extracted Successfully: ' + Award);
// Extract year
awardPos := PosFrom('<a class="ipc-metadata-list-summary-item__t"', Award, 1);
If awardPos = 0 Then Begin
LogMessage('Year not found');
Continue;
End;
awardPos := PosFrom('>', Award, awardPos) + 1;
endPos := PosFrom(' ', Award, awardPos); // Find the space after the year
Year := Copy(Award, awardPos, endPos - awardPos);
Year := Trim(Year);
LogMessage('Parsed Year: ' + Year);
// Determine if the award was won
Won := PosFrom('Winner', Award, 1) > 0;
If Won Then
LogMessage('Parsed Won: True')
Else
LogMessage('Parsed Won: False');
// Extract Category
categoryPos := PosFrom('<span class="ipc-metadata-list-summary-item__li awardCategoryName"', Award, awardPos);
LogMessage('EVE GA CAT: ' + IntToStr(categoryPos));
If categoryPos > 0 Then Begin
categoryPos := PosFrom('>', Award, categoryPos) + 1;
endPos := PosFrom('</span>', Award, categoryPos);
Category := Copy(Award, categoryPos, endPos - categoryPos);
End;
LogMessage('Parsed Category Name: ' + Category);
// Extract recipient
recipientPos := PosFrom('<a class="ipc-metadata-list-summary-item__li ipc-metadata-list-summary-item__li--link"', Award, categoryPos);
If recipientPos = 0 Then Begin
LogMessage('Recipient tag not found');
Continue;
End;
recipientPos := PosFrom('>', Award, recipientPos) + 1;
endPos := PosFrom('</a>', Award, recipientPos);
Recipient := Copy(Award, recipientPos, endPos - recipientPos);
LogMessage('Parsed Recipient: ' + Recipient);
// Extract award name
awardPos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', Award, awardPos);
If awardPos = 0 Then Begin
LogMessage('Award Name not found');
Continue;
End;
awardPos := PosFrom('>', Award, awardPos) + 1;
LogMessage('Parsed awardPos ' + IntToStr(awardPos));
endPos := PosFrom('</span>', Award, awardPos);
LogMessage('Parsed endPos ' + IntToStr(endPos));
Award := Copy(Award, awardPos, endPos - awardPos);
LogMessage('Parsed Award Name: ' + Award);
// Add award to the database
AddAward(Event, Award, Category, Recipient, Year, Won);
If Won Then
LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=True')
Else
LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + Award + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=False');
// Advance curPos to ensure moving to the next award
curPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item">', HTML, curPos);
End;
End Else LogMessage('Error: Event section not found');
LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
Result := prFinished;
End;
//BlockClose
For the sake of my life I cannot do anything to combine them, no matter what I try. Not even close... :o :'(
How's that even possible?
The page I'm trying to parse is attached, as well as the script which containes fixed genres and bio.
-
And probably FINALLY this one is a winner (have to try to tweak Recipient yet):
Function ParsePage_IMDBPeopleAWARDS(HTML: String): Cardinal;
Var
curPos, endPos, awardPos, categoryPos, recipientPos, yearPos, eventEndPos, namePos: Integer;
Event, Award, AwardName, Category, Recipient, Year: String;
Won: Boolean;
Begin
LogMessage('Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||');
Result := prFinished;
// Locate the start of the first event section
curPos := Pos('<section class="ipc-page-section ipc-page-section--base">', HTML);
While curPos > 0 Do Begin
LogMessage('curPos after finding event section: ' + IntToStr(curPos));
// Extract event name
curPos := PosFrom('<span id="ev', HTML, curPos);
If curPos = 0 Then Begin
LogMessage('Event name not found');
Break;
End;
curPos := PosFrom('>', HTML, curPos) + 1;
endPos := PosFrom('</span>', HTML, curPos);
Event := Trim(Copy(HTML, curPos, endPos - curPos));
LogMessage('Parsed Event: ' + Event);
// Move cursor to start processing awards within the event
curPos := endPos;
eventEndPos := PosFrom('<section class="ipc-page-section ipc-page-section--base">', HTML, curPos);
If eventEndPos = 0 Then
eventEndPos := Length(HTML); // Set to the end of HTML if no more events
// Process awards within the event
While curPos < eventEndPos Do Begin
// Find next award div within the current event
awardPos := PosFrom('<li class="ipc-metadata-list-summary-item sc-15fc9ae6-1 gQbMPJ" data-testid="list-item">', HTML, curPos);
If (awardPos = 0) Or (awardPos >= eventEndPos) Then Begin
LogMessage('No more awards found in this event');
Break;
End;
LogMessage('curPos after finding award div: ' + IntToStr(awardPos));
// Extract entire award block
curPos := awardPos;
endPos := PosFrom('</li>', HTML, curPos);
If endPos = 0 Then Begin
LogMessage('No closing tag for award div found');
Break;
End;
Award := Copy(HTML, curPos, endPos - curPos);
curPos := endPos + Length('</li>');
LogMessage('Award Content Extracted Successfully: ' + Award);
// Extract year
yearPos := PosFrom('<a class="ipc-metadata-list-summary-item__t"', Award, 1);
If yearPos = 0 Then Begin
LogMessage('Year not found');
Continue;
End;
yearPos := PosFrom('>', Award, yearPos) + 1;
endPos := PosFrom(' ', Award, yearPos);
Year := Copy(Award, yearPos, endPos - yearPos);
Year := Trim(Year);
LogMessage('Parsed Year: ' + Year);
// Determine if the award was won
Won := PosFrom('Winner', Award, 1) > 0;
If Won Then
LogMessage('Parsed Won: True')
Else
LogMessage('Parsed Won: False');
// Extract award name
namePos := PosFrom('<span class="ipc-metadata-list-summary-item__tst">', Award, 1);
If namePos > 0 Then Begin
namePos := PosFrom('>', Award, namePos) + 1;
endPos := PosFrom('</span>', Award, namePos);
AwardName := Copy(Award, namePos, endPos - namePos);
LogMessage('Parsed Award Name: ' + AwardName);
End Else Begin
LogMessage('Award Name not found');
AwardName := '';
End;
// Extract category
categoryPos := PosFrom('<span class="ipc-metadata-list-summary-item__li awardCategoryName" aria-disabled="false">', Award, 1);
If categoryPos > 0 Then Begin
categoryPos := PosFrom('>', Award, categoryPos) + 1;
endPos := PosFrom('</span>', Award, categoryPos);
Category := Copy(Award, categoryPos, endPos - categoryPos);
LogMessage('Parsed Category: ' + Category);
End Else Begin
LogMessage('Category tag not found');
Category := '';
End;
// Extract recipient
recipientPos := PosFrom('<a class="ipc-metadata-list-summary-item__li ipc-metadata-list-summary-item__li--link"', Award, endPos + Length('</span>') + 1);
If recipientPos > 0 Then Begin
recipientPos := PosFrom('>', Award, recipientPos) + 1;
endPos := PosFrom('</a>', Award, recipientPos);
Recipient := Copy(Award, recipientPos, endPos - recipientPos);
LogMessage('Parsed Recipient: ' + Recipient);
End Else Begin
LogMessage('Recipient tag not found');
Recipient := '';
End;
// Add award to the database
AddAward(Event, AwardName, Category, Recipient, Year, Won);
If Won Then
LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + AwardName + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=True')
Else
LogMessage('AddAward executed successfully: Event=' + Event + ', Award=' + AwardName + ', Category=' + Category + ', Recipient=' + Recipient + ', Year=' + Year + ', Won=False');
End;
// Move to the next event section
curPos := eventEndPos;
curPos := PosFrom('<section class="ipc-page-section ipc-page-section--base">', HTML, curPos);
End;
LogMessage('Function ParsePage_IMDBPeopleAWARDS END=====================||');
Result := prFinished;
End;
//BlockClose
Here's the beginning of the log (first event is "Ariel Awards, Mexico" and the first award in it is "Golden Ariel") and the end of the log ("BOFA" is the last award of the last event of this page - "Brazil Online Film Award"):
(12/28/2024 8:35:07 PM) Function ParsePage_IMDBPeopleAWARDS BEGIN=====================||
(12/28/2024 8:35:07 PM) curPos after finding event section: 148924
(12/28/2024 8:35:07 PM) Parsed Event: Ariel Awards, Mexico
(12/28/2024 8:35:07 PM) curPos after finding award div: 149982
(12/28/2024 8:35:07 PM) Parsed Year: 2019
(12/28/2024 8:35:07 PM) Parsed Won: True
(12/28/2024 8:35:07 PM) Parsed Award Name: Golden Ariel
(12/28/2024 8:35:07 PM) Parsed Category: Best Picture (Mejor Película)
(12/28/2024 8:35:07 PM) Recipient tag not found
(12/28/2024 8:35:07 PM) AddAward executed successfully: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=, Year=2019, Won=True
============= intermediate logs here
(12/28/2024 8:35:10 PM) AddAward executed successfully: Event=Premios Eres, Award= Premio Eres, Category=Best Picture (Mejor Película), Recipient=, Year=1993, Won=False
(12/28/2024 8:35:10 PM) No more awards found in this event
(12/28/2024 8:35:10 PM) curPos after finding event section: 1574223
(12/28/2024 8:35:10 PM) Parsed Event: Brazil Online Film Award
(12/28/2024 8:35:10 PM) curPos after finding award div: 1575285
(12/28/2024 8:35:10 PM) Parsed Year: 2019
(12/28/2024 8:35:10 PM) Parsed Won: True
(12/28/2024 8:35:10 PM) Parsed Award Name: BOFA
(12/28/2024 8:35:10 PM) Parsed Category: Best Director
(12/28/2024 8:35:10 PM) Recipient tag not found
(12/28/2024 8:35:10 PM) AddAward executed successfully: Event=Brazil Online Film Award, Award= BOFA, Category=Best Director, Recipient=, Year=2019, Won=True
(12/28/2024 8:35:10 PM) No more awards found in this event
(12/28/2024 8:35:10 PM) Function ParsePage_IMDBPeopleAWARDS END=====================||
(12/28/2024 8:35:10 PM) After calling ParsePage_IMDBPeopleAWARDS
(12/28/2024 8:35:10 PM) Parsed awards page.
(12/28/2024 8:35:10 PM) Parsing awards page finished successfully.
(12/28/2024 8:35:10 PM) Provider data info retrieved Ok on 2024-12-28 20:35:10|
(12/28/2024 8:35:10 PM) Function ParsePage smNormal END======================|
(12/28/2024 8:35:10 PM) Person -> LoadStatic -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadMultivalues -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadFilms -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadAwards -> 15ms
(12/28/2024 8:35:10 PM) Person -> LoadImages -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadStatic -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadMultivalues -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadFilms -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadAwards -> 0ms
(12/28/2024 8:35:10 PM) Person -> LoadImages -> 16ms
-
// Extract entire award block
curPos := awardPos;
endPos := PosFrom('</li>', HTML, curPos);
If endPos = 0 Then Begin
LogMessage('No closing tag for award div found');
Break;
End;
Award := Copy(HTML, curPos, endPos - curPos);
curPos := endPos + Length('</li>');
LogMessage('Award Content Extracted Successfully: ' + Award);
Just change this part of the code above with this part of the code below and Recipient will work.
// Extract entire award block
curPos := awardPos;
endPos := PosFrom('</div></div></li>', HTML, curPos);
If endPos = 0 Then Begin
LogMessage('No closing tag for award div found');
Break;
End;
Award := Copy(HTML, curPos, endPos - curPos);
curPos := endPos + Length('</div></div></li>');
LogMessage('Award Content Extracted Successfully: ' + Award);
-
// Extract entire award block
curPos := awardPos;
endPos := PosFrom('</li>', HTML, curPos);
If endPos = 0 Then Begin
LogMessage('No closing tag for award div found');
Break;
End;
Award := Copy(HTML, curPos, endPos - curPos);
curPos := endPos + Length('</li>');
LogMessage('Award Content Extracted Successfully: ' + Award);
Just change this part of the code above with this part of the code below and Recipient will work.
// Extract entire award block
curPos := awardPos;
endPos := PosFrom('<><></li>', HTML, curPos);
If endPos = 0 Then Begin
LogMessage('No closing tag for award div found');
Break;
End;
Award := Copy(HTML, curPos, endPos - curPos);
curPos := endPos + Length('<><></li>');
LogMessage('Award Content Extracted Successfully: ' + Award);
Is that for Recipient? Because everything else works except Recipient.
-
Ohhhhh, I seee now!!!! Award extracted didn't contain Recipient!!! Thank you I will try it later!
-
I can now confirm that parsing awards works completely.
What doesn't work is populating to database, at least for me. No award or event is populated, although everything is properly parsed. Here's the log for the person and page given above.
What that can be???
-
This solo IMDB_People_[EN][HTTPS]_Awards 1 script with selenium transfers the awards data to the awards field without any problems. The regular IMDB_People_[EN][HTTPS]_Awards 2 script for the PVD MOD version also transfers the awards data to the awards field without any problems
The problem is in your IMDB_People_[EN][HTTPS]-Awards script, because this script did not add any awards data to the awards field for me either.
I already know where the problem is. For me, it transfers downpage-UTF8_NO_BOM.htm for the awards from the website, but for you, the problem is probably the parsing for the awards at the end of the script
-
Thanks for the quick feedback, Ivek. I have 2 questions I'm puzzled with now.
1. What selenium script do you use to download person's page? Is it the same one for aka?
2. I tried to put Awards function to the beginning of the script, at the same place where it is in your scripts too: just after the Function ParsePage_IMDBSearchName, but nothing different happened, so it's not about that it looks, or I didn't understand your remark?
Now, I tried regular IMDB_People_[EN][HTTPS]_Awards 2 script for the PVD MOD version you posted, but even that one didn't populate any award to my database, unlike for you. So, it's probably something with my database then, what do you think?
-
Thanks for the quick feedback, Ivek. I have 2 questions I'm puzzled with now.
1. What selenium script do you use to download person's page? Is it the same one for aka?
It is the same selenium script as for aka and I can download both the movies and people awards data with your awards code.
2. I tried to put Awards function to the beginning of the script, at the same place where it is in your scripts too: just after the Function ParsePage_IMDBSearchName, but nothing different happened, so it's not about that it looks, or I didn't understand your remark?
The problem is in the name of the file extension Tmp\UTF8_NO_BOM-Awards.mhtml. Change it to .htm and it should work.
Now, I tried regular IMDB_People_[EN][HTTPS]_Awards 2 script for the PVD MOD version you posted, but even that one didn't populate any award to my database, unlike for you. So, it's probably something with my database then, what do you think?
It's probably not your database, maybe you don't have the awards field marked, because it was the same for me until I checked the awards field in the settings, after that it transferred the awards data to the awards fields without any problems.
-
It is the same selenium script as for aka and I can download both the movies and people awards data with your awards code.
Thanks! Do you get all awards with it, or only "static" ones?
The problem is in the name of the file extension Tmp\UTF8_NO_BOM-Awards.mhtml. Change it to .htm and it should work.
Wait WHAT? It works THIS WAY, thanks a lot(!) but where and why on Earth that comes from? What htm or mhtml has to do with database at all?
It's probably not your database, maybe you don't have the awards field marked, because it was the same for me until I checked the awards field in the settings, after that it transferred the awards data to the awards fields without any problems.
You were right. Scripts were set to "- Set if empty" It works now, of course.
But, I'm still puzzled, not to say shocked about htm and mhtml...
-
It is the same selenium script as for aka and I can download both the movies and people awards data with your awards code.
Thanks! Do you get all awards with it, or only "static" ones?
A regular script only transfers "static" awards data, while with selenium it transfers all awards data.
The problem is in the name of the file extension Tmp\UTF8_NO_BOM-Awards.mhtml. Change it to .htm and it should work.
Wait WHAT? It works THIS WAY, thanks a lot(!) but where and why on Earth that comes from? What htm or mhtml has to do with database at all?
Your IMDB_People_[EN][HTTPS]-Awards script also uses the .htm extension for transferring all other data, except for transferring awards data, which file is not transferred to the Tmp folder by the script. This is what I meant when I mentioned the change at the end of the script, or an additional change is needed at the beginning of the script. For example, this needs to be added there.BASE_DOWNLOAD_FILE_NO_BOM-AWARDS = 'Tmp\UTF8_NO_BOM-Awards.mhtml';
The selenium script will also transfer the same if you do not change the extension in it.
It's probably not your database, maybe you don't have the awards field marked, because it was the same for me until I checked the awards field in the settings, after that it transferred the awards data to the awards fields without any problems.
You were right. Scripts were set to "- Set if empty" It works now, of course.
But, I'm still puzzled, not to say shocked about htm and mhtml...
Great that it works now.
About htm and mhtml it has already been mentioned above.
-
Unfortunately, I don't plan on working on any Imdb Awards section anymore for any updates or fixes to the movies or people code in Function ParsePage_IMDBMovieAWARDS. It's too complicated and completely inappropriate layout or notation of the Awards page source code to be able to edit it to properly record the Awards data.
Thanks Ivek. I will now continue to integrate everything in order to get fully revised and functional IMDB_People_[EN][HTTPS].psf script and will do my best to maintain it in the future...
For that, I prepared Chrome selenium script at:
http://www.videodb.info/forum_en/index.php?topic=4364.msg22706 (http://www.videodb.info/forum_en/index.php?topic=4364.msg22706)
-
Hello and Happy New Year!
I'm close to finish the script, and at the moment I'm stuck with the fact that all of a sudden people's photos aren't populated to PVD although properly downloaded and sent to AddImageURL. Here's the log about it:
1/3/2025 3:36:26 AM) ImageFile path in Get ~Photo~: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/3/2025 3:36:26 AM) Function DownloadImage BEGIN======================|
(1/3/2025 3:36:26 AM) Global Var-DownloadURL|https://www.imdb.com/name/nm0001833/|
(1/3/2025 3:36:26 AM) Local Var-URL|https://m.media-amazon.com/images/M/MV5BMjAwNzc5MjE0N15BMl5BanBnXkFtZTcwMzUyNTMzNw@@._V1_UY12000_.jpg|
(1/3/2025 3:36:26 AM) Local Var-OutPutFile|X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg|
(1/3/2025 3:36:26 AM) Download with PVdBDownPage in file:|X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg the information of:|https://m.media-amazon.com/images/M/MV5BMjAwNzc5MjE0N15BMl5BanBnXkFtZTcwMzUyNTMzNw@@._V1_UY12000_.jpg||
(1/3/2025 3:36:26 AM) Waiting 2s for exists of:X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/3/2025 3:36:28 AM) Now present complete page file: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/3/2025 3:36:28 AM) Function DownloadImage END======================|
(1/3/2025 3:36:28 AM) Image successfully downloaded to: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/3/2025 3:36:28 AM) Adding image with URL: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg and type: itPoster
(1/3/2025 3:36:28 AM) AddImageURL has been called with ImageType: 0 and ImageFile: X:\PATH\To\PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
Do you have any idea why is that? I also tried with itPhoto (4) but no luck... Checkbox "Overwrite" is checked for the script, so it's not about that too, and it doesn't work even if there wasn't photo at all.
-
Also try the IMDB_People_[EN][HTTPS]-Awards script, if it uploads people's photos.
Reduce it here too, maybe it will help
MAX_IMAGE_HEIGHT = 1200;
The definition of the people's photo import path could also be a problem, check it and you'll see what happens.
In the awards code, correct the part of the code here and download all visible awards.
// Extract entire award block
curPos := awardPos;
endPos := PosFrom('</div></div></li>', HTML, curPos);
If endPos = 0 Then Begin
LogMessage('No closing tag for award div found');
Break;
End;
Award := Copy(HTML, curPos, endPos - curPos);
curPos := endPos + Length('</div></div></li>');
LogMessage('Award Content Extracted Successfully: ' + Award);
Award:=StringReplace(Award,'"winner"','',False,True,False);
-
I tried everything you said, and everything else possible and impossible, I even created separate, dedicated function to AddImageURL, but to na avail.... Here's the log for that. It claims photo is added, but it's not:
(1/4/2025 2:30:45 AM) Function DownloadPage END======================|
(1/4/2025 2:30:45 AM) All pages downloaded successfully.
(1/4/2025 2:30:45 AM) Function DownloadPageMain BEGIN======================|
(1/4/2025 2:30:45 AM) Global Var-DownloadURL|downpage-UTF8_NO_BOM.htm|
(1/4/2025 2:30:45 AM) Reading main file: X:PATH\To_PVD\Scripts\Tmp\downpage-UTF8_NO_BOM.htm
(1/4/2025 2:30:45 AM) Function DownloadPageMain END======================|
(1/4/2025 2:30:45 AM) HandlePhoto function called.
(1/4/2025 2:30:45 AM) ImageFile path in Get ~Photo~: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:45 AM) Function DownloadImage BEGIN======================|
(1/4/2025 2:30:45 AM) Global Var-DownloadURL|https://www.imdb.com/name/nm0000017/|
(1/4/2025 2:30:45 AM) Local Var-URL|https://m.media-amazon.com/images/M/MV5BMTI0NTcyMTM5OF5BMl5BanBnXkFtZTYwOTkzMDU2._V1_UY12000_.jpg|
(1/4/2025 2:30:45 AM) Local Var-OutPutFile|X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg|
(1/4/2025 2:30:45 AM) Download with PVdBDownPage in file:|X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg the information of:|https://m.media-amazon.com/images/M/MV5BMTI0NTcyMTM5OF5BMl5BanBnXkFtZTYwOTkzMDU2._V1_UY12000_.jpg||
(1/4/2025 2:30:45 AM) Waiting 2s for exists of:X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:47 AM) Now present complete page file: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:47 AM) Function DownloadImage END======================|
(1/4/2025 2:30:47 AM) Image successfully downloaded to: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:47 AM) Adding image with URL: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg and type: itPoster
(1/4/2025 2:30:47 AM) AddImageURL has been called with ImageType: 0 and ImageFile: X:PATH\To_PVD\Scripts\Tmp\downimage-BIN-Photo.jpg
(1/4/2025 2:30:47 AM) Person -> LoadStatic -> 0ms
(1/4/2025 2:30:47 AM) Person -> LoadMultivalues -> 0ms
(1/4/2025 2:30:47 AM) Person -> LoadFilms -> 16ms
(1/4/2025 2:30:47 AM) Person -> LoadAwards -> 0ms
(1/4/2025 2:30:47 AM) Person -> LoadImages -> 0ms
(1/4/2025 2:30:47 AM) Get result PhotoURL: https://m.media-amazon.com/images/M/MV5BMTI0NTcyMTM5OF5BMl5BanBnXkFtZTYwOTkzMDU2._V1_UY12000_.jpg (https://m.media-amazon.com/images/M/MV5BMTI0NTcyMTM5OF5BMl5BanBnXkFtZTYwOTkzMDU2._V1_UY12000_.jpg) ||
(1/4/2025 2:30:47 AM) Script end. After, PVdB will retrieve from ListImage and info of person in order get the photo
(1/4/2025 2:30:47 AM) Photo processed and added successfully.
I have changed real path of my PVD with "X:PATH\To_PVD\" to post it here for privacy reasons...
-
I give up. I will post script without photo populate working for me. Maybe someone will check it...
Here it is, check:
http://www.videodb.info/forum_en/index.php/topic,4367.0.html
and
http://www.videodb.info/forum_en/index.php?topic=4368.0
for more
-
I found a solution, the script now also uploads people's photos to the database. I will add it to the forum tomorrow at the latest.
-
Here is the IMDB_People_[EN][HTTPS]_TEST_2_full script, which can be found at the link below.
http://www.videodb.info/forum_en/index.php/topic,4369.0.html
-
I found a solution, the script now also uploads people's photos to the database. I will add it to the forum tomorrow at the latest.
I knew it!
Thanks once again!
-
I found a solution, the script now also uploads people's photos to the database. I will add it to the forum tomorrow at the latest.
I knew it!
Thanks once again!
Thanks
-
This was impossible without you and your willingness to help.
I am now thinking of search function.
Maybe to transfer search to Selenium too (if storedURL isn't found)? That would make PVD script seamless, calling everything external from a single PVD script, and parsing what is gotten back? We would have only one PVD script for each PVD segment: movies and people. Even single Selenium script for search, depending on argument (movie or people) passed to it. What do you think?
-
This was impossible without you and your willingness to help.
I am now thinking of search function.
Maybe to transfer search to Selenium too (if storedURL isn't found)? That would make PVD script seamless, calling everything external from a single PVD script, and parsing what is gotten back? We would have only one PVD script for each PVD segment: movies and people. Even single Selenium script for search, depending on argument (movie or people) passed to it. What do you think?
Good idea.
-
Great. I already had first interaction with AI regarding this and it gave me the concept. The thing that worries me is presenting search results in PVD, gotten by Selenium, but let's go step by step...