Author Topic: Script for egafd.com  (Read 32996 times)

0 Members and 1 Guest are viewing this topic.

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 1899
    • View Profile
Re: Script for egafd.com
« Reply #40 on: December 04, 2011, 07:45:14 pm »
I found the problem, tmpYear wrong.  I like this set tmpYear : Integer; ,  but not like this tmpYear : String;

Rest tomorrow.
Ivek23
Win 7 32bit, 64bit   PVD v0.9.9.21


Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #41 on: December 04, 2011, 09:15:08 pm »
I think temYear must be a string. tmpYear is just the two firsts letters of "notes" to know what style of date is writen in the page.
We work in a text for search data. (in the case of 1980s or 1980?  or c.1980),
and the problem already existed before the modification.

We can translate Year in integer before addPersonMovie, but for search value i think we must stay in string.

I think is the develpoment of the script who is en cause.
If there's not "Role" after actual line movie search the function PosFrom find the "Role" of the movie in the next line.
We must make a condition before searching //Get role

I've taken for example to work in script this url : http://www.egafd.com/actresses/details.php/id/n00008
There's many case of date.

« Last Edit: December 04, 2011, 09:57:05 pm by pra15 »

Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #42 on: December 04, 2011, 10:35:50 pm »
I found this solution, it's not not very elegant but it's OK

Code: [Select]
// Get Role (Not yet defined)
difpos := (PosFrom('<i>', HTML, (actposEnd-1))+4) - actposend;
logmessage('DIFFERENCE : ' + intToStr(difpos));

If difpos < 200 then begin

         actPosStart := PosFrom('<i>', HTML, (actposend-1)) + 4;
         {actPosStart := PosFrom('> <i>', HTML, actPosStart) + 6;}
         actPosEnd:=PosFrom('</i></li>', HTML, actPosStart) - 1;
         Role := Trim(Copy(HTML, actposStart, (actPosEnd - actPosStart)));
         LogMessage('Role: ' + Role);
     
         debug_pos1:=Pos('(',Role);
         if debug_pos1 >0 then
         Role:= Copy(Role,0,debug_pos1-1);
         LogMessage(Role);

end;

Perhaps it can have exeptions, in this case we must modify the number of difference (here 200).

Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #43 on: December 05, 2011, 02:29:48 pm »
Just a little modification in the case where the last movie in the list has not "role".

Code: [Select]
// Get Role (Not yet defined)
difpos := (PosFrom('<i>', HTML, (actposEnd-1))+4) - actposend;
logmessage('DIFFERENCE : ' + intToStr(difpos));

If difpos > 0 then begin
If difpos < 200 then begin

          actPosStart := PosFrom('<i>', HTML, (actposend-1)) + 4;
          actPosEnd:=PosFrom('</i></li>', HTML, actPosStart) - 1;
        Role := Trim(Copy(HTML, actposStart, (actPosEnd - actPosStart)));
          LogMessage('Role: ' + Role);
     
          debug_pos1:=Pos('(',Role);
          if debug_pos1 >0 then
          Role:= Copy(Role,0,debug_pos1-1);
          LogMessage(Role);

end;
end;

Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #44 on: December 05, 2011, 03:11:52 pm »
As a suggestion up,

I add this :

Code: [Select]
//If Original:
actposstart := actposEnd + 5;
actposstart := PosFrom('">', HTML, actposstart) + 2;
actPosEnd := PosFrom('</', HTML, actPosstart) - 1;
If copy(HTML, actposstart, 3) = 'alt' then
OrigT := Copy(HTML, (actPosstart + 22),(actPosEnd-actPosStart-21))
else
OrigT := Title;


and :

Code: [Select]
// Total Line
If Lien <> '' then
Lien := Lien + #13;
If URL1 <> '' then begin
If OrigT <> Title then
Lien := Lien + Name
else
Lien := Lien + '<link url="' + URL1 + '">' + Name + '</link>';
end;
If Year <> '' then
Lien := Lien + ' • ' + Year;
If Note <> '' then
Lien := Lien + ' • ' + Note;
If Role <> '' then
Lien := Lien + ' • ' + Role;

LogMessage('LIEN :' + Lien);

So in the list of movie in Field Bio, there's only link with original title!
I think it's clearly.
« Last Edit: December 05, 2011, 03:13:36 pm by pra15 »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 1899
    • View Profile
Re: Script for egafd.com
« Reply #45 on: December 05, 2011, 03:12:56 pm »
Just a little modification in the case where the last movie in the list has not "role".

Code: [Select]
// Get Role (Not yet defined)
difpos := (PosFrom('<i>', HTML, (actposEnd-1))+4) - actposend;
logmessage('DIFFERENCE : ' + intToStr(difpos));

If difpos > 0 then begin
If difpos < 200 then begin

          actPosStart := PosFrom('<i>', HTML, (actposend-1)) + 4;
          actPosEnd:=PosFrom('</i></li>', HTML, actPosStart) - 1;
        Role := Trim(Copy(HTML, actposStart, (actPosEnd - actPosStart)));
          LogMessage('Role: ' + Role);
     
          debug_pos1:=Pos('(',Role);
          if debug_pos1 >0 then
          Role:= Copy(Role,0,debug_pos1-1);
          LogMessage(Role);

end;
end;

THANK YOU.
This code is now  excellent it works.

As a suggestion up,

I add this :

Code: [Select]
//If Original:
actposstart := actposEnd + 5;
actposstart := PosFrom('">', HTML, actposstart) + 2;
actPosEnd := PosFrom('</', HTML, actPosstart) - 1;
If copy(HTML, actposstart, 3) = 'alt' then
OrigT := Copy(HTML, (actPosstart + 22),(actPosEnd-actPosStart-21))
else
OrigT := Title;


and :

Code: [Select]
// Total Line
If Lien <> '' then
Lien := Lien + #13;
If URL1 <> '' then begin
If OrigT <> Title then
Lien := Lien + Name
else
Lien := Lien + '<link url="' + URL1 + '">' + Name + '</link>';
end;
If Year <> '' then
Lien := Lien + ' • ' + Year;
If Note <> '' then
Lien := Lien + ' • ' + Note;
If Role <> '' then
Lien := Lien + ' • ' + Role;

LogMessage('LIEN :' + Lien);

So in the list of movie in Field Bio, there's only link with original title!
I think it's clearly.

True, it is more transparent, to see the original movie links, which is very nice.
Thank you again.

BTW:

I tried to    EGAFD MOVIE SCRIPT   in section Cast add Role with this code or part of the code:
Code: [Select]

        // Get Role
             actPosStart := PosFrom('href="', HTML, EndPos);   // search for url start;
             actPosStart2 := PosFrom('</a> <', HTML, actPosStart)
             actPosEnd:=PosFrom('</', HTML, actPosStart2);    // search for url end
             Role := Trim(Copy(HTML, (actPosStart2 + 7), (actPosEnd - actPosStart2 - 7) ));
             LogMessage(Role);
     
             debug_pos1:=Pos('(',Role);
                if debug_pos1 >0 then
                Role:= Copy(Role,0,debug_pos1-1);
                LogMessage(Role);

             AddMoviePerson(Trim(Name), '', Role, LowerCase(URL), ctActors);
                 

which should look like this in whole code
Code: [Select]
   //Cast
   curPos:= Pos('<th>Actresses</th>', HTML);
   LogMessage('Cast readout');
   if curPos > 0 then    begin
   EndPos := curPos;
      while (curPos > 0) AND (curPos < PosFrom('</ul>', HTML, EndPos)) do begin
     
         EndPos := curPos; // Set last position to actual position
         // get url
         UrlPosStart := PosFrom('href="', HTML, EndPos);  // search for url start
         UrlPosEnd := PosFrom('>', HTML, UrlPosStart);  // search for url end
     
         URL := BASE_URL + Trim(Copy(HTML, UrlPosStart + 6, (UrlPosEnd - UrlPosStart - 7) ));
   
         LogMessage(URL);
   
         // Get Name
         actPosStart := PosFrom('href="', HTML, EndPos);   // search for url start;
         actPosStart2 := PosFrom('">', HTML, actPosStart)
         actPosEnd:=PosFrom('</a>', HTML, actPosStart2);    // search for url end
         Name := Trim(Copy(HTML, (actPosStart2 + 2), (actPosEnd - actPosStart2 - 2) ));       
LogMessage(Name);

         debug_pos1:=Pos('(',Name);
            if debug_pos1 >0 then
            Name := Copy(Name,0,debug_pos1-1);
            LogMessage(Name);

        // Get Role
             actPosStart := PosFrom('href="', HTML, EndPos);   // search for url start;
             actPosStart2 := PosFrom('</a> <', HTML, actPosStart)
             actPosEnd:=PosFrom('</', HTML, actPosStart2);    // search for url end
             Role := Trim(Copy(HTML, (actPosStart2 + 7), (actPosEnd - actPosStart2 - 7) ));
             LogMessage(Role);
     
             debug_pos1:=Pos('(',Role);
                if debug_pos1 >0 then
                Role:= Copy(Role,0,debug_pos1-1);
                LogMessage(Role);

             AddMoviePerson(Trim(Name), '', Role, LowerCase(URL), ctActors);
                 
            curPos := PosFrom('href="', HTML, actPosEnd);
             end;
      end;
but does not work.

That is what changed.
« Last Edit: December 05, 2011, 09:01:59 pm by Ivek23 »
Ivek23
Win 7 32bit, 64bit   PVD v0.9.9.21


Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #46 on: December 05, 2011, 10:38:12 pm »
It's a good idea, if we can put role.

I made somes tests with egafd_movie and saw there's  many bugs!
The same that we had first with egafe_people; If an info fail, the script doesn't give data.

I'll try to see that tomorow!

I made two minor modif in egafd_people, for vision only :

Code: [Select]
Born := Trim(Copy(HTML, (PosStart + 25), (PosEnd - PosStart - 25)));
Born := Uppercase(Copy(Born,0,1)) + Copy(Born,2, length(Born)-1) + #13;

Code: [Select]
// BIO: //

curpos := Pos('<th>Films</th>', HTML);
    LogMessage('Films readout');
    if curPos > 0 then    begin

Lien := '----- Filmography (EGAFD) -----' + #09;
    EndPos := curPos;


« Last Edit: December 05, 2011, 10:47:35 pm by pra15 »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 1899
    • View Profile
Re: Script for egafd.com
« Reply #47 on: December 06, 2011, 11:47:31 am »

I made two minor modif in egafd_people, for vision only :

Code: [Select]
Born := Trim(Copy(HTML, (PosStart + 25), (PosEnd - PosStart - 25)));
Born := Uppercase(Copy(Born,0,1)) + Copy(Born,2, length(Born)-1) + #13;

Code: [Select]
// BIO: //

curpos := Pos('<th>Films</th>', HTML);
    LogMessage('Films readout');
    if curPos > 0 then    begin

Lien := '----- Filmography (EGAFD) -----' + #09;
    EndPos := curPos;

I have a better and simple solution, the effect is the same as before.
Code: [Select]
// Total Line
        If Lien <> '' then
        Lien := Lien + #13;
       ...
               ...
               ...
        LogMessage('LIEN :' + Lien);
               

         curPos := PosFrom('<a href="', HTML, actPosEnd);
       end;
     
   
        if (Lien <> '') AND (Born = '') then
AddFieldValue(pfBio, Lien);
if (Lien  <> '') AND (Born <> '') then
AddFieldValue(pfBio, Born + #13 + #13 + Lien);
end;

I modified the part of the Bio code
Code: [Select]
//If Title:
        actposstart := actposEnd + 5;
        actposstart := PosFrom('">', HTML, actposstart) + 2;
        actPosEnd := PosFrom('</', HTML, actPosstart) - 1;
        If copy(HTML, actposstart, 11) = 'alternative' then
        Title := Copy(HTML, (actPosstart + 22),(actPosEnd-actPosStart-21))
        else
        Title := OrigT;


//If Original:
        actposstart := actposEnd + 5;
        actposstart := PosFrom('">', HTML, actposstart) + 2;
        actPosEnd := PosFrom('</', HTML, actPosstart) - 1;
        If copy(HTML, actposstart, 3) = 'alt' then
        OrigT := Copy(HTML, (actPosstart + 22),(actPosEnd-actPosStart-21))
        else
        OrigT := Title;
and
Code: [Select]
// Total Line
        If Lien <> '' then
        Lien := Lien + #13;
        If URL1 <> '' then begin
    If Title <> OrigT then
        Lien := Lien + Name
else
If OrigT <> Title then
        Lien := Lien + Name
             else
        Lien := Lien + '<link url="' + URL1 + '">' + Name + '</link>';
        end;
        If Year <> '' then
        Lien := Lien + ' • ' + Year;
        If Note <> '' then
        Lien := Lien + ' • ' + Note;
        If Role <> '' then
        Lien := Lien + ' • ' + Role;

        LogMessage('LIEN :' + Lien);
and now, in addition to  Original Title also that Title, which are not  Alternative Title.
Not the best.
In Bio movie list is not as transparent as yours, I like it, Of course, if that is what change is also OK.

Could be done in BORN that all was as up to now,  Birthplace field arranged so that there are visible such data like these for example:
Czech, b. 1985
Hungarian. b. 1978
Ivek23
Win 7 32bit, 64bit   PVD v0.9.9.21


Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #48 on: December 06, 2011, 03:41:13 pm »
Thanks,
i'll see that later becaause i live in the middle of nowhere and the connection is very very slow during the day!

For "role" in egafd movie :

Code: [Select]
// -------------- Procedure ParseMovie --------------//

//  - after searchlist selection this function is called with the new html content

procedure ParseMovie(MovieURL : String; HTML : String);

var
 curPos, EndPos, P, P2, L: Integer;
 actPosStart,actPosStart2,
actPosEnd, UrlPosStart,UrlPosEnd,debug_pos1:Integer;
 Tmp, URL, Name,dbgstrg,tmpstrg, Role : String;
 ActorNames: TWideArray;
 ActorNumber,I,J: Integer;

/////
begin
    AddFieldValue(mfURL, MovieURL);
    LogMessage('Page parsing started');
    EndPos := 1;

   //Check for title. No orig. title info present, so duplicate..
   dbgstrg:= TextBetween(HTML, '<title>', '</title>', False, EndPos);
   LogMessage('Title: ' + dbgstrg);

If dbgstrg <> '' then begin
    AddFieldValue(mfOrigTitle,dbgstrg);
    AddFieldValue(mfTitle,dbgstrg);
end;
   
   //Year
   dbgstrg := '';
   CurPos := Pos('Released: ', HTML);
   endpos := CurPos;
         LogMessage('getting year');
         dbgstrg := TextBetween(HTML, '">', '</td>', False, CurPos);
         LogMessage('YEAR:' + dbgstrg);
If dbgstrg <> '' then
          AddFieldValue(mfYear,dbgstrg);
   
   //Director
   dbgstrg := '';
   curPos := Pos('Director: ', HTML);
   EndPos := curPos;
         LogMessage('getting Director');
         dbgstrg:= TextBetween(HTML, '">', '</td>', False, CurPos);
         LogMessage('DIRECTOR:' + dbgstrg);
          If dbgstrg <> '' then
AddMoviePerson(dbgstrg, '', '', '', ctDirectors);


   //Notes
   dbgstrg := '';
   curpos := Pos('Notes: ' , HTML);
   EndPos := curPos;
         LogMessage('Notes')
         dbgstrg := TextBetween(HTML, '">', '</td>', False, CurPos);
         LogMessage('Notes :' + dbgstrg);
If dbgstrg <> '' then
          AddFieldValue(mfDescription, dbgstrg);
   
   //AKA - Titles...
   dbgstrg := '';
   LogMessage('getting all titles')
   Curpos := Pos('<th>Alternate Titles</th>' ,HTML);
   EndPos := curPos;

////
while (curPos > 0) AND (curPos < PosFrom('<th>Actresses</th>', HTML, EndPos)) do begin
         EndPos := curPos;
         actPosStart := PosFrom('class="flma"', HTML, EndPos);
         actPosEnd := PosFrom('</span>', HTML, actPosStart);
         dbgstrg := Trim(Copy(HTML, (actPosstart + 13), (actPosEnd - actPosStart - 13) ));
         LogMessage('AKA: ' + dbgstrg);
If dbgstrg <> '' then
          AddFieldValue(mfAka, dbgstrg);

         curpos := PosFrom('class="flma"', HTML, actPosEnd);
////
end;
   
   //Cast
   curPos:= Pos('<th>Actresses</th>', HTML);
   LogMessage('Cast readout');
/////
if curPos > 0 then    begin
    EndPos := curPos;
///
      while (curPos > 0) AND (curPos < PosFrom('<th>Notes</th>', HTML, EndPos)) do begin
          EndPos := curPos;

        // get url
          UrlPosStart := PosFrom('href="', HTML, EndPos);
          UrlPosEnd := PosFrom('">', HTML, UrlPosStart);
          URL := BASE_URL + Trim(Copy(HTML, UrlPosStart + 6, (UrlPosEnd - UrlPosStart - 7) ));
          LogMessage(URL);
   
          // Get Name
          //actPosStart := PosFrom('href="', HTML, EndPos);   // search for url start;
//actPosStart2 := PosFrom('">', HTML, actPosStart)
//actPosEnd:=PosFrom('</a>', HTML, actPosStart2);    // search for url end
//Name := Trim(Copy(HTML, (actPosStart2 + 2), (actPosEnd - actPosStart2 - 2) ));
//LogMessage(Name);
{Name := TextBetween(HTML, '">', '</a>', false, urlpostart);}

// Get Name
actposstart := urlposend + 2;
actposend := Posfrom('</a>', HTML, actposstart);
Name := Trim(Copy(HTML, actposstart, (actposend-actposstart)));
Logmessage('Name: ' + Name);
 
          debug_pos1:=Pos('(',Name);
            if debug_pos1 >0 then
            Name := Copy(Name,0,debug_pos1-1);
            LogMessage('Name:' + Name);

// Role
Role := '';
actposStart := actposend;

If (copy(HTML, actposstart + 4, 1)) <> '<' then begin
actposStart := actposstart -6;
Role := TextBetween(HTML, '<i>', '</i>', false, actposstart);
end;

debug_pos1:=Pos('(',Role);
            if debug_pos1 >0 then
            Role := Copy(Role,0,debug_pos1-1);
            LogMessage('Role:'+ Role);


If URL <> '' then
If Name <> '' then     
            AddMoviePerson(Trim(Name), '', Role, LowerCase(URL), ctActors);


               
            curPos := PosFrom('href="', HTML, actPosEnd);

///
end;
/////
end;

/////
end;

It seems to be OK!
« Last Edit: December 06, 2011, 04:22:06 pm by pra15 »

Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #49 on: December 06, 2011, 04:42:12 pm »
Sorry, i don't understand your modification!
Certainly i made mistakes in modifying my code!
could you post your total code of Parse people?

I'm noting two errors in the code up :

If no "Notes" in the web page :

Code: [Select]
//Cast
   curPos:= Pos('<th>Actresses</th>', HTML);
   LogMessage('Cast readout');
If pos('<th>Notes</th>', HTML) > 0 then
endListStr := '<th>Notes</th>'
else
endListStr := '>Suppliers</th>';

/////
if curPos > 0 then    begin
    EndPos := curPos;
///
      while (curPos > 0) AND (curPos < PosFrom(endlistStr, HTML, EndPos)) do begin

and an error in getting URL of actresse :

Code: [Select]
// get url
          UrlPosStart := PosFrom('href="', HTML, EndPos) + 6;
          UrlPosEnd := PosFrom('">', HTML, UrlPosStart);
          URL := BASE_URL + Trim(Copy(HTML, UrlPosStart, UrlPosEnd - UrlPosStart));
          LogMessage(URL);
« Last Edit: December 06, 2011, 05:17:57 pm by pra15 »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 1899
    • View Profile
Re: Script for egafd.com
« Reply #50 on: December 06, 2011, 05:50:13 pm »
You can wait two or three days because I was half collapsed computer and I can not get to certain files. Thank you for understanding.
Ivek23
Win 7 32bit, 64bit   PVD v0.9.9.21


Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #51 on: December 06, 2011, 06:08:01 pm »
No problem, good courage!

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 1899
    • View Profile
Re: Script for egafd.com
« Reply #52 on: December 08, 2011, 09:47:18 am »
I'm back.

Quote
Certainly i made mistakes in modifying my code!
could you post your total code of Parse people?
Here is a complete Procedure Parse people code
Code: [Select]
//--------------------Procedure parse people-----------------
procedure ParsePeople(URL : String; HTML : String);

var curpos, endpos, PosStart, PosEnd, debug_Pos1, difpos : Integer;
    actPosstart, actposstart2, actPosStart3, actPosStart4, actposend, UrlposStart, UrlposEnd : Integer;
    Pseudo, Born, Lien, URL1, Name, OrigT, Title, Year, Av, Role, Notes, Note, tmpYear : String;


begin

//URL
Addfieldvalue(pfURL,URL);


//BORN:
curpos := Pos('<th>Notes</th>', HTML);
endpos := curpos;

While (curpos > 0) AND (curpos < Posfrom('</tr>', HTML, EndPos)) do begin
endpos := curpos;
PosStart := PosFrom('<td><ul class="list"><li>', HTML, endpos);
PosEnd := PosFrom('</li></ul></td>', HTML, PosStart);
Born := Trim(Copy(HTML, (PosStart + 25), (PosEnd - PosStart - 25)));
//LogMessage('BORN :' + Born);
curpos := posfrom('<td><ul class="list"><li>', HTML, posend);
if Born <> '' then
//AddFieldValue(pfBirthplace, Born);
LogMessage('BORN :' + Born);
end;


//AKA:
curpos := Pos('<th>Pseudonyms</th>', HTML);
endpos := curpos;

While (curpos > 0) AND (curpos < Posfrom('<th>Films</th>', HTML, EndPos)) do begin
endpos := curpos;
PosStart := PosFrom('class="acta">', HTML, endpos);
PosEnd := PosFrom('</span>', HTML, PosStart);
Pseudo := Trim(Copy(HTML, (PosStart + 13), (PosEnd - PosStart - 13)));
LogMessage('AKA :' + Pseudo);
curpos := posfrom('class="acta">', HTML, posend);
if Pseudo <> '' then
AddFieldValue(pfAltnames, Pseudo);
end;


//BIO:
curpos := Pos('<th>Films</th>', HTML);
    LogMessage('Films readout');
    if curPos > 0 then    begin

Lien := '';

    EndPos := curPos;
       while (curPos > 0) AND (curPos < PosFrom('</ul>', HTML, EndPos)) do begin
     
         EndPos := curPos; // Set last position to actual position
         // get url
         UrlPosStart := PosFrom('<a href="', HTML, EndPos);  // search for url start
         UrlPosEnd := PosFrom('" class="', HTML, UrlPosStart);  // search for url end     
         URL1 := BASE_URL + Copy(HTML, UrlPosStart + 9, (UrlPosEnd - UrlPosStart - 9) );
         LogMessage(URL1);

   
(*       // get url (for example)
UrlPosStart := PosFrom('<a href="', HTML, EndPos);  // search for url start
         UrlPosEnd := PosFrom('" class="', HTML, UrlPosStart);  // search for url end
         URL1 := BASE_URL + Trim(Copy(HTML, UrlPosStart + 9, (UrlPosEnd - UrlPosStart - 9) ));
         LogMessage(URL1);
*)


         // Get Name
         actPosStart := PosFrom('<a href="', HTML, EndPos);   // search for url start;
         actPosStart2 := PosFrom('">', HTML, actPosStart)
         actPosEnd:=PosFrom('</a>', HTML, actPosStart2);    // search for url end
         Name := Trim(Copy(HTML, (actPosStart2 + 2), (actPosEnd - actPosStart2 - 2) ));
         LogMessage(Name);
     
         debug_pos1:=Pos('(',Name);
            if debug_pos1 >0 then
            Name := Copy(Name,0,debug_pos1-1);
            LogMessage(Name);


//If Original:
        actposstart := actposEnd + 5;
        actposstart := PosFrom('">', HTML, actposstart) + 2;
        actPosEnd := PosFrom('</', HTML, actPosstart) - 1;
        If copy(HTML, actposstart, 3) = 'alt' then
        OrigT := Copy(HTML, (actPosstart + 22),(actPosEnd-actPosStart-21))
        else
        OrigT := Title;


(* // Get Title  (for movies)
         actPosStart := PosFrom('<a href="', HTML, EndPos);   // search for url start;
         actPosStart2 := PosFrom('">', HTML, actPosStart)
         actPosEnd:=PosFrom('</a>', HTML, actPosStart2);    // search for url end
         Title := Trim(Copy(HTML, (actPosStart2 + 2), (actPosEnd - actPosStart2 - 2) ));
         LogMessage(Title);
     
         debug_pos1:=Pos('(',Title);
            if debug_pos1 >0 then
            Title := Copy(Title,0,debug_pos1-1);
            LogMessage(Title);
*)


         //Notes :
     actPosStart := PosFrom('<a href="', HTML, EndPos);
     actPosStart2 := PosFrom('</a>', HTML, actPosStart);
     Av := Trim(Copy(HTML, (actposstart2 + 5), 1));
            logmessage('AV : ' + Av);
            If Av = '<' then begin
            actposStart2 := (actposstart2 + 5);           //Step to go after "</a>"
            actposstart3 := PosFrom('>', HTML, actposStart2);
            actposend := PosFrom('<', HTML, actposstart3);
            Notes := Trim(Copy(HTML, (actposstart3 +1), (actposend - actposStart3 - 1)));
            logmessage('Notes :' + Notes);
            end;   

            debug_pos1:=Pos('(',Notes);
            if debug_pos1 >0 then
            Year:= Copy(Notes,0,debug_pos1-1);
            LogMessage(Notes);

           
            ///Get Year & Note :
        tmpYear := Copy(Notes, 0, 2);
        logMessage('tmpYear :' + tmpYear);

        Case tmpYear of
            'c.' : Begin
               Year := Copy(Notes,4,4);
               Note := '';
   end;
   
'19', '20' : Begin
               If Copy(Notes,0,5) = 's' {OR Copy(Notes,0,5) = '?'} then begin    
                                Year := Copy(Notes,0,5);
                Note := Copy(Notes,7, Length(Notes)-6);
                end
                       else begin
                Year := Copy(Notes,0,4);
                Note := Copy(Notes,6, Length(Notes)-5);
               end;
                           If Copy(Notes,0,5) = '?' then begin
                Year := Copy(Notes,0,5);
                Note := Copy(Notes,7, Length(Notes)-6);
                end
               else begin
                Year := Copy(Notes,0,4);
                Note := Copy(Notes,6, Length(Notes)-5);
               end;
               end;
   
    else begin
             Year := '';
             Note := Notes;
            end;
            end;

logmessage('Year :' + Year);
        logmessage('Note :' + Note);


(*       // Get Year
     actPosStart := PosFrom('<a href="', HTML, EndPos);
     actPosStart2 := PosFrom('</a>', HTML, actPosStart);
     Av := Trim(Copy(HTML, (actposstart2 + 5), 1));
     logmessage('AV : ' + Av);
     If Av = '<' then begin
     actposStart2 := (actposstart2 + 5);           //Step to go after "</a>"
     actposstart3 := PosFrom('>', HTML, actposStart2);
     actposend := PosFrom('<', HTML, actposstart3);
     Year := Trim(Copy(HTML, (actposstart3 +1), (actposend - actposStart3 - 1)));
     logmessage(Year);
     end;   

         debug_pos1:=Pos('(',Year);
         if debug_pos1 >0 then
         Year:= Copy(Year,0,debug_pos1-1);
         LogMessage(Year);
*)


(* // Get Role (Now defined)
         actPosStart4 := PosFrom('<i>', HTML, (actposend-1)) + 4;
         actPosEnd:=PosFrom('</i>', HTML, actPosStart4) - 1;
         Role := Trim(Copy(HTML, actposStart4, (actPosEnd - actPosStart4)));
         LogMessage('Role: ' + Role);
     
         debug_pos1:=Pos('(',Role);
         if debug_pos1 >0 then
         Role:= Copy(Role,0,debug_pos1-1);
         LogMessage(Role);
*)


        // Get Role (Now defined)
     difpos := (PosFrom('<i>', HTML, (actposEnd-1))+4) - actposend;
     logmessage('DIFFERENCE : ' + intToStr(difpos));

     If difpos > 0 then begin
    If difpos < 200 then begin

            actPosStart := PosFrom('<i>', HTML, (actposend-1)) + 4;
            {actPosStart := PosFrom('> <i>', HTML, actPosStart) + 6;}
            actPosEnd:=PosFrom('</i></li>', HTML, actPosStart) - 1;
            Role := Trim(Copy(HTML, actposStart, (actPosEnd - actPosStart)));
            LogMessage('Role: ' + Role);
     
            debug_pos1:=Pos('(',Role);
            if debug_pos1 >0 then
            Role:= Copy(Role,0,debug_pos1-1);
            LogMessage(Role);

        end;
     end;


//AddPersonMovie(Trim(Title), '', Role, Year, LowerCase(URL1), ctActors);
AddPersonMovie(Trim(OrigT), '', Role, Year, LowerCase(URL1), ctActors);


(*    // Total Line
if Lien <> '' then
         Lien := Lien + #13;
if URL1 <> '' then
         Lien := Lien + '<link url="' + URL1 + '">';
            Lien := Lien + Name + '</link>';
if Year <> '' then
         Lien := Lien + ' • ' + Year;
If Note <> '' then
         Lien := Lien + ' • ' + Note;
if Role <> '' then
         Lien := Lien + ' • ' + Role;
*)
 
           
            // Total Line
        If Lien <> '' then
        Lien := Lien + #13;
        If URL1 <> '' then begin
        If OrigT <> Title then
        Lien := Lien + Name
        else
        Lien := Lien + '<link url="' + URL1 + '">' + Name + '</link>';
        end;
        If Year <> '' then
        Lien := Lien + ' • ' + Year;
        If Note <> '' then
        Lien := Lien + ' • ' + Note;
        If Role <> '' then
        Lien := Lien + ' • ' + Role;

        LogMessage('LIEN :' + Lien);
               

         curPos := PosFrom('<a href="', HTML, actPosEnd);
       end;
     
   
        if (Lien <> '') AND (Born = '') then
AddFieldValue(pfBio, Lien);
if (Lien  <> '') AND (Born <> '') then
AddFieldValue(pfBio, Born + #13+#13 + Lien);
end;


//Foto
curPos :=Pos('src="/actresses/id/',HTML);
if curPos > 0 then begin
EndPos := PosFrom('" width', HTML, curPos);
PhotoURL := BASE_URL + Copy(HTML, curPos + 5, EndPos - curPos - 5);
LogMessage('URL de la photo: '+ PhotoURL);
{PhotoURL := HTMLToText (PhotoURL);}
AddImageURL(4, PhotoURL);
end
else begin
PhotoURL := '';
end;

end;

Sorry, i don't understand your modification!
I modified the part of the Bio code


Here is a complete Modified Bio code

Code: [Select]
//BIO:
curpos := Pos('<th>Films</th>', HTML);
    LogMessage('Films readout');
    if curPos > 0 then    begin

Lien := '';

    EndPos := curPos;
       while (curPos > 0) AND (curPos < PosFrom('</ul>', HTML, EndPos)) do begin
     
         EndPos := curPos; // Set last position to actual position
         // get url
         UrlPosStart := PosFrom('<a href="', HTML, EndPos);  // search for url start
         UrlPosEnd := PosFrom('" class="', HTML, UrlPosStart);  // search for url end     
         URL1 := BASE_URL + Copy(HTML, UrlPosStart + 9, (UrlPosEnd - UrlPosStart - 9) );
         LogMessage(URL1);

   
(*       // get url (for example)
UrlPosStart := PosFrom('<a href="', HTML, EndPos);  // search for url start
         UrlPosEnd := PosFrom('" class="', HTML, UrlPosStart);  // search for url end
         URL1 := BASE_URL + Trim(Copy(HTML, UrlPosStart + 9, (UrlPosEnd - UrlPosStart - 9) ));
         LogMessage(URL1);
*)


         // Get Name
         actPosStart := PosFrom('<a href="', HTML, EndPos);   // search for url start;
         actPosStart2 := PosFrom('">', HTML, actPosStart)
         actPosEnd:=PosFrom('</a>', HTML, actPosStart2);    // search for url end
         Name := Trim(Copy(HTML, (actPosStart2 + 2), (actPosEnd - actPosStart2 - 2) ));
         LogMessage(Name);
     
         debug_pos1:=Pos('(',Name);
            if debug_pos1 >0 then
            Name := Copy(Name,0,debug_pos1-1);
            LogMessage(Name);


//If Title:
        actposstart := actposEnd + 5;
        actposstart := PosFrom('">', HTML, actposstart) + 2;
        actPosEnd := PosFrom('</', HTML, actPosstart) - 1;
        If copy(HTML, actposstart, 11) = 'alternative' then
        Title := Copy(HTML, (actPosstart + 22),(actPosEnd-actPosStart-21))
        else
        Title := OrigT;


//If Original:
        actposstart := actposEnd + 5;
        actposstart := PosFrom('">', HTML, actposstart) + 2;
        actPosEnd := PosFrom('</', HTML, actPosstart) - 1;
        If copy(HTML, actposstart, 3) = 'alt' then
        OrigT := Copy(HTML, (actPosstart + 22),(actPosEnd-actPosStart-21))
        else
        OrigT := Title;


(* // Get Title  (for movies)
         actPosStart := PosFrom('<a href="', HTML, EndPos);   // search for url start;
         actPosStart2 := PosFrom('">', HTML, actPosStart)
         actPosEnd:=PosFrom('</a>', HTML, actPosStart2);    // search for url end
         Title := Trim(Copy(HTML, (actPosStart2 + 2), (actPosEnd - actPosStart2 - 2) ));
         LogMessage(Title);
     
         debug_pos1:=Pos('(',Title);
            if debug_pos1 >0 then
            Title := Copy(Title,0,debug_pos1-1);
            LogMessage(Title);
*)


         //Notes :
     actPosStart := PosFrom('<a href="', HTML, EndPos);
     actPosStart2 := PosFrom('</a>', HTML, actPosStart);
     Av := Trim(Copy(HTML, (actposstart2 + 5), 1));
            logmessage('AV : ' + Av);
            If Av = '<' then begin
            actposStart2 := (actposstart2 + 5);           //Step to go after "</a>"
            actposstart3 := PosFrom('>', HTML, actposStart2);
            actposend := PosFrom('<', HTML, actposstart3);
            Notes := Trim(Copy(HTML, (actposstart3 +1), (actposend - actposStart3 - 1)));
            logmessage('Notes :' + Notes);
            end;   

            debug_pos1:=Pos('(',Notes);
            if debug_pos1 >0 then
            Year:= Copy(Notes,0,debug_pos1-1);
            LogMessage(Notes);

           
            ///Get Year & Note :
        tmpYear := Copy(Notes, 0, 2);
        logMessage('tmpYear :' + tmpYear);

        Case tmpYear of
            'c.' : Begin
               Year := Copy(Notes,4,4);
               Note := '';
   end;
   
'19', '20' : Begin
               If Copy(Notes,0,5) = 's' {OR Copy(Notes,0,5) = '?'} then begin    
                                Year := Copy(Notes,0,5);
                Note := Copy(Notes,7, Length(Notes)-6);
                end
                       else begin
                Year := Copy(Notes,0,4);
                Note := Copy(Notes,6, Length(Notes)-5);
               end;
                           If Copy(Notes,0,5) = '?' then begin
                Year := Copy(Notes,0,5);
                Note := Copy(Notes,7, Length(Notes)-6);
                end
               else begin
                Year := Copy(Notes,0,4);
                Note := Copy(Notes,6, Length(Notes)-5);
               end;
               end;
   
    else begin
             Year := '';
             Note := Notes;
            end;
            end;

logmessage('Year :' + Year);
        logmessage('Note :' + Note);


(*       // Get Year
     actPosStart := PosFrom('<a href="', HTML, EndPos);
     actPosStart2 := PosFrom('</a>', HTML, actPosStart);
     Av := Trim(Copy(HTML, (actposstart2 + 5), 1));
     logmessage('AV : ' + Av);
     If Av = '<' then begin
     actposStart2 := (actposstart2 + 5);           //Step to go after "</a>"
     actposstart3 := PosFrom('>', HTML, actposStart2);
     actposend := PosFrom('<', HTML, actposstart3);
     Year := Trim(Copy(HTML, (actposstart3 +1), (actposend - actposStart3 - 1)));
     logmessage(Year);
     end;   

         debug_pos1:=Pos('(',Year);
         if debug_pos1 >0 then
         Year:= Copy(Year,0,debug_pos1-1);
         LogMessage(Year);
*)


(* // Get Role (Now defined)
         actPosStart4 := PosFrom('<i>', HTML, (actposend-1)) + 4;
         actPosEnd:=PosFrom('</i>', HTML, actPosStart4) - 1;
         Role := Trim(Copy(HTML, actposStart4, (actPosEnd - actPosStart4)));
         LogMessage('Role: ' + Role);
     
         debug_pos1:=Pos('(',Role);
         if debug_pos1 >0 then
         Role:= Copy(Role,0,debug_pos1-1);
         LogMessage(Role);
*)


        // Get Role (Now defined)
     difpos := (PosFrom('<i>', HTML, (actposEnd-1))+4) - actposend;
     logmessage('DIFFERENCE : ' + intToStr(difpos));

     If difpos > 0 then begin
    If difpos < 200 then begin

            actPosStart := PosFrom('<i>', HTML, (actposend-1)) + 4;
            {actPosStart := PosFrom('> <i>', HTML, actPosStart) + 6;}
            actPosEnd:=PosFrom('</i></li>', HTML, actPosStart) - 1;
            Role := Trim(Copy(HTML, actposStart, (actPosEnd - actPosStart)));
            LogMessage('Role: ' + Role);
     
            debug_pos1:=Pos('(',Role);
            if debug_pos1 >0 then
            Role:= Copy(Role,0,debug_pos1-1);
            LogMessage(Role);

        end;
     end;


//AddPersonMovie(Trim(Title), '', Role, Year, LowerCase(URL1), ctActors);
AddPersonMovie(Trim(OrigT), '', Role, Year, LowerCase(URL1), ctActors);


(*    // Total Line
if Lien <> '' then
         Lien := Lien + #13;
if URL1 <> '' then
         Lien := Lien + '<link url="' + URL1 + '">';
            Lien := Lien + Name + '</link>';
if Year <> '' then
         Lien := Lien + ' • ' + Year;
If Note <> '' then
         Lien := Lien + ' • ' + Note;
if Role <> '' then
         Lien := Lien + ' • ' + Role;
*)
 
           
            // Total Line
        If Lien <> '' then
        Lien := Lien + #13;
        If URL1 <> '' then begin
    If Title <> OrigT then
        Lien := Lien + Name
else
If OrigT <> Title then
        Lien := Lien + Name
        else
        Lien := Lien + '<link url="' + URL1 + '">' + Name + '</link>';
        end;
        If Year <> '' then
        Lien := Lien + ' • ' + Year;
        If Note <> '' then
        Lien := Lien + ' • ' + Note;
        If Role <> '' then
        Lien := Lien + ' • ' + Role;

        LogMessage('LIEN :' + Lien);
               

         curPos := PosFrom('<a href="', HTML, actPosEnd);
       end;
     
   
        if (Lien <> '') AND (Born = '') then
AddFieldValue(pfBio, Lien);
if (Lien  <> '') AND (Born <> '') then
AddFieldValue(pfBio, Born + #13 + #13 + Lien);
end;

and now, in addition to  Original Title also that Title, which are not  Alternative Title.
Not the best.
In Bio movie list is not as transparent as yours, I like it, Of course, if that is what change is also OK.

Is it still your version of this code is better, perhaps it could be something to fix.
Ivek23
Win 7 32bit, 64bit   PVD v0.9.9.21


Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #53 on: December 08, 2011, 01:52:20 pm »
Thanks, but i always don't understand!

The result is the same as the version before my modification.

//If title and //If Original have the same code (the same effect)!

idem in part of Lien.


I don't find page of an actresse with born info in egafd, if you have links!
« Last Edit: December 08, 2011, 02:19:20 pm by pra15 »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 1899
    • View Profile
Re: Script for egafd.com
« Reply #54 on: December 08, 2011, 02:48:50 pm »
Thanks, but i always don't understand!

The result is the same as the version before my modification.

//If title and //If Original have the same code (the same effect)!

idem in part of Lien.
OK. This is not deals more and we leave this so as your modification, than if when offered what better solution for this change and prefer let's try to find a solution for this,
Code: [Select]
//BORN:
curpos := Pos('<th>Notes</th>', HTML);
endpos := curpos;

While (curpos > 0) AND (curpos < Posfrom('</tr>', HTML, EndPos)) do begin
endpos := curpos;
PosStart := PosFrom('<td><ul class="list"><li>', HTML, endpos);
PosEnd := PosFrom('</li></ul></td>', HTML, PosStart);
Born := Trim(Copy(HTML, (PosStart + 25), (PosEnd - PosStart - 25)));
//LogMessage('BORN :' + Born);
curpos := posfrom('<td><ul class="list"><li>', HTML, posend);
if Born <> '' then
AddFieldValue(pfBirthplace, Born);
LogMessage('BORN :' + Born);
end;
Could be done in BORN that all was as up to now,  Birthplace field arranged so that there are visible such data like these for example:
Czech, b. 1985
Hungarian. b. 1978

if possible.
Ivek23
Win 7 32bit, 64bit   PVD v0.9.9.21


Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #55 on: December 08, 2011, 06:13:33 pm »
For Born,

First Declare variable :
 PartBorn : Twidearray;
I : integer
Country, BirhtDay : String;

and after //Born :

Code: [Select]
////////BirthDay & Birthplace:

ExplodeString(Born, PartBorn, #46);
For I := Low(partBorn) to High(partBorn) do
Begin
PartBorn[I] := Trim(partBorn[I]);
End;

For I := Low(partBorn) to High(partBorn) do
Begin
If partBorn[I] = 'b' then begin
Country := partBorn[I-1];
BirthDay := '01/01/' + Copy(partBorn[I+1], 0, 4);
end;
end;

If Country <> '' then
AddFieldValue(pfBirthPlace, Country);
If BirthDay <> '' then
AddFieldValue(pfBirthday, BirthDay);


I have not enough exemples of page with "Notes", but where i try it works.

We haven't day and month of birthday!
« Last Edit: December 08, 2011, 06:24:00 pm by pra15 »

Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #56 on: December 08, 2011, 06:32:10 pm »
For more security, in the case where there's text before country :

Declare more :
PartBorn2 : twidearray;
J : integer

Code: [Select]
//BirthDay & Birthplace:

ExplodeString(Born, PartBorn, #46);
For I := Low(partBorn) to High(partBorn) do
Begin
PartBorn[I] := Trim(partBorn[I]);
End;

For I := Low(partBorn) to High(partBorn) do
Begin
If partBorn[I] = 'b' then begin
ExplodeString(partBorn[I-1], PartBorn2, ' ');
J := High(partBorn2);
Country := partBorn2[J];
BirthDay := '01/01/' + Copy(partBorn[I+1], 0, 4);
end;
end;

If Country <> '' then
AddFieldValue(pfBirthPlace, Country);
If BirthDay <> '' then
AddFieldValue(pfBirthday, BirthDay);

This works only with the type Hungarian. b. 1980 !
« Last Edit: December 08, 2011, 06:39:45 pm by pra15 »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 1899
    • View Profile
Re: Script for egafd.com
« Reply #57 on: December 08, 2011, 07:24:05 pm »
If you do not mind tomorrow, now regulated and tests egafd_movie script, how it works, the result will be tomorrow.
Ivek23
Win 7 32bit, 64bit   PVD v0.9.9.21


Offline pra15

  • Power User
  • ****
  • Posts: 164
    • View Profile
Re: Script for egafd.com
« Reply #58 on: December 08, 2011, 11:26:10 pm »
I have tested it,
two little errors fixed it :

Code: [Select]
//BirthDay & Birthplace:

ExplodeString(Born, PartBorn, #46);
For I := Low(partBorn) to High(partBorn) do
Begin
PartBorn[I] := Trim(partBorn[I]);
End;

For I := Low(partBorn) to High(partBorn) do
Begin
If partBorn[I] = 'b' then begin
ExplodeString(partBorn[I-1], PartBorn2, ' ');
J := High(partBorn2);
Country := partBorn2[J];
BirthDay := '01/01/' + Copy(partBorn[I+1], 0, 4);
end;

If Copy(partBorn[I], length(partBorn[I])-2,3) = ', b' then begin
ExplodeString(partBorn[I], partBorn2, ',');
J := High(partBorn2);
ExplodeString(partBorn2[J-1], partBorn3, ' ');
J := High(partBorn3);
Country := partBorn3[J];
Birthday := '01/01/' + copy(partborn[I+1],0,4);
end;
end;

If Country <> '' then
AddFieldValue(pfBirthPlace, Country);
If BirthDay <> '' then
AddFieldValue(pfBirthday, BirthDay);

It's ok for the two solutions!

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 1899
    • View Profile
Re: Script for egafd.com
« Reply #59 on: December 09, 2011, 04:36:39 pm »
If you do not mind tomorrow, now regulated and tests egafd_movie script, how it works, the result will be tomorrow.

I worked on tests  to about 100 movie, adding a few accessories and made ​​some correctionsso far and with a little delay is here egafd_mod7 script for movie attached.

I'm sorry, my mistake, added to the Cast wrong url  code,
This is the right URL codes.

Code: [Select]
// get url
          UrlPosStart := PosFrom('href="', HTML, EndPos) + 6;
          UrlPosEnd := PosFrom('">', HTML, UrlPosStart);
          URL := BASE_URL + Trim(Copy(HTML, UrlPosStart, UrlPosEnd - UrlPosStart));
          LogMessage(URL);

ADDED IS NOW ALSO CORRECT egafd_mod7 script for movie attached.

[attachment deleted by admin]
« Last Edit: December 09, 2011, 05:20:05 pm by Ivek23 »
Ivek23
Win 7 32bit, 64bit   PVD v0.9.9.21