Personal Video Database
English => Development => Scripts and Templates => Topic started by: meriator on December 24, 2011, 03:32:03 am
-
The script I currently trying to do
includes URLS
with POST and GET methods
and the site does not accept
POST instead of GET and vice versa
(http://www.filmportal.de)
I should be able to get the URL of the searchresults list
that was choosen/clicked before the download starts
to parse it and return the correct method.
I tryed everything without success,
no way to get this URL before GetDownloadMethod is executed
the earlist state to get the URL is after download
but this of cource fails either with ...
HTTP/1.1 400 Bad Request
(GET instead of POST)
or
HTTP/1.1 405 Method Not Allowed
(POST instead of GET)
Is there a way ?
thanks
cu meriator
-
by the way Happy Christ.... ;)
to all here
cu meriator
-
The following link has a script that MAY help you...
http://www.videodb.info/forum_en/index.php/topic,1665.20.html (http://www.videodb.info/forum_en/index.php/topic,1665.20.html)
If you follow the smPage global variable, you'll see how I implemented multiple pages within the search results. In your case,
function ParsePage(HTML : String; URL : AnsiString): Cardinal;
var
RVal : Integer;
begin
if ((Pos('/mediaindex',URL) > 0)) then begin; // this is a valid IMDB image location
if ((Pos('&ipage=load',URL) > 0) and (Mode <> smPage)) then begin; // user selected a page, not an image
PageUrl := URL; // set the url to retrieve another search page
Result := prDownLoad; // download the page
Mode := smPage; // parse search results
// SET POST METHOD HERE
end
else
// SET GET METHOD
If you have trouble, post the script and I'll take a look at it.
-
thanks mgpw4me for trying
but this does not realy solve the problem
as I allready explained above
the script parses the searchresults
at this point I allready know which of the URLs need which METHOD to be set
but I cant set a METHOD here in fact of,
I do not know at this time which of the result items will be choosen/clicked
and I cant apply an invidual METHOD to each item of the searchlist at this point
so after an item has got choosen/clicked
the function GetDownloadMethod get called
but now I cant get the URL that was clicked to parse it
and to return the right METHOD
try it your self
(http://www.filmportal.de) search for "Fluch"
you will get a long list of result but only 10 per page
so the script has to add Buttons as searchresults
atleast
one for go to next page (if curPage < lastPage)
and
one for go to previous page (if curPage > 1)
the results them self need the GET METHOD
the NAV buttons need the POST METHOD
but how can I know what will be clicked
hope this is more understandable now
cu meriator
-
Understood. My explanation was not complete enough (or won't work). I don't see the problem, so I'll have to try it myself.
Now that I have a sample title to work with, I'll see what I can do with it over the next day or two.
I've been wanting to play with "post" anyway...aveleyman.com has content I want, but it requires "post" and I've been too lazy to investigate the process.
-
Finally looking at this....
I'm researching the exact process required by HTTP 1.1 standards for the POST method. I've noted the Impawards poster plugin isn't working (for me), and it requires POST for search and GET for data...and it requires minimal post variable passing. I didn't count, but filmportal looks like it requires at least 6 variables to be passed in the POST, so I'll do an example with Impawards, and expand on it (maybe tomorrow) for dealing with multiple search pages.
-
My apologies for being so slow.
I've been sidetracked by making the script as flexible and easy to use as possible, while providing a "good" solution to the problem. I'd prefer that the script be a good example for others trying to do the same thing (note: Pascal isn't my favorite language so it's taking longer than I'd like), and have been trying to make it portable...within the Windows and Wine environments (Wine support will be very limited).
I might post a zip file tomorrow. If not. then the next day for sure.
-
my script is working
except my navigation buttons which need the POST METHOD instead GET
which I have to add
in case there are more then 10 items found
what means the searchresult page contains navigation buttons too
the navigation urls(POST) must contain ALL hidden form inputs
they vari from search to search depending on paging page and result of pages
plus
this field value must be searched and set
FF___FF__DF4425363084591DE0340003BA5CE267_0___Value=???
this field value
EncodingCheckString=ue
must be set manually and should be excluded from parsed hiddens
(do NOT parse or set the org value "ü", to prevent miss URLencodings)
this field value
FF___FF__DF4425363084591DE0340003BA5CE267_1___Value=2
must be set manually and should be excluded from parsed hiddens
it should be "2" (titlesearch)
do not parse the action
because it sometimes contains ";jsessionid=value[a-z|0-9]{10}" before "?"
(which also lead to miss URLencodings )
take this as default action URL instead
http://www.filmportal.de/dif/?FormName=PublicSearchForm&NavID=SimpleSearch&
the submit for navigation should be set like this
FF___ResultPager_%s___Value=%s
everything works perfekt so far
my script is multilang (en/de) for most of the data results
except this POST/GET thing :(
I'm very excitedly for a solution
( while me still believing its impossible for scripts currently )
cu meriator
-
I'm very excitedly for a solution
( while me still believing its impossible for scripts currently )
procedure FileExecute(const AFileName : String; const Params : String);
Opens a specified file with the application associated with its file type or starts another application if an executable file is specified.
Parameters:
AFileName file to open or execute
Params command line parameters to pass with the file
I started writing code this afternoon.
My plan for POST / GET is to load a web page to the hard drive, remove <script> tags (I hate pop-ups) and load it into a browser IFRAME under script control, do the initial search, have the user click a "save" button when they get to the right page, then pass the html from the IFRAME back to the .PSF file via a text file. No passing post variables, no parsing of search results...the web site html takes care of the navigation. Since the html will be accessible via javascript, the possibility exists to use the Document Object Model to process the html file for some information, rather than parsing it in the PSF file. For example, with posters, I'll be able to look at the document.images collection without any messy string processing.
-
I've been way too busy, so the following code is not even close to 'ready', but it does provide proof the method will work.
Some data about the activex component used is here:
http://www.the-art-of-web.com/javascript/ajax/
At the moment, what is does:
- Psf file invokes the HTA (HyperTExt Application)
- HTA uses GET to load the source code for the impawards.com index page into a textarea
- HTA can save the textarea to a file
- the activex component is 'standard' on windows xp and up...no install required
Known issues:
ActiveXObject('msxml2.xmlhttp') works for VISTA, other objects are used by other O/S's. The loadXMLDoc(url, params) subroutine shows how to connect to the correct object.
POST isn't supported by this code...at the moment...but it looks simple enough...in fact it could be possible to write a generic "read the <input> tags" routine to build most of the post variables on-the-fly.
Very unlikely this code will work on anything but a "genuine windows" install.
The FileExecute routine will NOT wait for the executable to finish before continuing. You can either create a file and wait for it to be deleted by the HTA (you'll probably want a TIMEOUT value on the loop), or use the TASKLIST command (maybe in a batch file...TASKLIST > activetasks.txt) and check to see if the hta is active.
----------------------
I'll update this code as I have time, but meanwhile I hope it gives you a starting point.
BTW, the second file, posterconfig.zip contains a user interface configuration dialog. Not very refined, but it works.
[attachment deleted by admin]