English > Support

[SOLVED] Multi threaded IMDB fetching?

<< < (6/8) > >>

Happy2k:

--- Quote from: rick.ca on August 28, 2010, 12:08:15 am ---Now there's an idea! Videos for every possible help topic could be produced, and a PVD database created to catalogue them—along with any textual help that might be available (e.g., a wiki or forum topic). Users could add the videos to their collection and import the catalogue. Help topics could be found using Search or Advanced search, and the videos played directly from PVD. 8)

Do we have any volunteers for this project? ;)

--- End quote ---

I would, if i wasnt starting a new semester on monday. An up to date wiki could help alot - shouldnt be too hard to fix some topics, so hopefully i get some time to do that.

patch:

--- Quote from: rick.ca on August 28, 2010, 03:09:37 am ---The idea of running in parallel is fine—assuming each field is filled using only one source.

--- End quote ---
In option 1) above the plugins for an individual movie are run in series ensuring all the current data dependencies are preserved. ie with time on the horizontal axis

movie 1 imdb -> movie 1 allmovie -> movie 1 amazon
--------------> movie 2 imdb      -> movie 2 allmovie  -> movie 2 amazon
----------------------------------> movie 3 imdb      -> movie 3 allmovie -> movie 3 amazon

Resulting in all sites being accessed in parallel. The how to do this is what the separate tasks and queues were for in the earlier post.


--- Quote ---Even more problematic is the idea of getting only data for fields that need it. (I'm not sure this is what you meant, but it's implied by "maybe a lot faster for later incremental updates to an established PVD database.") Data changes. The only way to determine whether data needs to be updated is to compare it to what's currently available. So it's faster just to download all the data.

It would be helpful if fields set to "ignore" were omitted.

--- End quote ---
Currently for each field we can specify,
a) Solid tick -> get value and overwrite existing value
b) Grey -> store value only if no prior data (so really only need to get in if field is currently empty).
c) White / blank check box -> do not use this value (so no need to get it).

For populated field b & c do not need to bee downloaded, for an unpopulated field c doesn't need to be downloaded.
The potential saving depends on the granularity used by each sites web interface and how slow a particular page is to download. For example images, full cast list and deeper technical pages etc.

patch:

--- Quote from: nostra on August 27, 2010, 08:48:10 pm ---
--- Quote ---But i really like PVD - it was hard getting to work properly in the beginning, but the possiblities are endless.
--- End quote ---

Unfortunately it seems like many users have difficulties when starting using PVD. If you have suggestions how to make the process easier for beginners, then feel free to post in the Feature Suggestions board.

--- End quote ---

I suspect the problem is there is a trade off between a programs ease of use and its flexibility.
What we could do is focus on new users initial experience because if this can be made positive then many will see reason to put in enough time to see the benefit of the more advanced features.

Some ideas to help achieve this:
1) Wiki should open on a contents and search page focused on PVD not the wiki engine.
2) Clear simplistic tutorials for basic initial set up tasks.
3) Let advanced users have to burrow through multiple pages to get to relevant things for them rather than beginners. As such documenting program use instructions in a forum is the opposite to what we need.

rick.ca:

--- Quote ---In option 1) above the plugins for an individual movie are run in series ensuring all the current data dependencies are preserved...
--- End quote ---

OIC. Yes, that would work.

So do you think running multiple versions of the program, each simultaneously running the plugin, would be a valid test of whether IMDb would tolerate this kink of hammering?


--- Quote ---For populated field b & c do not need to bee downloaded, for an unpopulated field c doesn't need to be downloaded.
--- End quote ---

Yes, this is essentially what I was referring to in the last paragraph of my post. I don't think it matters much if there isn't much of a time saving in the general case, as long as there is a big saving in cases where only a few fields are required. But the savings are only going to happen to the extent pages don't have to be downloaded. If only three fields need data, but they come from three different pages, it's not going to be much faster. :-\


--- Quote ---As such documenting program use instructions in a forum is the opposite to what we need.
--- End quote ---

I, of course, disagree with this. We have a wiki. It doesn't seem to be used much, and there's quite obviously no one with any interest in contributing to it. So the evidence seems to be to the contrary. No one is going to argue that things like tutorials or videos or help files wouldn't be nice to have. But it seems clear no one is willing to create them. Besides, most people are still going to find software like this a bit of a challenge to learn, regardless of the documentation available. So they're going to be back here asking for help anyway.

buah:
Maybe we should try to make videos on specific issues from now on? The one who resolved the issue could make a video tutorial for it, or someone else who's willing to. New "Video Tutorials" board would be established in such purpose...

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version