Author Topic: Typing/Clicking to search for Tags yields different results  (Read 9362 times)

0 Members and 1 Guest are viewing this topic.

Offline Pensare

  • Member
  • *
  • Posts: 13
    • View Profile
Typing/Clicking to search for Tags yields different results
« on: November 14, 2010, 10:56:49 pm »
Version: 0.9.9.21
OS: Windows Vista (32 bit)

Problem: Searching for multiple word tags by typing in search field (Tags) yield different results than clicking on hyper-linked tag of the same words.

Background:
I will explain my personal procedure for entering data into the database.  The basic procedures have been used since I set up my database in November 2009.  I have always installed the latest updates whenever they were available.  If the version during import is an issue, I can tell you what day a movie was imported and, perhaps, you could determine which version was current at the time.  I use the "PVD Classic" skin.

1. In a browser, search for and navigate to movie on IMDB.
2. Select URL and copy to Windows clipboard
3. On the "Movies" screen, click "New" icon
4. Enter Title and paste URL into appropriate fields
5. Save by clicking "Apply Changes"
6. Click import plug-in icon (defaulted to, in my case, [EN] Get Movie Information from IMDB.com)

I can recreate in my database (you can have a copy if you'd like) using these steps:

1.  Select "12 Angry Men" (Added 11/10/2009)
2.  Expand "Tags", and click "Based on Play"
3.  Observe 50 movies are displayed in left panel.
4.  Clear the search criteria
5.  With Tags still as the search element, type "Based on Play" into search criteria.
6.  Observe 8 different movies are displayed in left panel
NOTE: The same is true if I click on the hyper-linked tag for one of these 8 movies ("The Awful Truth" is an example).


I can do this with many tags that include spaces.  What causes this difference?  Can I merge the tags so search results are complete?

Offline rick.ca

  • Global Moderator
  • *****
  • Posts: 3241
  • "I'm willing to shoot you!"
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #1 on: November 15, 2010, 12:23:49 am »
Sorry, I'm unable to reproduce your problem. This may be because of the nature of my database. I import AllMovie "Themes" and "Tones" to my Categories and Tags fields. Tones are all single words, unlike IMDb tags. Themes include phrases, but it seems to work fine. AFAIK, all the link does is execute a simple search for the term in that field—exactly as if it were done manually. :-\

I suggest you examine the Tags of the 42 "extra" hits you get using the link. Do they have the "Based on Play" tag or not? Are they selected because other tags contain the words "Based" or "Play"? Do you have the same problem with your Categories field?

Maybe someone else who fills Tags from IMDb can test this for us.

Offline Pensare

  • Member
  • *
  • Posts: 13
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #2 on: November 15, 2010, 01:39:35 am »
Do you have the same problem with your Categories field?

I have only a few movies with a category field; those I have inadvertently imported from AllMovies as well as IMDb.  Unfortunately, there are not two movies with the same category, so I cannot test it. 

mgpw4me@yahoo.com

  • Guest
Re: Typing/Clicking to search for Tags yields different results
« Reply #3 on: November 15, 2010, 03:37:40 am »
I did a quick check on my database and the numbers all jive.  54 based on play with either a link or search.  Based on Novel, and other multi-word tags also work for me.

Offline rick.ca

  • Global Moderator
  • *****
  • Posts: 3241
  • "I'm willing to shoot you!"
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #4 on: November 15, 2010, 04:00:52 am »
And I trust Spike and Chester agree on this one? ;)

Offline Pensare

  • Member
  • *
  • Posts: 13
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #5 on: November 17, 2010, 03:13:47 am »
Is there a way to query the database directly?  Maybe my database has "unprintable" characters in it.

Offline rick.ca

  • Global Moderator
  • *****
  • Posts: 3241
  • "I'm willing to shoot you!"
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #6 on: November 17, 2010, 08:36:33 am »
I suggest you examine the Tags of the 42 "extra" hits you get using the link. Do they have the "Based on Play" tag or not? Are they selected because other tags contain the words "Based" or "Play"?

mgpw4me@yahoo.com

  • Guest
Re: Typing/Clicking to search for Tags yields different results
« Reply #7 on: November 17, 2010, 03:01:08 pm »
Is there a way to query the database directly?  Maybe my database has "unprintable" characters in it.

I suggest you use EXPORT to create a text file.

Export documentation:
http://www.videodb.info/help/hlp_export.html

Offline Pensare

  • Member
  • *
  • Posts: 13
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #8 on: November 18, 2010, 03:15:54 am »
I suggest you examine the Tags of the 42 "extra" hits you get using the link. Do they have the "Based on Play" tag or not? Are they selected because other tags contain the words "Based" or "Play"?
I exported each set twice, once with the default "XML (Movies)" export plugin template and once with a modified template which only removed the encoding option.  Both look the same when viewed with standard Microsoft Windows utilities (notepad, wordpad, PSPad text editor) but when I view it using Unix-like utilities, I find there are non-printable characters in the larger (hyper-link) set.  The extra characters only show up in the tags field.

(We'll see what you on-line editor does with them.  But I'll attach screenshots of the xml files in Emacs.)
=========
grep -Ei "[^/g]title|based on play" filename.xml|less

encoding="UTF8"
<title>12 Angry Men</title>
<tags>Murder, Jury, Trial, Jury Room, Restroom, All Male Cast, Heat Wave, Based On TV Movie, Watchmaker, Racism, Real Time,Ensemble Cast, Eyeglasses, Evidence, Photograph, Immigrant, Single Set Production, New York City, Legal, Advertising Executive, Witness, No Music, Switchblade, One Day, Death Threat, Stockbroker, Law, Architect, Salesman, Bank Clerk, Father Son Estrangement, Coach, Rainstorm, Based On Play, Number In Title, AFI Top 100-2007, MPAA Approved</tags>
<title>Amadeus</title>
<tags>Envy, Flashback, Musician, Asylum, Catholic Priest, Priest, Genius, Emperor, Classical Music, Loss Of Father, Wheelchair,1820s, Salon, Archbishop, Domineering Father, Maid, Prayer, Musical Chairs, Marriage, 1790s, Powdered Wig, Billiards, Jealousy, Buxom, Fainting, Marriage Proposal, Opera, Flatulence, Fireplace, Deceit, Lifting Person In Air, Mockery, Choking, Laughter, Talent, Pregnancy, Mediocrity, Cemetery, Primadonna, Crucifix, Piano, Candy, Child Prodigy, Italian, Father Son Relationship, Music Lesson, Intrigue, Loss Of Husband, Theater, Dictation, Madhouse, Blindfold, Funeral, 18th Century, Vienna Austria, Dog, Wedding, Tragedy, Composer, Confession, Censorship, Dwarf, Told In Flashback, Mozart's Requiem, Tony Award Source, Actor, Mother In Law, Murder, Kidney Failure, Male Frontal Nudity, Opera Parody, Masquerade, Mass Grave, Mask, One Word Title, Suicide Attempt, 1780s, Independent Film, Based On Play, Character Name In Title, MPAA R, AFI Top 100-1998</tags>

Unencoded
<title>12 Angry Men</title>
<tags>Murder, Jury, Trial, JuryáRoom, Restroom, AlláMaleáCast, HeatáWave, BasedáOnáTVáMovie, Watchmaker, Racism, RealáTime, EnsembleáCast, Eyeglasses, Evidence, Photograph, Immigrant, SingleáSetáProduction, NewáYorkáCity, Legal, AdvertisingáExecutive, Witness, NoáMusic, Switchblade, OneáDay, DeatháThreat, Stockbroker, Law, Architect, Salesman, BankáClerk, FatheráSonáEstrangement, Coach, Rainstorm, BasedáOnáPlay, NumberáInáTitle, AFI Top 100-2007, MPAA Approved</tags>
<title>Amadeus</title>
<tags>Envy, Flashback, Musician, Asylum, CatholicáPriest, Priest, Genius, Emperor, ClassicaláMusic, LossáOfáFather, Wheelchair, 1820s, Salon, Archbishop, DomineeringáFather, Maid, Prayer, MusicaláChairs, Marriage, 1790s, PowderedáWig, Billiards, Jealousy, Buxom, Fainting, MarriageáProposal, Opera, Flatulence, Fireplace, Deceit, LiftingáPersonáInáAir, Mockery, Choking, Laughter, Talent, Pregnancy, Mediocrity, Cemetery, Primadonna, Crucifix, Piano, Candy, ChildáProdigy, Italian, FatheráSonáRelationship, MusicáLesson, Intrigue, LossáOfáHusband, Theater, Dictation, Madhouse, Blindfold, Funeral, 18tháCentury, ViennaáAustria, Dog, Wedding, Tragedy, Composer, Confession, Censorship, Dwarf, ToldáInáFlashback, Mozart'sáRequiem, TonyáAwardáSource, Actor, MotheráInáLaw, Murder, KidneyáFailure, MaleáFrontaláNudity, OperaáParody, Masquerade, MassáGrave, Mask, OneáWordáTitle, SuicideáAttempt, 1780s, IndependentáFilm, BasedáOnáPlay, CharacteráNameáInáTitle, MPAA R, AFI Top 100-1998</tags>



[attachment deleted by admin]

Offline rick.ca

  • Global Moderator
  • *****
  • Posts: 3241
  • "I'm willing to shoot you!"
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #9 on: November 18, 2010, 03:45:42 am »
You still haven't explained what the problem is. How many of your movies actually have the "Based on Play" tag? 8 or 50? Is 12 Angry Men one of the 8 or one of the 42? Are you suggesting a manual search for "Based on Play" will not find 12 Angry Men because the value is actually "Based On Play"? If so, what about the 8 that are found? Do they found because they do not have special characters in the value? Is my assumption the 8 are a subset of the 50 even true? Or is one search finding 50 records where the values include special characters and the other finds 8 different records where the values do not include special characters? Or vice versa?

mgpw4me@yahoo.com

  • Guest
Re: Typing/Clicking to search for Tags yields different results
« Reply #10 on: November 18, 2010, 03:50:28 am »
Is there a way to query the database directly?  Maybe my database has "unprintable" characters in it.

Nice catch big guy.

I can't imagine how those characters got there, but it looks like all you have to do now is do a search / replace on those characters then use the IMPORT function to replace the data.  Be sure to backup your database.

** NOTE ** Windows may not show the characters as being different, but dollars to donuts it will copy / paste into the replace dialog properly.


It's probably best to reset your imdb plug-in preferences to OVERWRITE your TAGS and try getting the data again.  If you still have the same problem there could be issues with the plug-in.
« Last Edit: November 18, 2010, 03:56:02 am by mgpw4me@yahoo.com »

Offline Pensare

  • Member
  • *
  • Posts: 13
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #11 on: November 18, 2010, 05:58:16 am »
You still haven't explained what the problem is. How many of your movies actually have the "Based on Play" tag? 8 or 50? Is 12 Angry Men one of the 8 or one of the 42? Are you suggesting a manual search for "Based on Play" will not find 12 Angry Men because the value is actually "Based On Play"? If so, what about the 8 that are found? Do they found because they do not have special characters in the value? Is my assumption the 8 are a subset of the 50 even true? Or is one search finding 50 records where the values include special characters and the other finds 8 different records where the values do not include special characters? Or vice versa?
S1. You still haven't explained what the problem is.
R1. I apologize for not being clear.  The problem is, apparently, that my database has characters that appear to be spaces but are not. 

Q1. How many of your movies actually have the "Based on Play" tag? 8 or 50?
A1. To the best of my knowledge, 58.

Q2. Is 12 Angry Men one of the 8 or one of the 42?
A2. One of the 50.

Q3. Are you suggesting a manual search for "Based on Play" will not find 12 Angry Men because the value is actually "Based On Play"?
A3. Yes
 
Q4. If so, what about the 8 that are found? Do they found because they do not have special characters in the value?
A4. That is my belief.  The string "Based on Play" contains space characters which are not in the tags of "12 Angry Men".

Q5. Is my assumption the 8 are a subset of the 50 even true?
A5. No. (Once again, sorry about not being clear).

Q6. Or is one search finding 50 records where the values include special characters and the other finds 8 different records where the values do not include special characters?
A6. Clicking the link of a record with special characters will match other records with the same sequence of characters (50 in my case) while typing in the Search box (or clicking the tag of a record without special characters) will also match the records with the same sequence of characters (8 in my case).

Offline Pensare

  • Member
  • *
  • Posts: 13
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #12 on: November 18, 2010, 06:23:12 am »
Quote
It's probably best to reset your imdb plug-in preferences to OVERWRITE your TAGS and try getting the data again.
I tried mgpw4me's suggestion and the non-printable characters are gone.  Thanks.

Now, comes the process of re-importing most of my database.  :-\

Offline rick.ca

  • Global Moderator
  • *****
  • Posts: 3241
  • "I'm willing to shoot you!"
    • View Profile
Re: Typing/Clicking to search for Tags yields different results
« Reply #13 on: November 18, 2010, 06:59:47 am »
Quote
A1...6

That's much better, thanks. ;)

Hopefully, downloading the data again will fix everything. I suppose it would be pointless to try to narrow down the cause—especially with the number of plugin updates that have been required in the recent past. If anyone else is affected, doing the same thing is probably the only practical fix. After updating, please to the same export to verify all is well. If it's not, we'll need to distract nostra from 1.0.0.1 development. :-\