Author Topic: Bug: Export templates and UTF8BOM  (Read 10234 times)

0 Members and 1 Guest are viewing this topic.

Offline svenne

  • Power User
  • ****
  • Posts: 145
    • View Profile
Bug: Export templates and UTF8BOM
« on: June 07, 2010, 12:36:48 am »
This is about export templates and the charset encoding UTF8BOM, PVD v0.9.9.21:
It seems to me that something goes wrong when exporting with:
encoding="UTF8BOM"
Every template using UTF8BOM generates three characters (the byte order mark) at the beginning of the exported file. These should be invisible, but every program I tested does not interpret them. They are displayed instead. Special characters within the text are broken. As an example: "ü" will be "ü". Notepad (unicode capable, WinXP) reports the files to be ANSI-encoded instead of UTF-8. Different files (not generated with PVD) saved as UTF8 with and without byte order mark are handled correctly by Notepad.

buah

  • Guest
Re: Bug: Export templates and UTF8BOM
« Reply #1 on: June 07, 2010, 01:11:18 am »
Another unicode issue?

Offline rick.ca

  • Global Moderator
  • *****
  • Posts: 3241
  • "I'm willing to shoot you!"
    • View Profile
Re: Bug: Export templates and UTF8BOM
« Reply #2 on: June 07, 2010, 01:39:27 am »
Quote
Every template using UTF8BOM generates three characters (the byte order mark) at the beginning of the exported file.

So that's why I was getting "?»?" at the beginning of my files. I'm glad I'm not alone. :-\

Offline nostra

  • Administrator
  • *****
  • Posts: 2852
    • View Profile
    • Personal Video Database
Re: Bug: Export templates and UTF8BOM
« Reply #3 on: July 05, 2010, 10:41:10 pm »
Strange, but really all apps I tried did interpreted the BOM.
You are using Windows 7 as well, Rick, where do you see these characters?
Gentlemen, you can’t fight in here! This is the War Room!

buah

  • Guest
Re: Bug: Export templates and UTF8BOM
« Reply #4 on: July 05, 2010, 11:18:50 pm »
Rick, where do you see these characters?

I believe in html files edited with notepad, or some other "pad". I can confirm appearance of those characters unrelated to PVD but with earlier unicode issues in html files exported from "WhereIsIt?"

Offline nostra

  • Administrator
  • *****
  • Posts: 2852
    • View Profile
    • Personal Video Database
Re: Bug: Export templates and UTF8BOM
« Reply #5 on: July 05, 2010, 11:33:28 pm »
Unfortunately I can't reproduce the problem neither with notepad nor with Notepad++
Gentlemen, you can’t fight in here! This is the War Room!

Offline rick.ca

  • Global Moderator
  • *****
  • Posts: 3241
  • "I'm willing to shoot you!"
    • View Profile
Re: Bug: Export templates and UTF8BOM
« Reply #6 on: July 06, 2010, 12:04:19 am »
You are using Windows 7 as well, Rick, where do you see these characters?

Windows 7. If, for example, I export using the "plain list" template and open the result in Notepad++, the first three characters displayed are ?»?.

Offline nostra

  • Administrator
  • *****
  • Posts: 2852
    • View Profile
    • Personal Video Database
Re: Bug: Export templates and UTF8BOM
« Reply #7 on: July 06, 2010, 12:16:17 am »
Hmm, I do not know what am I doing wrong, but it does not work for me :(
Gentlemen, you can’t fight in here! This is the War Room!

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2667
    • View Profile
Re: Bug: Export templates and UTF8BOM
« Reply #8 on: July 06, 2010, 05:30:01 am »
Quote
Every template using UTF8BOM generates three characters (the byte order mark) at the beginning of the exported file.

So that's why I was getting "?»?" at the beginning of my files. I'm glad I'm not alone. :-\
With me is the same as using in XP Pro SP 3.
« Last Edit: July 06, 2010, 05:31:58 am by Ivek23 »
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline svenne

  • Power User
  • ****
  • Posts: 145
    • View Profile
Re: Bug: Export templates and UTF8BOM
« Reply #9 on: July 07, 2010, 01:03:24 pm »
Quote
Unfortunately I can't reproduce the problem
Just a guess: perhaps something is different with your database that has an effect on this issue? Did you try it with a newly generated database, too?

Offline nostra

  • Administrator
  • *****
  • Posts: 2852
    • View Profile
    • Personal Video Database
Re: Bug: Export templates and UTF8BOM
« Reply #10 on: July 07, 2010, 03:38:35 pm »
Quote
Unfortunately I can't reproduce the problem
Just a guess: perhaps something is different with your database that has an effect on this issue? Did you try it with a newly generated database, too?

I have tried it with a new database with the same result. I think I must try it on a Win XP machiene...
Gentlemen, you can’t fight in here! This is the War Room!

mgpw4me@yahoo.com

  • Guest
Re: Bug: Export templates and UTF8BOM
« Reply #11 on: July 07, 2010, 05:14:42 pm »
Vista also shows the header, regardless of the app (ie. notepad++).

Offline svenne

  • Power User
  • ****
  • Posts: 145
    • View Profile
Re: Bug: Export templates and UTF8BOM
« Reply #12 on: July 08, 2010, 12:18:36 am »
Ok, to narrow it down... the problem seems to be: all files exported by PVD with utf8bom (at least on my computer with WinXP) start with the following three bytes:
3F BB 3F
Displayed as ISO 8851-1 (or ANSI) this is: ?»?

But the correct BOM for UTF8 should be:
EF BB BF
Displayed as ISO 8851-1 (or ANSI) this is: 
This one works fine and is recognized as UTF8.

I attached a zip containing a text file exported with PVD (template: "Plain list.ptm") using UTF8BOM (called "file as exported by PVD.txt") and a second file where I (manually) changed the first three bytes to EF BB BF ("same file with BOM corrected manually.txt")...

Plain list.ptm:
Code: [Select]
%OPTIONS%
filter="Text Files|*.txt"
encoding="UTF8BOM"
%OPTIONS%{%value=203}. {%value=title} / {%value=origtitle} ({%value=year}) [{%value=genre}]

Quite often programs change characters to "?" when the character that is exported does not fit the encoding. Might be a hint, but I'm wondering if it makes any sense, because ï and ¿ are both present in the ANSI character table.


[attachment deleted by admin]
« Last Edit: July 08, 2010, 09:40:31 am by svenne »

Offline daddydave

  • Member
  • *
  • Posts: 28
    • View Profile
    • Me on Google+
Re: Bug: Export templates and UTF8BOM
« Reply #13 on: August 11, 2010, 01:16:17 am »
I have this problem as well and was able to reproduce svenne's troubleshooting. Is the fix on the to-do list for Version 1?

Offline nostra

  • Administrator
  • *****
  • Posts: 2852
    • View Profile
    • Personal Video Database
Re: Bug: Export templates and UTF8BOM
« Reply #14 on: August 11, 2010, 06:37:16 pm »
A fix will be available soon
Gentlemen, you can’t fight in here! This is the War Room!