Hi all, occasionally we have discussed issues with char encoding and the UTF-8 set. I can not recall what was done or discussed, so I want to ask first the community for common knowledge and then invest time in looking into code etc.
I have following use case. I read a book with tde-ebook-reader. I can copy following text out of the book (example: “English: The 1931 Malay census was an alarm bell. IPA: ðə 1931 ˈmeɪleɪ ˈsɛnsəs wɑz ən əˈlɑrm bɛl.”) but in the reader the IPA characters are shown as squares. Is it a font issue, or encoding handling issue in tde-ebook-reader?
I attached a capture of kmail (under FreeBSD). I hope this is the answer to your question.
On Tuesday 17 March 2026 10:47:50 deloptes via tde-devels wrote:
Hi all, occasionally we have discussed issues with char encoding and the UTF-8 set. I can not recall what was done or discussed, so I want to ask first the community for common knowledge and then invest time in looking into code etc.
I have following use case. I read a book with tde-ebook-reader. I can copy following text out of the book (example: “English: The 1931 Malay census was an alarm bell. IPA: ðə 1931 ˈmeɪleɪ ˈsɛnsəs wɑz ən əˈlɑrm bɛl.”) but in the reader the IPA characters are shown as squares. Is it a font issue, or encoding handling issue in tde-ebook-reader?
Denis Kozadaev via tde-devels wrote:
I attached a capture of kmail (under FreeBSD). I hope this is the answer to your question.
Well, I assume the problem is somewhere in the tde-ebook-reader. I see those characters properly displayed in knode , kwrite etc. too.
I'll put it on the todo list, to investigate, why it is not properly displaying.
BR
I recall reporting this bug:
https://mirror.git.trinitydesktop.org/gitea/TDE/tdepim/issues/45
The problem there would suggest an encoding issue, since in one case the font works and in another it doesn't.
Cheers, Janek
Jan Stolarek via tde-devels wrote:
I recall reporting this bug:
https://mirror.git.trinitydesktop.org/gitea/TDE/tdepim/issues/45
The problem there would suggest an encoding issue, since in one case the font works and in another it doesn't.
Yes, I remember this. As it is repetitive I was wondering if it is a generic problem or a matter of implementation in specific apps. I guess it is the later.
I once went through the PIM part and improved the encoding (with help by Michele) but it seems there are other places or situations.
I once went through the PIM part and improved the encoding (with help by Michele) but it seems there are other places or situations.
It's possible it is a tde-ebook-reader. I didn't check the code, but the original code was design to work in many different OSes (not just unix/linux/bsd), so it is likely it could have custom handling of unicode chars.
I remember this. As it is repetitive I was wondering if it is a generic problem
This may actually point to a font issue though. If you compare with the original text, you see squares only where special characters are placed. Try using a unicode-enable font (like Unifont upper for example) and see if the characters show up.
Cheers Michele
Michele Calgaro via tde-devels wrote:
I once went through the PIM part and improved the encoding (with help by Michele) but it seems there are other places or situations.
It's possible it is a tde-ebook-reader. I didn't check the code, but the original code was design to work in many different OSes (not just unix/linux/bsd), so it is likely it could have custom handling of unicode chars.
I'll have a look into the code.
I remember this. As it is repetitive I was wondering if it is a generic problem
This may actually point to a font issue though. If you compare with the original text, you see squares only where special characters are placed. Try using a unicode-enable font (like Unifont upper for example) and see if the characters show up.
I use Arial in the settings, but it does not change the representation. e-book-reader on Windows displays the characters, so the document is sane.
Michele Calgaro via tde-devels wrote:
It's possible it is a tde-ebook-reader. I didn't check the code, but the original code was design to work in many different OSes (not just unix/linux/bsd), so it is likely it could have custom handling of unicode chars.
out of curiosity, I opened the epub file and I think the problem might be interpretation issue (how the reader interprets the content). It looks like this
Michele Calgaro via tde-devels wrote:
This may actually point to a font issue though. If you compare with the original text, you see squares only where special characters are placed. Try using a unicode-enable font (like Unifont upper for example) and see if the characters show up.
From the extracted files when I open the XHTML in konqueror, I do not see the symbols. From the Firefox, I see the symbols.
On 2026/03/18 02:57 AM, deloptes via tde-devels wrote:
Michele Calgaro via tde-devels wrote:
This may actually point to a font issue though. If you compare with the original text, you see squares only where special characters are placed. Try using a unicode-enable font (like Unifont upper for example) and see if the characters show up.
From the extracted files when I open the XHTML in konqueror, I do not see
the symbols. From the Firefox, I see the symbols.
That could be a font selection issue. For the time being, unicode characters are correctly visualized if a proper font is selected. I still haven't worked on an auto-font selection when a glyph is not available in the current font. Can you install Unicode Upper font and check if that would display the symbols?
Cheers Michele
Michele Calgaro via tde-devels wrote:
That could be a font selection issue. For the time being, unicode characters are correctly visualized if a proper font is selected. I still haven't worked on an auto-font selection when a glyph is not available in the current font. Can you install Unicode Upper font and check if that would display the symbols?
in my case the font is embedded in the epub file. I looked into it not very deeply, but my understanding is that this code is quite old and needs either replacement or update. I traced the encoding to embedded zlibrary which seems to be "borrowed" from fbreader, but both are from 2010 or 2012. Modern readers implement webkit or other html5 capable engines. But perhaps it doesn't have to do with html5, it should be "just" able to process the CSS and load the font :-) I may open a ticket and leave it for now.