Ticket #241 (closed defect: fixed)

Opened 5 years ago

Last modified 10 months ago

Auto-sense language and character set on a per-email basis

Reported by: rjl Owned by: dmorton
Priority: normal Milestone: 1.0.3
Component: PHP scripts Version: 1.0.0 RC5
Severity: normal Keywords: language charset character set
Cc:

Description

Some users may operate in multilingual environments, such that they send and receive mail in several languages. Currently, Maia only allows a user to specify a single language and character set to use to interpret all mail, so these users would need to repeatedly switch their language/charset combinations manually in order to read mail properly in the Mail Viewer.

Since the language and character set is encoded in the mail headers, however, it reasons that Maia should be able to detect these and select the appropriate supported language from among a list that the user provides, and automatically display the viewmail.php page in the appropriate language/charset when he tries to view that mail item.

Users would continue to have a "preferred" language/charset as they do now, but would need to be able to create a personal list of other languages that he understands (and which the administrator has installed language modules for). When the user wishes to look at a mail item in his Mail Viewer, the headers of the mail item could then be scanned to determine its language and charset, and if the language matches one of the ones the user has stated that he can read, the viewmail.php page can then be displayed in that language with the charset as described in the headers. No other Maia pages need be affected by this--they would continue to use the user's "preferred" language/charset.

If the mail item is in a language and charset the user has NOT indicated that he understands (or a language that the administrator has not installed a matching translation pack for), defaulting to the user's "preferred" language/charset may be a reasonable default. Alternatively, it may make sense not to display the decoded contents at all in that case, and only offer the "raw" contents. Trying to display the decoded contents of such a message should perhaps bring up a warning/error message explaining that the appropriate language/charset is not available.

Attachments

iconv.patch (6.1 kB) - added by dmorton 2 years ago.
initial stab at using iconv to display other charsets in utf-8

Change History

Changed 5 years ago by dmorton

  • milestone changed from 1.0.0 RC6 to 1.1.0

Changed 4 years ago by rjl

  • patch set to 0

Ultimately the proper character set to use in config.php is "UTF-8" (Unicode), so that multiple character sets can be displayed on a single web page. Using the GNU libiconv library and its iconv function, converting portions of a web page to UTF-8 becomes possible, so the mail viewer and quarantine/cache lists should be able to display the proper characters based on the content-type header of the email or MIME-part.

Changed 2 years ago by dmorton

  • owner changed from rjl to dmorton
  • status changed from new to assigned
  • milestone changed from 1.1.0 to 1.0.3

I have a basic bit working pretty well for me, but needs some additional work to make it interact with the display charset used by the end user... if the value in settings is to have any effect. This shouldn't be too hard to port back to 1.0.3

Changed 2 years ago by dmorton

initial stab at using iconv to display other charsets in utf-8

Changed 2 years ago by rjl

Patch committed to the 1.0 branch in [1253], minus the template changes that are only needed in the trunk version. It appears to work well, but could definitely use more testing across a wider range of browsers and locales.

Changed 2 years ago by dmorton

  • status changed from assigned to testing
  • resolution set to fixed

Changed 10 months ago by mortonda@…

  • status changed from testing to reopened
  • resolution fixed deleted

I discovered that a server setting was forcing utf-8 in all my pages. Turning it off shows a more accurate representation - and of course, the default charset cannot display much from other charsets. As Robert said above, the correct thing to do is to make all web pages UTF-8, and then the iconv stuff is much easier. I think we need to force all references to $html_charset to UTF-8.

Changed 10 months ago by mortonda@…

  • status changed from reopened to closed
  • resolution set to fixed

[1439] removed $html_charset from most pages and set it to UTF-8.

Note: See TracTickets for help on using tickets.