European character set problems

I am having problems displaying european accented character sets correctly using PHP and MySQl. My data set was created in Microsoft Access using Alt + number pad to create names with european accented characters e.g. ?E(C) is Alt number pad 130.

I have tried several ways of getting the Access data to MySQL:

1. Using .csv files truncates many words e.g. B?E(C)trisey becomes B

2. Using Export to ODBC appears to work correctly if I view the data with phpMyAdmin e.g. B?E(C)trisey looks like B?E(C)trisey but if I view the data with a PHP generated web page B?E(C)trisey looks like B?E?i?C(C)trisey.

When I step through the code below using ZendStudio the value of $row is B?E?i?C(C)trisey.

Is it necessary to do something at the start of a PHP file to declare that the character set is utf8_general_ci or does the PHP code adopt the font set of the MySQL table?

Any other suggestions?

Bob Holmstr?E?m

// show authors
$result = mysql_query($sql);
if(!$result or !mysql_num_rows($result))
echo "<p>No author results.\n";
else {
echo "<hr /><ul>\n";
while($row = mysql_fetch_object($result)) {
echo "<li>",
build_href("bhmfind.php",
"sqlType=title&AuthorID=$row->AuthorID",
last_name_last($row->FullName)),
"</li>\n";
}
echo "</ul>\n";
}

Re:European character set problems

Mark,

I spoke with a representative of my ISP (Infinity) and after much discussion they said that they could not change their PHP installation because it might break some of their customer's sites. They suggested that using htaccess in my home directory, I could "override php settings". Any ideas if that is possible?

Bob

Re:European character set problems

Mark,

Thank you - it works as you suggested it would.

Now I need to look over my code and figure out the most efficient way to use it since accented characters can appear in many fields - the application is a speciallized book database with documents in many languages.

Bob Holmstr?E?m

Re:European character set problems

Glad that worked.. Perhaps a more appropriate solution for you (considering this seemts to come up a fair amount) would be the multi-byte string extension of PHP. Unfortunately this is not an extension that gets compiled in by default (and if I recall properly, your host isn't very good about working with you to add functionality).

You can read about it at [url]http://php.net/mbstring[/url].

Mark

Character set problems

I'm wondering if it's MySQL and not PHP that is your Problem.. I know that MySQL can be compiled with specific character set support (Latin1 by default). If the server is using a binary distribution from MySQL.com it has been compiled with extra character set support, but still defaults to latin1 when creating databases, tables and columns. The character set can be overridden at any level. This page from the MySQL documentation [url]http://dev.mysql.com/doc/mysql/en/charset-examples.html[/url] gives some examples.

The fact that it replaces the character with not one wrong character but two makes me think it's sticking two single-byte characters in the place where a single double-byte character should be.

Just a thought.

Mark

Re:European character set problems

Mark - Thanks for your thoughts on the problem.

I agree that the first place I would have looked for a solution was the topic you point out - except that everything reads correctly with phpMyAdmin. I have now checked the MySQL data using MySQL Query Browser and it is ok also.

So two MySQL gui interfaces give the correct information - which is what led me to suspect PHP creation of a web page as the problem.

Sorry I put this question to the wrong section of the forum - how can I move it?

Bob

Re:European character set problems

[quote]So two MySQL gui interfaces give the correct information - which is what led me to suspect PHP creation of a web page as the problem. [/quote]

I agree.. Have you looked into this? [url]http://php.net/manual/en/function.utf8-decode.php[/url]

This bit of test code gives me the correct result (ie B?E(C)trisey):
[code:1]
<?php
echo utf8_decode('B?E?i?C(C)trisey');
?>
[/code:1]

So then maybe something like this would work:

[code:1]
// show authors
$result = mysql_query($sql);
if(!$result or !mysql_num_rows($result)) {
echo "<p>No author results.\n";
} else {
echo "<hr /><ul>\n";
while($row = mysql_fetch_object($result)) {
echo "<li>"
.build_href("bhmfind.php",
"sqlType=title&AuthorID=$row->AuthorID",
last_name_last(utf8_decode($row->FullName)))
."</li>\n";
}
echo "</ul>\n";
}
[/code:1]

Or perhaps the utf8_decode() function might be better placed within your last_name_last() or build_href() functions.

Mark

Re:European character set problems

No, htaccess isn't going to do it for the multi-byte string extensions. The PHP executable needs to be built with that enabled. It can be usefull to tailer other settings though. Moxley seems to be the whiz at htaccess files.

Mark