International characters in PHP
Aug 11, 2009 Basic PHP
I must have done this one hundred times, yet I always forget the exact syntax needed to convert UTF8 characters into Unicode or HTML entities.
Not a biggie for US developers, but here in Europe and I’d bet Asia as well, you end up with pages filled with characters that look this way: è.
This happens because the web form, or maybe the RSS feed or the sql database from where the text is coming from, used Utf-8 coding.
Quick solution:
$string_to_clean='è'; utf8_decode ($string_to_clean);
This will properly print ‘è’. We can even go further and write:
htmlentities (utf8_decode ($string_to_clean));
That will return è instead of ’è’, and is also a wise security measure to harden our forms from HTML injection. Just remember to do this before you add your own HTML tags.
That’s to say -assuming we are building a list from -say- an RSS of Tweets:
$text.='<li>'.htmlentities (utf8_decode ($string_to_clean)).'</li>';
If we were to clean the resulting $text variable, we would loose all the ’<li>’ built in the loop.
Of course this will slow down execution a bit, because both functions are called evey time.
Tags: php function, utf8


Leave a Reply