Be careful to split with regular expressions - character set and
locale issues:
<pre>
<?php
$string = 'blårbærøl er greit';
$string = iconv( 'utf-8', 'latin1', $string );
setlocale( LC_ALL, 'nb_NO.iso-8859-1');
var_dump( preg_split( '/\W/', $string ) );
?>
Output
array(6) { [0]=> string(2) "bl" [1]=> string(2) "rb" [2]=> string(1) "r" [3]=> string(1) "l" [4]=> string(2) "er" [5]=> string(5) "greit" }