Hello.
My site runs on the Japanese version of Xoops, and xhld (a RSS feed module) is installed on it to get RSS feeds from news sites written in Japanese, English, and Russian.
The issue I'm experiencing is that, a feed from BBC Russia cannot be displayed correctly on my site such that all the Cyrillic characters are garbled in the title line as the output of xhld. The encoding of the news feed provided by BBC Russia is not viewable (
http://newsrss.bbc.co.uk/rss/russian/russia/rss.xml) but it seems to be windows-1251.
More specifically speaking, on the admin panel of xhld installed on my site, when I tried to enter the site's URL and its RSS's URL in each box required for the set-up, I got the following error.
"XmlParse error: not well-formed (invalid token) at line 4"
Now, could anyone kindly explain how to solve this issue? I have looked into two of the relevant files; russian/headlinerenderer.php and japanese/headlinerenderer.php in the module. I am not sure about which lines in either/both of the files should be fixed.
russian/headlinerenderer.php is written as,
// This is a sample of using iconv() for converting UTF-8 <-> non-iso-encoding
// Replace "WINDOWS-1251" to your encoding
// Don't forget adding the encoding into _AM_ENCODINGS of language/(your language)/admin.php
//Russian translation and russian encoding adaptation by Vladislav "FractalizeR" Rastrusny.
http://www.vrsi.ruif( ! class_exists( 'XhldRendererLocal' ) ) {
class XhldRendererLocal extends XhldRenderer
{
function XhldRendererLocal( &$headline , $mydirname='xhld0' )
{
parent::XhldRenderer( $headline , $mydirname ) ;
}
function convertFromUtf8(&$value, $key)
{
if( ! is_string( $value ) ) return ;
if( stristr( _CHARSET , 'iso-8859-1' ) ) {
$value = utf8_decode( $value ) ;
} else if( $this->_hl->getVar('headline_encoding') == 'iso-8859-1' && ! $this->_hl->getVar('headline_allowhtml') ) {
$value = htmlentities( utf8_decode( $value ) ) ;
} else {
$value = iconv( "UTF-8" , "WINDOWS-1251" , $value ) ;
}
}
function &convertToUtf8(&$xmlfile)
{
$encoding = $this->_hl->getVar('headline_encoding') ;
// auto detection
if( empty( $encoding ) ) {
$top_of_xml = substr( $xmlfile , 0 , 255 ) ;
preg_match( "/^<\?xml .* encoding=['\"]?([0-9a-z_-]+)/i", $top_of_xml , $regs ) ;
if( empty( $regs ) ) {
$encoding = 'utf-8' ;
} else {
$encoding = strtolower( $regs[1] ) ;
}
$this->_hl->setVar( 'headline_encoding' , $encoding ) ;
$headline_handler =& xoops_getmodulehandler('headline', $this->_mydirname);
$headline_handler->insert($this->_hl);
}
switch( strtolower( $encoding ) ) {
case 'iso-8859-1' :
$xmlfile = utf8_encode( $xmlfile ) ;
break ;
case 'windows-1251' :
$xmlfile = iconv( "WINDOWS-1251" , "UTF-8" , $xmlfile ) ;
break ;
case 'koi8-r' :
$xmlfile = iconv( "KOI8-R" , "UTF-8" , $xmlfile ) ;
break ;
case 'koi8-u' :
$xmlfile = iconv( "KOI8-U" , "UTF-8" , $xmlfile ) ;
break ;
case 'koi8-ru' :
$xmlfile = iconv( "KOI8-RU" , "UTF-8" , $xmlfile ) ;
break ;
case 'utf-8' :
default :
break ;
}
return $xmlfile;
}
}
}
?>
While japanese/headlinerenderer.php is,
if (function_exists('mb_convert_encoding') && ! class_exists( 'XhldRendererLocal' ) ) {
class XhldRendererLocal extends XhldRenderer
{
function XhldRendererLocal( &$headline , $mydirname='xhld0' )
{
parent::XhldRenderer( $headline , $mydirname ) ;
if( ! preg_match( '/(EUC-JP|UTF-8|SJIS)/i' , mb_internal_encoding() ) ) {
mb_internal_encoding( 'EUC-JP' ) ;
}
}
function convertFromUtf8(&$value, $key)
{
if( ! is_string( $value ) ) return ;
if( stristr( _CHARSET , 'iso-8859-1' ) ) {
$value = utf8_decode( $value ) ;
} else if( $this->_hl->getVar('headline_encoding') == 'iso-8859-1' && ! $this->_hl->getVar('headline_allowhtml') ) {
$value = htmlentities( utf8_decode( $value ) ) ;
} else {
$value = mb_convert_encoding( $value , mb_internal_encoding() , 'UTF-8' ) ;
}
}
function &convertToUtf8(&$xmlfile)
{
$encoding = $this->_hl->getVar('headline_encoding') ;
// auto detection
if( empty( $encoding ) ) {
$top_of_xml = substr( $xmlfile , 0 , 255 ) ;
preg_match( "/^<\?xml .* encoding=['\"]?([0-9a-z_-]+)/i", $top_of_xml , $regs ) ;
if( empty( $regs ) ) {
$encoding = 'utf-8' ;
} else if( stristr( $regs[1] , 'JIS' ) ) {
$encoding = 'shift_jis' ;
} else if( stristr( $regs[1] , 'euc' ) ) {
$encoding = 'euc-jp' ;
} else if( stristr( $regs[1] , 'utf-8' ) ) {
$encoding = 'utf-8' ;
} else {
$encoding = strtolower( $regs[1] ) ;
}
$this->_hl->setVar( 'headline_encoding' , $encoding ) ;
$headline_handler =& xoops_getmodulehandler('headline', $this->_mydirname);
$headline_handler->insert($this->_hl);
}
switch( strtolower( $encoding ) ) {
case 'iso-8859-1' :
$xmlfile = utf8_encode( $xmlfile ) ;
break ;
case 'shift_jis' :
$xmlfile = str_replace( chr( 0 ) , '' , mb_convert_encoding( $xmlfile , "UTF-8" , "Shift_JIS" ) ) ;
break ;
case 'euc-jp' :
$xmlfile = str_replace( chr( 0 ) , '' , mb_convert_encoding( $xmlfile , "UTF-8" , "EUC-JP" ) ) ;
break ;
case 'utf-8' :
default :
break ;
}
return $xmlfile;
}
}
}
?>
I would very much appreciate it if any of you could shed light on this issue.
Thank you.
russy