1
irmtfan
the utf-8 problem in preg_replace function
  • 2006/4/7 14:11

  • irmtfan

  • Module Developer

  • Posts: 3419

  • Since: 2003/12/7


the autolink feature in "wordbook" module doesnt work for unicode(utf-8) charset when the $search_term have more than 3 characters. i dont know about the pattern of this function.

this is the code from wordbook/entry.php
// singular
$term_q preg_quote($term'/');
$search_term "/b$term_qb/i";
$replace_term "<span><b><a style='color: #2F5376; text-decoration: underline; ' href='".XOOPS_URL."/modules/".$xoopsModule->dirname()."/entry.php?entryID=".ucfirst($entryID)."'>".$term."</a></b></span>";
$parts[$key] = preg_replace($search_term$replace_term,$parts[$key]);



for example with this term it works fine:

$test1=preg_replace("/\bعلی\b/i", "change ok","علی");

but if i use a more than 3 characters term it is fail to find that term:
$test2=preg_replace("/\bعلیب\b/i", "change NOT ok","علیب");

phppp get me this answer:
Quote:

It is not necessarily a pre_replace - unicode problem, but more like a word boundary for multibyte language problem.


ANYONE CAN HELP?

2
mondarse
Re: the utf-8 problem in preg_replace function
  • 2006/6/14 14:56

  • mondarse

  • Just popping in

  • Posts: 96

  • Since: 2003/2/3 1


Sorry, I can't help you, I didn't code that part, I only made some improvements to the module. Please ask the main coder hsalazar

3
leostotch
Re: the utf-8 problem in preg_replace function
  • 2006/6/14 16:35

  • leostotch

  • Just popping in

  • Posts: 76

  • Since: 2006/4/1 1


With no guarantee at all: I've read the preg functions have better utf support in recent PHP versions.
Quote:
u (PCRE_UTF8)
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.


So you may want to try adding this modifier to the regexp:
$search_term "/b$term_qb/i";
// change to
$search_term "/b$term_qb/iu";

4
irmtfan
Re: the utf-8 problem in preg_replace function
  • 2006/6/14 17:09

  • irmtfan

  • Module Developer

  • Posts: 3419

  • Since: 2003/12/7


still not work in php 5.1 and php 4.3.11
could you provide the link about for me?

5
leostotch
Re: the utf-8 problem in preg_replace function
  • 2006/6/17 21:37

  • leostotch

  • Just popping in

  • Posts: 76

  • Since: 2006/4/1 1


I found that here, on the pattern modifiers PHP manual page.

Actually, there are a lot of comments on this page about UTF-8, you may find the solution to your problem there.

6
rasme
Re: the utf-8 problem in preg_replace function
  • 2007/10/2 0:41

  • rasme

  • Just popping in

  • Posts: 39

  • Since: 2005/5/9 6


I now this old topic but maybe he not fixed.

you will find 3 sentences in entry.php
$search_term = "/\b$term_q\b/i"

change all to

$search_term = "/$term_q([^A-Za-z0-9_-])/i";

I tested in my website

Login

Username:
Password:

Lost Password? Register now!

Who's Online

60 user(s) are online (30 user(s) are browsing Support Forums)


Members: 0


Guests: 60


more...

Donat-O-Meter

Stats
Goal: $100.00
Due Date: Jun 30
Gross Amount: $0.00
Net Balance: $0.00
Left to go: $100.00
Make donations with PayPal!

Latest GitHub Commits