1
dlh
URL Rewrite Update: Google Indexing and Xoops
  • 2004/10/27 23:24

  • dlh

  • Posts: 182

  • Since: 2002/2/20


Dear Xoopsers:

I just received a message from the Google staff that I though you would be interested in. I recently asked them to include my site (www.guitargearheads.com) in their news indexing service. Unfortunately they were unable to - I have posted the reasons below - however, it essentially has to do with the way they index URLs.

Based on this - I am request that the URL rewrite hack formally become a permanent aspect of XOOPS (perhaps switchable). This is important so that it is no longer a hack and upgrade paths are preserved.

Thanks,

Dan


--------- From Google ----------------------

Quote:
Hi Daniel,

Thank you for your inquiry regarding inclusion in Google News. We apologize for our delayed response. After some investigation, we've found that our system cannot crawl some of your articles because of the format of their URLs. In order to have your articles crawled by Google News,
their URLs must contain a number consisting of at least three digits.

For example, our news crawler would not crawl articles with the following URLs:

www.google.com/news/article23.html
www.google.com/lemurs_in_the_mist.html

It would crawl these pages:
www.google.com/news/08112003/article.html
www.google.com/news/lemurs_in_the_mist/23467.html

An example of a site that we are able to crawl successfully is

http://english.chosun.com.

Please note that each article on this site has a highly unique URL.

We apologize for this limitation of our system. If you are able to make changes on your end to allow us to crawl your content, please let us know.

Regards,
The Google Team

2
tl
Re: URL Rewrite Update: Google Indexing and Xoops
  • 2004/10/28 0:57

  • tl

  • Friend of XOOPS

  • Posts: 999

  • Since: 2002/6/23


Not sure if they are telling the truth. The followings are two most recent news crawled by Google. It had no problems with a more complex URL (second one).

Frenchman to stay on until 20081 hour ago
http
://soccernet.espn.go.com/headlinenews?id=314598&cc=3888

Richmond Times Dispatch 1 hour ago
http
://www.timesdispatch.com/servlet/Satellite?pagename=RTD%2FMGArticle%2FRTD_BasicArticle&c=MGArticle&cid=1031778787338&path=!weekend!music&s=1045855936364


But URL rewrite (as Xaraya has) would be great. Be able to rid of modules in the URL would be really nice.

Login

Who's Online

138 user(s) are online (89 user(s) are browsing Support Forums)


Members: 0


Guests: 138


more...

Donat-O-Meter

Stats
Goal: $100.00
Due Date: Apr 30
Gross Amount: $0.00
Net Balance: $0.00
Left to go: $100.00
Make donations with PayPal!

Latest GitHub Commits