1
macmend
Why does googlebot and inktomi use all my bandwidth
  • 2005/8/30 19:18

  • macmend

  • Quite a regular

  • Posts: 285

  • Since: 2004/2/27


My XOOPS site at http://www.jonathanspencer.net seems to set the search engine spiders in a loop that has them looking at my weblog articles hundreds of times.

my other site at http://www.macmend.com does not get this problem in the same way.

they are more or less identical in structure except for a module here and there and of course one is a weblog using the weBLog module.

I have adapted this module so it gives proper page titles for each article page.

Both sites use the sitemap module and have a xml sitemap registered with google based on the php file that comes with this module (it is hacked so google will accept it).

I am lucky that my hosts like me (qawebhosting.com) and just keep putting up my bandwidth with as yet no cost.

I can't think what I might have done or what is happening in the weblog module that is causing the spiders to loop hundreds of times.

(not sure if this is the right place for this but its a start)
Free Mac Support

Ordinary Wisdom

apache server with php sshexec turned on
xoops version 2.0.18.1 & 2.3.1
php version 5.2.5
mysql version 5.0.45

2
mboyden
Re: Why does googlebot and inktomi use all my bandwidth
  • 2005/8/30 20:37

  • mboyden

  • Moderator

  • Posts: 484

  • Since: 2005/3/9 1


I had the same problem on a site when I added piCal. I decided it was because of all the links in the calendar itself to each page, then to each month and each month before and after that, I'm assuming to some eternity. If all of your pages link to and from each other, until their bot gets it all figured out, you'll see that. Also each time you add a new entry, then it starts all over again. You might add something to your robots.txt file in the default directory of your webserver to tell the bots not to crawl a certain module.

Hope that helps.
Pessimists see difficulty in opportunity; Optimists see opportunity in difficulty. --W Churchill

XOOPS: Latest | Debug | Hosting and Web Development

3
macmend
Re: Why does googlebot and inktomi use all my bandwidth
  • 2005/8/31 7:25

  • macmend

  • Quite a regular

  • Posts: 285

  • Since: 2004/2/27


I already have pical disallowed, however i have now set mygallery and liens (links) to disallow to see if they created a loop.

was the site very slow for you as it seems slow to me, and I am not sure if this is because of the short urls hack or something else
Free Mac Support

Ordinary Wisdom

apache server with php sshexec turned on
xoops version 2.0.18.1 & 2.3.1
php version 5.2.5
mysql version 5.0.45

4
dargosch
Re: Why does googlebot and inktomi use all my bandwidth
  • 2005/8/31 7:53

  • dargosch

  • Friend of XOOPS

  • Posts: 118

  • Since: 2004/12/21


I'm not sure this is the cause of the flooding of search engines. I have been having trouble with inktomi hitting me in about 200 unique instances, and that has not even been conveted to XOOPS yet! Plain php with not many new features per week.

My thought is that the search engines are just very active right now.

/Fredrik
My Gentoo + PVR-350 + IVTV + MythTV blog is on
http://gentoomythtv.blogspot.com/

Login

Who's Online

395 user(s) are online (332 user(s) are browsing Support Forums)


Members: 0


Guests: 395


more...

Donat-O-Meter

Stats
Goal: $100.00
Due Date: Nov 30
Gross Amount: $0.00
Net Balance: $0.00
Left to go: $100.00
Make donations with PayPal!

Latest GitHub Commits