1
dickinsd
My News Pages are no longer been crawled by the major search engines.
  • 2005/11/21 18:21

  • dickinsd

  • Quite a regular

  • Posts: 278

  • Since: 2004/11/14


I have a strange problem at the minute in that my News items from the News module are no longer been crawled by google when since I updated the XOOPS core to 2.2.3

I have the same problem on 2 sites.

One site is using News 1.42 and the other is using News 1.3.

Google is able to crawl the site, I have not disallowed the googlebot and I have index, follow in my admin options.
Google used to crawl the site all the time and looking for other pages from my sites, google has plenty of news from before I did the update.

I use a recent News block at one site and the spotlight module at another site, if the news item is in one of these blocks when the googlebot stops by, then searching for that term on my site will return a result to my hompage, however this is not very good as once you get there the news item may no longer be in the recent news block.

Oh by the way, both sites in question do not use any modules as the homepage, my homepage on both sites is made up of blocks only.

Any ideas?

If there is another post on this, please point me in the right direction as the XOOPS search seems to be bringin up nothing similar to my problem.

Dave

2
dickinsd
Re: My News Pages are no longer been crawled by the major search engines.
  • 2005/11/22 12:19

  • dickinsd

  • Quite a regular

  • Posts: 278

  • Since: 2004/11/14


Ok.

I am VERY sorry for suggesting that my upgrade to XOOPS 2.2.3 might be the reason.

I have checked for all news items that google has crawled for 2 of my sites, my problem seems to have come about once I hit about 200 news stories on both site, which was in about a month of each other.

Now I am not sure, but after scouring the web for a while, I think the problem is actually down to the number of news stories I have.

I read somewhere that Although Google DOES crawl dynamic pages (e.g. https://xoops.org/modules/newbb/reply.php?forum=20&topic_id=43832 ... ) Google will limit the number of pages it crawls because of something like Google do not want to crawl through a massive number of dynamic pages as the number of pages could mean that google would use all your bandwidth.

I will get a better description and link to the description if this is news to you.

ANYWAY...

I am wondering if the large number of news articles on my sites has caused the problem in that google did the first 200 and then gave up crawling because of the large number invloved (Both sites would have more than 400 news stories at the moment)

I should point out that I never set an expire date on a single news story. Could this also be a cause of the problem.

If I had set the pages to expire after say 30 days, would this mean less pages for google to crawl? Perhpas if I had done this would Google still crawl the new News pages?

Can anyone help me figure this out? Anyone got any possible suggestions?

Dave

3
Goober
Re: My News Pages are no longer been crawled by the major search engines.
  • 2005/11/22 14:38

  • Goober

  • Not too shy to talk

  • Posts: 101

  • Since: 2003/3/30


Have you looked into Google Sitemaps?

http://www.google.com/webmasters/sitemaps/login

If you can't run the python program, you can always get a windows one to do it. I use the one at http://johannesmueller.com/gs/
Dispelling the Mystical belief of Web Standards and tableless CSS.
Nobody gets excited about the tools used to build a house, people get excited about how the house looks and performs

4
dickinsd
Re: My News Pages are no longer been crawled by the major search engines.
  • 2005/11/22 18:24

  • dickinsd

  • Quite a regular

  • Posts: 278

  • Since: 2004/11/14


I did look at using phpsitemap (I think it was called) but was having trouble with it.

I tried using the google service which required python access, I ran the script and provided the rel. info. to google.

That was last night, so I guess I will have to give google a couple of days to see if that has made a difference.

Thanks for the suggestions Goober.

Dave

5
Goober
Re: My News Pages are no longer been crawled by the major search engines.
  • 2005/11/22 18:54

  • Goober

  • Not too shy to talk

  • Posts: 101

  • Since: 2003/3/30


It will, it takes a week or two for them to index the thing. I couldn't use the python program - so I had to use the winders one. Think it took like 3 weeks for them to finally index it. I now just add a new map once a month.

You might also submit your map to Yahoo http://submit.search.yahoo.com/free/request . They have jumped on board (like they have a choice since Google's doing it) and are allowing sitemap submissions. Difference is Yahoo wants you create a simple text file with one URL per line. You can GZ it like the Google one tho.

Have you noticed MSN kicking butt in indexing lately? They still dont supply the hits Google does, but they are like 1 day turn around on anything I put up.
Dispelling the Mystical belief of Web Standards and tableless CSS.
Nobody gets excited about the tools used to build a house, people get excited about how the house looks and performs

Login

Who's Online

246 user(s) are online (202 user(s) are browsing Support Forums)


Members: 0


Guests: 246


more...

Donat-O-Meter

Stats
Goal: $100.00
Due Date: Nov 30
Gross Amount: $0.00
Net Balance: $0.00
Left to go: $100.00
Make donations with PayPal!

Latest GitHub Commits