1
wishcraft
Spiders 1.01 - Robot Management Tool

Resized Image

Spiders is a robot manager tool, that imports a list of all crawler and scanner robots on the web. It allows you to use XOOPS Permissioning to control the data that robots list online your site. It will also log the robot in using a post loader and display when the robot is online on you 'Whos Online'.

The robot text file used is taken from an online resource of Robot data and stores it in your database. Remember to adjust your mainfile.php to include the post loader after the common file is loaded.

Robot Manager (Spiders) is a good way to control what your site displays in search engines.

Spider is only written for XOOPS 2.3 and later.

Download: xoops2.3_spiders_1.01.zip

Remember the line for your mainfile looks like this and you will not be able to update or uninstall the module while the loader is active.

include_once XOOPS_ROOT_PATH."/modules/spiders/post.loader.spiders.php";
Resized Image
www.ohloh.net/accounts/226400

Follow, Like & Read:-

twitter.com/RegaltyFamily
github.com/Chronolabs-Cooperative
facebook.com/DrAntonyRoberts

2
trabis
Re: Spiders 1.01 - Robot Management Tool
  • 2009/5/24 2:07

  • trabis

  • Core Developer

  • Posts: 2269

  • Since: 2006/9/1 1


This is very interesting, we can set permissions to the bots group! I can close my site for maintenance and keep it open for bots! How cool is that ;)
http://www.xuups.com/userinfo.php?uid=375

just changed line 50 on the post loader to suppress some warnings:
if(@strpos(' '.$_SERVER['HTTP_USER_AGENT'], $spider->getVar('robot-exclusion-useragent'))||@strpos(' '.$_SERVER['HTTP_USER_AGENT'], $spider->getVar('robot-useragent'))) {


Thank you, great job!

3
wishcraft
Re: Spiders 1.01 & Protector Interactions - post.loader.spiders.php

btw trabis if you want to let the bot access the software on your site while it is around, here is a module that will change the 'welcome agent' of protector and allow it to index, while it is around.

<?php
// $Author: wishcraft $
//  ------------------------------------------------------------------------ //
//                XOOPS - PHP Content Management System                      //
//                    Copyright (c) 2000 XOOPS.org                           //
//                       <https://xoops.org/>                             //
//  ------------------------------------------------------------------------ //
//  This program is free software; you can redistribute it and/or modify     //
//  it under the terms of the GNU General Public License as published by     //
//  the Free Software Foundation; either version 2 of the License, or        //
//  (at your option) any later version.                                      //
//                                                                           //
//  You may not change or alter any portion of this comment or credits       //
//  of supporting developers from this source code or any supporting         //
//  source code which is considered copyrighted (c) material of the          //
//  original comment or credit authors.                                      //
//                                                                           //
//  This program is distributed in the hope that it will be useful,          //
//  but WITHOUT ANY WARRANTY; without even the implied warranty of           //
//  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the            //
//  GNU General Public License for more details.                             //
//                                                                           //
//  You should have received a copy of the GNU General Public License        //
//  along with this program; if not, write to the Free Software              //
//  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA //
//  ------------------------------------------------------------------------ //
// Author: Simon Roberts (AKA wishcraft)                                     //
// URL: http://www.chronolabs.org.au                                         //
// Project: The XOOPS Project                                                //
// ------------------------------------------------------------------------- //

    
if(!defined('_MI_SPIDERS_DIRNAME'))
        
define('_MI_SPIDERS_DIRNAME','spiders');
        
    global 
$xoopsConfig;
    
$module_handler =& xoops_gethandler('module');
    
$critera = new CriteriaCompo(new Criteria('dirname'_MI_SPIDERS_DIRNAME));
    
$installed $module_handler->getCount($critera);

    if (
$installed!=0)
    {
        
$module =& $module_handler->getByDirname(_MI_SPIDERS_DIRNAME);
        if (
$module->getVar('isactive')==true)
        {
            
$spider_handler = &xoops_getmodulehandler'spiders'_MI_SPIDERS_DIRNAME );    
            
$suser_handler = &xoops_getmodulehandler'spiders_user'_MI_SPIDERS_DIRNAME );    
            
            
$spiders $spider_handler->getObjects(NULL);
            foreach(
$spiders as $spider) {
                
$suser $suser_handler->get($spider->getVar('id'));
                
$robot = &$member_handler->getUser$suser->getVar('uid') );
                if(
strpos(' '.$_SERVER['HTTP_USER_AGENT'], $spider->getVar('robot-exclusion-useragent'))||strpos(' '.$_SERVER['HTTP_USER_AGENT'], $spider->getVar('robot-useragent'))) {

                    
                    
/**
                     * User Sessions
                     */
                    
$xoopsUser '';
                    
$xoopsUserIsAdmin false;
                    
$member_handler = &xoops_gethandler'member' );
                    
$sess_handler = &xoops_gethandler'session' );
                    @
ini_set'session.gc_maxlifetime'$xoopsConfig['session_expire'] * 60 );
                    
session_set_save_handler( array( &$sess_handler'open' ), array( &$sess_handler'close' ), array( &$sess_handler'read' ), array( &$sess_handler'write' ), array( &$sess_handler'destroy' ), array( &$sess_handler'gc' ) );
                    
session_start();
                    
                    
$_SESSION['xoopsUserId'] = $suser->getVar('uid');
                                        
                    
/**
                     * Log user is and deal with Sessions and Cookies
                     */
                    
if ( !empty( $_SESSION['xoopsUserId'] ) ) {
                        
$xoopsUser = &$member_handler->getUser$_SESSION['xoopsUserId'] );
                        if ( !
is_object$xoopsUser ) || ( isset( $hash_login ) && md5$xoopsUser->getVar'pass' ) . XOOPS_DB_NAME XOOPS_DB_PASS XOOPS_DB_PREFIX ) != $hash_login ) ) {
                            
$xoopsUser '';
                            
$_SESSION = array();
                            
session_destroy();
                        } else {
                            
$GLOBALS['sess_handler']->update_cookie();
                            if ( isset( 
$_SESSION['xoopsUserGroups'] ) ) {
                                
$xoopsUser->setGroups$_SESSION['xoopsUserGroups'] );
                            } else {
                                
$_SESSION['xoopsUserGroups'] = $xoopsUser->getGroups();
                            }
                            
$xoopsUserIsAdmin $xoopsUser->isAdmin();
                            if ( 
in_arrayXOOPS_GROUP_BANNED$xoopsUser->getGroups() ) ) {
                                include_once 
$GLOBALS['xoops']->path'include/site-banned.php' );
                                exit();
                            }
                        }
                    }
    
                if (
is_object($robot))
                    if (
$robot->isOnline())
                        
$dos_crsafe .= $spider->getVar('robot-exclusion-useragent').'|'.ucfirst($spider->getVar('robot-exclusion-useragent')).'|'.$spider->getVar('robot-useragent').'|';
                    }
            }        
            
            if (
strlen($dos_crsafe)>0) {
                
$module =& $module_handler->getByDirname('protector');
                if (
is_object($module)&&!empty($module)) {
                    
$config_handler =& xoops_gethandler('config');
                    
$criteria CriteriaCompo(new Criteria('conf_name''dos_crsafe'), "AND");
                    
$criteria->add(new Criteria('conf_modid'$module->getVar('mid')));
                    if (
$config_handler->getConfigCount($criteria)>0) {
                        
$configs $config_handler->getConfigs($criteria);
                        if (
is_object($configs[0])) {
                            
$configs[0]->setVar('conf_value''/(msnbot|Googlebot|'.$dos_crsafe.'Yahoo! Slurp)/i');
                            
$config_handler->insertConfig($configs[0]);
                        }
                    }
                
                }
            }
        }
        
    }
?>


edited 4:10 PM AEST
edited 4:37 PM AEST
Resized Image
www.ohloh.net/accounts/226400

Follow, Like & Read:-

twitter.com/RegaltyFamily
github.com/Chronolabs-Cooperative
facebook.com/DrAntonyRoberts

4
Kiwi_Chris
Re: Spiders 1.01 & Protector Interactions - post.loader.spiders.php
  • 2009/5/25 8:26

  • Kiwi_Chris

  • Just popping in

  • Posts: 79

  • Since: 2009/1/3 2


Hi, Question,with spiders now basically being seen as a Member to my site.

I am guessing this may affect my visitor stats?
Or will they be disregarded as usual.

Also is it possible to prevent these profiles been seen by the other members as this could
cause confusion within my site as its a social site, I could have Real member mistaking them
for real people.

Thanks.

Login

Who's Online

166 user(s) are online (132 user(s) are browsing Support Forums)


Members: 0


Guests: 166


more...

Donat-O-Meter

Stats
Goal: $100.00
Due Date: May 31
Gross Amount: $0.00
Net Balance: $0.00
Left to go: $100.00
Make donations with PayPal!

Latest GitHub Commits