1
vmax2extreme
Setting .htaccess restrictions

Hello,

My site is running xoops 2.5.5 release and some non members contacted me that they are able to get hits from google searching various information off our site that is supposed to be restricted pages for only admin or various board members only. Is there a way to restrict the bots & crawlers to not go deep into our site indexing all of our restricted HTML pages? These pages in guestion are from the Xcenter Content Module 2.1.6. Permissions are set for anonymous to not have viewable access at all, but this isnt permissions, its the crawlers finding these html pages to index.

Please help if you can assist me fixing this.

Thanks in advance,

Mike

2
wishcraft
Re: Setting .htaccess restrictions

Okey in the 'Basic Permissions' There is 2 options the 'default permissions' and the 'content permissions'

The problem is you have your anonymous default permission set to 'view content' enabled. So in this setting:

Resized Image

Turn off 'View content' otherwise for all content you will be able to view the content, make it look like:

Resized Image

You will also have to add the following line to your robots.txt so crawlers don't index your files as well as add a blank index.html to the /html path so the crawler don't crawl the folder..

Add to robots.txt:
Disallow: /modules/xcenter/html/


And just get the index.html from the /class folder or one of the other one containing the history.go(1) line.

3
vmax2extreme
Re: Setting .htaccess restrictions

This didnt solve the issue of anonymous users being able to view our html files in that module. It seems that the crawlers have indexed all our html pages and we need to know how do we set restrictions for the crawlers NOT to index these pages?

4
wishcraft
Re: Setting .htaccess restrictions

Okey added a .htaccess file to the folder with the html files in it containing the following line:

AddType application/x-httpd-php .html


This will make .html files behave like php files.. Add the following lines to the top of each html file.

include dirname(dirname(dirname(dirname(__FILE__)))) . '/mainfile.php';
if (!
is_object($GLOBALS['xoopsUser'])) {
        
header"HTTP/1.1 301 Moved Permanently" ); 
        
header('Location: '.XOOPS_URL);
        exit;
}
?>


This will remove the link from the search engine as well as prevent the file from being viewed unless they are logged in.

5
irmtfan
Re: Setting .htaccess restrictions
  • 2012/7/16 4:58

  • irmtfan

  • Module Developer

  • Posts: 3419

  • Since: 2003/12/7


I think the final solution for preventing everybody to see a page/file/picture is moving those contents outside the wwwroot.

A professional module that i proudly used in some of my websites is "Wraps" by GIJ
http://xoops.peak.ne.jp/md/mydownloads/singlefile.php?lid=97&cid=1
This module is still working with the latest xoops 2.5.5 version.

eg: I moved all private content like resume, private pictures, recommendations, ... outside the root and set permissions for the desired groups.
for example you can not see my resume unless you be registered to the website and be involved to the "customers group":
http://d.jadoogaran.org/modules/customer/Recommendations/resume.pdf

This is the best way i can imagine.

6
vmax2extreme
Re: Setting .htaccess restrictions

Why cant the /.htaccess file be written to restrict .html access to non members throughout the site or specific locations? I dont care to maintain an additional module or individual files since they are replaced on a regularly basis when updating the site. This would be the optimal goal here that I am trying to accomplish with little to no overhead in the long run of things! There's got to be someone that has done this to make it simple.

7
irmtfan
Re: Setting .htaccess restrictions
  • 2012/7/17 6:59

  • irmtfan

  • Module Developer

  • Posts: 3419

  • Since: 2003/12/7


vmax2extreme:
Everybody (and also crawlers as well) can access all of the html files directly because they are out of xoops (or any other CMS) permissions system control.
what wishcraft wrote is a way to implement the xoops permissions to your html files. (converting them to php files)

As i said you before the final solution is moving that folder contain html files outside the wwwroot.
It is very simple. you just need to go to cpanel and copy /modules/xcenter/html folder to an outside root folder.
Then follow the installation of "Wraps".
You can have the same folder as /modules/xcenter/html

As for Crawlers the wishcraft code should prevent them.
Add to robots.txt:
Disallow: /modules/xcenter/html/

But anyway still everybody can access directly to your html files even if google will not index them.

8
vmax2extreme
Re: Setting .htaccess restrictions

unfortunately our site is on virtual shared storage therefore the root of our site is the highest level we can go on our available resources. Simons suggestion is very hard to do since we change our files often when we update them therefore making it hard to manage them.

I guess at this point we really dont have an option then since copies wont work and modifying the individual files is too cumbersome. There has to be a simpler solution to do this. I am sure I am not the only one out there with this type of issue.

Mike

9
irmtfan
Re: Setting .htaccess restrictions
  • 2012/7/18 13:22

  • irmtfan

  • Module Developer

  • Posts: 3419

  • Since: 2003/12/7


dont wory. there is always a solution.
Wraps module still can solve your problem.
In "wraps" module moving the folder outside the root is not nessesary . I (and also GIJ-module creator) just recommended it. personally i like to move my important data outside the root.
Anyway, you can make a folder inside the root but add a .htaccess file contain "DENY FROM ALL"
then install wraps and follow the instruction

Login

Who's Online

254 user(s) are online (113 user(s) are browsing Support Forums)


Members: 0


Guests: 254


more...

Donat-O-Meter

Stats
Goal: $100.00
Due Date: Nov 30
Gross Amount: $0.00
Net Balance: $0.00
Left to go: $100.00
Make donations with PayPal!

Latest GitHub Commits