1
tedsmith
Spidering and Indexing the Content of a Folder?
  • 2004/12/17 13:24

  • tedsmith

  • Home away from home

  • Posts: 1151

  • Since: 2004/6/2 1


Not sure if a module already exists for this, which is why I'm asking - if it does, please let me know.

What we need is the following....

We are members of a newsgroup. In the e-mails that we get there is a great deal of useful information that we want to store in a folder in the web root, and for the data to be accessible by our XOOPS Intranet site and the search block. However, we don't really want to have to manually copy the content of each and every e-mail and paste it into an article system if avoidable.

Is there a module that we can ask to go and spider a particular folder stored in the web root (which we will just dump the e-mails into) and then index the content, and then make it viewable in some way via the search block? If not, is there a module similar that may do a similar thing? If not, can anyone suggest an alternative way for us to achieve a similar affect? If not, does anyone think that they could create a module to do that? I'm too dim so I can't do it!

Thanks

Ted

2
ajaxbr
Re: Spidering and Indexing the Content of a Folder?
  • 2004/12/17 15:31

  • ajaxbr

  • Quite a regular

  • Posts: 276

  • Since: 2003/10/25


What format will the emails be in? I've seen perl scripts to parse, index and search .mdb files... and PHPDig can index HTML files, but I wonder whether a email/thread-aware app wouldn't be better.

Take a look at Lurker, sounds interesting.

3
tedsmith
Re: Spidering and Indexing the Content of a Folder?
  • 2004/12/17 18:32

  • tedsmith

  • Home away from home

  • Posts: 1151

  • Since: 2004/6/2 1


I was thinking just standard eml files copied straight out of Microsoft Outlook (not OE) into a Windows Explorer folder. So it won't have to trawl through the pst file or anything like that.

Lurker does indeed look interesting, but I'd rather not use non-Xoops modules if I can help it. But it is something exactly like Lurker that I need - a mailing list archiver, but for use on Xoops.

4
tedsmith
Re: Spidering and Indexing the Content of a Folder?
  • 2004/12/17 18:32

  • tedsmith

  • Home away from home

  • Posts: 1151

  • Since: 2004/6/2 1


I was thinking just standard eml files copied straight out of Microsoft Outlook (not OE) into a Windows Explorer folder. So it won't have to trawl through the pst file or anything like that.

Lurker does indeed look interesting, but I'd rather not use non-Xoops modules if I can help it. But it is something exactly like Lurker that I need - a mailing list archiver, but for use on Xoops.

5
tedsmith
Re: Spidering and Indexing the Content of a Folder?
  • 2004/12/20 11:27

  • tedsmith

  • Home away from home

  • Posts: 1151

  • Since: 2004/6/2 1


One last bump just incase someone in the know missed it before.

6
tedsmith
Re: Spidering and Indexing the Content of a Folder?
  • 2005/1/5 15:04

  • tedsmith

  • Home away from home

  • Posts: 1151

  • Since: 2004/6/2 1


OK, ignore the fact that the data is in e-mail format.

If it was in one massive text file, of if we exported the e-mails into another format of some kind, or something, would there be a way of getting XOOPS to index that so that my users could enter a search string and get hits back?

Login

Who's Online

203 user(s) are online (108 user(s) are browsing Support Forums)


Members: 0


Guests: 203


more...

Donat-O-Meter

Stats
Goal: $100.00
Due Date: Mar 31
Gross Amount: $0.00
Net Balance: $0.00
Left to go: $100.00
Make donations with PayPal!

Latest GitHub Commits