Don't Release RC2!!
With the forthcoming release of RC2, XOOPS developers are encouraging module developers to go to town, promising consistency with the final release. My concern is that this might be too early given that the metadata (content about content) aspect of XOOPS is still immature. I'm suggesting that at this stage a "wrapper" be introduced for all content which in addition to metadata, also builds an abstraction mechanism into content in XOOPS permitting greater code reuse and communication between modules. XOOPs is already moving in this direction, and perhaps I'll be surprised and find that RC2 has already taken steps in this direction.
Conceptually, we should be able to agree on a basic content class which all other content inherits from, adding its own attributes which specifically descibe that data. This is a design issue. We do not necessarily need to build this using classes given that PHP4 is still limited in its object-oriented qualifications. Basic attributes include a unique ID (across all content on a XOOPS site), its class ID, its creation timestamp, its author ID, a read count, the ID of the group who have permission to read it, etc. I think we can very quickly come to an agreement in what these basic attributes are.
Content also has "methods" which can operate on that data. Each piece of content knows how to display itself in various ways: sideblock, central pane, as a list member, etc. This is how I'd like to view modules. A review module is simply code that knows how to manipulate reviews. A users module is simply code that knows how to manipulate users, or rather user profiles. User profiles are content too. Perhaps a central number allocation module is also needed which ensures that all content (irrespective of its class) has a unique ID, irrespective of whether its a review, news or a user profile. There isn't a performance issue here, a module can simply ask for a block of unique ID numbers and work with them.
A list mechanism for content is also fundamental. Think of it as a bookmark list. Each user will have a collection of bookmarks: a list of users s/he frequently wishes to post messages to as a group, a list of favourite forum threads s/he is tracking, an archived list of news stories. When a user logs in it is a simple procedure to go through that users bookmarks and alter him/her when that content has changed, whether because one of his favourite authors has written a new article, a forum thread has been updated, or a calendar event they were "subscribed to" (ie bookmarked) has been cancelled. Users will want to share bookmarks, and manipulate bookmarks as a group. In fact we should probably think of bookmarks as content with a module to manipulate it.
Let's say I'm building my XOOPS site around a large database. In my case, a database of tens of thousands of films. I create a film module which wraps each piece of film content in a common XOOPS content wrapper. Additional information specific to film content such as release date, country of origin, etc, are stored in a separate database table but the common information for all site content (creation date, read count, etc) are kept in the common XOOPS content table. Yes, there are performance issues, but in database design one commonly breaks the information into separate tables. Conceptually, film content inherits from the basic content class in that it is assumed to have the same basic attributes and methods.
The next issue is metadata: content about content. When everything is content, basically we're talking about associations between content which could be implemented as a join file or by each piece of content having a bookmark list of associated content. A forum post, for example, would associate itself with various other pieces of content, all listed in the main content table, which may be a film record in a movie site, a species record in an endangered animal site, a book in a literature site, or a person in a sports site, etc. I would suggest two basic types of association across all content defined in the basic content class: primary and secondary. An advanced search could find all content (forum and news posts, etc) which are primarily about, say, Tom Cruise, without listing discussions of Mimi Rogers which mention Cruise in passing.
The issue arises here of the overhead for the user. My first get-out clause is that it is the responsibility of an editor to ensure that content is properly classified. Secondly, I suggest that content is created in a different way when you have rich classification. When I want to discuss Mr Cruise I go to the page about him where I see the list of all (primary) news stories, all (primary) forum posts and all (primary) calendar events, etc, ordered by date. When I add a forum post it is automatically associated with Mr Cruise. If I'm commenting on a post which is also associated with the Vanilla Sky content record then my post also "inherits" that primary association. I suggest an advanced Text Sanitizer also suggests other associations through a simple search of the main content table, perhaps only looking at the "title" attribute.
I'll make a list below of the attributes I'd like to see in the basic class below. I'd like to build in the concept that content can be fundamentally objective or subjective, at least in intent (for all your philosophers out there). I'd suggest that content defined as objective would need approval before posting and would be edited in Wiki-style. For simplicity, I don't include the attributes necessary for a versioning system,for rewinding changes, but that should be built into the basic content class. Same for multilingualism which I'll write about separately. And a user rating system should also be built in here, with perhaps higher rating new stories being included in the newsletter.
One important attribute across all content is a bookmark ID determing who has read access to that content. Default is everyone, perhaps represented by NULL. But this builds in a mechanism where I can annotate any content on a XOOPS site (be it a review, a news story, a calendar event or another user) which only I or my workgroup can view. One could imagine a workgroup having a private form discussion about a calendary event. "Workgroup" is a loaded word: also think about it as a "Buddy List" of friends planning a night out. Buddy Lists or Workgroups are essentially User Groups defined by the user: for administration purposes I'd suggest the user who defines the buddy group controls who can and cannot join the group. The same bookmark mechanism (since it's just a list of content IDs which may be users or any other content) could be used for a group to collect together a list of films they want to include in a book, or an individual to bookmark videotapes they own or want to buy: a longterm shopping cart.
So, what do we get here that's worth holding back RC2 (or at least an open call to module developers) for. 1) An advanced search mechanism which works across all content, whatever your site subject matter. 2) An organisational mechanism for all content by subject matter (primary and secondary association). 3) An advanced bookmarking system which works across whatever kind of content you have on your site. 4) A method of sharing content within user-defined groups. Please feel free to comment, however brutally. I'm not an experienced PHP coder but I include the attributes below as a starting point.
Wanted to post this before RC2 is released so forgive me for any inconsistencies, spelling mistakes, repetition, etc.
Basic Content Class Attibutes:
01) Unique ID
Unique for every piece of content on a site, whether it be a user profile, a calendar event, a review ... whatever. Unique because this allows any content to be associated with any other piece of content irrespective of class. This also allows bookmarks to hold mixed content.
02) Unique Class ID
I suggest here that Class ID is prefixed with the author's XOOPS.org membership number to ensure that this ID is unique across the universe of XOOPS sites. This ensures compatibility whichever modules one introduces to a site. Clearly there is an issue here when one calendar module is switched to another calendar module, but it should be simple for a site administrator to update this.
03) Author ID
Author is a User who is just content. Everything is content. I guess there is an argument to have this as a bookmark ID which lists a group of authors for collaborative content or where a number of authors have contributed over time. But I need to think more about how versioning affects this when one has Wiki-style editing
04) Parent ID
Here I'm not referring to inheritence but to the previous forum post. Since all content can be commented on, at least as personal annotation, this is built in here.
05) Child ID List
Again, this refers to follow-up posts. There is a performance issue here when it comes to annotation where hundreds of users may annotate the same news story. Perhaps annotation should be implemented with bookmarks "owned" by the user (and possibly shared with a group through the user's account).
06) Timestamp when Posted/Created
07) Timestamp when Modified
08) Primary Association ID List
The list of primary associations which is nothing more than a list of IDs. Perhaps best implemented as the ID of a Bookmark List in the central Bookmark Table. This allows for optimisation when hundreds of posts have similar associations. Again, this is an implementation issue which I'm not so concerned about in this post.
09) Objective/Subjective Status
Fundamental to content I believe. As stated above I'd suggest that "objective" content should be edited Wiki-style and need approval before being invisible to every user of a site.
10) Reader ID List
NULL means every can read this. Otherwise, it's a private piece of content for the author or a small group. Again, best to implement this as the ID of a bookmark content-type. This permits the content to be shared by a group which changes over time without updating every piece of content viewable by that group. Very often that list will have one member, indicating personal annotation.
11) Read Count
How many times a piece of content has been read
12) User Rating
I think read count is too primitive a measure of a content's worth, particularly when making qualitive decisions like whether to include a post in the newsletter. This also aids workgroups in rating the importance of a specific event, etc. Hmmm, I guess this means that the bookmark module is extended.
13) Maximum Length of Content Measured in Characters
I think it's useful to restrict the size of certain content, assuming content is as simple as a collection of characters. Even when it's not, as in a user profile, or generic database record, I think it's useful to have a description of that user or record in text. Here, or course, I'm thinking 16-bit Unicode and not being Euro-centric!
14) Language of content
Using ISO codes here. I need to think more about multilingualism along the same lines as a versioning system. Perhaps each Unique ID (1) should be extended with the language code and the version number...
Again, I'm including it here for reasons given in (13). When this is very long as in an article, I'd suggest this should be the abstract of the article which one may like to include in newsletters and in links to the main content from other sites.
16) [Primary] Weblink Address
Again, debatable whether it should be here, but I think it should be built in for when one starts messing with SOAP, etc. This isn't necessarily (or even ideally) a link within one's own site. A forum post or a news piece will frequently be primarly around an article posted on another site. Optional.
17) Include in Next Newsletter
Determined by content type. Perhaps only modifiable by editors of the site.
Almost forgot this. As in title of forum post, title of news story, etc. Very useful for the search mechanims and also for the Text Sanitizer to determine what other content to make associations with. But useful if alternate titles are also possible, even if not displayed here. Again multilingual system would allow other languages, but I don't want to go into that.
19) User Group List
Although not a rich enough system for classification or workgroups, it does have its very broad users. User Groups should make semantic sense. Hence a sports site has Football, Baseball User Groups, etc. Content can be restricted of only interest to a specific User Group (or several User Groups). Should be possible to display content only marked as members of the Baseball User Group, etc.
<small>[ Edited by AFLSC on 2002/1/26 13:29:43 ]</small>