टाटा मूलभूत अनुसंधान संस्थान
Tata Institute of Fundamental Research

Homi Bhabha Road, Mumbai 400005, India

Home | Search | Sitemap | People Finder | Mail to Webmaster

Summary

This page explains the difference between the core pages in the TIFR web site and those which are not part of this core (called hosted pages). The management of the core pages is outlined, including the content management system which should generate the core web site.

Core web site construction

Page types

The web site will be divided into core and hosted content . The core content will show up in the site map or equivalent navigational aid, be thoroughly checked for style and content, and have an unified feel and navigation facility. This document is about the architecture of the core content.

The core content has many different pieces: there will be almost static pages as well as very fluid information best disseminated through database queries (telephone book, tender notices, job announcements, calendar of events, search functionality, etc).

Three jobs are clearly demarcated for almost static pages: the authors are responsible for the content of a specific page, the webmaster is responsible for the information system and its technology, and the publisher is responsible for checking the content and the presentation of the web site. For responses to database queries there is no single author, and the data must be verified at the stage of entry. A database manager will be responsible for the integrity of each database.

For data security, there will always be a master copy of the complete web site on an internal machine. The publicly viewable website is created by mirroring this master by a manually invoked process. Static pages need not be updated very often. Master copies of some of the databases need to be updated fairly frequently.

Page layout elements

In order to preserve an unity of feel while permitting flexibility in content and navigation, the screen area is divided into a fixed number of elements. The look and feel of the layout is handled through CSS. This page contains illustrations of each of the elements described below.

Banner
The banner contains the name of the institute in Hindi and English, and the address in English. No content in this section is an image. This is deliberate: it enhances visibility in search engines when the keywords are typed either in the Roman or Devanagari scripts. The width, colour, font, etc are handled through CSS and can be changed as one wishes.
Navigation
The invariant navigation bar gives the main navigational elements in the site. In certain sections of the site, navigation can contain additional variable elements. At present it is placed on the right hand side of the page. Its positioning, size, font, colour, etc, can be changed through CSS.
  1. The invariant navigation bar will be constructed by the content management system. The default view will be a condensed form, and a script can cause it to switch to an expanded view (and back). The availability of an expanded view means that a link to a site map page need not be provided.
  2. The variable elements can be section dependent, and gives freedom to content providers to add navigational aids through the section of the site that they manage.
Quick links
Quick links are to the Home page, the main page of the current section, a site search facility, and a mail connection to the webmaster. At present this is a horizontal line below the banner, but everything about it is controlled by changeable CSS.
Author, date, and copyright
Visible metadata about the page contains the name of the author (or an author list if there are multiple authors), the date of last modification and copyright information. This information should be added by the content management system. For database query returns, one or more of these elements may be omitted. At present this information is placed across the foot of the page but everything about this can be changed through CSS.
Content
The main content is completely separated from the visual design and the site architecture. In order to write a page, an author needs to know a very small subset of HTML, namely
  1. How to create the page title.
  2. How to delimit paragraphs.
  3. How to create lists.
  4. How to include non-text material (images, video, sound).
  5. How to declare special copyrights.

Content management system

Building a version of the web site

The content management system (CMS) should be able to synthesize the whole website entirely from authors' files by the following steps.

  1. Back up the current web site.
  2. Step through the list of authors , and for each author
  3. step through the files of the author,
  4. validating each file.
  5. For each valid file generate an entry in the long navigation list, decide on the section and enter it in the quick link, generate visible and invisible metadata , and
  6. put together the fully styled file with the author's content,
  7. store it in the proper place inside the new site.
  8. For each invalid file generate proper error messages.
  9. Collect and transmit error and status messages to authors, publisher and webmaster
  10. Exit

For safety there needs to be a manually initiated step to replace the current visible web site by the newly created web site. This step should be taken by the publisher after (s)he looks at the report, and consults with the authors and webmaster about all possible problems. This step should involve

  1. Recreating the search engine database
  2. Comparing the new and old long navigation lists (site maps) in order to generate help for 404 errors
  3. Updating the public copy of the website

Question: How often should new versions of the web site be published?

Maintaining a list of authors

A database of authors should contain a list of authors in the form of pairs <author name, author directory>. This should match information in the user database of the machine from which the content files are extracted.

Validating a file

  1. HTML syntax check
  2. Spelling check
  3. Certification of copy editor (?)
  4. Copyright certification (embedded: provided by author)
  5. Link checking
  6. Anything else?

Generating metadata

  1. Author name from database
  2. Date created, date modified from time stamps
  3. Copyrights from embedded notes

Data security and backups

For security, the current content should be first created in a different machine, invisible to the web server, and then be mirrored in the file system visible to the web server. This will allow fast restoration of data in case of malicious or accidental tampering.

Periodic updates to the website are envisaged. The previous few updates can be stored in the secure machine for recovery from accidental author error.

Database Managers

The parts of the web site which rely on databases are

Each of these databases take data from multiple sources. The database architecture should ensure data integrity. One person should be in administrative charge of each database. This database manager will be the point of contact with the website publisher.

Authors

An author should be able to generate content in ascii files stored in a specific sub-directory owned by him/her. One expects from an author material which follows the syntax laid out and is checked for content, language and copyright. Tools for the authors are instructions for using a small subset of HTML , HTML (subset) syntax checkers, spell checking, and viewing their content as it would finally appear in the web site (partly to help them check their own links).

The webmaster

The webmaster will be in charge of the complete technical functioning of the web site, including the web server (Apache), the content management system, the database management systems, security, etc. (S)he will coordinate with the publisher on the presentation of the web site.

The publisher

The publisher is fully responsible for everything that appears on the core content of the web. It is the publisher's responsibility to make sure that authors have performed the tasks that they are supposed to carry out, that database managers have preserved data integrity and that the information architecture is delivering what it is meant to deliver. The publisher is the person primarily responsible for monitoring accidental or malicious damage to the web site, and raising an alarm when this happens.

Hosted content

The TIFR web server will also serve out "hosted content". These will be departmental web pages, personal pages, pages for individual projects etc.

Norms for hosted content

Copyright : TIFR ; Author : Gupta Sourendu ; Created on 2008-04-04 ; Last modified on 2009-04-17.