WebinTool: A Generic Web to Database Interface Building Tool

Jian Hu, David Nicholson, Chris Mungall,
Alan L. Hillyard* and Alan L. Archibald

Roslin Institute (Edinburgh)
Roslin, Midlothian EH25 9PS, UK

*The Jackson Laboratory
Bar Harbor, ME 04609, USA

Contact: Jian Hu (Email: jianhu@bbsrc.ac.uk)

In: Proc 7th International Conference and Workshop
on Database and Expert System Applications, DEXA 96,
pages 285-290, Zurich, Switzerland, September 1996

WebinTool: A Generic Web to Database Interface Building Tool

Jian Hu, David Nicholson, Chris Mungall, Alan L. Hillyard*, Alan L. Archibald
Roslin Institute (Edinburgh)
Roslin, Midlothian EH25 9PS, UK
- *The Jackson Laboratory
Bar Harbor, ME 04609, USA

Abstract

The increasing use of the World Wide Web (WWW or Web) for information retrieval from information systems has raised a demand for Web to database interface building tools. The requirements for such a building tool is analysed and a framework (WebinTool) for rapid building and easy maintenance of Web interfaces is presented. The key component of the WebinTool is an interface tool language which mainly consists of the HTML statements and dedicated SQL-based data manipulation statements. Using this language, a variety of user-customized Web interfaces can be created.

Introduction

A database interface provides a user with a means to access data stored in the database. Command query languages have been developed for this purpose, such as the well-known SQL (Structured Query Language) which is now an industry standard relational database query language. The main advantage of command languagess are their generality and the flexibility in manipulating the database. However, the generality and flexibility also make the language appear too complex for many users to handle [8]. As an alternative to the query language, most commercial (relational) DBMSs provide a form-based tool package. Users can manipulate databases through form-based facilities, such as Query-By-Form (QBF) and Report-By-Form (RBF) in INGRES [7]. Users are also allowed to create a form-based, user-customized interface by creating a group of forms with a form editor and specifying the operations on these forms in a so-called fourth generation language (4GL) [5,4]. This framework is usually called Application-By-Forms and has been widely used [9].

Developments in both graphics hardware and software have made it possible to make interfaces more intuitive and easier to learn and use. INGRES Window 4GL is an example where the user can create graphical interfaces using the W4GL language [6]. In order to support database applications across networks, the client/server approach has become very popular in which a database is usually located at a server machine with the graphical interface running on other networked client machines, such as PCs [2].

The applications of the World Wide Web (WWW or Web) have been growing dramtically since its inception in late 1990. A Web server upon receiving a request from a Web client sends back the required document which is marked up by HTML (Hyper Text Markup Language)[1]. An important reason why the Web is so popular is that the user does not need to learn details of the interface, but only the browser (e.g. Netscape). Since many information systems use database, this raises the issue of how to retrieve the data from the database and to present it in HTML format, i.e. how to create interfaces between the Web and databases. Considering that most databases used today are relational, we will only consider the relational databases.

In this paper, we present a generic web to database interface building tool (WebinTool ) which aims to provide a framework for rapid building and easy maintenance of web interfaces to databases. The key component of this tool is a interface tool language mainly consisting of the HTML statements which describe the interface displays and the dedicated SQL-based statements which perform data manipulations. This idea is similar to the Application By Form (ABF) framework [9]. Therefore, any Web interface to databases can be constructed by creating a series of user-defined WebinTool forms written in the interface tool language.

The rest of this paper will carry out a requirements analysis. The system functional design, architecture and implementation are given. Finally a WebinTool application using genome mapping databases is briefly reported.

Requirements analysis

For the Web interface building tool design, two kinds of users are involved: end-users and application developers. This section assesses their characteristics and requirements, followed by a description of typical scenarios of data manipulations in context of the Web.

End-users

Now consider a typical scenario for an end-user to retrieve information from a database. The user is first presented with a `home page' which displays the information available. He/she then selects (usually by a `click') the information of interest to obtain a query form and to enter search conditions for required data objects. The user submits the query and obtains the results, typically a list of the qualified data objects. A further detailed description of an object may be availabe by 'clicking' it. This style of query-list-detail is usually called the Master/Details style.

However, in many cases, an end-user may need to send a request directly asking for the required data item, e.g. the details of a gene in a genome database. In the Web environment, this can be done by issuing a request from a Web client with the query string which specifies the search condition. More generally, from the viewpoint of end-users, the data in one or more databases is perceived as belonging to some categories or classes. The end-user may wish to obtain the information about one or more data objects of a particular class directly by sending a Web request.

Therefore, in a web interface (referred to as an application) a set of keys can be introduced which basically describe how end-users see the database and mediates the end-users to the database. Note that different applications may have different keys both in terms of the names, number and semantics.

In addition to the keys, there may need to be some other parameters which put a further search restriction on the data. Therefore any database request may take the form of a query URL and look like the following:

    http://the.host.machine/path
      ?key1=val&key2=val&para1=val&para2=val

In general we call these requesting parameters.

Application developers

The role of application developers is to create an interface to meet the requirements of the end-users. An application developer is assumed to have a good knowledge of HTML, database query languages (e.g. SQL), as well as other tools which carry out data manipulations and/or produce the resulting HTML document.

A conventional Web database interface is usually implemented as a CGI program which contains all data operations and HTML statements, e.g. a C/embedded-SQL program. All resulting HTML documents are embedded within the source code. The advantage of this method is that the interface can be described in detail because of the expressiveness of the host language, e.g. C. Problem with this method are mainly related to the interface maintanence. Whenever the interface needs to be changed, the source code will have to be modified and recompiled.

One approach to solve this problem is to separate the interface displays and data operations from the source code. This topic has been addressed by a number of researcher [10] and several prototype systems are available in the public domain, such as GSQL, available at

http://www.ncsa.uiuc.edu/SDG/People/- jason/pub/gsql/starthere.html

and WDB, at

http://arch-http.hq.eso.org/brasmus/wdb

These systems allow application developers freedom to define the Web forms and associated database operations by writing user-customized script files using the system defined commands. The commands can perform specific data operations as well as limited results processing. This feature makes the system easy to use, but the price is the lack of interface expressiveness, i.e. it may be awkward or impossible to describe some complicated displays and construct complex database queries.

As already mentioned, an application developer is assumed to have a good working knowledge of HTML and SQL, and this will give them sufficient freedom to construct and maintain a wide variety of Web interfaces.

Data manipulations

The purpose of the data manipulation statements is to specify the database operations and the subsequent display of the results. This subsection will discuss the basic features of the data manipulation statements by investigating typical scenarios of data query under the Web environment.

Among all SQL statements, the SELECT statement, which retrieves the required information from the database is the most important. A statement specifying a SELECT statement can look like:

    .EXEC <Select statement>

Users should be allowed to specify their particular search conditions which can be submitted through the URL requests. A straightforward method is to introduce variables in the statement. For example,

    .EXEC SELECT sname, fnames, department
        from employee
        where department='$dept_name';

The user makes his/her required URL request by defining the parameter dept_name with the appropriate value. The above statements are referred to as execution statements.

If the search condition is complex, the execution statements may not be sufficient. For example, one may want to search for all employees with surname `Smith', another one may want all in department `Biology', and someone else may want those with surname `Smith' AND working in the department `Biology'. These should correspond to the following three WHERE clauses in SQL:

    .WHERE sname = 'Smith';
    .WHERE department = 'Biology';
    .WHERE sname = 'Smith'
        AND department = 'Biology';

This requirement is very common in a `form-based' interface, where the user is presented with a form with fields to be filled in. Some fields may be filled in while others may be left blank. The status of these fields (filled or blank) will specify the search conditions against the database.

A method to deal with this is to specify a set of WHERE clauses (generally referred to as query statements) each of which corresponds to a search condition unit. A search unit usually contains one or more variables which correspond to the form fields. The construction of the final condition will be based on whether the fields have been filled in or not, and how the WHERE clauses are connected (i.e. through AND or OR). In the above example, there may be the following statements:

    .WHERE sname = '$emp_sname';
    .WHERE department = '$dept_name';
    .WHEREOP AND;

Another widely used query pattern is the so-called master/detail query (see Section 2.1 End-users). In the Web environment, this can be implemented by making each data object of the list a hypertext link which leads to the respective details. For example, suppose that there is a (master) query asking for lecturers of grade A in the Department of Computer Science. The results can be a list of hypertext links (anchors) each of which displays the name of the lecturers. The details of any lecturer of interest can be obtained by clicking the corresponding hypertext link. Therefore, there is a need to specify a hypertext link template with the master query.

The above discussion raises the essential requirements for variables, SQL execution statements and query statements, as well as additional arguments associated with a query (e.g. hypertext link template). Starting from these observations, a functional design of the tool language is presented in the next section.

Functional design

Variables and substitutions

The need of variables is raised from the perspective of query processing, but in practice it is always desirable that variables can be applied at any place in a form.

A variable in the WebinTool is any identifier (a string of alphabetic, digit and underscore characters starting with an alphabetic character) or a braced identifier preceded by the dollar sign, e.g. $dbname, $email_addr, or by the at-sign `@', e.g. @surname, @employer. We call variables starting with a `$' input variables while `@' are column variables.

The basic source of input variable definitions is by parsing requesting parameters received from a Web client request and have the following form:

    para1=val&para2=val&para3=val&para1=val

Note that a variable may have a list of values associated with it, like para1 above which has two values.

In addition, it has been found very useful if an input variable can be defined and altered within the form. So a set of statements that definie and alter variable values has been designed.

Semantics of input variables 1

The basic semantic processing for input variables is substitution, i.e. replacing a variable with its (first) value. This is referred to as simple substitution.

In the Web environment, in order to support the HTML form interface two kinds of constructs are provided: one is for the INPUT tag with the input type `radio' or `checkbox' (radio substitution) and another is for the SELECT tag (select substitution).

1.: Radio substitution The format is
$variable:'string' or $variable:"string"

If the (first) value of $variable is equal to the string, then the expression will be replaced by CHECKED. Otherwise, it is discarded.
2.: Select Substitution
There are three kinds of Select substitutions

(a)
SELECT 1 format is
$variable;'string' or $variable;"string"

This is the same as the Radio substitution except the result of substitution is `SELECTED' or empty instead of `CHECKED' or empty.
(b)
SELECT 2 The format is
$selects/$values

This expression will be replaced based on the following rules. For each value, say val in the value list of $values, if val also occurs in the value list of $slects, then the following line is added:
<OPTION SELECTED> val
Otherwise, simply
<OPTION> val

(c)
SELECT 3
Ths format is
$selects/$values/$labels
This is similar to SELECT 2 except that the labels of each option will be a corresponding item from the value list of $labels instead of the default value from $values.

Semantics of column variables

The introduction of the column variable allows some operations to be carried out with one or more column values while the query results are being produced. For example, consider a hypertext link template

    /cgi-bin/script?objtype=dept&objid=@d_no

for a query

    select dept_name, d_no from department
    where location='Bldg-A'

The results will be a list of rows each of which is a hypertext link and displays the department names and department number. For instance,

    <A HREF=
      /cgi-bin/script?objtype=dept&objid=3
      > Mathematics  3 </A>
    <A HREF=
      /cgi-bin/script?objtype=dept&objid=9
      > Biology  9 </A>

Execution statements

An execution statement (with name EXEC) allows a SQL statement to be specified and executed. Any part of the statement can have input variabes which will be replaced with their values before being sent for execution. For the SQL SELECT statement, the results of the query are usually displayed in a tabular format. To cater for various requirements, other forms of the results output are allowed. The following forms have been designed:

Tabular output;
HTML table output;
Default record output;
User-defined record output;

An execution statement will generally take the syntax:

    EXEC outform := SQL-statement;

Query statements

A query statement in WebinTool is any clause of the SELECT statement, e.g. SELECT, FROM, WHERE, UNION, or SORT. The design of the query statements is to facilitate the form-based query interfaces. Variables can occur anywhere in a query statement. However, the semantics of variables are different from the general case described above and this is determined by the characteristics of the query statements.

Semantics of input variables 2

1.: Variables in SELECT, FROM, WHERE, UNION and SORT query statements For variables in any of the statements SELECT, FROM and SORT, if the variable has a non-null value then it is replaced with all of its values separated by commas, while in the case of WHERE and UNION a variable is replaced by its first value. If the variable does not exist or has a null value, then the clause is discarded.
2.: Variables in WHERE/AND, and WHERE/OR query statements
These two are an extension of WHERE statement which allows all the values of a variable to be used. The restriction for using these two statements is that they may, at most, have one variable. For each value of the variable, one sub-clause is produced by replacing the variable with that value. The final clause is constructed by connecting all sub-clauses by 'AND' or 'OR'. If the variable does not exist or has the NULL value, then the clause is discarded.

Argument statements

As discussed in Section 2.3, there is a need to specify arguments with a query, e.g. hypertext link template. The following arguments have been found essential:

1.

Database, Database User, Password: These arguments specify the database to be accessed, as well as the user identifier and password;

2.

Link:

The Link argument specifies a hypertext link template which is to be filled in and embedded in each result row of the query;

3.

Variable Columns:

The argument specifies a list of columns whose results will be stored in variables with the same name as the corresponding column. These variables can then be referenced in the form.

Control statements

To become a practical script language, some kinds of control statements must be included. Among them, two kinds are essential: the conditional statement and the loop statement. In WebinTool, these statements are designed as:

The IF statement:
```
	    .IF logical_expression_1
	        segment_1
	    .ELSEIF logical_expression_2
	        segment_2
	    .ELSE
	        segment_3
	    .ENDIF
```
A segment can contain any HTML lines and webin statements, including if-else-endif statements, i.e. nesting is allowed.
The FOR statement:
Two kinds of FOR statements are found essential. The first has the syntax:
```
	    .FOR lp_var = ar_expr_1 TO ar_expr_2
	        segment
	    .ENDFOR
```
which repeats executing the segment for the loop variable lp_var going through from the value of arithmetic expression arith_expr_1 to that of arith_expr_2.
The second has the syntax:
```
	    .FOR loop_var IN $input_variable
	        segment
	    .ENDFOR
```
which repeats executing the loop_body for each value of the variable <input variable>.
The loop body segment can contain any HTML lines and webin statements (including other FOR loops).

This section has outlined the description of some essential data manipulation statements. The complete specification of all statements can be found in [3].

System architecture and implementation

Using the WebinTool interface tool language, a user can write a form file which specifies Web interface displays and the database manipulations. Usually an application may have many relevant forms. All form files and supporting files can be organised into a directory and we call it the application home. There may be many applications on a server machine each of which performs a specific task.

An interpreter plays a central role within the WebinTool. It is invoked by an appropriate CGI script (usually a shell script) specifying the path and other parameters needed by the application. For a request from a web client, the interpreter

Locates the application and identifies the required form by parsing the requesting parameters and searching an index which serves as a map between key parameter values and the form files, and
produces the resulting HTML file by parsing and interpreting the form, including performing database operations if necessary.

The system architecture of the WebinTool is shown in Figure 1, where webin is the interpreter.