back to previous section up to contents on to next section

Critique of the Design of the Slipnet

some unresolved issues


The Problems Posed by Large Databases

In any database system problems are created when the data becomes too large fit into the active memory (RAM). While the ROD model provides a mechanism for addressing this problem, it introduces some others.

There is really only one solution to the problem of not having enough RAM, disk caching. That is, to answer a given query, we must determine which portions of the data we need to access, and to load it into memory if it is not already there. Similarly, when we need to make more space available, we need to determine which portions of RAM can be safely jettisoned.

Although we have chosen not to clutter our code with unimplemented "stubs", it is easy to see how a disk caching system would fit into our framework. We need only add a layer of abstraction between the query parser and the actual Slipnet. This layer would then determine whether or not a particular portion of the graph is currently in memory, and would access it in the appropriate fashion.

In part, the graphical structure of the Slipnet allows us to perform this "swapping" of data between disk and memory space in a fairly effective manner, since we can easily determine which nodes are "connected" to those we are interested in. In fact, a node's activation should provide a useful heuristic for determining precisely which nodes we are actually interested in.

Suppose that whenever a query is made, the activation levels of the "returned" nodes goes up. Activation then "spreads" from highly activated nodes to their neighbors, in manner that is determined in part by the length of the associated Link. Similarly, the activation level of a node might "decay" over time, so that only recently accessed nodes maintain high levels of activation. It should be possible, then, to determine which nodes are of interest to a user by examining activation levels. Hence we might "anticipate" future queries by loading into memory highly activated nodes before a user has actually requested them. Similarly, when more memory is needed, we might assume that the least activated nodes are no longer of interest to the user and expunge those first.

Two currently unresolved problems arise. The first is that some queries require an examination of the entire Slipnet, and so are still forced to load the entire graph, at least in a piecemeal fashion, to fully answer some queries. Moreover, the construction of our activation-spreading routine, as outline above, guarantees that we must examine the entire graph on every iteration.

The first problem is unsurmountable. We can only hope that users will restrict their queries on large databases in order to receive a response in a reasonable period of time. As for the second, one possible strategy is to use a "time radius", whereby those nodes currently in memory are "iterated" after every query, those adjacent to nodes in memory after, say, every other query, and so on, so that "time" travels more slowly for nodes far from the current focus. This doesn't really solve the problem, but at least greatly reduces its significance. (We can, in fact, arbitrarily reduce the significance of iterating the activation levels within the Slipnet by making such iterations arbitrarily infrequent. There is, of course, a corresponding loss in accuracy, and the utility of our heuristic will begin to suffer.)


The Representation of Infinite Domains

A different problem arises in the consideration of AI-based applications of the Slipnet.

An intriguing application of AI search engines is to mathematics and other formal logic systems. An entire class of problem domains is known as theorem proving (and the related task, theorem generation), largely for this reason.

Unfortunately, it is awkward to represent many of the problems in these area as a semantic network. Suppose, for example, we wish to encode the eight queens problem within our framework. One solution might be to create a node that corresponds to each of the possible board configurations, although this is quickly eliminated, since it requires 64*63*62*61*60*59*58*57 ~= 2 * 10^14 nodes. An alternative might be to encode the position of each queen as a node (so 64 nodes total), with links between "queen positions" that are "legal" (or perhaps "illegal", as that would yield a sparser graph). The require query would then be somewhat difficult to construct. An additional graph theoretic query that provides a near minimal graph coloring would provide an adequate solution to the problem, when expressed as the search for "independent" (not connected) nodes within the Slipnet. (Given the 64 node representation of the problem space, with edges between two nodes if one can attack the other, a "monochromatic" collection of 8 nodes would then correspond to a solution to the eight queens problem, since no edge could possibly exist between the selected nodes.)

For some problems a clever encoding of the information will not suffice. Suppose, for example, that one wanted to implement a number-theoretical theorem prover. Clearly it is unreasonable to expect to create a node representing each integer. Moreover, there doesn't seem to be any direct way to reduce the number of nodes required to represent any arbitrary integer. This problem remains unresolved in our current implementation. A possible solution is to create a new kind of node that encapuslates "variable" or "active" information. Our integers, for example, could be encoded (internally) as a Java integer, with the requisite extensions needed to handle relations such as "greater than" or "plus".


Footnote: The eight queens problem, simply stated, is to find a placement of eight queens on a standard eight-by-eight chessboard such that no queen can attack another. (Since this implies that there is exactly one queen in each row and column, it is related to the problem of constructing "Latin Squares" and similar group-theoretic notions.)

Footnote: A "proper coloring" of a graph is an assignment of "colors" to each node in the graph such that no link joins nodes of the same color. A "minimal coloring" is one that uses the fewest possible number of colors. We suggest looking for "near minimal" colorings since in general the problem is known to be NP complete. It is not difficult, however, to produce merely "good" colorings within a reasonable period of time.


Back to the top of this page.


back to previous section up to contents on to next section