Fundamentals of Processing Overview Representation

Some Improvements to the Classification Model

This section of the tutorial is intended to consolidate the material learnt so far, give you more experience in editing and running COGENT models, and teach a little more about message passing and some of the complexities of rules. A number of modifications to the classification model are considered, and these are used to demonstrate further aspects of the representation language. Further familiarity with the environment should be obtained by performing the modifications and running the resultant models.

More on Research Programmes

Rather than destructively modify the model that we have just developed, we are going to make modifications to copies of the first model. If we always follow this procedure, we should be able to reconstruct the motivation for particular aspects of any particular model by looking at history behind that model. We are therefore finished with the current model, and must return to the research programme level in order to create the second model in our Categorisation research programme.

Close all windows associated with the model by pressing the Done button on the window containing the model's box/arrow diagram. Note that by default pressing this button will save any modifications you have made to the model.

Find the research programme history window. You may notice that the first model's name (e.g., Classification from LTM Knowledge) now appears below the blob corresponding to that model:

To create a new model based on the existing model, right-click on the blob corresponding to that model, and select Create... from the menu that appears. A copy of the original model will be created, and the history diagram will be updated to show this:

Within the history diagram time is represented as going from left to right: models to the right were created after models to the left. The new model (which we are about to modify) is thus the right-most version.

Using a Single LTM Buffer

The first improvement to the model we are going to consider involves the use of a single buffer to store long term knowledge. Presumably knowledge about vertebrates and invertebrates is not stored in functionally or structurally distinct stores. As such, the use of separate buffers for the two forms of information is inappropriate. A better model would allow the storage of information concerning vertebrates and invertebrates in a single functional store. In order to do this within COGENT, we need to use compositional representations: representations which are composed of multiple parts, and whose interpretation or meaning is a function of the interpretation or meaning of those parts. In particular, we need to represent facts such as "tigers are vertebrates" and "spiders are invertebrates" in a uniform manner.

COGENT's representation language (which is based on Prolog) allows this by the use of compound terms, such as vertebrate(tiger) and invertebrate(spider). In general, these terms comprise a functor (such as vertebrate or invertebrate), followed by an opening round bracket, a comma separated sequence of arguments (such as the single element sequence spider), and a closing round bracket.

The representation language thus allows us to represent all of the information concerning vertebrates and invertebrates in a single format:

Our second model is therefore arrived at by replacing the two buffers of our first model with a single LTM buffer containing information in this form:

Exercise: Open the copy of the original model and modify the second version of classification model by deleting one buffer and altering the second buffer so that is looks like the above.

The resultant model should look like:

Note that the revised model's name should also be changed to distinguish it from the original model. This revision will appear on the research programme's main history canvas.

The modification of representations required in order to represent both vertebrate and invertebrate information in a single buffer is not sufficient to produce a working model. The rules which reference buffer information must also be altered.

Open Classification Rules in the modified model and look at the rules. If you deleted one buffer you will notice that the rule whose conditions previously matched that buffer now match an unidentified buffer, indicated by ???. The rules may well be displayed as:

The ??? indicates that the buffer which the first rule is attempting to match against is invalid or unknown. Edit the first rule by double-clicking on it and altering the matched-against buffer to LTM. Note that the second rule has automatically altered the buffer it references to LTM. This is because although we deleted the first buffer, the second buffer still exists, but with a different name: we altered the name of the second buffer to LTM, and references to the second buffer have been updated accordingly. WIthin COGENT boxes can be renamed and references to those boxes will be automatically updated.

In the present case, this automatic renaming is not sufficient. The rules need to be edited so as to conform to the altered representations used in the LTM buffer. Change the rules in Classification Rules to the following, being sure to use the case of letters indicated:

The mixture of upper- and lower-case in these rules needs some explaining. Within the representation language variables are indicated by names beginning with an upper-case letter. In the previous rules, Animal is a variable. The first rule will fire if the value of Animal which triggers the rule (e.g., fox) is listed as a vertebrate in LTM (i.e., if LTM contains vertebrate(fox)). Similarly, the second rule will fire if the value of Animal which triggers the rule (e.g., spider) is listed as an invertebrate in LTM (i.e., if LTM contains invertebrate(spider)). The use of variables within compound terms here is very important, and should be well understood before moving on.

Exercise: Complete the edits and run the model to ensure that it produces the expected output.

The attentive student might ask if the above rules could be merged into a single rule by the use of a further variable, viz:

IF:   Type(Animal) is in LTM
THEN: send Type to Output

In this putative rule, the variable Type is used to match against the functor of an LTM representation. This form of rule is not valid. Within Prolog/COGENT, a variable cannot serve as the functor of a compound term. The "term" Type(Animal) is therefore syntactically invalid, and this approach to merging the rules cannot be adopted.

Nevertheless, the two rules can be merged via the use of compound representations:

Exercise: Alter the representation of long term knowledge again, so that, for example, vertebrate(fox) is represented as animal(fox, vertebrate). Now merge the rules in Classification Rules so that classification can be effected by a single rule. The rule might look like:

IF:   animal(Animal, Type) is in LTM
THEN: send Type to Output

Note how in this rule the variable Type, which may be equated with either vertebrate or invertebrate depending on the value of Animal, is used as an argument in a syntactically valid compound term.

Exercise: Complete the edits and run the model to ensure that it produces the expected output.

Handling Multiple Classifications Simultaneously

Recall that the data source Test Cases feeds one animal type to Classification Rules on each processing cycle. In general, data sources may feed any number of messages (including none) to any number of boxes on each cycle.

Open Test Cases by double-clicking on its icon and edit any input message by double-clicking on the message. The following window shows the data source editor open on one such item:

Note that the Add Element menu button is active. Right click on this button and add a second element to the set of messages to be added on the given cycle. Fill in the appropriate text field with an animal name:

Close the element editor and note how the Input Data canvas of Test Cases changes to reflect the sending of two messages on the same cycle.

It is also possible to create empty messages, so that no messages are sent from a source on a given cycle. Right-click either on an existing message (or between two existing messages) and select Create->Input Data from the menu that appears. An empty message editor window will appear. If you close this window without creating any messages, no messages will be sent on the corresponding cycle of the execution cycle. The following diagram shows a set of possible input data for Test Cases.

Exercise: Complete the edits and run the model to ensure that it produces the expected output.

You may notice one significant difficulty with the model as it currently exists: there is no indication in the output of the animal being classified, and on cycles where multiple animals are being classified it is not possible to distinguish which input animals map onto with output animal types.

This problem may be rectified by building compound terms for output.

Exercise: Alter the classification rule such that the output message takes the form is(Animal, Type). Run the resultant model and examine the contents of Output.

You may notice that the output of the model after the above modification is performed is not is(spider, invertebrate) (as might be expected), but is in fact spider is invertebrate. This (more readable) output is due to the fact that is is defined in COGENT's representation language as an infix operator. This means that it has a similar status to arithmetic operators such as "+" and "-", and COGENT thus prefers to print the is (which is actually a binary functor) between its two arguments.

Handling Unknown Inputs

An obvious inadequacy of our current model concern the model's behaviour in situations where information is either incomplete or inconsistent. Suppose we asked the current model to classify an animal for which it has no information. None of our classification rules will fire and the model will therefore fail to produce an answer. In this situation a more appropriate behaviour would be to produce an answer such as unknown. A reverse case, that in which the model thinks an animal is both a vertebrate and an invertebrate also yields inappropriate behaviour. In such a case the model will output both vertebrate and invertebrate as responses.

To avoid the above behaviour we must modify our rules so that they explictly check for the absence of information. We require rules such as:

Notice the use of negation (expressed as not) in the above rules. We've added a second condition to each of the first two rule so that they explicitly check that an animal thought to be a vertebrate is not also thought to be an invertebrate (and vice verse). In a similar way, the third rule checks for animals which are thought to be both vertebrates and invertebrates, and the fourth rule checks for animals which are not thought to be either.

Negation is introduced to the conditions of COGENT rules by the use of a qualifier. In general, the conditions of rules are implicitly universally quantified. This means that, unless otherwise stated, all successful bindings of elements to variables will cause each rule to fire. Thus, a single rule may fire multiple times on a single cycle if there are multiple reasons for that rule to fire. This default behaviour may be altered by the use of qualifiers on conditions. COGENT currently supports four qualifiers: once, exists, unique and not. A condition may be qualified by clicking on the menu button immediately to the left of the start of the condition and selecting Add qualifier from the menu that appears:

A condition qualified by not succeeds just in case the embedded condition does not succeed. A condition qualified by exists succeeds just in case there is at least one successful solution of the embedded condition, but variables within the embedded condition are not bound. A condition qualified by once succeeds just in case there is at least one successful solution of the embedded condition, in which case any variables in the embedded condition are bound to the first solution found. A condition qualified by unique succeeds just in case there is exactly one solution to the embedded condition, in which case variables within the embedded condition are bound to that solution. In the current case, it is clear that the qualifier we require is not.

You will notice that a fifth "qualifier", trace appears under the Add qualifier menu. Strictly speaking trace is not a qualifier in that it has no effect on the embedded condition. Its effect is actually to simplify the process of tracing condition evaluation if a model does not work as intended. When completing the exercises that follow, you might like to qualify some conditions with trace and observe the output that results in the Message Log region of the relevant process' window.

Exercise: Create a new classification model in which the rules are negated in the way suggested above. Experiment with the rule editor by adding and deleting qualifiers. Also run the resultant model with different inputs to ensure that the rules work as intended.

Fundamentals of Processing Overview Representation