We present an experiment and a computational model aimed at addressing the question: What is the meaning of probability terms (such as ``probable'', ``possible'') as used in everyday language? Historically, such terms have often been taken as conveying intervals of confidence or probability over some analogue scale, such as a probability scale or fuzzy membership functions. An alternative possibility is based on the view that human reasoning under uncertainty involves a process of logical argumentation, in which qualitative arguments for or against (or reasons to believe or doubt) a proposition are as important, or possibly more so, than representation in terms of quantitative values. On this view probability words may convey qualitative structures of such arguments rather than numerical degrees of belief.
We present an experimental study of the relationship between probability words and linguistic statements about the reasons to believe or not to believe propositions. In a balanced design subjects were presented with two sentences, one using a probability word (such as ``it is possible that P'') and the other phrased in terms of reasons-to-believe (such as ``there are more reasons to believe P than to doubt it''). Subjects were asked to judge if the second sentence was an acceptable paraphrase of the first. The results show that for certain term/paraphrase pairs there was a high degree of consensus about equivalence in meaning, whereas in others the subjects were divided, or decided that the distinction was unclear. In some cases the order of presentation of the two sentences was also important.
We have developed an information-processing model for this judgement task, which has been implemented in COGENT. On this model the decision process involves two stages. In the first stage a set of internal ``mental'' models is constructed which are consistent with the first phrase presented. In the second stage, the compatibility of the second phrase with the first is assessed by testing the set of internal models against the second phrase. The model was run using two different representations for the meanings of the phrases used in the experiment, one based on probability intervals, and one based directly on argument patterns. Both versions of the model give a good account of the data, both in terms of which paraphrases are judged to be ``correct'' (including the effect of order of presentation), and the relative proportions of subjects agreeing or disagreeing, with the version based on argumentation giving a slightly better fit than the version which uses probabilistic representation.