Appeal 2009-3941 Application 10/334,370 sequences together with a value on one or more axes to enable classification of subsequently-analyzed documents that contain the same words or combinations of words (¶ 0051). 6. Thus, the automated classification process operates to determine scores for axes for documents based on extreme words and their synonyms and antonyms that are determined on an iterative basis. This avoids human subjective input that may give inaccurate retrieval results (¶ 0064). 7. The result of the classification process is a series of scores (i.e., one on each axis) for each of the training texts. The output is illustrated schematically in Figure 5. Associated with each Training Text (illustrated by a dotted line) is a table or Score Table ST. The Score Table shown comprises two columns, namely an axis number and a score for each axis. Well known memory management techniques can be used to efficiently store the information. For example, a document number could simply be followed by n scores in a data array, thereby eliminating the storage of the axis identification numbers (¶ 0065). 8. Brown generates a word stem and word stem sequences that are stored in association with the appropriate group. Using the example of the Happy-Sad axis, the stem “happi” will be expected to occur most frequently in group G0 of this axis. Thus, when this word stem “happi” is found in a new text the training data can be used to provide an indication that the document should be placed in one of the groups G0 on the Happy-Sad axis (¶ 0093-0097). 5Page: Previous 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Next
Last modified: September 9, 2013