Ex Parte Gosby et al - Page 6

               Appeal 2009-3941                                                                            
               Application 10/334,370                                                                      

                      9. The next step in the process is determining a score for each                      
               word stem and word stem sequence.  This is carried out on a statistical basis.              
               One example of a calculation of the likelihood or probability of occurrence                 
               of each of the stem words, doubles, and triples will now be described.  It                  
               should be noted that, while a mathematical probability is given in the                      
               following examples, this need not be the case in practice (¶ 0098).                         
                      10. The classification system processes texts in the same way as the                 
               training texts to identify word stems and their count, which are determined                 
               by a score (¶ 0112).                                                                        
                      11. For each axis, the probability of the new text belonging to each                 
               group on the axis is calculated (¶ 0125).  This relates the probability of the              
               text being allocated to a particular group on each axis on the basis of the                 
               training data and the text being classified.  This is performed by multiplying              
               (for every word) the probabilities of that word occurring in a document that                
               is allocated to that group (based on the training data) (¶ 0126).                           
                      12.  Having determined the differences using the split-merge-                        
               compare algorithm for the original training data, the classifications and word              
               stem data for texts that were determined to give scores of high confidence                  
               are added to the original training data to provide modified training data,                  
               which is compared with the differences generated for the original training                  
               data (¶ 0157).                                                                              
                      13. Brown discloses different methods for comparison between                         
               scores.  As depicted in Figure 13, the hierarchical structure of a                          
               classification tree is illustrated.  In this embodiment the qualities or axes               
               have extreme values indicating how much the document is concerned with a                    


                                                    6                                                      

Page:  Previous  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  Next

Last modified: September 9, 2013