Figure 1 shows the evolution of the average accuracy for the intuitive judgments using the different reduced lexicons. Figure 2 shows the evolution for the textual judgments. Although the classifier configuration used to produce these figures was not the optimal (figures showing the results with optimal configuration are still in preparation), we can compare the results for the different lexicons. These prelimiary results show that the best accuracy is obtained with very small, reduced lexicons.
A detailed look at the optimal lexicons allows us to validate previous processes for enriching the lexicon. For example, the reduced lexicon containing the most informative tokens for textual judgments contain the drug categories we included (Diabetes_DRUG, Depression_DRUG and Hypertension_DRUG), the special gender tokens and also some negated tokens despite the fact that 'N' jugdments are very rare (for example, we get NO_CRI as very informative for Hypertriglyceridemia). The presence of binary negated tokens significantly increases as we increase the lexicon size.
![]() ![]() |
![]() ![]() |
Carlos 2008-10-16