Table 1 shows the mean and std. deviation of the accuracy obtained for 4-fold cross-validation experiments using no drug categories, adding drug categories as explained above and replacing the drug occurrence with its corresponding category (replacing lantus with Diabetes_DRUG in the previous example). Results show that the addition of drug categories is improving the accuracy only for Diabetes, Depression and Hypertension. For the remaining diseases the best results are generally obtained without using drug categories. It seems that these three diseases have a better defined drug profile while the other diseases share a high number of related drugs. Hence, we proceeded by adding drug categories only to those drugs associated to Diabetes, Depression and Hypertension according to our compiled lists. 
Table:
Average accuracy (with std. deviation in parenthesis) of 4-fold cross-validation runs with no drug categories added, adding drug categories after the drug occurrence in the text (e.g. lantus Diabetes_DRUG), and replacing the drug occurrence with the corresponding category (e.g. replacing lantus occurrences with Diabetes_DRUG).
| 
 
| Disease | No drug categories | Adding drug categories | Replacing drugs with category |  | Diabetes | 0.843 (0.031) | 0.876 (0.017) | 0.877 (0.018) |  | CHF | 0.849 (0.017) | 0.847 (0.021) | 0.853 (0.018) |  | Asthma | 0.931 (0.016) | 0.929 (0.013) | 0.930 (0.016) |  | CAD | 0.821 (0.005) | 0.807 (0.019) | 0.804 (0.012) |  | Depression | 0.864 (0.033) | 0.878 (0.035) | 0.876 (0.035) |  | Gallstones | 0.864 (0.017) | 0.867 (0.018) | 0.845 (0.020) |  | GERD | 0.895 (0.014) | 0.888 (0.017) | 0.893 (0.014) |  | Gout | 0.952 (0.005) | 0.946 (0.014) | 0.945 (0.007) |  | Hypercholesterolemia | 0.719 (0.042) | 0.722 (0.038) | 0.720 (0.046) |  | Hypertension | 0.807 (0.014) | 0.820 (0.022) | 0.809 (0.011) |  | Hypertriglyceridemia | 0.973 (0.015) | 0.971 (0.019) | 0.971 (0.019) |  | OA | 0.872 (0.021) | 0.874 (0.023) | 0.875 (0.024) |  | Obesity | 0.848 (0.018) | 0.847 (0.031) | 0.848 (0.023) |  | OSA | 0.926 (0.011) | 0.926 (0.008) | 0.924 (0.011) |  | PVD | 0.909 (0.014) | 0.912 (0.013) | 0.915 (0.012) |  | Venous Insufficiency | 0.977 (0.008) | 0.975 (0.006) | 0.975 (0.006) |  | AVG all diseases | 0.8715 | 0.874 | 0.8723 |  | 
 
Carlos
2008-10-16