Smabbler’s Graph Language Model was used to prepare (label) a training set for ML model built for symptom-based disease identification. For the purpose of preparing the case study, a publicly available dataset was used. SVC (support vector classifier) was used for disease classification task. Most of the predicted labels (different types of hepatitis) were at or close to 1.
As little as 2 hours is needed to build an ML model for symptom-based hepatitis detection, which is trained on a dataset labeled by Smabbler in 5 minutes, and achieves 96-100% accuracy.
• Training dataset: 5.63k rows
• Test dataset: 1.41k rows
• Dataset automated labeling time: 5 minutes
• ML model setup: less than 2 hours
