The current benchmark speech-based depression detection techniques rely on acoustic speech parameters collected from large sets of representative speech recordings. This study for the first time investigates depression detection based on the higher order influence model (HOIM) coefficients and emotional transition parameters derived from a relatively small set of conversational speech recordings representing 63 different parent-adolescent conversations of time duration 20 minutes each. The adolescents included 29 (24 female and 5 male) individuals diagnosed with major depressive disorder and 34 (24 female and 8 male) healthy individuals. The mental state of parents was not assessed. The model-based depression diagnosis was compared with benchmark techniques based on acoustic speech parameters (mel frequency cepstral coefficients (MFCC) and Teager energy operator (TEO)). The classification into depressed on non-depressed categories was performed using the Gaussian Mixture Model (GMM) for the acoustic parameters and the support vector machine (SVM) for the HOIM features. The model based technique led to the highest average classification accuracy of 94% of for the HOIM of order 4, whereas the best benchmark techniques scored 70% for the optimized MFCCs and 71% for the optimized TEO features.
↧