Solution

AI System Research & Development

AI System Structure

The AI system consists of a. Facial analysis AI model, b. Voice transcription AI model, and c. Audio feature extraction AI model. Each analysis results will be compiled with the Hamilton Rating Scale for Depression (HAM-D) to give the final result on depression severity. The system overview flowchart is as below.

Facial Analysis AI Model

FaceMesh is a machine learning model for detecting key facial features from images, published by Google in 2019. While a typical face keypoint detection computes 68 points (x,y), FaceMesh gives us 468 points (x,y,z).

AR Core system, developed by Unity, will be utilized to detect FaceMesh and send it to analyze anonymously to protect the user's privacy.

DeepFaceLIFT model, a Convolution Neural Network system, will be utilized as a deep learning model to analyze the facial data of this system. Facial reactions will be translated to the feelings by this model.

The facial analysis AI model flowchart is as below.

Voice Transcription AI Model

Speech recognition model for Thai language, developed by Faculty of Engineering, Chulalongkorn University, will be utilized to transcribe voice into text. This model is based on Nvidia's Conformer Connectionist Temporal Classification (CTC) model and Contextualized Connectionist Temporal Classification (CTC) model.

The text will be analyzed by several Natural Language Processing (NLP) models, including Term Frequency–Inverse Document Frequency (TF-IDF), Logistic regression, and WangchanBerta.

The voice transcription AI model flowchart is as below.

Audio Feather Extraction AI Model

Speech Emotion Recognition (SER) model in Thai language, which was co-developed by one of our members, will be utilized to extract the audio feature of the user’s voice.

This model is able to interpret 5 emotions, including angy, sad, happy, and frustrated, with a high accuracy of 70%.

The audio feature extraction AI model flowchart is as below.

Clinical Trial Research

This research will collect data from sample population to improve the reliability and validity of the AI model by using Hamilton Rating Scale for Depression (HAM-D) as the benchmark.

Research Tools

Personal information questionnaire

The questions obtains the participant personal information, including age, gender, domicile, education, religion, marital status, number of children, income, physical medical record, mental medical record

Hamilton Rating Scale for Depression (HAM-D), Thai language version

This Thai version of HAM-D, which was developed by Lotrakul et al., has a good internal reliability and concurrent validity (Cronbach alpha = 0.858; r = 0.72) [16]. The scale has 5 levels of depression severity; Not depressed (0–6 points), Mild (7–12 points), Moderate (13–17 points), Severe (18–29 points), and Very severe (>30 points). The cut off point to consider as the depression is 13 points.

Depression data collection software

The software will be developed to assist with the interview process. By running on Apple’s iPad tablet device, it will interact with the participant to have them answer questions and tell their story, while recording video and audio using the built-in front camera and microphone. The software contains a terms of use page, the personal information questionnaire page, the Hamilton Rating Scale for Depression (HAM-D) Thai language version (17 items) page, and the interview question on daily life to address happy and sad moments in the last two weeks.

Participant selection criteria

Inclusion criteria

  • Age of 18 – 60 years old

  • Fluent in reading and communicating in Thai language

For depression participant (target population)

  • Get less than 13 points from HAM-D test

For non-depression participant (control population)

  • Never have depression medical record

  • Get more than 13 points from HAM-D test

Exclusion criteria

  • Have unstable medical conditions that affect the communication skill

  • Have a disease that affect facial expression, such as Bell's palsy, Parkinson’s disease, and facial palsy

  • Have a mental disorder, such as Schizophrenia spectrum, Anxiety disorder, Obsessive-Compulsive disorder, Dissociative disorder, Psychosomatic disorder, and Feeding and Eating disorder

Population sample size

This research requires both depression and non-depression population groups. According to the literature review, the proper sample size is 50 people for each population group, or 100 people in total.

Data collection experiment

Submission of the research proposal to Ethic committee of the Institutional Review Board (IRB), Faculty of Medicine, Chulalongkorn University, in order to be granted a clinical trial permission

The experiment will be conducted in 2 onsite events, which will host 50 participants each, and held at the Department of Psychiatry, Faculty of Medicine, Chulalongkorn University.

Each participant will be in a private room with a psychiatrist who observes this clinical trial. The depression data collection software, mentioned above, will be utilized to ask the questions and record video and audio of the answers. The participant will spend around 20 - 30 minutes to complete the session.

Mobile Application Development

  1. The application will be compatible with Android and iOS operating system (more than 80% of available version of both system)

  2. The user interface will consist of these following;

    1. Landing page

    2. Register and login page

    3. Password recovery page

    4. User consent page

    5. Personal information page

    6. Interview / Testing page

    7. Dashboard page

  3. For the visual disability accessibility, the screen readability and keyboard navigability will be the key design concept for the application. ARIA live region system will be utilized to notify screen readers when content is updated on the page. Facial detection feather will be able to inform the visual disability user to adjust the face position to be in the camera scope.

  4. Examples of the user interface design are as following;

Marketing & Scaling

  1. Online marketing

    1. Develop marketing materials, such as infographics and infomotions

    2. Run social media campaigns on Facebook, Youtube, etc.

  2. Launch event

    1. Partners with the Ministry of Social Development and Human Security of Thailand to organize a press conference.

    2. Invites more than 5 relatable organizations, more than 10 media, and other relatable stakeholders to attend.

    3. Online streams the event and targets more than 100 people to virtually attend.

  3. Disability advocacy

    1. Partners with the Department for Empowerment of Persons with Disabilities of Thailand to reach out to 500 disability users.

    2. Organizes workshops to train the staff of the Disability Skill & Career Path Development Center for 10 centers in major cities.

  4. Education outreach

    1. Partners with Chulalongkorn university to organize an academic seminar on the topic of “AI Technology Utilization for a Better Mental Health Service”.

    2. Invites academia, researchers, psychiatrists, students, healthtech startups, and other relatable stakeholders to attend.

Examples of previous events and workshops organized by Vulcan Coalition

Last updated