The COVID-19 crisis has reportedly tested healthcare systems around the world. Access to a vaccine against COVID-19 has stabilized the situation day by day. However, screening with large-scale population-based nucleic acid testing for new coronaviruses has had to continue in order to detect positive cases and thus break the chain of possible virus transmission. Therefore, scientists must investigate new technologies to reduce the time and cost of diagnostic testing so that it can be performed on a large scale in a convenient, effective and economical manner. In the framework of the Interspeech 2021 International Congress, a research group presented the system to the Cough Sound Track of the Diagnosis Using Acoustics COVID-19 (DiCOVA) Challenge. The article related to their contribution has been accepted for inclusion in the Interspeech Scientific Program.
The research was led by UPF audiovisual systems engineering alumni, researchers Adrià Mallol and Helena Cuesta from the University of Augsburg (Germany), with the participation of Emilia Gómez (European Commission Joint Research Centre), both members of the Music Technology Research Group (MTG) of the UPF Department of Information and Communication Technologies (DTIC), and researcher Björn Schuller from the University of Augsburg and Imperial College London (UK).
Previous AI-based systems have proven effective in detecting coughing and sneezing and identifying respiratory abnormalities. Artificial intelligence has also been used in mental health to identify patients with depression or post-traumatic stress disorder. Following the advances in digital health. “Inspired by these studies and based on the respiratory diseases caused by COVID-19, we set ourselves the challenge of investigating whether artificial intelligence techniques could detect diseases related to this virus through automated cough analysis,” explained Helena Cuesta, a member of the research team.
Cough signals altered in patients who tested positive for COVID-19
In this paper, the authors studied two different neural network architectures, but with a common structure: the first block processes the input spectrogram and extracts a set of embedded features, and the second block classifies patients based on whether these features correspond to patients who test positive for COVID-19 or healthy patients.
“Our model uses the spectrogram, a time-frequency representation of the audio signal, as an input.”
The first step is to pre-process the input data. In general, the database recordings contain various coughs, separated by silences (the typical pattern when we cough). “In order to keep only the part of the recording that contains relevant information, i.e. coughs, we use a signal energy-based sound activity detector (SAD),” Cuesta explains. After filtering this data, the next step is to extract the features and subsequently segment them. “Our model uses a spectrogram, a time-frequency representation of the audio signal, as input.” She adds, “First, we calculate the spectrogram for each recording in the database and then segment it into one-second segments of each.”
Gender of the patient matters
An interesting contribution of the project was to study different versions of the neural network to investigate whether the patient’s gender was a consideration when analyzing the cough. “Intuitively, when we approached this work, one of our assumptions was that male and female coughs should have different characteristics because their vocal tracts differ in size and shape,” the authors commented.
From a spectral point of view, coughs produced by males and females are not necessarily equal
One of the most remarkable aspects from the experiments conducted in their work is that in most of the cases evaluated by the authors, the model that included information about the patient’s gender obtained better results in the predictions, which confirms the hypothesis that, from a spectral point of view, coughs produced by men and coughs produced by women are not necessarily equal.
Cough Tracks – DiCOVA Challenge
The organizers of the DiCOVA challenge provided participants with a database (the Coswara dataset) containing 1040 audio recordings of people coughing from 1 to 15 seconds. Along with the recordings, this database provides a range of metadata associated with each recording: positive/negative for COVID-19, the individual’s gender and nationality. “Based on this data, we have developed and evaluated two different neural networks that use one second of audio to predict positive or negative COVID-19,” the authors noted.
Although this work is only the first approach to detect COVID-19 through automated cough analysis, the experiments presented by the authors provide some clues that can be followed up in the next steps of this study. We still need to understand how the cough signal is altered in COVID-19 positive patients. Therefore, features can be extracted and specific neural networks can be designed to improve the quality of the model.