Keynote Speaker

Prof. Huiyu Zhou, University of Leicester

Bio: Dr. Huiyu Zhou received a Bachelor of Engineering degree in Radio Technology from Huazhong University of Science and Technology of China, and a Master of Science degree in Biomedical Engineering from University of Dundee of United Kingdom, respectively. He was awarded a Doctor of Philosophy degree in Computer Vision from Heriot-Watt University, Edinburgh, United Kingdom.Dr. Zhou currently is a full Professor at School of Computing and Mathematical Sciences, University of Leicester, United Kingdom. He has published over 500 peer-reviewed papers in the field. His research work has been or is being supported by UK EPSRC, ESRC, AHRC, MRC, EU, Innovate UK, Royal Society, British Heart Foundation, Leverhulme Trust, Puffin Trust, Alzheimer’s Research UK, Invest NI and industry. Homepage: https://le.ac.uk/people/huiyu-zhou

Title: Video Understanding for Behavioural Analysis

Abstract: Video understanding has emerged as a powerful tool in behavioural analysis, offering innovative methodologies to capture and interpret complex behaviours from visual data. This talk explores the various techniques used in video understanding, including machine learning, deep learning, and computer vision, which can be used to address the challenges in this field, such as accurately detecting and tracking multiple subjects, recognising subtle and nuanced behaviours, and managing large volumes of video data. The application of video understanding extends across numerous sectors from multimedia, healthcare to security, with potential to revolutionise behavioural analysis and beyond.

Prof Zhou will share his experience and insights in video understanding for behavioural analysis. He will present a case study on Parkinson’s disease (PD) diagnosis to demonstrate the capability of video understanding in healthcare. He will describe the methodologies developed to analyse behaviours in both animals (e.g, mice) and humans. These include pioneering techniques for detecting and tracking single and multiple mice, recognising individual and social behaviours, conducting comprehensive social behaviour analysis, and also for distinguishing between normal and PD-afflicted mice by examining their interactions and movements. He will conclude with a vision for the future of video understanding in behavioural analysis, along with an outline for the anticipated advancements in technology and methodology, the potential for broader applications, and the ongoing research efforts aimed at overcoming current limitations.

Prof. Mark Plumbley, University of Surrey

Bio: Prof. Mark Plumbley is Professor of Signal Processing at the Centre for Vision, Speech and Signal Processing (CVSSP) at the University of Surrey, in Guildford, UK. He is an expert on analysis and processing of audio, using a wide range of signal processing and machine learning methods. He led the first international data challenge on Detection and Classification of Acoustic Scenes and Events (DCASE), and is a co-editor of the book “Computational Analysis of Sound Scenes and Events” (Springer, 2018). He currently holds a 5-year EPSRC Fellowship “AI for Sound” on automatic recognition of everyday sounds. He is a Member of the IEEE Signal Processing Society Technical Committee on Audio and Acoustic Signal Processing, and a Fellow of the IET and IEEE.

Title: Machine Learning for Everyday Sounds: Recognition, Captioning, Visualization, Separation and Generation of Audio

Abstract: The last few years has seen a rapid increase of interest in machine learning for everyday sounds. Starting a decade ago with acoustic scene classification and sound event detection, the challenges and workshops on Detection and Classification of Acoustic Scenes and Events (DCASE) have brought together researchers from academia and industry to establish a new research community. In this talk, I will highlight some of the recent work taking place in this area at the University of Surrey, including pretrained audio neural networks (PANNs), audio captioning, audio visualization, audio source separation and audio generation (AudioLDM). I will also mention some cross-cutting issues such as dataset collection and algorithm efficiency, and discuss how we might design future audio machine learning applications for the benefit of people and society.