The Programme includes keynotes, invited talks, contributed paper presentations and demonstrations.
09:30
Multimodal Video Search by Examples (Hui Wang, Queen’s University Belfast, UK)
09:50
Keynote: “Searching Through Video: More than Needles and Haystacks” (Alan Smeaton, Dublin City University, Ireland)
10:30
MVSE incremental hashing (Wing Ng and Xing Tian, South China University of Technology, China)
10:45
IRM2R: Image retrieval method based on two models re-ranking (Dapeng Zhang, Fujian Normal University, China)
11:00-11:15
Tea break (15 min)
11:15
Vision Keynote (Miroslaw Bober, University of Surrey, UK)
11:55
MixProp: Towards High-Performance Image Recognition via Dual Batch Normalisation (Jinghao Zhang, University of Surrey, UK)
12:10
Keynote: “Sounds Like What We’re Looking For: Some thoughts on the role of audio in MVSE” (Douglas W. Oard, University of Maryland, USA)
12:50-13:30
Lunch break (40 min)
13:30
Transformer based audio classification (Sara Atito, University of Surrey, UK)
13:45
Multimodal learning and representations (Nishant Rai, Waymo, USA)
14:00
MVSE Demo (Ivor Spence and Guanfeng Wu, Queen’s University Belfast, UK)
14:15
MVSE Interface design (Raymond Bond and Maurice Mulvenne, Ulster University, UK)
14:30
Video retrieval assessment “in the wild” (All)
Keynote Speakers:
Abstract: Information seeking is an essential, daily activity that billions of us pursue as part of our work and leisure activities. Information seeking is a journey, a conversation between a searcher and an information repository which can have many final destinations, or none. Video search, like text search or more broadly information retrieval, in its early years has had a focus on known item search or searching around a broad topic. This has been supported and nurtured by our benchmarking activities like TREC, TRECVid, CLEF, FIRE and others and the systems we have developed and that we use in our essential daily information seeking activities reflect this. In this presentation I reflect on the broader information seeing landscape and on how a twist of good (or bad) timing, has left us with the search tools we use but which do not necessarily deliver the best support for information seeking. This broader perspective will then allow us to contextualise the roles that various forms of video search can play in our information seeking journeys.
Biography: Alan Smeaton is Professor of Computing at Dublin City University where he has previously been Head of School and Dean of Faculty. He is an IEEE Fellow, Principal Fellow of Advance HE and an elected member of the Royal Irish Academy. He is the 2022 winner of the ACM SIGMM Technical Achievement Award for contributions to multimedia computing. Alan’s interests are in human memory and how we forget and remember some things and not others. For decades we’ve been building systems that help people find information accurately and quickly, yet half the time we search for things we once knew but have forgotten. Alan focuses on building systems that help us to remember and to learn, not just plug the gap when we forget. This involves the use machine learning and data analytics, and the applications of text, image, and video analysis in areas like learning analytics, personal sensing, and lifelogging.
Abstract: Central to the idea of multimedia search is that there is some user performing some task, and our goal is to help them accomplish that task. In this talk, I’ll focus on aspects of that process that involve audio search. I’ll start with the MVSE task: ranked retrieval, making a couple of connections to information seeking behavior. I’ll then drill down to look in more detail at the ranking task, with particular attention to where useful side information might be found. I’ll then pop back up look at tasks other than ranked retrieval, illustrating the diversity of tasks with two examples: redaction, and creation of derivative works. In this way I’ll seek to illustrate both the centrality of system-centered research and the key role of the user’s task in shaping the questions on which we work.