Integrating Image Processing with Large-Scale Vision/Language Models for Advanced Visual Understanding Workshop

(ICIP 2024)

Tuesday 29 October

Introduction


This workshop aims to bridge the gap between conventional image processing techniques and the latest advancements in large-scale models (LLM and LVLM). In recent years, the integration of large-scale models into image processing tasks has shown significant promise in improving visual object understanding and image classification.


This workshop will provide a platform for researchers and practitioners to explore the synergies between conventional image processing methods and cutting-edge large language model and large vision language models, fostering innovation and collaboration in the field.


Our objectives are as follows:


This workshop is designed for researchers, academics, and industry professionals working in the fields of image processing, computer vision, multimedia processing and natural language processing. Participants should have a basic understanding of image processing concepts and an interest in exploring innovative approaches for visual understanding.


The workshop will consist of paper presentations by leading experts in image processing and large-scale language/vision models. Participants will have the opportunity to engage in discussions, exchange insights, and collaborate on potential research projects.

Program Schedule

ICIP 2024 Workshop 8 Program Schedule

Organizers


Prof. Yong Man Ro earned his Ph.D. degree from the Department of Electrical Engineering at KAIST. He conducted research at various institutions including Columbia University, the University of California, Irvine, and, Berkeley. Additionally, he served as a visiting professor at the University of Toronto. Currently, he holds the position of full professor at the School of Electrical Engineering and ICT endowed chair professor at KAIST. Furthermore, he is the director of the Center for Applied Research in Artificial Intelligence, the Image Video System Lab, and the Integrated Vision and Language Lab at KAIST. Prof. Ro has received notable recognition, including the Young Investigator Finalist Award from ISMRM in 1992 and the Scientist of the Year Award (Korea) in 2003. He has contributed to the academic community, has served as an associate editor for IEEE Signal Processing Letters and currently serving IEEE Transactions on Circuits and Systems for Video Technology. He is also the IVMSP committee member in the IEEE Signal Processing Society. Moreover, he has played key roles in organizing numerous international conferences, including serving as the organizing chair/program chair of MMM 2020/PCM 2015 and IWDW 2004. He has also curated several special sessions, such as "Explainable Deep Neural Networks for Image/Video Processing" at ICIP 2021 and 2022, "Digital Photo Album Technology" at AIRS 2005, "Social Media" at DSP 2009, and "Human 3D Perception and 3D Video Assessments" at DSP 2011. Prof. Ro's recent research interests span various AI areas, including deep learning in computer vision and image processing, multimodal learning, integrating vision, speech, and language for AI, explainable and robust AI. His scholarly output includes over 520 peer-reviewed papers published in international journals and conferences.


2. Hak Gu Kim (Chung-Ang University, South Korea), e-mail address: hakgukim@cau.ac.kr

Prof. Hak Gu Kim received the Ph.D. degree from the Department of Electrical Engineering at Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea, in 2019. From 2019 to 2021, he was a postdoctoral researcher at École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland. He is currently an assistant professor at the Graduate School of Advanced Imaging Science, Multimedia & Films (GSAIM) in Chung-Ang University, South Korea. He served as the tutorial chair for the 2023 IEEE International Conference on Electronics, Information, and Communication (ICEIC). His research interests include deep learning and machine learning in 2D/3D/VR image and video processing and computer vision, human visual perception, multi-modal learning, and vulnerability of deep neural network for convergence of AI and reality.


3. Nikolaos (Nikos) Boulgouris (Brunel University London, United Kingdom), e-mail address: nikolaos.boulgouris@brunel.ac.uk

Prof. Nikolaos (Nikos) Boulgouris is an academic with the Department of Electronic and Computer Engineering of Brunel University London. From 2004 to 2010, he was an academic member of staff with King's College London, and prior to that he was a researcher with the Department of Electrical and Computer Engineering of the University of Toronto, Canada. He has published more than 100 papers in international journals and conferences and has participated in numerous national and international research consortia. Dr. Boulgouris was on the organizing committee of six major IEEE conferences, and served as Technical Program Chair for the 2018 IEEE International Conference on Image Processing (ICIP). He served as Senior Area Editor for the IEEE Transactions on Image Processing and as Associate Editor for the IEEE Transactions on Circuits and Systems for Video Technology, from which he received the 2017 Best Associate Editor Award. He also served as Associate Editor for the IEEE Transactions on Image Processing, and the IEEE Signal Processing Letters. He was co-editor of the book Biometrics: Theory, Methods, and Applications, which was published by Wiley - IEEE Press Series on Computational Intelligence, and guest co-editor for two journal special issues. From 2020 to 2022, he served as an elected member of the IEEE Multimedia Signal Processing Technical Committee (MMSP - TC). From 2014 to 2019, he served as an elected member of the IEEE Image, Video, and Multidimensional Signal Processing Technical Committee (IVMSP - TC). Dr. Boulgouris is a Senior Member of the IEEE and a Fellow of the Higher Education Academy.

Workshop Committees


Zhu Li (University of Missouri, United States), e-mail address: lizhu@umkc.edu

Wen-Huang Cheng (National Taiwan University, Taiwan), e-mail address: wenhuang@csie.ntu.edu.tw

Wesley De Neve (Ghent University, Belgium), e-mail address: Wesley.DeNeve@ghent.ac.kr

Cong-Thang Truong (The University of Aizu, Japan), e-mail address: thang@u-aizu.ac.jp

Minsu Kim (Meta, United Kingdom), e-mail address: minsu@meta.com

Hyung Il Kim (ETRI, South Korea), e-mail address: hikim@etri.re.kr

Youngjoon Yu (KAIST, South Korea), e-mail address: greatday@kaist.ac.kr

Chen Liu (City University of Hong Kong, Hong Kong), e-mail address: chen.liu@cityu.edu.hk

Tania Stathaki (Imperial College London, United Kingdom), e-mail address: t.stathaki@imperial.ac.uk

Yiannis Kompatsiaris (CERTH-ITI, Greece), e-mail address: ikom@iti.gr

Nikos Deligiannis (Vrije Universiteit Brussel, Belgium), e-mail address: ndeligia@etrovub.be


Supported by


This workshop is supported by Center for Applied Research in Artificial Intelligence (CARAI).