I will be joining Amazon Agentic AI as an Applied Scientist in June 2025. I received my Ph.D. degree at KAIST.
Humans are inherently multi-modal learners, naturally understanding the world by looking (vision), listening (audio), and communicating (language). I am passionate about advancing machine intelligence to mirror this ability, enabling systems to understand the world holistically and generate faithful, human-centered content.
My work explores the following, but not limited to:
smwoo95 [at] kaist.ac.kr
shmwoo9395 [at] gmail.com
291, Daehak-ro, Yuseong-gu, Daejeon, Republic of Korea 34141
Ph.D. in EE, KAIST, 2025
on "Deep Visual and Multimodal Generation: Advancing Diffusion Models and Large Vision Language Models"
M.S. in EECS, GIST, 2021
on "Learning to Detect Visual Relationships in Images and Videos"
B.S. in EE, KNU, 2019
25.05 1 paper got accepted to ACL 2025 Findings!
25.05 I successfully defended my PhD! 🎓
25.04 1 paper got accepted to CVIU!
25.02 1 paper got accepted to CVPR 2025!
25.01 1 paper got accepted to NAACL 2025 Main!
24.12 1 paper got accepted to AAAI 2025!
24.09 I am excited to keep collaborating with the team remotely!
24.09 I had a fantastic summer internship with Amazon Bedrock!
24.07 3 papers got accepted to ECCV 2024!
24.06 I joined
Amazon Bedrock as a summer intern!
CVIU 2025
[ paper ]
CVPR 2025
[ paper ]
Arxiv 2025
[ paper ]
NAACL 2025
[ paper ]
ECCV 2024
[ paper ]
ECCV 2024
[ paper ]
CVPR 2024
Featured by HuggingFace Daily Papers
Finalist, Qualcomm Innovation Fellowship 2024 Korea
VCIP 2023 Oral presentation
[ paper ]
VCIP 2023
[ paper ]
CVIU 2023
[ paper ]
ICCV 2023
Invited Paper Talk @ CARAI Workshop
[ paper ]
WACV 2023
[ paper ]
ICIP 2022
[ paper ]
Applied Sciences 2022
[ paper ]