Skip to content

KAIST Visual AI Group

Home

KAIST Visual AI Group

The KAIST Visual AI Group, led by Minhyuk Sung, conducts research on advancing technologies for generating, processing, and analyzing diverse visual data. Our work spans areas such as machine learning, computer vision, and computer graphics.

Research Highlights

Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation (ICCV 2025)
A framework for vision-language models to perform spatial reasoning in arbitrary perspectives.

VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors (CVPR 2025)
The first method for 3D object composition editing in videos without any training.

StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces (ICLR 2025)
A method that combines diffusion synchronization and score distillation sampling for generating images in arbitrary spaces (e.g., 360° panoramas and mesh textures).

SyncTweedies: A General Generative Framework Based on Synchronized Diffusions (NeurIPS 2024)
A novel approach for synchronizing multiple reverse diffusion processes to generate diverse visual content.

Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses (NeurIPS 2024)
A framework for learning neural pose representations that facilitate the generation and transfer of non-rigid object poses.

Occupancy-Based Dual Contouring (SIGGRAPH Asia 2024)
A dual contouring method that provides state-of-the-art performance for various neural implicit functions.

ReGround: Improving Textual and Spatial Grounding at No Cost (ECCV 2024)
A cost-free network reconfiguration for improving the text-prompt fidelity in layout-guided image generation.

Posterior Distillation Sampling (CVPR 2024)
A novel optimization method for editing parameterized images, applicable to NeRF, 3D Gaussian Splatting, and SVG.

As-Plausible-As-Possible: Plausibility-Aware Mesh Deformation Using 2D Diffusion Priors (CVPR 2024)
A plausibility-aware mesh deformation framework integrating Jacobian-based geometry representation and generative image priors.

SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions (NeurIPS 2023)
A zero-shot plug-and-play module that synchronizes multiple reverse diffusion processes, producing coherent images of various sizes.

SALAD: Part-Level Latent Diffusion for 3D Shape Generation and Manipulation (ICCV 2023)
A cascaded diffusion model based on a part-level implicit 3D representation.

PartGlot: Learning Shape Part Segmentation from Language Reference Games (CVPR 2022 (Oral))
A neural framework for learning semantic part segmentation of 3D shape geometry based solely on part referential language.

OptCtrlPoints: Finding the Optimal Control Points for Biharmonic 3D Shape Deformation (Pacific Graphics 2023)
A data-driven framework identifying the optimal sparse set of control points for biharmonic 3D shape deformation.

News

[Jun 2025] One paper has been accepted to ICCV 2025.
[Jun 2025] One paper has been accepted to TMLR.
[Apr 2025] Minhyuk gave a keynote talk at CVM 2025 titled "Inference-Time Guided Generation with Diffusion and Flow Models". Click to download the slides.

3D Gallery