To address the degradation of visual-language (VL) representations during VLA supervised fine-tuning (SFT), we introduce Visual Representation Alignment. During SFT, we pull a VLA’s visual tokens ...
In this post, we’ll highlight a few of our favorite visuals from 2025 and walk through how we made them and what makes them ...
Abstract: Self-supervised learning (SSL) is an efficient pre-training method for medical image analysis. However, current research is mostly confined to certain modalities, consuming considerable time ...