Avatar

Sivan Doveh

I am a student researcher at Google and a PhD student in Computer Science at the Weizmann Institute of Science, supervised by Prof. Shimon Ullman. I study how vision-language models function. Exploring their core mechanisms, strengths, and limitations - mainly by developing new data and training approaches

I earned my Master’s degree in Electrical Engineering from Tel Aviv University and my Bachelor’s degree in Electrical Engineering from Ben-Gurion University of the Negev (BGU). In parallel with my academic journey, I have also worked at Applied Materials and IBM Research.

I am actively looking for student collaborators in the area of multi-modal learning.

Contact: sivan.doveh [at] weizmann.ac.il

Recent News

05/25: Invited talk at BIU CS Multi-Modal Day .
05/25: Invited talk at New Tech Event .
04/25: Our workshop "Long Multi-Scene Video Foundations: Generation, Understanding and Evaluation" got accepted at ICCV 2025
04/25: Invited talk at 14th Israel Machine Vision Conference (IMVC) 2025.
02/25: 2 paper accepted at CVPR, 2025 (workshops).
01/25: LiveXiv accepted at ICLR, 2025.
01/25: Life update: started an internship at Google :)
12/24: Our workshop "What's Next in Multi-Modal Foundation Models" got accepted at CVPR 2025.
09/24: Invited talk at TU Graz.
12/23: Our workshop "What's Next in Multi-Modal Foundation Models" got accepted at CVPR 2024.

Selected Publications