Top Important Computer Vision Papers for the Week from 11/12 to 17/12
Stay Updated with Recent Computer Vision Research
Every week, several top-tier academic conferences and journals showcased innovative research in computer vision, presenting exciting breakthroughs in various subfields such as image recognition, vision model optimization, generative adversarial networks (GANs), image segmentation, video analysis, and more.
This article provides a comprehensive overview of the most significant papers published in the third week of December 2023, highlighting the latest research and advancements in computer vision. Whether you’re a researcher, practitioner, or enthusiast, this article will provide valuable insights into the state-of-the-art techniques and tools in computer vision.
Table of Contents:
Stable Diffusion
Vision Language Models
Image Generation & Editing
Video Generation & Editing
Image Segmentation
Image Recognition
My E-book: Data Science Portfolio for Success Is Out!
I recently published my first e-book Data Science Portfolio for Success which is a practical guide on how to build your data science portfolio. The book covers the following topics: The Importance of Having a Portfolio as a Data Scientist How to Build a Data Science Portfolio That Will Land You a Job?
1. Stable Diffusion
1.1. Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors
1.2. Customizing Motion in Text-to-Video Diffusion Models
1.3. Efficient Quantization Strategies for Latent Diffusion Models
1.4. FreeInit: Bridging Initialization Gap in Video Diffusion Models
1.5. DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing
1.7. Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation
1.8. PEEKABOO: Interactive Video Generation via Masked-Diffusion
1.8. Clockwork Diffusion: Efficient Generation With Model-Step Distillation
1.9. LIME: Localized Image Editing via Attention Regularization in Diffusion Models
1.10. UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation
2. Vision Language Models
2.1. Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
2.2. VILA: On Pre-training for Visual Language Models
2.3. CCM: Adding Conditional Controls to Text-to-Image Consistency Models
2.4. How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation
2.5. CogAgent: A Visual Language Model for GUI Agents
2.6. A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
2.6. Pixel-Aligned Language Models
2.7. Vision-Language Models as a Source of Rewards
3. Image Generation & Editing
3.1. ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations
3.2. SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds
3.3. Mosaic-SDF for 3D Generative Models
3.4. SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance
4. Video Generation & Editing
4.1. DreaMoving: A Human Dance Video Generation Framework Based on Diffusion Models
4.2. Photorealistic Video Generation with Diffusion Models
4.3. MVDD: Multi-View Depth Diffusion Models
4.4. VideoLCM: Video Latent Consistency Model
5. Image Segmentation
5.1. CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
6. Image Recognition
6.1. Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Are you looking to start a career in data science and AI and do not know how? I offer data science mentoring sessions and long-term career mentoring:
Mentoring sessions: https://lnkd.in/dXeg3KPW
Long-term mentoring: https://lnkd.in/dtdUYBrM