๐Ÿ“š Weekly AI Paper Digest

๊ธฐ๊ฐ„: 2026-03-09 ~ 2026-03-14 ์„ ์ •: ์ด๋ฒˆ ์ฃผ ๊ฐ€์žฅ ์ฃผ๋ชฉ๋ฐ›์€ ๋…ผ๋ฌธ Top 5


๐Ÿ† ์ด๋ฒˆ ์ฃผ Top 5

์ˆœ์œ„๋…ผ๋ฌธโฌ†๏ธDeep Dive
๐Ÿฅ‡Geometry-Guided Reinforcement Learning fโ€ฆ136DD-041
๐ŸฅˆPenguin-VL: Exploring the Efficiency Limโ€ฆ104DD-042
๐Ÿฅ‰OpenClaw-RL: Train Any Agent Simply by Tโ€ฆ90DD-043
4.Lost in Stories: Consistency Bugs in Lonโ€ฆ81DD-044
5.Holi-Spatial: Evolving Video Streams intโ€ฆ77DD-045

๐Ÿ” ์ด๋ฒˆ ์ฃผ ํŠธ๋ Œ๋“œ

ํ•ต์‹ฌ ํ‚ค์›Œ๋“œ

  • 3D ๊ณต๊ฐ„ ์ง€๋Šฅ (Spatial Intelligence): ๋น„๋””์˜ค ์ŠคํŠธ๋ฆผ์„ ํ™œ์šฉํ•œ ๋Œ€๊ทœ๋ชจ 3D ๋ฐ์ดํ„ฐ ๊ตฌ์ถ•๊ณผ 3D ์žฅ๋ฉด ํŽธ์ง‘์˜ ์ •ํ•ฉ์„ฑ ํ™•๋ณด
  • ์ƒ์„ฑ ๊ฒฐ๊ณผ์˜ ์ผ๊ด€์„ฑ (Consistency): 3D ๋‹ค์‹œ์  ํŽธ์ง‘๊ณผ ์žฅ๋ฌธ ์Šคํ† ๋ฆฌ ์ƒ์„ฑ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ชจ์ˆœ ํ•ด๊ฒฐ ๋ฐ ์ผ๊ด€์„ฑ ์œ ์ง€
  • ์—์ด์ „ํŠธ ์˜จ๋ผ์ธ ํ•™์Šต (Agentic Online RL): ์‚ฌ์šฉ์ž ๋Œ€ํ™”๋‚˜ ๋„๊ตฌ ๊ฒฐ๊ณผ ๋“ฑ โ€˜๋‹ค์Œ ์ƒํƒœ ์‹ ํ˜ธโ€™๋ฅผ ์‹ค์‹œ๊ฐ„ ํ•™์Šต ๋ฐ์ดํ„ฐ๋กœ ํ™œ์šฉ
  • ๊ฒฝ๋Ÿ‰ํ™”๋œ VLM (Efficient VLM): ๊ฑฐ๋Œ€ ๋น„์ „ ์ธ์ฝ”๋” ์˜์กด์„ฑ์„ ์ค„์ด๊ณ  ๋ชจ๋ฐ”์ผ/์—ฃ์ง€ ๋””๋ฐ”์ด์Šค ๋ฐฐ์น˜๋ฅผ ๊ณ ๋ คํ•œ ์†Œํ˜• ๋ชจ๋ธ ๊ฐœ๋ฐœ
  • LLM ๊ธฐ๋ฐ˜ ๋น„์ „ ์ฒ˜๋ฆฌ: ๊ธฐ์กด์˜ ๋Œ€์กฐ ํ•™์Šต ๊ธฐ๋ฐ˜ ๋น„์ „ ์ธ์ฝ”๋”๋ฅผ ๋Œ€์ฒดํ•˜๋Š” LLM ์•„ํ‚คํ…์ฒ˜ ๊ธฐ๋ฐ˜์˜ ์‹œ๊ฐ ์ดํ•ด ๋ฐฉ์‹ ํƒ์ƒ‰

๊ณตํ†ต ์ฃผ์ œ

์ด๋ฒˆ ์ฃผ ๋…ผ๋ฌธ๋“ค์€ ์ƒ์„ฑํ˜• AI๊ฐ€ ๋‹จ์ˆœํžˆ โ€˜๊ฑฐ๋Œ€ํ•ด์ง€๋Š” ๊ฒƒโ€™์—์„œ ๋ฒ—์–ด๋‚˜, ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ์˜ ๊ตฌ์ฒด์ ์ธ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์ง„ํ™”ํ•˜๊ณ  ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ํŠนํžˆ 3D ๊ณต๊ฐ„ ์ดํ•ด์™€ ์—์ด์ „ํŠธ์˜ ์‹ค์‹œ๊ฐ„ ํ•™์Šต, ๊ทธ๋ฆฌ๊ณ  ๋ชจ๋ธ์˜ ๊ฒฝ๋Ÿ‰ํ™”๋ฅผ ํ†ตํ•ด **์ •ํ•ฉ์„ฑ(Consistency)๊ณผ ํšจ์œจ์„ฑ(Efficiency)**์„ ๊ทน๋Œ€ํ™”ํ•˜๋Š” ๋ฐ ์ฃผ๋ ฅํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, 2D ๋น„์ „์ด๋‚˜ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋ฅผ 3D๋‚˜ ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒํ•œ ๊ฒฝํ—˜์œผ๋กœ ํ™•์žฅํ•˜๋ ค๋Š” ์‹œ๋„๊ฐ€ ๋‘๋“œ๋Ÿฌ์ง‘๋‹ˆ๋‹ค.

์ฃผ๋ชฉํ•  ์ 

๊ฐ€์žฅ ๋ˆˆ์— ๋„๋Š” ์ ์€ ๊ฐ•ํ™” ํ•™์Šต(Reinforcement Learning)์˜ ํ™œ์šฉ ๋ฒ”์œ„๊ฐ€ ํ™•๋Œ€๋˜๊ณ  ์žˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. 3D ์žฅ๋ฉด์„ ํŽธ์ง‘ํ•  ๋•Œ ๊ธฐํ•˜ํ•™์  ์ œ์•ฝ ์กฐ๊ฑด์„ ๋ณด์ƒ ์‹ ํ˜ธ๋กœ ํ™œ์šฉํ•˜๊ฑฐ๋‚˜(Paper 1), ์—์ด์ „ํŠธ๊ฐ€ ๋Œ€ํ™” ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ชจ๋“  ์ƒํƒœ ๋ณ€ํ™”๋ฅผ ์ฆ‰๊ฐ์ ์ธ ํ•™์Šต ๊ธฐํšŒ๋กœ ์‚ผ๋Š”(Paper 3) ๋“ฑ RL์ด ์ƒ์„ฑ ๋ฐ ์ œ์–ด ์ž‘์—…์˜ ์ •๋ฐ€๋„๋ฅผ ๋†’์ด๋Š” ํ•ต์‹ฌ ๋„๊ตฌ๋กœ ๋– ์˜ค๋ฅด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, ์›น์ƒ์˜ ๋ฌด์ˆ˜ํ•œ ๋น„๋””์˜ค ๋ฐ์ดํ„ฐ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ 3D ๊ณต๊ฐ„ ์ง€๋Šฅ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋ ค๋Š”(Paper 5) ์‹œ๋„๋Š” ๋ฐ์ดํ„ฐ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

์‹ค๋ฌด ์‹œ์‚ฌ์ 

๊ฐœ๋ฐœ์ž์™€ ์—ฐ๊ตฌ์ž๋Š” ๊ฑฐ๋Œ€ ํŒŒ๋ผ๋ฏธํ„ฐ ์Šค์ผ€์ผ๋ง๋ณด๋‹ค๋Š” ํŠน์ • ๋„๋ฉ”์ธ(3D, ๋ชจ๋ฐ”์ผ, ๋กฑํ…์ŠคํŠธ)์˜ ๊ตฌ์กฐ์  ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๋Š” ์•„ํ‚คํ…์ฒ˜์— ์ฃผ๋ชฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ ์˜จ๋””๋ฐ”์ด์Šค AI ์„œ๋น„์Šค๋ฅผ ๊ณ„ํšํ•œ๋‹ค๋ฉด ๊ฑฐ๋Œ€ ๋น„์ „ ์ธ์ฝ”๋” ์—†์ด๋„ ์„ฑ๋Šฅ์„ ๋‚ผ ์ˆ˜ ์žˆ๋Š” ๊ฒฝ๋Ÿ‰ํ™”๋œ VLM ์„ค๊ณ„(Paper 2)๊ฐ€ ํ•„์ˆ˜์ ์ด๋ฉฐ, ์—์ด์ „ํŠธ๋ฅผ ๊ฐœ๋ฐœํ•  ๋•Œ๋Š” ์‚ฌ์šฉ์ž์™€์˜ ์ƒํ˜ธ์ž‘์šฉ ์ž์ฒด๋ฅผ ๋ชจ๋ธ ์„ฑ์žฅ์„ ์œ„ํ•œ ํ•ต์‹ฌ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ์œผ๋กœ ์„ค๊ณ„(Paper 3)ํ•˜๋Š” ์ „๋žต์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.


๐Ÿ“‘ ๋…ผ๋ฌธ๋ณ„ ์š”์•ฝ

๐Ÿฅ‡ 1. Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing

arXiv: 2603.03143 | โฌ†๏ธ 136 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: 3d-editing reinforcement-learning multi-view-consistency flux 3d-gaussian-splatting vggt rlhf computer-vision

3D ์žฅ๋ฉด ํŽธ์ง‘์„ ์œ„ํ•ด ๊ธฐ์กด์˜ ์ง€๋„ ํ•™์Šต(SFT) ๋ฐฉ์‹์ด ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ , ๊ธฐํ•˜ํ•™์  ๊ฒ€์ฆ์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ ์— ์ฐฉ์•ˆํ•˜์—ฌ ๊ฐ•ํ™” ํ•™์Šต(RL)๊ณผ 3D ๊ธฐ๋ฐ˜ ๋ชจ๋ธ(VGGT)์„ ๊ฒฐํ•ฉํ•ด ๋‹ค์ค‘ ์‹œ์  ์ผ๊ด€์„ฑ์„ ํ™•๋ณดํ•œ ํš๊ธฐ์ ์ธ ํ”„๋ ˆ์ž„์›Œํฌ์ž…๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿฅˆ 2. Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

arXiv: 2603.06569 | โฌ†๏ธ 104 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: vision-language-model efficiency llm-based-encoder edge-ai multimodal-learning compact-model mobile-ai

์ด ๋…ผ๋ฌธ์€ ๊ฑฐ๋Œ€ํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ ํฌ๊ธฐ์— ์˜์กดํ•˜์ง€ ์•Š๊ณ , ์–ธ์–ด ๋ชจ๋ธ ๊ธฐ๋ฐ˜์˜ ๋น„์ „ ์ธ์ฝ”๋”(LLM-based Vision Encoder)๋ฅผ ๋„์ž…ํ•˜์—ฌ ํšจ์œจ์„ฑ์˜ ๊ทนํ•œ์„ ํƒ๊ตฌํ•จ์œผ๋กœ์จ ๋ชจ๋ฐ”์ผ ๋ฐ ์—ฃ์ง€ ๋””๋ฐ”์ด์Šค์—์„œ ์‹ค์ œ ๋ฐฐํฌ๊ฐ€ ๊ฐ€๋Šฅํ•œ ์†Œํ˜• ๊ณ ํ’ˆ์งˆ ๋น„์ „ ์–ธ์–ด ๋ชจ๋ธ์„ ๊ตฌํ˜„ํ–ˆ๋‹ค๋Š” ์ ์—์„œ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿฅ‰ 3. OpenClaw-RL: Train Any Agent Simply by Talking

arXiv: 2603.10165 | โฌ†๏ธ 90 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: rlhf agentic-rl online-learning llm-agents prm sglang model-optimization

๊ธฐ์กด AI ์—์ด์ „ํŠธ๊ฐ€ ๋ฒ„๋ฆฌ๊ณ  ์žˆ๋˜ ๋ชจ๋“  ์ƒํ˜ธ์ž‘์šฉ์˜ ๊ฒฐ๊ณผ(๋‹ค์Œ ์ƒํƒœ ์‹ ํ˜ธ)๋ฅผ ์‹ค์‹œ๊ฐ„ ํ•™์Šต ๋ฐ์ดํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ, ๋Œ€ํ™”, ์ฝ”๋”ฉ, GUI ์ œ์–ด ๋“ฑ ์„œ๋กœ ๋‹ค๋ฅธ ํ™˜๊ฒฝ์„ ํ•˜๋‚˜์˜ ํ†ตํ•ฉ๋œ ๊ฐ•ํ™” ํ•™์Šต ๋ฃจํ”„๋กœ ํ•™์Šต์‹œํ‚ค๋Š” ํ˜์‹ ์ ์ธ ํ”„๋ ˆ์ž„์›Œํฌ์ž…๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


4. 4. Lost in Stories: Consistency Bugs in Long Story Generation by LLMs

arXiv: 2603.05890 | โฌ†๏ธ 81 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: story-generation consistency-benchmark llm-evaluation long-context automated-checker nlp hallucination

์ดˆ์žฅํ˜•(long-form) ์Šคํ† ๋ฆฌ ์ƒ์„ฑ ๊ณผ์ •์—์„œ LLM(Large Language Model)์ด ์ž์‹ ์ด ์„ค์ •ํ•œ ์‚ฌ์‹ค์ด๋‚˜ ์„ธ๊ณ„๊ด€์„ ์žƒ์–ด๋ฒ„๋ฆฌ๋Š” โ€˜์ผ๊ด€์„ฑ ๋ถ€์žฌโ€™ ๋ฌธ์ œ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ์ •์˜ํ•˜๊ณ , ์ด๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์ดˆ์˜ ๋ฒค์น˜๋งˆํฌ(ConStory-Bench)์™€ ์ž๋™ํ™”๋œ ํ‰๊ฐ€ ํŒŒ์ดํ”„๋ผ์ธ(ConStory-Checker)์„ ์ œ์‹œํ–ˆ๋‹ค๋Š” ์ ์—์„œ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


5. 5. Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence

arXiv: 2603.07660 | โฌ†๏ธ 77 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: spatial-intelligence 3d-reconstruction gaussian-splatting automated-pipeline dataset-generation vlm computer-vision 3d-vision

์ด ๋…ผ๋ฌธ์€ ์ธ๊ฐ„์˜ ๊ฐœ์ž… ์—†์ด ์›์‹œ ๋น„๋””์˜ค(Raw Video)๋ฅผ ๋Œ€๊ทœ๋ชจ์˜ ์ •๋ฐ€ํ•œ 3D ๊ณต๊ฐ„ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ์ž๋™ ๋ณ€ํ™˜ํ•˜์—ฌ, ๊ณต๊ฐ„ ์ง€๋Šฅ ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ๊ทผ๋ณธ์ ์œผ๋กœ ํ•ด๊ฒฐํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿ“… ์ƒ์„ฑ์ผ: 2026-03-15 | ๐Ÿค– GLM-4.7 Weekly Digest