๐Ÿ“š Weekly AI Paper Digest

๊ธฐ๊ฐ„: 2026-03-16 ~ 2026-03-21 ์„ ์ •: ์ด๋ฒˆ ์ฃผ ๊ฐ€์žฅ ์ฃผ๋ชฉ๋ฐ›์€ ๋…ผ๋ฌธ Top 5


๐Ÿ† ์ด๋ฒˆ ์ฃผ Top 5

์ˆœ์œ„๋…ผ๋ฌธโฌ†๏ธDeep Dive
๐Ÿฅ‡Demystifing Video Reasoning346DD-046
๐ŸฅˆInCoder-32B: Code Foundation Model for Iโ€ฆ290DD-047
๐Ÿฅ‰AI Can Learn Scientific Taste266DD-048
4.SocialOmni: Benchmarking Audio-Visual Soโ€ฆ239DD-049
5.MiroThinker-1.7 & H1: Towards Heavy-Dutyโ€ฆ172DD-050

๐Ÿ” ์ด๋ฒˆ ์ฃผ ํŠธ๋ Œ๋“œ

ํ•ต์‹ฌ ํ‚ค์›Œ๋“œ

  • ๊ณ ๋„ํ™”๋œ ์ถ”๋ก  ๋Šฅ๋ ฅ (Advanced Reasoning): ๋‹จ์ˆœํ•œ ํŒจํ„ด ๋งค์นญ์„ ๋„˜์–ด, ๋น„๋””์˜ค ์ƒ์„ฑ ๋ชจ๋ธ์˜ ๋‚ด๋ถ€์  ๋…ผ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ๊ทœ๋ช…ํ•˜๊ณ  ๋ณต์žกํ•œ ๋‹ค๋‹จ๊ณ„ ๋ฌธ์ œ ํ•ด๊ฒฐ ๋Šฅ๋ ฅ์„ ๊ฐ•ํ™”ํ•˜๋Š” ์—ฐ๊ตฌ.
  • ๊ฒ€์ฆ ๊ธฐ๋ฐ˜ ์‹ ๋ขฐ์„ฑ (Verification & Reliability): ๊ธด ํ˜ธ๋ผ์ด์ฆ˜(Long-horizon)์„ ๊ฐ€์ง„ ์—ฐ๊ตฌ ์—์ด์ „ํŠธ์—์„œ โ€˜๊ฒ€์ฆ(Verification)โ€™ ๊ณผ์ •์„ ํ†ตํ•ด ์ถ”๋ก ์˜ ์‹ ๋ขฐ์„ฑ์„ ํ™•๋ณดํ•˜๋ ค๋Š” ์‹œ๋„.
  • ์˜์—ญ ํŠนํ™” ๋ฐ ์‹ฌ์ธต ์ดํ•ด (Domain Specialization): ์‚ฐ์—… ํ˜„์žฅ์˜ ์—„๊ฒฉํ•œ ํ•˜๋“œ์›จ์–ด ์ œ์•ฝ์ด๋‚˜ ๊ณผํ•™์  ํ†ต์ฐฐ(Scientific Taste) ๋“ฑ ์ผ๋ฐ˜์ ์ด์ง€ ์•Š์€ ํŠน์ • ๋„๋ฉ”์ธ์˜ ๊นŠ์ด ์žˆ๋Š” ์ดํ•ด ํ•„์š”์„ฑ.
  • ์‚ฌํšŒ์  ์ƒํ˜ธ์ž‘์šฉ (Social Interactivity): ์ •์ ์ธ ํ…์ŠคํŠธ/์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ๋ฅผ ๋„˜์–ด, ์˜ค๋””์˜ค์™€ ๋น„์ฃผ์–ผ์ด ํ†ตํ•ฉ๋œ ๋™์ ์ธ ๋Œ€ํ™” ํ™˜๊ฒฝ์—์„œ์˜ ์‚ฌํšŒ์  ์ƒํ˜ธ์ž‘์šฉ ๋Šฅ๋ ฅ ํ‰๊ฐ€.

๊ณตํ†ต ์ฃผ์ œ

์ด๋ฒˆ ์ฃผ ๋…ผ๋ฌธ๋“ค์€ AI๊ฐ€ ๋‹จ์ˆœํžˆ โ€˜์–ธ์–ด๋ฅผ ์ƒ์„ฑํ•˜๊ฑฐ๋‚˜ ์ด๋ฏธ์ง€๋ฅผ ๋งŒ๋“œ๋Š” ์ˆ˜ํ–‰์ž(Executor)โ€˜๋ฅผ ๋„˜์–ด, **๋ณต์žกํ•œ ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ ์Šค์Šค๋กœ ํŒ๋‹จํ•˜๊ณ  ๊ฒ€์ฆํ•˜๋Š” ์ง€๋Šฅ์ ์ธ ์ฃผ์ฒด(Agent)**๋กœ ์ง„ํ™”ํ•˜๊ณ  ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ํŠนํžˆ, ์‚ฐ์—… ํ˜„์žฅ์˜ ์ œ์•ฝ ์กฐ๊ฑด, ๊ณผํ•™์  ๊ฐ€์น˜ ํŒ๋‹จ, ์‚ฌํšŒ์  ๋งฅ๋ฝ ์ดํ•ด ๋“ฑ ์ธ๊ฐ„์˜ ๊ณ ๋“ฑ ์ธ์ง€ ๋Šฅ๋ ฅ์„ ๋ชจ๋ฐฉํ•˜๊ฑฐ๋‚˜ ์‹ค์šฉ์ ์ธ ์‹ ๋ขฐ์„ฑ์„ ํ™•๋ณดํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์—ฐ๊ตฌ๊ฐ€ ์ง‘์ค‘๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์ฃผ๋ชฉํ•  ์ 

๊ฐ€์žฅ ํฅ๋ฏธ๋กœ์šด ์ ์€ AI์˜ โ€˜๊ณผํ•™์  ์ทจํ–ฅ(Scientific Taste)โ€˜์„ ํ•™์Šต์‹œ์ผœ ์—ฐ๊ตฌ ์•„์ด๋””์–ด์˜ ์ž ์žฌ์  ํŒŒ๊ธ‰๋ ฅ์„ ํŒ๋‹จํ•˜๊ฒŒ ํ•œ ์—ฐ๊ตฌ๋กœ, ์ด๋Š” AI๊ฐ€ ๋‹จ์ˆœํžˆ ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋„๊ตฌ๋ฅผ ๋„˜์–ด ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์„ ์ œ์‹œํ•˜๋Š” ํŒŒํŠธ๋„ˆ๋กœ ๋ฐœ์ „ํ•  ๊ฐ€๋Šฅ์„ฑ์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋น„๋””์˜ค ์ƒ์„ฑ ๋ชจ๋ธ์ด โ€˜ํ”„๋ ˆ์ž„ ๊ฐ„์˜ ์ˆœ์ฐจ์  ์ถ”๋ก โ€™์ด ์•„๋‹Œ ์ „ํ˜€ ๋‹ค๋ฅธ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ†ตํ•ด ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ๋ฐœํœ˜ํ•œ๋‹ค๋Š” ์‚ฌ์‹ค์„ ๊ทœ๋ช…ํ•œ ์—ฐ๊ตฌ๋Š” ํ‘๋ฐฑ ์ƒ์ž๋กœ ์—ฌ๊ฒจ์ง€๋˜ ์ƒ์„ฑ ๋ชจ๋ธ์˜ ๋‚ด๋ถ€ ์ž‘๋™ ์›๋ฆฌ๋ฅผ ์ดํ•ดํ•˜๋Š” ์ค‘์š”ํ•œ ๋‹จ์„œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์‹ค๋ฌด ์‹œ์‚ฌ์ 

๊ฐœ๋ฐœ์ž์™€ ์—ฐ๊ตฌ์ž๋“ค์€ ๋ฒ”์šฉ์ ์ธ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋†’์ด๋Š” ๊ฒƒ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ํŠน์ • ๋„๋ฉ”์ธ(์‚ฐ์—…, ๊ณผํ•™ ๋“ฑ)์˜ ์—„๊ฒฉํ•œ ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ์กฑํ•˜๋Š” โ€˜ํŠนํ™”๋œ ๋ชจ๋ธโ€™์˜ ํ•„์š”์„ฑ์— ์ฃผ๋ชฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋ณต์žกํ•œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์—์ด์ „ํŠธ๋ฅผ ๊ฐœ๋ฐœํ•  ๋•Œ๋Š” ๋‹จ์ˆœํžˆ ์ถ”๋ก ์„ ์—ฐ๊ฒฐํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค๋Š”, ๊ฐ ๋‹จ๊ณ„์˜ ๊ฒฐ๊ณผ๋ฅผ ๊ฒ€์ฆํ•˜์—ฌ ์˜ค๋ฅ˜๋ฅผ ์ˆ˜์ •ํ•˜๋Š” โ€˜๊ฒ€์ฆ ๋ฉ”์ปค๋‹ˆ์ฆ˜(Verification)โ€˜์„ ์•„ํ‚คํ…์ฒ˜์— ํ•„์ˆ˜์ ์œผ๋กœ ํฌํ•จํ•ด์•ผ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


๐Ÿ“‘ ๋…ผ๋ฌธ๋ณ„ ์š”์•ฝ

๐Ÿฅ‡ 1. Demystifing Video Reasoning

arXiv: 2603.16870 | โฌ†๏ธ 346 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: video-reasoning diffusion-models mechanistic-interpretability chain-of-steps denoising-process ai-safety computer-vision

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿฅˆ 2. InCoder-32B: Code Foundation Model for Industrial Scenarios

arXiv: 2603.16790 | โฌ†๏ธ 290 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: code-llm industrial-ai hardware-design fine-tuning long-context semantic-reasoning optimization

์ผ๋ฐ˜์ ์ธ ์ฝ”๋”ฉ ๋Šฅ๋ ฅ์„ ๋„˜์–ด ์นฉ ์„ค๊ณ„๋‚˜ GPU ์ปค๋„ ์ตœ์ ํ™” ๊ฐ™์€ ๋ณต์žกํ•œ ์‚ฐ์—… ํ˜„์žฅ์˜ ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ์กฑ์‹œํ‚ค๋Š” ์ตœ์ดˆ์˜ 320์–ต ํŒŒ๋ผ๋ฏธํ„ฐ ๊ทœ๋ชจ ์ฝ”๋“œ ๊ธฐ์ดˆ ๋ชจ๋ธ์„ ์ œ์•ˆํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿฅ‰ 3. AI Can Learn Scientific Taste

arXiv: 2603.14473 | โฌ†๏ธ 266 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: ai-science scientific-taste rlcf preference-learning citation-analysis research-automation reinforcement-learning

์ด ๋…ผ๋ฌธ์€ ๋‹จ์ˆœํžˆ ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋Šฅ๋ ฅ์„ ๋„˜์–ด, ๋Œ€๊ทœ๋ชจ ์ปค๋ฎค๋‹ˆํ‹ฐ ํ”ผ๋“œ๋ฐฑ(์ธ์šฉ ์ง€์ˆ˜)์„ ํ†ตํ•ด ์—ฐ๊ตฌ์˜ ์ž ์žฌ์  ์˜ํ–ฅ๋ ฅ์„ ํŒ๋‹จํ•˜๊ณ  ์ œ์•ˆํ•˜๋Š” โ€˜๊ณผํ•™์  ์ทจํ–ฅ(Scientific Taste)โ€˜์„ AI๊ฐ€ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Œ์„ ์ตœ์ดˆ๋กœ ์ž…์ฆํ•˜์˜€๋‹ค๋Š” ์ ์— ์ค‘์š”ํ•œ ์˜์˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


4. 4. SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

arXiv: 2603.16859 | โฌ†๏ธ 239 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: olm social-interactivity benchmarking audio-visual-learning speaker-diarization turn-taking nlp human-computer-interaction

๊ธฐ์กด ์˜ด๋‹ˆ ๋ชจ๋‹ฌ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(OLM)์ด ๊ฐ„๊ณผํ•ด ์™”๋˜ โ€˜์—ญ๋™์ ์ธ ๋Œ€ํ™” ์ƒํ˜ธ์ž‘์šฉ ๋Šฅ๋ ฅโ€™์„ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์ดˆ์˜ ํฌ๊ด„์ ์ธ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ œ์•ˆํ•˜์—ฌ, AI๊ฐ€ ์–ธ์ œ, ๋ˆ„๊ตฌ์—๊ฒŒ, ์–ด๋–ป๊ฒŒ ๋ง์„ ๊ฑธ์–ด์•ผ ํ•˜๋Š”์ง€๋ฅผ ์ œ๋Œ€๋กœ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธธ์„ ์—ด์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


5. 5. MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

arXiv: 2603.15726 | โฌ†๏ธ 172 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: llm reasoning verification agent research-automation fine-tuning reliability mirothinker

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿ“… ์ƒ์„ฑ์ผ: 2026-03-22 | ๐Ÿค– GLM-4.7 Weekly Digest