Elon Musk Praises Kimi's Research: Same Computing Power, 25% Efficiency Gain Challenges Foundation of LLMs - AI News - Seekool

On March 16, Kimi published research challenging a foundational pillar of large language models: the residual connection. The proposed architecture replaces the traditional 'equal addition' method with an 'elegant rotation' mechanism, where each layer actively selects information from previous layers via a small query vector. This prevents intermediate layers from becoming 'ineffective workers' as networks deepen.

The breakthrough quickly stirred Silicon Valley. OpenAI's Jerry Tworek called it the beginning of 'Deep Learning 2.0,' while Andrej Karpathy noted the industry still has room to explore the fundamentals of 'Attention is All You Need.' Experiments show a 7.5% improvement on the GPQA-Diamond science reasoning task, with gains of 3.6% in math and 3.1% in code generation, while inference delay increases less than 2%.

Reader Comments