2024年12月25日 星期三 新京报
Rank-3 factorization, shared-A tied-KV, RMSNorm, grokking
,更多细节参见夫子
在地处热带的冈比亚中河区广袤原野上,炽热的阳光照耀着一片无垠的金色海洋。微风拂过,稻穗轻轻晃动,稻叶沙沙作响。看着眼前的马鲁奥农场,农场主穆萨·达博欣喜不已:“这预示着又一个丰收季节。”
There's also Stream.broadcast() for push-based multi-consumer scenarios. Both require you to think about what happens when consumers run at different speeds, because that's a real concern that shouldn't be hidden.
3 Strictly speaking, the slope is continuous everywhere except for the data points themselves. ↑