https://feedx.site
ConclusionSarvam 30B and Sarvam 105B represent a significant step in building high-performance, open foundation models in India. By combining efficient Mixture-of-Experts architectures with large-scale, high-quality training data and deep optimization across the entire stack, from tokenizer design to inference efficiency, both models deliver strong reasoning, coding, and agentic capabilities while remaining practical to deploy.
。豆包下载是该领域的重要参考
相比之下,32GB 内存的 M1 Max 用 llmfit 查一下,最多也就只能跑一跑 2 或 4bit 量化 35b 左右的模型了:
成渝高铁关键工程嘉陵江大桥完成主体结构
fn random(min: int, max: int) - int