AI热点 2月前 189 阅读 0 评论

China's Houmo AI Debuts Edge AI Chip for Large Models at WAIC

作者头像
AI中国

AI技术专栏作家 | 发布了 246 篇文章

AsianFin -- China’s Houmo Intelligence unveiled a new high-efficiency AI chip tailored for large models on edge devices at the 2025 World Artificial Intelligence Conference (WAIC), as the startup sharpens its pivot from intelligent driving to next-gen edge computing amid intensifying demand for localized AI inference.

The Houmo Manjie M50, launched in Shanghai during the conference, is China’s first AI chip to integrate memory and computation in a single architecture specifically designed for large-scale inference at the edge. With 160 TOPS of INT8 compute and 100 TFLOPS of bFP16 performance, plus up to 48GB of memory and a bandwidth of 153.6 GB/s, the M50 runs models from 1.5 billion to 70 billion parameters—all within 10 watts of power. The chip is aimed at devices such as PCs, smart speakers, and robots, offering plug-and-play large model capabilities.

“The M50 is just the beginning,” said CEO Wu Qiang at a media briefing. “Our vision is to make AI computing power as accessible as electricity—embedded in every device, across every industry.”

The company also rolled out complementary products including the Liqing-series M.2 cards, Limou-series accelerator cards, and compute boxes, targeting everything from consumer devices to intelligent industrial terminals. It’s the latest sign that China’s edge AI race is heating up as generative models move off the cloud and into everyday products.

Founded in 2020, Houmo initially focused on AI chips for intelligent driving. But by late 2023, Wu concluded the sector was overcrowded and stagnating. “The industry was obsessed with cost competition, and no one believed in L3 autonomy anymore,” Wu said. “We had a chip with strong performance, but the market didn’t want it.”

Instead, the company saw promise in compute-in-memory (CIM)—a chip design that breaks the traditional von Neumann architecture by embedding computation directly into memory arrays. This reduces data movement and energy consumption, addressing bottlenecks in bandwidth and latency—particularly relevant for large AI models.

Houmo re-engineered its first-generation product in under a year, launching the M30 chip for edge large model inference in early 2024. A key vote of confidence came from China Mobile, which used the chip to run a 60-billion-parameter model. In July, the company secured strategic funding from China Mobile’s digital economy funds in Beijing and Shanghai.

“People questioned why I pivoted,” Wu said. “But survival outweighed pride. Autonomous driving was a dead end. Edge AI is a new frontier—and there’s still space to lead.”

Houmo’s new accelerator lineup includes the Limou LM5050 and LM5070 cards, equipped with two and four M50 chips respectively, offering up to 640 TOPS of performance for ultra-large model inference. The company’s transition from SRAM-based CIM to DRAM-based PIM (processing-in-memory) further boosts its hardware’s efficiency and scalability.

Wu says the firm is already developing its next-generation DRAM-PIM AI chip, expected as early as 2026. The chip aims to triple current energy efficiency and support widespread local deployment of models with tens of billions of parameters on devices like tablets and PCs.

“DRAM-PIM is the next step,” said Wu. “It tightens the bond between memory and compute, unlocking real-time intelligence at the edge.”

The WAIC 2025 event, themed “Intelligent Era, Shared Future,” featured more than 1,200 global guests—including 12 Turing and Nobel laureates—and showcased over 3,000 innovations. The exhibition space surpassed 70,000 square meters for the first time, drawing more than 800 companies and revealing more than 100 world or China-first product debuts.

As generative AI evolves, China is increasingly positioning edge AI as a key national focus. Public data suggests China’s computing-in-memory chip market could exceed 110 billion yuan ($15.2 billion) by 2030.

Houmo’s investors include Sequoia Capital China, Qiming Venture Partners, Matrix Partners China, Lenovo Capital, Walden International, and China Mobile.

“Edge intelligence is where the future is headed,” Wu said. “We’re not just building chips—we’re building the infrastructure for the intelligent era.”

作者头像

AI前线

专注人工智能前沿技术报道,深入解析AI发展趋势与应用场景

246篇文章 1.2M阅读 56.3k粉丝

评论 (128)

用户头像

AI爱好者

2小时前

这个更新太令人期待了!视频分析功能将极大扩展AI的应用场景,特别是在教育和内容创作领域。

用户头像

开发者小明

昨天

有没有人测试过新的API响应速度?我们正在开发一个实时视频分析应用,非常关注性能表现。

作者头像

AI前线 作者

12小时前

我们测试的平均响应时间在300ms左右,比上一代快了很多,适合实时应用场景。

用户头像

科技观察家

3天前

GPT-4的视频处理能力已经接近专业级水平,这可能会对内容审核、视频编辑等行业产生颠覆性影响。期待看到更多创新应用!