This US startup claims to fit a 120-billion-parameter AI model

Artificial intelligence is all about scale, and this is why AI companies are in a maddening rush to optimise AI systems with the most compute power for greater efficiency. The prevailing scaling laws indicate that more data and compute lead to better performance in terms of smarter reasoning and better coding, and perhaps this is why companies are bringing out large language models (LLMs) with bigger parameters.

While the world is debating whether bigger models are not always more efficient in a cost sense, and experts are arguing that scale alone will not produce real intelligence, a US deep-tech AI startup, Tiiny AI Inc., recently unveiled the world’s smallest personal AI supercomputer. Named Tiiny AI Pocket Lab, it has been officially verified by Guinness World Records under the category – The Smallest MiniPC (100 LLM Locally).

According to the company, this is the first time in AI supercomputing that a pocket-sized device possesses the ability to run up to a full 120-billion-parameter LLM entirely on the device, meaning no cloud connectivity, servers, or high-end GPUs.

“Tiiny envisions a world where powerful AI is no longer distant or exclusive. By bringing large-scale intelligence to the edge, we make advanced AI accessible, personal, and seamlessly integrated into everyday life,” read the company’s vision statement on its official website.

In simple words, Tiiny AI Pocket Lab is a device that can run an LLM with as many as 120 billion parameters. The pocket-sized device measures about 14.2 x 8 x 2.53 cm and weighs around 300 grams. It can operate as a complete AI inference system. The device is significant, as it represents a shift in how LLMs will be distributed and accessed by consumers.

The Tiiny AI Pocket Lab has been designed to be an energy-efficient personal intelligence. It operates within a 65W power envelope. Most notably, it offers large-model performance at a fraction of the energy consumption of conventional AI systems backed by GPUs. At a time when cloud-based AI is increasingly grappling with issues related to soaring energy costs, sustainability concerns and growing privacy risks, Tiiny AI shows an alternative that revolves around personal, portable and fully private intelligence. The company believes that the real bottleneck in today’s AI ecosystem is not computing power; rather, it is the dependence on the cloud.

What makes this notable is not raw power, but that all of this happens without the cloud. The device is powered by an ARMv9.2 12-core CPU with a dedicated neural processing unit delivering about 190 TOPS of AI compute and is backed by 80GB of LPDDR5X memory and 1TB of storage. According to the company, Tiiny AI Pocket Lab operates in the ‘golden zone’ of personal AI (10B-100B parameters), which is ideal for more than 80 per cent of real-world needs. It reportedly delivers intelligence comparable to GPT-4o, allowing PhD-level reasoning, multi-step analysis, and deep contextual understanding.

The AI system works on two core technologies, TurboSparse and PowerInfer, which make it possible to run large-parameter models viably on a compact form factor. TurboSparse is a neurone-level sparse activation technique that improves inference efficiency significantly. On the other hand, PowerInfer is an open-source inference engine that accelerates heavy LLM workloads by dynamically sharing computation across CPU and NPU. When combined, these technologies enabled Tiiny AI Pocket Lab to demonstrate capabilities that earlier required professional GPUs worth thousands of dollars.

Tiiny AI Pocket Lab also offers one-click installation of some of the prominent open-source models, such as OpenAI GPT-OSS, Qwen, DeepSeek, Llama, Phi, Mistral, etc. This allows for seamless deployment of open-source AI agents like OpenManus, ComfyUI, Flowise, Libra, Presenton, Bella, and SillyTavern. The company claims that the users will receive continuous updates, including official OTA hardware upgrades. The above features will be released at CES in January 2026.

Simply put, by reducing reliance on cloud servers, TiinyAI Pocket Lab cuts down on operational costs, reduces latency issues, and assuages sustainability concerns that have become integral to data centre-scale operations. Most importantly, the innovation makes advanced AI models accessible to individual users and particularly those in environments with limited resources.

The founding team of Tiiny AI Inc was formed in 2024 which comprises engineers from MIT, Stanford, Intel, Meta, HKUST and SJTU.

This US startup claims to fit a 120-billion-parameter AI model into your pocket

Editorial Context & Insight