Skip to content

WT快讯

WeTrying | 币圈快讯早知道

Menu
  • 首页
  • 快讯
  • 港股
  • 美股
  • A股
  • 工具包
Menu

Google’s Gemma Already Acts Like Gemini—Someone Made It Think Like Claude Opus Too

Posted on 2026年4月15日

Add Decrypt as your preferred source to see more of our stories on Google.

If you’ve been following the local AI scene, you probably know Qwopus—the open-source model that tried to distill Claude Opus 4.6’s reasoning into Alibaba’s Qwen, so you could run something resembling Opus on your own hardware for free. It worked surprisingly well. The obvious catch: Qwen is a Chinese model, and not everyone is comfortable with that.

Jackrong, the same pseudonymous developer behind that project, heard the feedback. His answer is Gemopus—a new family of Claude Opus-style fine-tunes built entirely on Google’s open-source Gemma 4. All-American DNA, same idea: frontier-level reasoning, running locally on hardware you already own.

The family comes in two flavors. Gemopus-4-26B-A4B is the heavier option—a Mixture of Experts model that has 26 billion total parameters but only activates around 4 billion during inference, which means it punches well above its weight on constrained hardware.

Parameters are what determine an AI’s capacity to learn, reason, and store information. Having 26 billion total parameters gives the model a huge breadth of knowledge. But by only “waking up” the 4 billion parameters relevant to your specific prompt, it delivers the high-quality results of a massive AI while remaining lightweight enough to run smoothly on everyday hardware.

The other is Gemopus-4-E4B, a 4-billion parameter edge model engineered to run comfortably on a modern iPhone or a thin-and-light MacBook—no GPU required.



The base model choice matters here. Google’s Gemma 4, released on April 2, is built directly from the same research and technology as Gemini 3—the company said so explicitly at launch. That means Gemopus carries something no Qwen-based fine-tune can claim: The DNA of Google’s own state-of-the-art closed model under the hood, wrapped in Anthropic’s thinking style on top. The best of both worlds, more or less.

What makes Gemopus different from the wave of other Gemma fine-tunes flooding Hugging Face right now is the philosophy behind it. Jackrong deliberately chose not to force Claude’s chain-of-thought reasoning traces into Gemma’s weights—a shortcut most competing releases take.

His argument, backed by recent research, is that stuffing a student model with a teacher’s surface-level reasoning text doesn’t actually transfer real reasoning ability. It teaches imitation, not logic. “There is no need for excessive imagination or superstitious replication of the Claude-style chain of thought,” the model card reads. Instead, he focused on answer quality, structural clarity, and conversational naturalness—fixing Gemma’s stiff Wikipedia tone and its tendency to lecture you about things you didn’t ask.

AI infrastructure engineer Kyle Hessling ran independent benchmarks and published the results directly on the model card. His verdict on the 26B variant was pretty favorable. “Happy to have benched this one pretty hard and it is an excellent finetune of an already exceptional model,” he wrote on X. “It rocks at one-shot requests over long contexts, and runs incredibly fast thanks to the MOE (mixture of experts) architecture.”

Gemopus-4-26B-A4B from Jackrong is LIVE! Happy to have benched this one pretty hard (see my benches in the model card) and it is an excellent finetune of an already exceptional model! My friend Jackrong is always cooking the greatest! It rocks at one-shot requests over long… — Kyle Hessling (@KyleHessling1) April 10, 2026

The smaller E4B variant passed all 14 core competence tests—instruction following, coding, math, multi-step reasoning, translation, safety, caching—and cleared all 12 long-context tests at 30K and 60K tokens. On needle-in-haystack retrieval, it passed 13 out of 13 probes including a stretch test at one million tokens with YaRN 8× RoPE scaling.

The 26B extends natively to 131K context and all the way out to 524K with YaRN, which Hessling also stress-tested: “It also crushed my simple needle-in-the-haystack tests all the way out to an extended context of 524k!”

On edge hardware, the E4B is genuinely fast. Jackrong reports 45–60 tokens per second on iPhone 17 Pro Max, and 90–120 tokens per second on MacBook Air M3/M4 via MLX. The 26B MoE architecture means it offloads gracefully on unified memory systems or GPUs with under 10GB of VRAM. Hessling called it his daily driver recommendation for VRAM-starved setups.

Both models are available in GGUF format, which means you can drop them straight into LM Studio or llama.cpp without configuration. The full training code and a step-by-step fine-tuning guide are on Jackrong’s GitHub—same pipeline he used for Qwopus, same Unsloth and LoRA setup, reproducible on Colab.

Gemopus is not without its rough edges. Tool calling remains broken across the entire Gemma 4 series in llama.cpp and LM Studio—call failures, format mismatches, loops—so if your workflow depends on agents using external tools, this is not your model yet. Jackrong himself calls it “an engineering exploration reference rather than a fully production-ready solution,” and recommends his own Qwopus 3.5 series for anyone who needs something more stable for real workloads.

And because Jackrong deliberately avoided aggressive Claude-style chain-of-thought distillation, don’t expect it to feel as deeply Opus-brained as Qwopus—that was a conscious tradeoff for stability, not an oversight.

Yeah the philosophy on this one was stability first, it is my understanding that the Gemma models tend to become unstable if you force a bunch of Claude thinking traces into them, you can see this when testing many other Opus gemma fine tunes on hugging face. Jackrong tried a… — Kyle Hessling (@KyleHessling1) April 10, 2026

For those who want to go deeper into Gemma fine-tuning for reasoning specifically, there is also a separate community project worth watching: Ornstein by pseudonmyous developer DJLougen, which takes the same 26B Gemma 4 base and focuses specifically on improving its reasoning chains without relying on the logic or style of any specific third party model.

One honest caveat: Gemma’s training dynamics are messier than Qwen’s for fine-tuners—wider loss fluctuations, more hyperparameter sensitivity. Jackrong says so himself. If you need a more battle-tested local model for production workflows, his Qwopus 3.5 series remains more robustly validated. But if you want an American model with Opus-style polish, Gemopus is currently your best available option. A denser 31B Gemopus variant is also in the pipeline, with Hessling teasing it as “a banger for sure.”

If you want to try running local models on your own hardware, check our guide on how to get started with local AI.


分享到:

  • 在 Facebook 上共享(在新窗口中打开) Facebook
  • 共享到 X(在新窗口中打开) X
  • 共享到 Threads(在新窗口中打开) Threads
  • 共享到 Bluesky(在新窗口中打开) Bluesky
  • 共享到 Telegram(在新窗口中打开) Telegram
  • 共享到 Nextdoor(在新窗口中打开) 隔壁
  • 分享到 Tumblr (在新窗口中打开) Tumblr
  • 共享到 Mastodon(在新窗口中打开) Mastodon

赞过:

赞 正在加载……

相关

发表评论取消回复

近期文章

  • HYPE hits 2026 high as Hyperliquid volumes soar: Is the rally sustainable?
  • DAO behind CoW Swap urges users to stay off platform after ‘hijacking‘
  • Kraken IPO Plans Move Forward After Confidential Filing, Co-CEO Sethi Revealed
  • Visa Joins Stripe’s Tempo Payments Network as ‘Anchor’ Validator
  • Goldman Sachs to use options strategy for planned Bitcoin income ETF

归档

  • 2026 年 4 月
  • 2026 年 3 月
  • 2026 年 2 月
  • 2026 年 1 月
  • 2025 年 12 月
  • 2025 年 11 月
  • 2025 年 10 月
  • 2025 年 9 月
  • 2025 年 8 月
  • 2025 年 7 月
  • 2025 年 6 月
  • 2025 年 5 月
  • 2025 年 4 月

分类

  • 1kx (1)
  • 21Shares (1)
  • a16z (1)
  • Aave (3)
  • ai16z (1)
  • Alameda Research (1)
  • Alpaca (1)
  • Arbitrum (1)
  • Ark Invest (1)
  • Arkham (1)
  • Avail (1)
  • Azuki (1)
  • A股 (18)
  • Base (1)
  • Berachain (1)
  • Bitget (8)
  • BlackRock (3)
  • Brian Armstrong (1)
  • BTC (5)
  • Bybit (2)
  • Canary (1)
  • Cathie Wood (1)
  • Coinbase (3)
  • Coinbase Prime (2)
  • Coinbase Ventures (3)
  • CoinDesk (2)
  • CoinGecko (1)
  • Cointelegraph (1)
  • COMP (1)
  • Compound (1)
  • DAO (1)
  • DATA (2)
  • DeAI (1)
  • DePIN (1)
  • DEX (3)
  • EARN (1)
  • Eliza (1)
  • ETF (4)
  • ETH (4)
  • Ethos Network (1)
  • Fartcoin (2)
  • FDUSD (1)
  • FLock.io (1)
  • FLUID (1)
  • FUEL (1)
  • Gas (2)
  • GPU (1)
  • Grayscale (1)
  • IEO (1)
  • Inception (1)
  • IOG (1)
  • Jupiter (1)
  • Kairos (1)
  • Kaito (1)
  • Launchpool (1)
  • Layer2 (1)
  • Liquidity (1)
  • Magicblock (1)
  • Mango Markets (1)
  • Mechanism Capital (1)
  • Meebits (1)
  • Meme (3)
  • Netflix (1)
  • NVIDIA (1)
  • Ondo (1)
  • OpenAI (2)
  • Paradigm (1)
  • Polygon (3)
  • Pudgy Penguins (1)
  • pump.fun (1)
  • Raydium (2)
  • Robert Leshner (1)
  • Robinhood (1)
  • Sam Altman (1)
  • SEC (4)
  • Securitize (1)
  • SideKick (1)
  • SNX (1)
  • SOL (1)
  • Solana (3)
  • Stani Kulechov (1)
  • StarkWare (1)
  • STO (1)
  • Stripe (1)
  • SunDog (1)
  • SunPump (1)
  • Synthetix (1)
  • TechFlow (40,827)
  • The Block (2)
  • Tron (2)
  • TRX (1)
  • Upbit (1)
  • USDC (3)
  • WBTC (2)
  • Web3 (4)
  • WLD (1)
  • WOO X (1)
  • Xai (1)
  • Zora (1)
  • 交易所动态 (8)
  • 人工智能 (1)
  • 以太坊 (4)
  • 以太坊基金会 (1)
  • 信托 (1)
  • 借贷 (2)
  • 公链 (1)
  • 基础设施 (1)
  • 大额投融资 (1)
  • 存储 (2)
  • 孙宇晨 (2)
  • 安全 (2)
  • 富达 (1)
  • 工具 (2)
  • 币安 (7)
  • 快讯 (41,083)
  • 托管 (1)
  • 指数 (1)
  • 支付 (1)
  • 数据 (6)
  • 数据追踪 (4)
  • 智能合约 (1)
  • 未分类 (355)
  • 模块化 (1)
  • 欧洲 (1)
  • 欧盟 (1)
  • 比特币 (7)
  • 永续合约 (1)
  • 治理 (1)
  • 波场 (1)
  • 港股 (5)
  • 游戏 (3)
  • 火币 (1)
  • 灰度 (1)
  • 特朗普 (5)
  • 社交 (2)
  • 稳定币 (3)
  • 空投 (6)
  • 纳斯达克 (1)
  • 美国 (6)
  • 美国证券交易委员会 (3)
  • 美股 (2)
  • 股市资讯 (11)
  • 股市资讯 (1)
  • 股市资讯 (1)
  • 股市资讯 (1)
  • 英伟达 (2)
  • 英国 (1)
  • 萨尔瓦多 (1)
  • 融资 (3)
  • 行情异动 (7)
  • 贝莱德 (1)
  • 质押 (4)
  • 赵长鹏 (1)
  • 跨链 (3)
  • 跨链桥 (1)
  • 迪拜 (1)
  • 重要消息 (45)
  • 金库 (1)
  • 钱包 (4)
  • 阿根廷 (1)
  • 阿里云 (1)
  • 隐私 (2)
  • 项目重要进展 (9)
  • Bluesky
  • Mail
©2026 WT快讯 | Design: Newspaperly WordPress Theme
%d