Google’s Gemini Omni Enters Market Defined by DeepSeek’s Price Collapse
Native multimodal architecture addresses enterprise deployment fragmentation, but Chinese cost disruption has already reshaped foundational model economics.
Google launched Gemini Omni Flash on May 19, positioning unified multimodal processing against OpenAI and Anthropic while DeepSeek’s 90% price cuts—effective since April 26—have fundamentally altered competitive dynamics in the foundational model market.
The release introduces native multimodal-to-multimodal inference, collapsing previously separate vision, audio, and video models into a single architecture. Omni Flash processes any input modality and generates video, image, or audio output without intermediate conversion steps. Initial deployment limits video generation to 10-second clips, with Omni Pro handling longer content. Image and audio output capabilities will arrive in subsequent releases, per MindStudio.
API access was promised “in coming weeks” at the May 19 announcement, but no firm date has materialized as of May 23. Pricing remains unpublished. Mid-case projections estimate $2.00 input and $0.40 per second output per million tokens, interpolated from existing Veo 3.1 ($0.05-$0.60/sec) and Gemini 3.5 Flash ($1.50/$9.00) pricing structures, according to TECHSY.
“Omni is described as a ‘native multimodal’ model where all modalities are treated as first-class citizens during training,” per MindStudio analysis. Previous multimodal systems required separate models for each modality with intermediate conversion steps—text-to-image pipeline followed by image-to-video processing. Omni’s unified architecture reduces latency, eliminates transcoding artifacts, and simplifies enterprise deployment infrastructure.
DeepSeek’s Price Disruption Reshapes Competition
While Google optimized architecture, DeepSeek upended unit economics. The Chinese lab reduced cache hit pricing to $0.0037 per million tokens on April 26—one-tenth the launch price and roughly 1/30th the combined cost of OpenAI GPT-5.5 and Anthropic Claude Opus. Output pricing fell to $0.878/M with a 75% discount extended through May 31, per KuCoin.
The pricing collapse preceded Omni’s launch by three weeks, establishing a cost baseline that Google’s unpublished API rates will be measured against. DeepSeek’s strategy inverts traditional competition from capability leadership to cost-at-scale dominance. “Foundation large models are rapidly becoming ‘infrastructure-like,’ similar to water and electricity,” DeepSeek analysts noted in material reviewed by KuCoin. “The future competitive focus will shift comprehensively from competition over single model parameter scale to optimization of inference costs and market share of developer ecosystems.”
ByteDance’s Doubao model now processes 120+ trillion tokens daily, driven by AI-generated video content. China surpassed US weekly token usage in March 2026, with Chinese models occupying the top three positions globally in total usage, per CEIBS reporting in April. The adoption gap reflects DeepSeek’s cost advantage translating into deployment velocity in price-sensitive markets.
Infrastructure Advantage vs. Margin Defense
Google’s response centers on infrastructure scale. CEO Sundar Pichai confirmed $180-190 billion capital expenditure expected by end of 2026, focused on AI infrastructure and chips. AI Mode in Google Search surpassed 1 billion monthly users as of May 19, with queries more than doubling each quarter since launch, according to Google Official Blog.
That user base provides distribution leverage competitors lack. Enterprise tier access for Gemini Omni through Google Cloud arrives in Q3 2026. Bundling multimodal capabilities with existing cloud contracts reduces customer acquisition costs and locks in recurring revenue—critical as unit economics compress.
“Frontier labs will try to hold the line at first. But even with DeepSeek’s 90% lower cached input token prices on the table, gross token spend will keep surging. Jevons Paradox is undefeated.”
— Val Bercovici, Chief AI Officer at Weka
Video generation infrastructure requirements amplify semiconductor demand. Production deployment requires minimum 48GB VRAM, with H200 GPUs (141GB HBM3e) needed for state-of-the-art models. Achieving 10-100 videos per hour throughput demands A100 minimum, H100 ideally, per GMI Cloud analysis. Video’s 10-100x token multiplier versus text accelerates infrastructure strain across the industry.
Market Concentration and Regulatory Lag
The foundational model market reached $12 billion valuation in 2026, projected to hit $19.89 billion by 2030 at 13.5% CAGR, driven by multimodal AI advancements, per Research and Markets. Growth concentrates among fewer players. A single firm holds 80%+ market share in three AI infrastructure segments, with the top three firms collectively holding 60%+ in three others, OECD reported in November 2025.
- High fixed training costs with low marginal deployment costs create natural monopoly dynamics
- Long lead times and proprietary software ecosystems raise barriers to entry
- Regulatory frameworks lag market consolidation by 12-18 months
Foundation models exhibit natural monopoly tendencies, requiring a two-pronged regulatory approach: antitrust measures to maintain contestability plus quality standards covering safety, privacy, reliability, and interoperability, per Brookings Institution analysis in October 2025. No such framework has materialized in major jurisdictions.
US-China Competition Enters “Decathlon” Mode
Google Gemini 3 Pro leads the May 2026 leaderboard at 1490 Arena score, followed by xAI Grok-4.1 (1477) and Claude Opus 4.5 (1469). But capability advantages have narrowed. Google DeepMind CEO Demis Hassabis assessed in January that China trails the US by 6-12 months on model performance, down from a 24-month gap two years prior, per Pinggy reporting.
“China’s lead in open-source AI and applied AI focus could prove winning formula for global market share,” Atlantic Council assessed in January. “US-China AI competition entering ‘decathlon’ mode across multiple dimensions—model capability, compute access, infrastructure control, standards.”
The competition splits along capability versus adoption axes. US labs maintain frontier model performance but face margin compression from Chinese cost efficiency. DeepSeek’s promotional pricing (75% discount expires May 31) will revert to regular rates ($3.48/M output), testing whether promotional adoption converts to sustained market share or represented temporary arbitrage.
What to Watch
Gemini Omni API pricing documentation from Vertex AI and AI Studio will determine whether Google defends 50%+ gross margins or matches DeepSeek’s cost structure. Publication expected by mid-June based on “coming weeks” guidance from May 19.
June-July token usage metrics from China will confirm whether April adoption surge reflected promotional pricing arbitrage or durable shift in deployment patterns. Watch for ByteDance Doubao usage trends and DeepSeek post-discount retention rates.
Regulatory movement on foundational model concentration bears monitoring across US, EU, and UK jurisdictions. Brookings’ natural monopoly framework provides analytical baseline; actual antitrust action or interoperability mandates would reshape competitive dynamics materially.
Semiconductor order flows from Google, ByteDance, and hyperscalers signal infrastructure buildout pace. H200 and next-generation HBM3e capacity allocation determines which labs can scale video generation to production volumes. Lead times remain 9-12 months, making Q2-Q3 2026 orders decisive for 2027 capacity.