QUICK INFO BOX
| Attribute | Details |
|---|---|
| Company Name | Sakana AI |
| Founders | Llion Jones (CEO), David Ha (Co-Founder, Sakana Scholar) |
| Founded Year | 2023 |
| Headquarters | Tokyo, Japan |
| Industry | Artificial Intelligence / Machine Learning / Foundation Models |
| Sector | AI Research / Evolutionary AI / Model Merging |
| Company Type | Private |
| Key Investors | Khosla Ventures, Lux Capital, NVIDIA, SoftBank Vision Fund, Sony Innovation Fund, NTT DOCOMO Ventures |
| Funding Rounds | Seed, Series A |
| Total Funding Raised | $165 Million |
| Valuation | $1.5 Billion (Series A, February 2024) |
| Number of Employees | 60+ (February 2026) |
| Key Products / Services | Evolutionary Model Merge, Nature-Inspired AI Architecture, Foundation Models, AI Research Platform |
| Technology Stack | PyTorch, JAX, Evolutionary Algorithms, Neural Architecture Search, Transformer Models |
| Revenue (Latest Year) | Research Stage (pre-revenue, February 2026) |
| Customer Base | Research Partnerships (Sony, NTT, Universities) |
| Social Media | Website, Twitter, Blog |
Introduction
AI development is hitting scaling limits. GPT-4 cost $100M+ to train (10,000+ GPUs, months), Claude 3 required similar resources—yet improvements are diminishing. The “scaling law” paradigm (bigger models = better performance) faces fundamental constraints:
- Compute costs: Training 1T+ parameter models costs $500M-1B+ (economically unsustainable)
- Data scarcity: High-quality text data exhausted—models trained on entire internet
- Energy consumption: Training runs using 50-100MW (environmental concerns, grid capacity)
- Diminishing returns: GPT-4 → GPT-5 gains smaller than GPT-2 → GPT-3
- Monolithic design: Single massive model (inflexible, expensive to update, hard to specialize)
Meanwhile, nature solves intelligence differently:
Evolution doesn’t build one giant organism—it creates diverse ecosystems (millions of species, each specialized). Intelligence emerges through:
- Diversity: Different organisms solving different problems
- Combination: Sexual reproduction mixing genes, creating novel traits
- Selection: Keeping what works, discarding what doesn’t
- Modularity: Specialized organs (eyes, brains, wings) composed into organisms
- Efficiency: Insects with 100K neurons navigate, hunt, communicate (vs. AI needing billions of parameters)
What if we built AI like nature builds organisms—merging small models (not training giant ones), evolving architectures (not hand-designing), creating diversity (not monolithic systems)?
Enter Sakana AI (“sakana” = fish in Japanese), the Tokyo-based AI research lab pioneering nature-inspired AI through evolutionary model merging. Founded in 2023 by Llion Jones (CEO, co-creator of Transformers at Google) and David Ha (Sakana Scholar, ex-Google Brain researcher famous for World Models and evolutionary algorithms), Sakana AI develops techniques for merging multiple small models into specialized experts—achieving GPT-4-class performance at 10x lower training cost through evolutionary combination rather than monolithic scaling.
As of February 2026, Sakana AI operates at a $1.5 billion valuation with $165 million in funding from Khosla Ventures, Lux Capital, NVIDIA, SoftBank Vision Fund, Sony Innovation Fund, and NTT DOCOMO Ventures. The company employs 60+ researchers and engineers (February 2026) in Tokyo, making it Japan’s highest-valued AI startup. Sakana AI’s research remains in pre-commercial phase—publishing papers, collaborating with Sony/NTT, and demonstrating evolutionary model merging (EvoMerge) techniques achieving state-of-art results on specialized tasks.
What makes Sakana AI revolutionary:
- Evolutionary model merging: Combining 10-20 small open-source models (Llama 3 7B, Mistral 7B) → specialized expert matching GPT-4 on specific domains—at 90% lower training cost
- Nature-inspired architecture: Neural networks evolving like organisms—discovering novel architectures through mutation, crossover, selection
- Collective intelligence: Diverse model ensembles (not monolithic) collaborating—like ant colonies achieving complex behaviors through simple agents
- Japanese AI leadership: Building sovereign AI capability (Tokyo HQ, Japanese focus)—avoiding US/China dependence
- Open research: Publishing methods, contributing to open-source—democratizing AI beyond Big Tech
The market opportunity spans $150+ billion foundation model market, $50+ billion AI research, and $500+ billion sovereign AI (governments seeking domestic AI capabilities). Every AI lab faces scaling limits—Sakana AI provides alternative: merge existing models (cheaper, faster) rather than training from scratch.
Sakana AI competes with OpenAI ($80B valuation, scaling paradigm), Anthropic ($18B valuation, constitutional AI), Google DeepMind (scaling + efficiency research), Mistral AI ($6B valuation, efficient models), EleutherAI (open-source collective), and Cohere ($5.5B valuation, enterprise foundation models). Sakana AI differentiates through evolutionary approach (merging vs. training), nature-inspired philosophy (ecosystems vs. monoliths), Japan focus (Tokyo HQ, Japanese government/corporate partnerships), and founder pedigree (Transformers co-creator + world-class evolutionary AI researcher).
The founding story reflects philosophical conviction: Llion Jones (co-created Transformers with Vaswani, Parmar, et al. at Google 2017) and David Ha (pioneered World Models—reinforcement learning agents learning through imagination) believed AI’s future lay in biological principles (evolution, diversity, modularity) rather than brute-force scaling. After leaving Google (2023), they founded Sakana AI in Tokyo—combining Western AI expertise with Japanese manufacturing philosophy (kaizen, lean production, harmony with nature).
This comprehensive article explores Sakana AI’s journey from research vision to the $1.5 billion nature-inspired AI lab redefining how foundation models are built.
Founding Story & Background
The Scaling Crisis (2022-2023)
By 2022, AI scaling paradigm faced challenges:
GPT-3 (2020): 175B parameters, $4-10M training cost
PaLM (Google, 2022): 540B parameters, $10-20M estimated
GPT-4 (OpenAI, 2023): 1.7T parameters (rumored), $100M+ training cost
Scaling law: Performance improving predictably with model size, data, compute. But:
- Cost explosion: Training costs 10x every 2 years (economically unsustainable)
- GPU scarcity: H100s supply-constrained, multi-month wait times
- Energy concerns: Training runs rivaling small towns’ electricity consumption
- Diminishing returns: GPT-4 → GPT-5 improvements smaller than previous generations
Meanwhile, nature achieves intelligence efficiently:
Bee brain: 1 million neurons (vs. GPT-4’s trillions of parameters), navigates complex environments, communicates via dance, recognizes faces
Octopus: 500 million neurons (1/3 in brain, 2/3 distributed in arms), solves puzzles, uses tools, changes color instantly
Insight: Nature uses diversity (many specialized organisms) + modularity (specialized organs) + evolution (combining/selecting) rather than monolithic scale.
Llion Jones and the Transformers Legacy
Llion Jones co-created Transformers at Google Brain (2017)—the “Attention Is All You Need” paper revolutionizing AI:
Before Transformers (2017):
- RNNs/LSTMs: Sequential processing, slow, limited context
After Transformers:
- Parallel processing: Attention mechanism computing all positions simultaneously
- Long-range dependencies: Modeling relationships across thousands of tokens
- Foundation models: GPT, BERT, T5, LLaMA, Claude—all use Transformers
Impact: 100K+ citations, basis for $100B+ AI industry
Yet by 2022, Jones grew concerned about scaling paradigm’s sustainability. At Google, he witnessed:
- Compute battles: Teams competing for GPU quota
- Training failures: Multi-million-dollar runs crashing, restarting
- Monolithic thinking: Bigger is always better
Jones believed next breakthrough wouldn’t come from scaling—but from smarter architectures.
David Ha and World Models
David Ha (Google Brain researcher, 2018-2023) pioneered World Models (2018)—reinforcement learning agents learning to imagine:
Problem: RL agents need millions of environment interactions (expensive, slow)
Solution: Train world model (neural network predicting environment dynamics)—agent trains in imagination (not real environment)
Result: Car racing agent learning in 1 hour (vs. 10+ hours traditional RL), using 10x fewer parameters
Ha’s research emphasized:
- Efficiency: Small models trained cleverly outperform large models trained naively
- Evolutionary algorithms: Neural architecture search through mutation, selection
- Visual intelligence: Agents learning from pixels (not hand-crafted features)
By 2023, Ha believed evolution (not gradient descent alone) would unlock next AI generation.
2023: Founding Sakana AI in Tokyo
In July 2023, Jones and Ha departed Google Brain and founded Sakana AI in Tokyo, Japan—deliberate location choice:
Why Tokyo?
- Talent: Japanese universities (Tokyo, Kyoto) producing top ML researchers
- Government support: Japanese government investing $13B in AI sovereignty (competing with US/China)
- Corporate partners: Sony, NTT, SoftBank seeking domestic AI capabilities
- Cultural fit: Japanese manufacturing philosophy (kaizen = continuous improvement, muda elimination = waste reduction) aligning with efficiency-focused AI
- Lower costs: Tokyo office/salary costs 30-40% lower than San Francisco
Mission: “Build AI inspired by nature’s evolutionary principles—diverse, efficient, sustainable.”
Research focus:
- Evolutionary model merging: Combining existing models (not training from scratch)
- Neural architecture search: Evolving architectures automatically
- Collective intelligence: Multi-agent systems, swarm intelligence
- Foundation models: Japanese-language specialization
2023: Seed and Initial Research
Seed (August 2023): $30 Million
- Lead: Khosla Ventures
- Additional: Lux Capital, NVIDIA
- Purpose: Core team (10 researchers), compute infrastructure, initial experiments
Khosla Ventures (Vinod Khosla, Sun Microsystems founder) provided:
- Patient capital: Long-term R&D focus (not revenue pressure)
- Conviction: Khosla personally convinced by evolutionary approach
By December 2023, Sakana AI published first paper: “Evolutionary Model Merge: Creating Specialized Experts Without Training”
Method:
- Start with open-source models: Llama 2 7B, Mistral 7B, Code Llama, etc. (10-20 models)
- Define task: E.g., “Generate Python code solving LeetCode problems”
- Evolutionary search: Try combinations (averaging weights, merging architectures, ensembling)
- Selection: Keep combinations performing best on validation set
- Iteration: Mutate (adjust merge weights), crossover (combine successful merges), repeat
Results (December 2023 paper):
- Code generation: Merged model (from Llama 2 7B + Code Llama 7B + Mistral 7B) scoring 75% on HumanEval (vs. 67% for best individual model)
- Training cost: $0 (using existing models), vs. $500K+ training from scratch
- Specialization: Merged model excellent on code (better than GPT-3.5 Turbo 67%), poor on poetry (acceptable trade-off for specialized use)
2024: Series A, Japanese Government Partnership, Scaling Research
Series A (February 2024): $135 Million
- Lead: Lux Capital, Khosla Ventures
- Additional: NVIDIA, SoftBank Vision Fund, Sony Innovation Fund, NTT DOCOMO Ventures
- Valuation: $1.5 Billion (unicorn status)
- Purpose: Scaling team (10 → 40), Japanese-language models, larger compute (1,000+ A100 GPUs), commercialization research
SoftBank Vision Fund investment signaled:
- Japanese validation: SoftBank betting on domestic AI champion
- Distribution: Access to SoftBank portfolio (100+ companies)
Sony Innovation Fund provided:
- Entertainment use cases: AI for gaming (PlayStation), music, film production
- Hardware integration: AI running on Sony devices (cameras, TVs, robotics)
NTT DOCOMO (largest Japanese telecom) enabled:
- Telecom AI: Customer service, network optimization
- Edge deployment: AI on mobile devices (efficient models)
Japanese government partnership (March 2024):
- $50M grant: Developing Japanese-language foundation models (reducing dependency on US models)
- Compute access: Access to government-funded supercomputers (ABCI, Fugaku)
- Research collaboration: Joint projects with Tokyo University, RIKEN
2024-2026: Research Breakthroughs and Llama 3 Merging
In 2024-2026, Sakana AI published:
“Evolutionary Optimization of Model Merging Recipes” (May 2024):
- Method: Using genetic algorithms to find optimal merge strategies (50+ open-source models)
- Results: Merged model achieving MMLU 85% (vs. 80% for best individual model), rivaling GPT-4 (claimed 86%)
- Cost: $50K compute (evolutionary search on A100s), vs. $50M+ training GPT-4-class model
“Collective Intelligence in Language Models” (November 2024):
- Method: Ensemble of 10 specialized models (code, math, creative writing, reasoning) collaborating—routing questions to experts
- Results: Overall performance +10-15% vs. monolithic model of equivalent total parameter count
- Insight: Diversity beats scale (10 specialists better than 1 generalist)
“Llama 3 Evolutionary Merge” (April 2025):
- Method: Merging Llama 3 8B, 70B, 405B models with domain-specific fine-tunes (medical, legal, code)
- Results: 70B merged model matching Llama 3 405B on specialized tasks (medical QA, legal reasoning)—using 6x fewer parameters
- Commercialization: Demonstrating cost savings for enterprises (run 70B on single server, not 405B on 8 servers)
“Japanese-Language Foundation Model via Evolutionary Merge” (October 2025):
- Sakana-JP: Merged model optimized for Japanese (combining Llama 3, Mistral, Japanese BERT, Japanese GPT models)
- Results: Best-in-class Japanese performance—beating GPT-4 on Japanese benchmarks (JGLUE, JCommonsenseQA)
- Strategic: Providing Japan with sovereign AI capability (not dependent on OpenAI/Anthropic)
By February 2026:
- 60+ employees (40 researchers, 20 engineers/operations)
- 12 published papers (Nature, NeurIPS, ICML, ICLR)
- 100+ research partnerships (Sony, NTT, Tokyo University, Kyoto University)
- Sakana-JP model deployed at 20+ Japanese companies (NTT, Sony, Rakuten)
Founders & Key Team
| Relation / Role | Name | Previous Experience / Role |
|---|---|---|
| Co-Founder, CEO | Llion Jones | Co-creator of Transformers (“Attention Is All You Need” 2017), Google Brain Research Scientist (2016-2023) |
| Co-Founder, Sakana Scholar | David Ha | Creator of World Models, Google Brain Researcher (2018-2023), Evolution Strategies, Visual Intelligence |
| Chief Scientist | Yutaka Matsuo | Professor at University of Tokyo, Japanese AI Society President, Deep Learning Pioneer in Japan |
| VP Research | Takanori Maehara | Ex-RIKEN (Japan’s largest research institution), Algorithms, Optimization |
| Head of Engineering | Koki Nagano | Ex-NVIDIA Research (graphics, simulation), PhD USC, Character Animation |
Llion Jones (CEO) brings Transformers legacy—co-creating architecture powering $100B+ AI industry. His Google Brain tenure (2016-2023) provides deep understanding of scaling limits, leading to evolutionary conviction. Welsh background, fluent Japanese (lived in Tokyo), bridging Western AI expertise with Japanese culture.
David Ha (Sakana Scholar) is evolutionary AI pioneer—World Models, Evolution Strategies, Neural Architecture Search. His research emphasis on efficiency (small models, clever training) directly informs Sakana philosophy. Visual learner (famous for hand-drawn research blogs), communicating complex ideas intuitively.
Yutaka Matsuo (Chief Scientist) is Japan’s leading AI researcher—University of Tokyo professor, Japanese AI Society president, advising government on AI policy. His involvement provides academic credibility, government access, Japanese AI community connections.
Funding & Investors
Seed (August 2023): $30 Million
- Lead Investor: Khosla Ventures
- Additional Investors: Lux Capital, NVIDIA
- Purpose: Core team, compute infrastructure, initial evolutionary merge research
Series A (February 2024): $135 Million
- Lead Investors: Lux Capital, Khosla Ventures
- Additional Investors: NVIDIA, SoftBank Vision Fund, Sony Innovation Fund, NTT DOCOMO Ventures
- Valuation: $1.5 Billion (unicorn status)
- Purpose: Scaling team (10 → 60), Japanese-language models, 1,000+ A100 GPUs, commercialization, government partnerships
Total Funding Raised: $165 Million
Sakana AI deployed capital across:
- Research talent: $40-60M (recruiting top ML researchers—competitive with Google/DeepMind salaries in Tokyo)
- Compute infrastructure: $30-50M in A100/H100 GPUs, cloud credits (AWS Tokyo, GCP)
- Japanese government grant: $50M for Japanese-language foundation model development
- Partnerships: $10-20M corporate collaborations (Sony, NTT)
Product & Technology Journey
A. Evolutionary Model Merge (Core Method)
Traditional approach:
- Pre-training: Train foundation model from scratch ($1M-100M+, months)
- Fine-tuning: Adapt to specific task ($10K-100K, days-weeks)
Sakana’s evolutionary approach:
- Collect models: Download 10-20 open-source models (Llama 3, Mistral, Falcon, Code Llama, Meditron)
- Define task: Specify domain (medical QA, code generation, Japanese language)
- Evolutionary search:
# Simplified evolutionary merge algorithm
import numpy as np
from transformers import AutoModelForCausalLM
import torch
# Step 1: Load models
models = [
AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3-8b"),
AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1"),
AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf"),
# ... 10-20 models total
]
# Step 2: Define evaluation (e.g., accuracy on medical QA validation set)
def evaluate_model(model, eval_dataset):
# Run inference, compute accuracy
correct = 0
for question, answer in eval_dataset:
pred = model.generate(question)
if pred == answer:
correct += 1
return correct / len(eval_dataset)
# Step 3: Initialize population (random merge strategies)
population_size = 100
population = []
for i in range(population_size):
# Random merge weights for each model
merge_recipe = {
'weights': np.random.dirichlet(np.ones(len(models))), # Sum to 1
'merge_method': np.random.choice(['average', 'slerp', 'task_arithmetic'])
}
population.append(merge_recipe)
# Step 4: Evolutionary loop
num_generations = 50
for generation in range(num_generations):
# Evaluate population
fitness_scores = []
for recipe in population:
# Merge models according to recipe
merged_model = merge_models(models, recipe)
# Evaluate on validation set
score = evaluate_model(merged_model, eval_dataset)
fitness_scores.append(score)
# Select top performers (elitism)
top_k = 20
elite_indices = np.argsort(fitness_scores)[-top_k:]
elite_recipes = [population[i] for i in elite_indices]
# Create next generation
next_population = elite_recipes.copy() # Keep elite
while len(next_population) < population_size:
# Crossover: combine two parents
parent1, parent2 = np.random.choice(elite_recipes, size=2, replace=False)
child_weights = 0.5 * parent1['weights'] + 0.5 * parent2['weights']
child_weights /= child_weights.sum() # Normalize
# Mutation: add noise
mutation_strength = 0.1
child_weights += np.random.normal(0, mutation_strength, len(models))
child_weights = np.abs(child_weights) # Keep positive
child_weights /= child_weights.sum() # Re-normalize
child_recipe = {
'weights': child_weights,
'merge_method': parent1['merge_method'] # Inherit from parent1
}
next_population.append(child_recipe)
population = next_population
print(f"Generation {generation}: Best score = {max(fitness_scores):.3f}")
# Step 5: Return best merge recipe
best_recipe = population[np.argmax(fitness_scores)]
final_model = merge_models(models, best_recipe)
Results (Sakana’s published benchmarks):
- Medical QA (MedQA): Merged model 78% accuracy (vs. 72% best individual, 65% GPT-3.5)
- Code generation (HumanEval): Merged model 82% (vs. 75% best individual, 67% GPT-3.5)
- Japanese language (JGLUE): Sakana-JP 85% (vs. 78% GPT-4, 70% Llama 3)
Cost savings:
- Training from scratch: $1M-50M+ (depending on model size)
- Evolutionary merge: $10K-100K (GPU time for search, typically 1-5 days on 8-64 A100s)
- Savings: 100-1000x cheaper
B. Nature-Inspired Techniques
Swarm intelligence (collective decision-making):
- Method: 10-50 small models voting on answers (like bees choosing nest site via waggle dance)
- Result: Ensemble accuracy > any individual model
- Use case: High-stakes decisions (medical diagnosis, financial predictions)
Neural architecture search (evolving architectures):
- Method: Mutating transformer architectures (attention heads, FFN sizes, layer counts), selecting best performers
- Result: Discovering efficient architectures (50% fewer parameters, same accuracy)
- Example: Sakana found 6-layer transformer matching 12-layer performance on Japanese tasks
Modularity (specialized sub-networks):
- Method: Training separate “organs” (vision encoder, language decoder, reasoning module), composing into systems
- Result: Faster training (parallel), easier updating (swap modules), better specialization
- Analogy: Like organisms having specialized organs (eyes, ears, brain) not monolithic blobs
C. Japanese-Language Foundation Model (Sakana-JP)
Problem: GPT-4, Claude, Llama 3 trained primarily on English—poor Japanese performance
Solution: Sakana-JP (merged model optimized for Japanese)
Method:
- Starting models: Llama 3 70B (strong reasoning), Japanese BERT, Japanese GPT (rinna/tohoku), Mistral (efficient)
- Japanese data: 100B+ tokens (Japanese Wikipedia, books, news, social media, government documents)
- Evolutionary merge: Finding optimal combination for Japanese benchmarks
- Fine-tuning: Additional training on Japanese instruction-following data
Results (October 2025):
- JGLUE (Japanese language understanding): 85% (vs. GPT-4: 78%, Claude 3: 80%, Llama 3: 70%)
- JCommonsenseQA: 82% (vs. GPT-4: 75%)
- Japanese math: 70% (vs. GPT-4: 65%)
Strategic impact: Japan no longer dependent on US AI (OpenAI, Anthropic) for Japanese-language applications
Deployments (February 2026):
- NTT: Customer service chatbots (10M+ users)
- Sony: PlayStation game dialogue generation
- Rakuten: E-commerce product descriptions, customer support
- Japanese government: Document analysis, citizen services
Business Model & Revenue
Revenue Model (Future)
Sakana AI currently pre-revenue (research phase). Planned model:
| Product | Price | Description |
|---|---|---|
| Sakana-JP API | $0.002-0.01/1K tokens | Japanese-language model inference |
| Custom merge service | $50K-500K/project | Creating specialized models for enterprises via evolutionary merge |
| Enterprise licensing | $200K-2M/year | On-premise deployment, unlimited usage, custom training |
| Research partnerships | $500K-5M/year | Joint research with corporations, universities |
Target launch: Q3-Q4 2026 (Sakana-JP API), 2027 (enterprise licensing)
Target Customers
- Japanese enterprises: NTT, Sony, Toyota, Mitsubishi (domestic AI)
- Japanese government: Ministries, municipalities (digital services)
- Global companies in Japan: Google Japan, Amazon Japan (localization)
- AI startups: Companies needing specialized models (cheaper than training)
Estimated Economics (Post-Launch)
- CAC: $20K-100K (enterprise sales, partnerships)
- LTV: $500K-5M (multi-year contracts)
- Gross Margin: 70-80% (API inference, low marginal cost)
- Projected 2027 ARR: $20-40M (Sakana-JP adoption, enterprise contracts)
Competitive Landscape
OpenAI ($80B valuation): Scaling paradigm, expensive training
Anthropic ($18B valuation): Constitutional AI, safety focus
Google DeepMind: Scaling + efficiency research (Gemini)
Mistral AI ($6B valuation): Efficient open-source models
Cohere ($5.5B valuation): Enterprise foundation models
EleutherAI: Open-source collective (similar philosophy, less funding)
Sakana AI Differentiation:
- Evolutionary approach: Merging existing models (100-1000x cheaper than training)
- Nature-inspired: Biological principles (diversity, modularity, evolution)
- Japan focus: Tokyo HQ, Japanese-language models, government partnerships
- Founder pedigree: Transformers co-creator + world-class evolutionary AI researcher
- Open research: Publishing methods, contributing to community
Impact & Success Stories
Japanese Government
Digital Agency (Japan’s digital transformation ministry): Using Sakana-JP for citizen chatbot answering questions about government services. Result: 80% query resolution (vs. 60% previous rule-based system), 24/7 availability, Japanese language quality exceeding GPT-4.
NTT DOCOMO
Customer support: Deployed Sakana-JP for 10M+ mobile customers. Result: 70% automation rate (vs. 50% previous system), $30M+ annual savings (reduced call center staff), 90% customer satisfaction (natural Japanese conversation).
Sony Interactive Entertainment
Game development: Using Sakana AI evolutionary merge to create specialized AI for character dialogue (combining story models, personality models, game lore). Result: 10x faster dialogue generation (200 lines/hour vs. 20 lines/hour manual writing), more consistent character voices.
Future Outlook
Product Roadmap
2026: Sakana-JP API launch (Q3), custom merge service (enterprise clients)
2027: Multimodal Sakana models (vision + language), edge deployment (mobile devices)
2028: Evolutionary AGI research (self-improving systems), robotics applications (Toyota partnership)
Growth Strategy
Japanese dominance: Becoming default AI for Japanese enterprises, government
Global expansion: Offering custom evolutionary merge services internationally
Open-source: Releasing merge tools, techniques (building community, goodwill)
Long-term Vision
Sakana AI aims to prove evolution beats scaling—becoming world leader in efficient AI through nature-inspired techniques. With $165M funding, $1.5B valuation, Japanese government support, and Transformers co-creator as CEO, Sakana positioned for IPO ($5B-10B+) or strategic acquisition (Google, SoftBank) within 5-7 years as evolutionary AI becomes standard approach to foundation model development.
FAQs
What is Sakana AI?
Sakana AI is Tokyo-based AI research lab pioneering nature-inspired AI through evolutionary model merging—combining 10-20 small open-source models into specialized experts at 100-1000x lower cost than training from scratch.
How much funding has Sakana AI raised?
$165 million total across Seed ($30M) and Series A ($135M, led by Lux Capital/Khosla Ventures with NVIDIA, SoftBank, Sony), achieving $1.5 billion valuation (February 2024)—Japan’s highest-valued AI startup.
Who founded Sakana AI?
Llion Jones (Transformers co-creator, ex-Google Brain) and David Ha (World Models creator, ex-Google Brain), founded July 2023 in Tokyo, Japan.
What is evolutionary model merging?
Technique combining existing pre-trained models (Llama 3, Mistral, etc.) using genetic algorithms to find optimal merge strategies—creating specialized experts without training from scratch. 100-1000x cheaper than traditional training.
What is Sakana-JP?
Japanese-language foundation model created via evolutionary merge of Llama 3, Japanese BERT/GPT, and Mistral—achieving best-in-class Japanese performance (85% JGLUE, beating GPT-4’s 78%), providing Japan with sovereign AI capability.
Conclusion
Sakana AI has established itself as pioneering nature-inspired AI research lab, achieving $1.5 billion valuation, $165 million funding from Khosla/Lux/NVIDIA/SoftBank/Sony, and position as Japan’s leading AI startup. With evolutionary model merging techniques achieving GPT-4-class performance at 100-1000x lower cost, Sakana proves that biological principles (diversity, evolution, modularity) offer viable alternative to brute-force scaling.
As AI industry hits scaling limits ($100M+ training runs, diminishing returns), demand for Sakana’s efficient approaches grows exponentially—enterprises seeking specialized models without massive compute budgets, countries requiring sovereign AI capabilities, researchers exploring post-scaling paradigms. Sakana’s evolutionary merge (published papers, 12+ benchmarks), Sakana-JP model (deployed at NTT, Sony, government), Japanese partnerships ($50M government grant, corporate collaborations), and founder pedigree (Transformers co-creator + evolutionary AI pioneer) position it as essential infrastructure for efficient AI era. With nature-inspired philosophy, open research contributions, and $165M funding enabling aggressive R&D, Sakana is positioned as compelling IPO candidate ($5B-10B+ valuation) or strategic acquisition target within 5-7 years as evolutionary AI replaces scaling as dominant paradigm for foundation model development.
Related Article:
- https://eboona.com/ai-unicorn/6sense/
- https://eboona.com/ai-unicorn/abnormal-security/
- https://eboona.com/ai-unicorn/abridge/
- https://eboona.com/ai-unicorn/adept-ai/
- https://eboona.com/ai-unicorn/anduril-industries/
- https://eboona.com/ai-unicorn/anthropic/
- https://eboona.com/ai-unicorn/anysphere/
- https://eboona.com/ai-unicorn/applied-intuition/
- https://eboona.com/ai-unicorn/attentive/
- https://eboona.com/ai-unicorn/automation-anywhere/
- https://eboona.com/ai-unicorn/biosplice/
- https://eboona.com/ai-unicorn/black-forest-labs/
- https://eboona.com/ai-unicorn/brex/
- https://eboona.com/ai-unicorn/bytedance/
- https://eboona.com/ai-unicorn/canva/
- https://eboona.com/ai-unicorn/celonis/
- https://eboona.com/ai-unicorn/cerebras-systems/


























