Benchmark Data & Results
Comprehensive performance metrics from 8 standardized test scenarios evaluating LLM-powered ant colony behaviors
Total Simulations
8
Standardized scenarios
Avg. Food Collected
6.3
Per simulation
Avg. Collaborations
36
Touch communications
Total LLM Requests
2,024
Across all tests
🏆 Best Performing Scenario: Resource Abundance
Efficiency Score
6.06
Food Collected
12/15
Pheromone Efficiency
62.3%
Detailed Results by Scenario
| Scenario | Colony Size | Food Collected | Exploration % | Pheromone Eff. % | Collaborations | Efficiency |
|---|---|---|---|---|---|---|
Baseline Performance 305s duration | 20 | 7/8 88% | 67.3% | 78.2% | 23 1.1 per ant | 2.86 food/LLM req |
Resource Scarcity 428s duration | 20 | 3/3 100% | 89.1% | 91.5% | 45 2.3 per ant | 1.04 food/LLM req |
Resource Abundance 289s duration | 20 | 12/15 80% | 45.7% | 62.3% | 18 0.9 per ant | 6.06 food/LLM req |
Linear Trail 245s duration | 20 | 8/8 100% | 34.2% | 94.7% | 31 1.6 per ant | 4.79 food/LLM req |
Scattered Resources 478s duration | 20 | 5/8 63% | 92.8% | 56.9% | 12 0.6 per ant | 1.60 food/LLM req |
Large Colony 389s duration | 40 | 6/8 75% | 78.4% | 43.2% | 67 1.7 per ant | 1.15 food/LLM req |
Small Colony 412s duration | 5 | 4/8 50% | 23.6% | 85.1% | 3 0.6 per ant | 4.60 food/LLM req |
Communication Test 334s duration | 15 | 5/6 83% | 58.9% | 82.7% | 89 5.9 per ant | 2.46 food/LLM req |
📊 Performance Insights
- • Linear Trail scenario achieved highest efficiency (4.79) due to structured food placement
- • Resource Abundance enabled highest absolute food collection (12/15 sources)
- • Scattered Resources required most exploration (92.8% coverage) but lowest efficiency
- • Small colonies showed high per-ant efficiency but limited overall throughput
🔬 Behavioral Patterns
- • Higher pheromone efficiency correlated with structured food arrangements
- • Collaboration events peaked in scarcity scenarios (45 events) due to competition
- • Large colonies showed coordination challenges with more boundary violations
- • Communication tests demonstrated 89 collaborations with enhanced touch prompting
Methodology Notes
Data Collection
- • Each scenario run for 4000+ ticks or until completion
- • Real-time behavioral tracking and metric calculation
- • Gemini 2.5 Flash with naturalistic ant prompting
- • Automatic CSV export for reproducible analysis
Key Metrics
- • Efficiency: Food collected per LLM request
- • Exploration: Percentage of grid cells visited
- • Pheromone Eff.: Successful trail following events
- • Collaborations: Ant-to-ant touch communications