Benchmark Data & Results

Comprehensive performance metrics from 8 standardized test scenarios evaluating LLM-powered ant colony behaviors

Total Simulations

8
Standardized scenarios

Avg. Food Collected

6.3
Per simulation

Avg. Collaborations

36
Touch communications

Total LLM Requests

2,024
Across all tests

🏆 Best Performing Scenario: Resource Abundance

Efficiency Score
6.06
Food Collected
12/15
Pheromone Efficiency
62.3%

Detailed Results by Scenario

ScenarioColony SizeFood CollectedExploration %Pheromone Eff. %CollaborationsEfficiency
Baseline Performance
305s duration
20
7/8
88%
67.3%
78.2%
23
1.1 per ant
2.86
food/LLM req
Resource Scarcity
428s duration
20
3/3
100%
89.1%
91.5%
45
2.3 per ant
1.04
food/LLM req
Resource Abundance
289s duration
20
12/15
80%
45.7%
62.3%
18
0.9 per ant
6.06
food/LLM req
Linear Trail
245s duration
20
8/8
100%
34.2%
94.7%
31
1.6 per ant
4.79
food/LLM req
Scattered Resources
478s duration
20
5/8
63%
92.8%
56.9%
12
0.6 per ant
1.60
food/LLM req
Large Colony
389s duration
40
6/8
75%
78.4%
43.2%
67
1.7 per ant
1.15
food/LLM req
Small Colony
412s duration
5
4/8
50%
23.6%
85.1%
3
0.6 per ant
4.60
food/LLM req
Communication Test
334s duration
15
5/6
83%
58.9%
82.7%
89
5.9 per ant
2.46
food/LLM req

📊 Performance Insights

  • Linear Trail scenario achieved highest efficiency (4.79) due to structured food placement
  • Resource Abundance enabled highest absolute food collection (12/15 sources)
  • Scattered Resources required most exploration (92.8% coverage) but lowest efficiency
  • Small colonies showed high per-ant efficiency but limited overall throughput

🔬 Behavioral Patterns

  • • Higher pheromone efficiency correlated with structured food arrangements
  • • Collaboration events peaked in scarcity scenarios (45 events) due to competition
  • • Large colonies showed coordination challenges with more boundary violations
  • • Communication tests demonstrated 89 collaborations with enhanced touch prompting

Methodology Notes

Data Collection

  • • Each scenario run for 4000+ ticks or until completion
  • • Real-time behavioral tracking and metric calculation
  • • Gemini 2.5 Flash with naturalistic ant prompting
  • • Automatic CSV export for reproducible analysis

Key Metrics

  • Efficiency: Food collected per LLM request
  • Exploration: Percentage of grid cells visited
  • Pheromone Eff.: Successful trail following events
  • Collaborations: Ant-to-ant touch communications