Benchmark Data & Results

Comprehensive performance metrics from 8 standardized test scenarios evaluating LLM-powered ant colony behaviors

Standardized scenarios

6.3

Per simulation

Touch communications

2,024

Across all tests

🏆 Best Performing Scenario: Resource Abundance

Efficiency Score

6.06

Food Collected

12/15

Pheromone Efficiency

62.3%

Detailed Results by Scenario

Scenario	Colony Size	Food Collected	Exploration %	Pheromone Eff. %	Collaborations	Efficiency
Baseline Performance 305s duration	20	7/8 88%	67.3%	78.2%	23 1.1 per ant	2.86 food/LLM req
Resource Scarcity 428s duration	20	3/3 100%	89.1%	91.5%	45 2.3 per ant	1.04 food/LLM req
Resource Abundance 289s duration	20	12/15 80%	45.7%	62.3%	18 0.9 per ant	6.06 food/LLM req
Linear Trail 245s duration	20	8/8 100%	34.2%	94.7%	31 1.6 per ant	4.79 food/LLM req
Scattered Resources 478s duration	20	5/8 63%	92.8%	56.9%	12 0.6 per ant	1.60 food/LLM req
Large Colony 389s duration	40	6/8 75%	78.4%	43.2%	67 1.7 per ant	1.15 food/LLM req
Small Colony 412s duration	5	4/8 50%	23.6%	85.1%	3 0.6 per ant	4.60 food/LLM req
Communication Test 334s duration	15	5/6 83%	58.9%	82.7%	89 5.9 per ant	2.46 food/LLM req

• Linear Trail scenario achieved highest efficiency (4.79) due to structured food placement
• Resource Abundance enabled highest absolute food collection (12/15 sources)
• Scattered Resources required most exploration (92.8% coverage) but lowest efficiency
• Small colonies showed high per-ant efficiency but limited overall throughput

• Higher pheromone efficiency correlated with structured food arrangements
• Collaboration events peaked in scarcity scenarios (45 events) due to competition
• Large colonies showed coordination challenges with more boundary violations
• Communication tests demonstrated 89 collaborations with enhanced touch prompting