fix: Add Visualization column to main table (not just benchmark tables)
5f628a6
openhandsopenhandscommited on
feat: Add Visualization column for Laminar eval links
dfa8bfc
openhandsopenhandscommited on
Widen Logs column to prevent vertical stacking of download icons
9cab912
openhandsopenhandscommited on
Add download icons to Logs column for benchmark results
67867ec
openhandsopenhandscommited on
Format date column to show only date, not time
2b7cd27
openhandscommited on
Fix: Preserve mark_by selection during periodic data refresh
df058f7
openhandsopenhandscommited on
Add 'Mark systems by' selector for scatter plot icons (Company/Openness/Country)
ed6e90d
openhandsopenhandscommited on
Clean up unused code, files, and assets
bb0cd90
openhandscommited on
Only show 'Detailed Benchmark Results' when more than one benchmark exists
d17eff0
openhandscommited on
Connect 'Show only open models' checkbox to Winners and Evolution sections
49f9739
openhandscommited on
Winners by Category: put scores before names
24ff7a3
openhandsopenhandscommited on
Use emojis instead of images in Winners by Category table
63c73f3
openhandsopenhandscommited on
Refactor Winners by Category to single unified table
4f4eb00
openhandsopenhandscommited on
Add Winners by Category section to main page
c14a283
openhandsopenhandscommited on
Add 'Show only open models' checkbox filter
5cdf97c
openhandsopenhandscommited on
Add runtime column and Cost/Performance + Runtime/Performance charts to all pages
2854ddd
openhandsopenhandscommited on
Move Download column to benchmark-specific tables only
4d0ae13
openhandsopenhandscommited on
Add Download column for trajectory archives and increase table font size
b5317d7
openhandsopenhandscommited on
Add timer-based auto-refresh for leaderboard data
974f31f
openhandscommited on
Move 'Show incomplete entries' checkbox above plot and apply filter to both
361b5c2
openhandscommited on
Add periodic cache refresh for leaderboard data
6bddf26
openhandscommited on
Update DeepSeek logo, tooltip format, and category names
5778893
openhandsopenhandscommited on
Fix table icons layout and add Qwen/MiniMax logos
72b86cb
openhandsopenhandscommited on
UI cleanup and About page updates
6737ff3
openhandsopenhandscommited on
Multiple graph and table improvements
fcb3d0b
openhandsopenhandscommited on
Replace open/closed model distinction with lock emojis in tables
8a3a9eb
openhandsopenhandscommited on
Remove open/closed distinction from graph, use company logos as data points
b6ec318
openhandsopenhandscommited on
Add company logos to graphs and tables, label frontier points with model names
800e404
openhandsopenhandscommited on
Replace total_cost with cost_per_instance (average cost per instance)
b1f3e49
openhandsopenhandscommited on
fix: Column naming and incomplete entries toggle
4ab5f97
openhandsopenhandscommited on
feat: Update leaderboard calculations and add incomplete entries toggle
5998027
openhandsopenhandscommited on
Fix UI score formatting: do not coerce NaN to 0; rely on format_score_column to show 'Not Submitted'.\n\nCo-authored-by: openhands <openhands@all-hands.dev>
c68aa7d
openhandscommited on
Fix data plotting requirements and server port handling; ensure per-benchmark plots use correct agent column.\n\n- Respect HOST/PORT env for local runs\n- Use 'OpenHands Version' in plot requirements\n- Avoid plotting when use_plotly=False\n\nCo-authored-by: openhands <openhands@all-hands.dev>
fb3d0db
openhandscommited on
Remove unused AstaBench category files and update UI to OpenHands categories
6a0d1cb
openhandscommited on
Fix score calculation to match AstaBench methodology and update categories
e734bf6
openhandscommited on
Swap column order and fix duplicate column warnings