ARXIV
AI BUCKET

Primary category mix inside the AI surge Β· cs.LG, CV, CL & allies

6Categories
2015–25Window
PrimaryRule

The headline story is well known: AI-related submissions on arXiv grew fast. The more interesting question for practitioners is which lanes inside β€œAI” carried the mass β€” machine learning (`cs.LG`), vision (`cs.CV`), language (`cs.CL`), narrow AI (`cs.AI`), neural computing (`cs.NE`), and statistics-side ML (`stat.ML`). Below: mix within a defined AI bucket (each paper counted once by primary category), plus how that bucket sits against all CS primaries. All-arXiv yearly totals use the official monthly submissions CSV from arxiv.org/stats/get_monthly_submissions, summed by calendar year.

Counts use each paper once by primary category. AI bucket = cs.LG + cs.CV + cs.CL + cs.AI + cs.NE + stat.ML. Β· All-arXiv yearly totals are summed from arXiv’s official monthly submissions CSV at arxiv.org/stats/get_monthly_submissions. Β· Data cut: 2026-03-27 Β· All-arXiv yearly totals: monthly submissions (CSV)

2026 YTD Β· all arXiv73,322Jan–Mar 2026 Β· from monthly CSV (partial year; not comparable to full-year chart rows)
AI bucket (primary)9.8K β†’ 51K418% growth 2015–2025
cs.LG share of bucket35.5% β†’ 40.5%Machine learning primary codes
cs.CL share of bucket15.5% β†’ 22.6%NLP / language
All arXiv submissions105K β†’ 284KOfficial annual totals (sum of months)

What each label means

Codes are arXiv primary subject classes. Hover or tap a color in the charts below for counts; click to open the drill-down panel with links to live listings.

β€”Machine learning
Core ML: supervised/unsupervised/RL, methodology, robustness, fairness β€” general learning papers and many ML applications.
β€”Computer vision
Images/video, recognition, segmentation, scene understanding β€” the vision side of β€œAI” on arXiv.
β€”Computation & language (NLP)
Natural language: models, retrieval, speech/text, NLP benchmarks β€” often where LLM-era work lands.
β€”Statistics β€” ML
ML with a statistics framing (same research universe as cs.LG, different archive).
β€”Artificial intelligence (narrow)
Classic AI topics (planning, KR, search) β€” excludes ML/NLP/vision, which have their own codes.
β€”Neural & evolutionary
Neuro-inspired and evolutionary algorithms, neurodynamic models, related non-mainstream ML threads.

Mix within the AI bucket

Stacked bars show how primary submissions split across six AI-related categories each year. Width is proportional to papers in that category. Hover a segment for the tooltip; click to dig deeper.

2015
9.8K
2016
14K
2017
18K
2018
24K
2019
30K
2020
36K
2021
38K
2022
37K
2023
40K
2024
44K
2025
51K

2015 vs 2025 β€” share of the AI bucket

Normalized to 100% within the bucket. Hover a row for the tooltip; click the label or bar to dig deeper.

2015
35.5%
31%
15.5%
6.1%
7.2%
4.7%
2025
40.5%
24.1%
22.6%
6.9%
3.6%
2.3%

AI bucket as % of all CS primaries

Roughly what fraction of computer-science submissions (primary) fall into these six categories combined β€” illustrative totals aligned with the same JSON.

2015
37.6%
2016
47.3%
2017
55%
2018
60.8%
2019
64.9%
2020
67.4%
2021
68.5%
2022
68.7%
2023
68.2%
2024
67.3%
2025
67.3%

Data Sources