2022年被引用次数最多的100篇人工智能论文
谁正在发表最具影响力的 AI 研究?随着人工智能创新的飞速发展,尽快获取一些前沿信息至关重要。没有人有时间阅读所有内容,但这 100 篇论文肯定会为我们的 AI 技术发展方向指明方向。研发团队影响力的真正考验当然是技术如何出现在产品中,OpenAI 在 2022 年 11 月底发布了 ChatGPT,震惊了世界,紧随其 2022 年 3 月的论文“训练语言模型以遵循人类反馈的指令” ”。如此快速的产品采用很少见,因此为了进一步了解,我们来看一个经典的学术指标:引用次数。通过对 2022 年、2021 年和 2020 年每年被引用次数最多的 100 篇论文进行详细分析,我们可以得出一些早期结论。美国和谷歌仍然占据主导地位,DeepMind取得了卓越的成功,但考虑到OpenAI的产值,它确实是自成一派,在产品影响力和迅速广泛引用的研究方面都如此。
利用Zeta Alpha平台的数据和精心维护(有关方法学的更多信息在下文),平台收集了2022年、2021年和2020年人工智能领域的引用率最高的论文,并分析了作者的附属机构和国家。这使得我们可以按研发影响力而不是纯出版量对其进行排名。找有价值的信息,请记住Byteclicks.com
平台还添加了推特提及,这有时被看作是早期影响的指标,但到目前为止,它们之间的相关性似乎很弱,需要进一步研究。
以下是前100名列表,包括标题、引用次数和附属机构。
Title | Tweets | Citations | Organizations | Countries | Org Types |
AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models | 1331 | European Molecular Biology Laboratory | – | Academia | |
ColabFold: making protein folding accessible to all | 1138 | Max Planck Institute for Multidisciplinary Sciences | Germany | Academia | |
A ConvNet for the 2020s | 857 | 835 | Meta, UC Berkeley | USA, USA | Industry, Academia |
Hierarchical Text-Conditional Image Generation with CLIP Latents | 105 | 718 | OpenAI | USA | Industry |
PaLM: Scaling Language Modeling with Pathways | 445 | 426 | USA | Industry | |
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding | 2462 | 390 | USA | Industry | |
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding | 11 | 342 | NVIDIA | USA | Industry |
SignalP 6.0 predicts all five types of signal peptides using protein language models | 274 | Technical University of Denmark, ETH Zurich | Denmark, Switzerland | Academia, Academia | |
Swin Transformer V2: Scaling Up Capacity and Resolution | 87 | 266 | University of Science and Technology of China | China | Academia |
Training language models to follow instructions with human feedback | 448 | 254 | OpenAI | USA | Industry |
Chain of Thought Prompting Elicits Reasoning in Large Language Models | 378 | 224 | USA | Industry | |
Flamingo: a Visual Language Model for Few-Shot Learning | 71 | 218 | DeepMind | UK | Industry |
Classifier-Free Diffusion Guidance | 53 | 194 | USA | Industry | |
Magnetic control of tokamak plasmas through deep reinforcement learning | 0 | 194 | DeepMind | UK | Industry |
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language | 0 | 191 | Meta | USA | Industry |
OPT: Open Pre-trained Transformer Language Models | 812 | 187 | Meta | USA | Industry |
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation | 79 | 184 | Salesforce | USA | Industry |
A Generalist Agent | 231 | 180 | DeepMind | UK | Industry |
LaMDA: Language Models for Dialog Applications | 473 | 180 | USA | Industry | |
CMT: Convolutional Neural Networks Meet Vision Transformers | 0 | 172 | University of Sydney | Australia | Academia |
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model | 271 | 158 | Microsoft | USA | Industry |
What Makes Good In-Context Examples for GPT-3? | 0 | 157 | Duke University | USA | Academia |
Ensemble unsupervised autoencoders and Gaussian mixture model for cyberattack detection | 145 | Ningbo University of Technology | China | Academia | |
Training Compute-Optimal Large Language Models | 144 | DeepMind | UK | Industry | |
Learning robust perceptive locomotion for quadrupedal robots in the wild | 3 | 141 | ETH Zurich | Switzerland | Academia |
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances | 82 | 135 | USA | Industry | |
How Do Vision Transformers Work? | 193 | 129 | Yonsei University, NAVER | South Korea, South Korea | Academia, Industry |
Scaling Up Your Kernels to 31×31: Revisiting Large Kernel Design in CNNs | 30 | 127 | Tsinghua University | China | Academia |
Large Language Models are Zero-Shot Reasoners | 862 | 124 | University of Tokyo | Japan | Academia |
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time | 0 | 122 | University of Washington | USA | Academia |
Patches Are All You Need? | 117 | 116 | Carnegie Mellon University | USA | Academia |
Competition-Level Code Generation with AlphaCode | 113 | DeepMind | UK | Industry | |
TensoRF: Tensorial Radiance Fields | 73 | 110 | ShanghaiTech University | China | Academia |
Video Diffusion Models | 0 | 103 | USA | Industry | |
Data Analytics for the Identification of Fake Reviews Using Supervised Learning | 102 | Dr. Babasaheb Ambedkar Marathwada University | India | Academia | |
Visual Prompt Tuning | 26 | 102 | Cornell University | USA | Academia |
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection | 15 | 100 | Hong Kong University of Science and Technology | Hong Kong | Academia |
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training | 66 | 100 | Nanjing University, Tencent | China, China | Academia, Industry |
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? | 199 | 99 | University of Washington, Meta | USA, USA | Academia, Industry |
BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers | 11 | 96 | Nanjing University, Shanghai AI Lab | China, China | Academia, Academia |
Conditional Prompt Learning for Vision-Language Models | 51 | 93 | Nanyang Technological University | Singapore | Academia |
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution | 151 | 93 | Stanford University | USA | Academia |
Measuring and Improving the Use of Graph Information in Graph Neural Networks | 1 | 93 | Chinese University of Hong Kong | Hong Kong | Academia |
Exploring Plain Vision Transformer Backbones for Object Detection | 205 | 91 | Meta | USA | Industry |
GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation | 26 | 90 | Mila, University of Montreal | Canada, Canada | Academia, Academia |
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework | 91 | 88 | Alibaba Group | China | Industry |
Block-NeRF: Scalable Large Scene Neural View Synthesis | 641 | 86 | UC Berkeley | USA | Academia |
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents | 24 | 86 | UC Berkeley | USA | Academia |
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models | 881 | 81 | University of Notre Dame | USA | Academia |
Outracing champion Gran Turismo drivers with deep reinforcement learning | 80 | Sony | Japan | Industry | |
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning | 10 | 77 | USA | Industry | |
DN-DETR: Accelerate DETR Training by Introducing Query DeNoising | 0 | 74 | Hong Kong University of Science and Technology | Hong Kong | Academia |
Equivariant Diffusion for Molecule Generation in 3D | 131 | 73 | University of Amsterdam | Netherlands | Academia |
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images | 6 | 73 | NVIDIA | USA | Industry |
GPT-NeoX-20B: An Open-Source Autoregressive Language Model | 50 | 72 | EleutherAI | – | Industry |
Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems | 72 | Anhui University | China | Academia | |
Detecting Twenty-thousand Classes using Image-level Supervision | 35 | 70 | Meta | USA | Industry |
Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network | 68 | Wuhan University | China | Academia | |
LAION-5B: An open large-scale dataset for training next generation image-text models | 53 | 66 | LAION | Germany | Industry |
Denoising Diffusion Restoration Models | 0 | 65 | Technion | Israel | Academia |
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance | 175 | 64 | EleutherAI | – | Industry |
CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields | 33 | 63 | City University of Hong Kong | Hong Kong | Academia |
Solving Quantitative Reasoning Problems with Language Models | 139 | 63 | USA | Industry | |
Masked Autoencoders As Spatiotemporal Learners | 120 | 61 | Meta | USA | Industry |
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language | 499 | 59 | USA | Industry | |
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond | 2 | 59 | University of Sydney | Australia | Academia |
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks | 178 | 58 | Microsoft | USA | Industry |
Language-driven Semantic Segmentation | 95 | 57 | Cornell University | USA | Academia |
Vision-Language Pre-Training with Triple Contrastive Learning | 34 | 56 | University of Texas at Arlington | USA | Academia |
Deep Reinforcement Learning-Based Path Control and Optimization for Unmanned Ships | 55 | Tongji University | China | Academia | |
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction | 208 | 54 | MIT | USA | Academia |
Omnivore: A Single Model for Many Visual Modalities | 89 | 54 | Meta | USA | Industry |
Quantifying Memorization Across Neural Language Models | 106 | 54 | USA | Industry | |
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection | 36 | 53 | Johns Hopkins University | USA | Academia |
Genetic Algorithm-Based Trajectory Optimization for Digital Twin Robots | 53 | Wuhan University of Science and Technology | China | Academia | |
Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors | 280 | 53 | Meta | USA | Industry |
Discovering faster matrix multiplication algorithms with reinforcement learning | 52 | DeepMind | UK | Industry | |
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation | 221 | 52 | Google, Boston University | USA, USA | Industry, Academia |
PETR: Position Embedding Transformation for Multi-View 3D Object Detection | 4 | 52 | Megvii | China | Industry |
Protein structure predictions to atomic accuracy with AlphaFold | 51 | DeepMind | UK | Industry | |
ABAW: Valence-Arousal Estimation, Expression Recognition, Action Unit Detection & Multi-Task Learning Challenges | 2 | 50 | Queen Mary University of London | UK | Academia |
HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video | 72 | 50 | University of Washington | USA | Academia |
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models | 38 | 49 | University of Hong Kong | Hong Kong | Academia |
A Systematic Evaluation of Large Language Models of Code | 61 | 48 | Carnegie Mellon University | USA | Academia |
Robust Speech Recognition via Large-Scale Weak Supervision | 40 | 48 | OpenAI | USA | Industry |
Diffusion Models: A Comprehensive Survey of Methods and Applications | 274 | 47 | Peking University | China | Academia |
Can language models learn from explanations in context? | 113 | 46 | DeepMind | UK | Industry |
NELA-GT-2021: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles | 9 | 46 | Rensselaer Polytechnic Institute | USA | Academia |
ActionFormer: Localizing Moments of Actions with Transformers | 0 | 44 | Nanjing University, 4Paradigm Inc. | China, China | Academia, Industry |
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models | 44 | USA | Industry | ||
Diffusion-LM Improves Controllable Text Generation | 253 | 43 | Stanford University | USA | Academia |
Overview of The Shared Task on Homophobia and Transphobia Detection in Social Media Comments | 0 | 41 | National University of Ireland Galway | Ireland | Academia |
Text and Code Embeddings by Contrastive Pre-Training | 23 | 40 | OpenAI | USA | Industry |
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality | 125 | 40 | Hugging Face | USA | Industry |
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | 325 | 39 | BigScience Team | France | Industry |
Red Teaming Language Models with Language Models | 40 | 39 | DeepMind, New York University | UK, USA | Industry, Academia |
Transformer Memory as a Differentiable Search Index | 372 | 39 | USA | Industry | |
Torsional Diffusion for Molecular Conformer Generation | 109 | 38 | MIT | USA | Academia |
Unified Contrastive Learning in Image-Text-Label Space | 66 | 37 | Microsoft | USA | Industry |
Benchmarking Generalization via In-Context Instructions on 1, 600+ Language Tasks | 149 | 36 | University of Washington | USA | Academia |
方法论
首先在Zeta Alpha平台上收集了每年被引用次数最多的论文,然后手动检查了第一次发表日期(通常是arXiv预印本),以便将论文放置在正确的年份。通过在Semantic Scholar上挖掘高被引用的人工智能论文来补充此列表,因为它具有更广泛的覆盖范围和按引用计数排序的能力。这主要会发现来自高影响力的闭源出版商(例如《自然》、爱思唯尔、斯普林格和其他期刊)的补充论文。然后取每篇论文在Google Scholar上的引用次数作为代表性指标,并按此数字对论文进行排序,得出该年度的前100篇论文。对于这些论文,使用GPT-3提取作者、附属机构以及所在的国家,并手动检查这些结果(如果从出版物上无法清楚地看出国家,则取该机构总部所在国家)。有多个附属机构的论文对每个附属机构计数一次。