Introduction
Attention Is All You Need
Abstract
1 Introduction
2 Background
3 Model Architecture
3.1 Encoder and Decoder Stacks
3.2 Attention
3.3 Position-wise Feed-Forward Networks
3.4 Embeddings and Softmax
3.5 Positional Encoding
4 Why Self-Attention
论文
2020.acl-main.6
2020.acl-main.7
2020.acl-main.8
2020.acl-main.9
2020.acl-main.10
2020.acl-main.11
2020.acl-main.11
2020.acl-main.20
2020.acl-main.23
2020.acl-main.25
2020.acl-main.34
2020.acl-main.36
2020.acl-main.40
2020.acl-main.41
2020.acl-main.42
2020.acl-main.54
2020.acl-main.57
2020.acl-main.55
2020.acl-main.60
2020.acl-main.61
2020.acl-main.64
2020.acl-main.67
2020.acl-main.68
2020.acl-main.76
2020.acl-main.80
2020.acl-main.94
2020.acl-main.98
2020.acl-main.101
2020.acl-main.127
2020.acl-main.130
2020.acl-main.131
2020.acl-main.143
2020.acl-main.144
2020.acl-main.145
2020.acl-main.147
2020.acl-main.148
2020.acl-main.150
2020.acl-main.152
2020.acl-main.154
2020.acl-main.155
2020.acl-main.164
2020.acl-main.165
2020.acl-main.167
2020.acl-main.171
2020.acl-main.184
2020.acl-main.185
2020.acl-main.186
2020.acl-main.201
2020.acl-main.218
2020.acl-main.219
2020.acl-main.221
2020.acl-main.222
2020.acl-main.223
2020.acl-main.224
2020.acl-main.227
2020.acl-main.228
2020.acl-main.251
2020.acl-main.252
2020.acl-main.254
Neural Architectures for Named Entity Recognition
Abstract
1 Introduction
2 LSTM-CRF Model
3 Transition-Based Chunking Model
4 Input Word Embeddings
5 Experiments
6 Related Work
7 Conclusion
Reference
Teaching Machines to Read and Comprehend
Abstract
1 Introduction
2 Supervised training data for reading comprehension
3 Models
4 Empirical Evaluation
5 Conclusion
Reference
Exploring the Limits of Language Modeling
Abstract
1 Introduction
2 Related Work
3 Language Modeling Improvements
4 Experiments
5 Results and Analysis
6 Discussion and Conclusions
References
Effective Approaches to Attention-based Neural Machine Translation
Abstract
1 Introduction
2 Neural Machine Translation
3 Attention-based Models
4 Experiments
5 Analysis
6 Conclusion
References
Conditional random fields as recurrent neural networks
Abstract
1. Introduction
2. Related Work
3. Conditional Random Fields
4. A Mean-field Iteration as a Stack of CNN
5. The End-to-end Trainable Network
Published with GitBook
Exploring the Limits of Language Modeling
Exploring the Limits of Language Modeling
[info]
link:
http://arxiv.org/pdf/1602.02410
results matching "
"
No results matching "
"