Skip to content

dbobrenko/awesome-dl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Awesome Deep Learning

Enhancements

pdf Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 @ Ioffe, Szegedy

Sequence Processing (NLP, Time Series)

pdf BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2019 @ Delvin et al

  • Transformer-based architecture.

pdf Attention Is All You Need 2017 @ Vaswani et al

  • Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.
  • The encoder is composed of a stack of N = 6 identical layers. Each layer has two sub-layers. The first is a multi-head self-attention mechanism, the second is a positionwise fully connected. Residual connection around each of the two sub-layers, followed by layer normalization. That is, the output of each sub-layer is LayerNorm(x + Sublayer(x)). To facilitate residual connections, all layers in the model, produce outputs of dimension model = 512.

About

Awesome Deep Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published