-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Let's use this wiki to keep a reading list of interesting papers. You can edit it
[CS224D Deep learning for NLP] (http://cs224d.stanford.edu/syllabus.html)
The Sequence to Sequence Model
Attention Model and Another one and Conversation Model
Memory Network and [End-to-End Memory Network] (http://arxiv.org/pdf/1503.08895v5.pdf), Dynamic Memory Network,DMN for Visual
NLP with Distributed Representation
NLP from scratch - deeptext model
Character-wise LM, Character-wise DL for Classification
Generation of conversation response
A Chatroom Dataset and its github
Function Approximation with 2nd order optimization
AI Go Introduction in Chinese,Google AlphaGo, FB Darkforest
The Atari and RL paper and its Nature paper, [Google Atari RL Architecture] (http://www.iclr.cc/lib/exe/fetch.php?media=iclr2015:silver-iclr2015.pdf),and the paper Google distributed RL paper,
Baidu's DNN based speech recognition system
ImageNet 2015 winning solution, Deep Residual Learning
Deep Networks with Stochastic Depth
Clipping & Regularizer to alleviate gradient exploding & vanishing
A good introduction of LSTM and its variants and a good lecture(http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec7.pdf)
Colah's Blog The author has a way to explain neural network concepts via a clear way. Specially, i like the way he described LSTM
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Distributed Training with SSP by Xing's group
The original CNN paper. Although this paper was written at 1998, it is still great to read the first two sections of the paper today.
A list of deep learning papers, a little bit old and some other list
A DL talk with introduction to Application by Yann and 2015 NIPS DL tutorial
####* Deep Learning Framework comparison By Bartvm
##* Compression of deep learning [Binary Connect] (http://arxiv.org/pdf/1511.00363v2.pdf) Binarize weights and quantize (hidden or raw) inputs to save multiplication.
[BinaryNet] (http://arxiv.org/pdf/1602.02830v2.pdf) Binary weights and activation
[1-bit compression] (http://research.microsoft.com/pubs/230137/IS140694.PDF) good for dense data like speech but doubt for sparse data like text ##* Embedding Glove in standford NLP
Large Target in LSTM, Importance Sampling
##* Linear Model
[FTRL: Follow the Regularizered Leader] (http://arxiv.org/pdf/1403.3465v3.pdf), Google's LR using FTRL
##* Recommendation An extensive study by Xavier
##* Others Bayesian Program Learning