Notes for Paper “Lattice Long Short-Term Memory for Human Action Recognition”


Sun, Lin, et al. “Lattice long short-term memory for human action recognition.” arXiv preprint arXiv:1708.03958 (2017).

  • Basics
    • CNN methods for spatial appearance
    • RNN methods (LSTM) for temporal dynamics. — Natively applying RNN only suitable for short term motions.
  • Main methods
    • Lattice-LSTM. — extend LSTM by learning independent hidden state transitions of memory cells for individual spatial locations.
      • Control gates are shared between RGB and optical flow stream.
      • Greatly enhance the capacity of the memory cell to learn motion dynamics.
    • Multi-model training procedure. — Train both input gates and forgor gates in the network. (Other two-stream network training these two separately)


  • Take home message
  • Other methods mentioned
    • Extension of CNN. –C3D learns both space and time.– Only covers a short range of the sequence.
    • Training another nerual network on optical flow.
    • Methods for obtain a better combination of appearance and motion:  spatial-temporal features using sequential procedure. 2D spatial (short) and 1D temporal (long)information.
    • ResNets
    • RNN, LSTM — encoder and decoder



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s