Zettelkasten/Terminology Information

Transformer

Computer-Nerd 2023. 4. 9.

Information

  • Transformer is a deep learning architecture that was introduced in a 2017 paper by Vaswani et al. for natural language processing tasks.
  • Unlike traditional RNNs, Transformers don't use sequential processing to learn context from a sequence of inputs. Instead, they use a self-attention mechanism to process all input positions simultaneously.
  • The architecture of the Transformer is composed of an encoder and a decoder. The encoder is responsible for processing the input sequence and producing a fixed-length vector representation of the sequence, while the decoder generates an output sequence from that vector.
  • The self-attention mechanism in the Transformer allows the model to attend to different parts of the input sequence at different levels of granularity, and to weigh the importance of each part for a given output.
  • The Transformer has been shown to achieve state-of-the-art results in many natural language processing tasks, including machine translation and language modeling.
  • In addition to NLP, Transformers have also been applied to other tasks such as image generation, video classification, and speech recognition.

'Zettelkasten > Terminology Information' 카테고리의 다른 글

STL (Seasonal-Trend decomposition using LOESS)  (0) 2023.04.11
Time series forecasting  (0) 2023.04.10
RNN (Recurrent Neural Network)  (0) 2023.04.08
Long-term time series forecasting  (0) 2023.04.07
RW (Random Walk)  (0) 2023.04.06

댓글