Sentiment Analysis of Chinese Microblog Based on Stacked Bidirectional LSTM
Yue Lu1, Junhao Zhou1, Hong-Ning Dai1, Hao Wang2, Hong Xiao3
1 Macao University of Science and Technology
2 Norwegian University of Science and Technology
3 Guangdong University of Technology
1
Outline
2
Outline
3
Sentiment Analysis
Microblog
Texts
Positive
Negative
Sentiment Analysis
Model
4
Related Works - Word Representation
Feature engineering
• Hand-crafted features
• Sentiment lexicons
Problem
0.641 excited
0.531 satisfied
0.375 cool
0.266 thought
0.063 make
-0.266 sadly
-0.531 unhappy
-0.719 annoying
5
Can these features be manually designed?
E1. 为祖国疯狂打call
( Cheer for my country! )
E2. 陈独秀请你坐下
( Your idea was quite brilliant! )
Traditional feature engineering cannot encode semantic features automatically
6
Related Works - Document Representation
Sentiment Analysis
Problem
0.719
nice
0.094
day
0.000
a
0.000
have
7
Can the sentiment orientation of the middle two sentences be correctly identified without referring to the context?
E3. 为什么要这么苛刻呢?8 分钟展现出这么多中国元素,中国科技,展现出中国的热情和自信。张艺谋导演真的是鞠躬尽瘁了。搞不懂这些人!
( Why are they so mean? This 8-minute show exhibited so many Chinese elements, Chinese technologies as well as our people's enthusiasm and confidence. The director Zhang Yimou has already tried his best. I really can't understand these people! )
Traditional Non-RNN based methods cannot deal with
long-range dependencies of words (contextual feature)
Positive×
Non-RNN Based Model
8
Challenges for sentiment analysis of Chinese Microblog
9
Outline
10
Overview of Methodology
semantic features
contextual features
11
Overview of Methodology
semantic features
12
Continuous Bag-of-Words (CBOW)
Beijing
Olympics
remarkable
excellent
brilliant
perfect
100-dimensional column matrix
...
100
elements
13
Overview of Methodology
contextual features
14
Recurrent neural network (RNN)
Problem:
RNN stores all the information from previous inputs without filtering out some useless information. (Cannot handle long-range dependencies of words)
15
inputs
outputs
Long Short Term Memory (LSTM) [1]
Memory Unit of LSTM [2]
forget useless info
memorize useful info
[1] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
[2] Zhang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis: a survey. Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery.
16
Sentiment Analysis Based on Stacked Bi-LSTM
past
contexts
future
contexts
Combining
past & future
contexts
17
Stacked Bidirectional LSTM Model
: LSTM cell
info to be kept
info to be forgotten
18
Sentiment Analysis Based on Stacked Bi-LSTM
Sentiment Prediction
19
Outline
20
Experiment Setting
21
Results
22
Influence of Different Factors
23
Influence of Different Factors
24
Outline
25
Conclusion
26
Thanks
27
Influence of Different Factors
25
Corpus Construction
Preprocessing of raw microblog text:
Removing hashtags, reply symbols and several references to user names (@user) and links.
Chinese segmentation and stop word processing
Skip-gram
context words
current word
(Mikolov et al., 2013)
Continuous Bag-of-Words (CBOW)
context words
current word
(Mikolov et al., 2013)
Long Short Term Memory (LSTM) [1]
Memory Unit of LSTM [2]
[1] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
[2] Zhang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis: a survey. Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery.
16
old cell state
new cell state
output from the previous hidden layer
current input
output of current hidden layer
forget gate:
what information to dump from the cell state
input gate:
what new information to store in the cell state
new memory:
new candidate values after adding new input
new cell state:
update the old cell state Ct-1 into new cell state Ct
output gate:
decides which parts of the cell state to output
output of current hidden layer:
puts the cell state through the tanh function and multiplies it by the output of the sigmoid
gate
Stacked Bidirectional LSTM Model
: LSTM cell
info to be kept
info to be forgotten
18
Sigmoid