山东科学 ›› 2017, Vol. 30 ›› Issue (1): 115-121.doi: 10.3976/j.issn.1002-4026.2017.01.019

• 其他研究论文 • 上一篇    下一篇

基于词典与规则的新闻文本情感倾向性分析

李晨,朱世伟,魏墨济,于俊凤,李新天   

  1. 1.山东省科学院情报研究所,山东 济南 250014;2.山东省科学院生物研究所,山东 济南 250014
  • 收稿日期:2016-07-13 出版日期:2017-02-20 发布日期:2017-02-20
  • 作者简介:李晨(1988—),男,硕士,研究方向为大数据和数据挖掘。
  • 基金资助:
    山东省科技发展计划(2014GGX101013);山东省重点研发计划(2015GGX101032,2015GGX101037,2016GGX101018)

Lexicon and rules based news text sentiment analysis

LI Chen, ZHU Shi-wei, WEI Mo-ji, YU Jun-feng,LI Xintian   

  1. 1.Information Institute, Shandong Academy of Sciences, Jinan 250014, China;2.Biology Institute,Shandong Academy of Sciences, Jinan 250014,China
  • Received:2016-07-13 Online:2017-02-20 Published:2017-02-20

摘要: 通过对新闻类文体的结构分析,将新闻文体按段落划分,采用一种基于情感词典和语义规则相结合的情感关键句抽取方法,对段落内的句子进行情感分析。综合考虑情感、转折、否定、程度和归总等词语信息构建情感词典,根据规则切割新闻文本,将新闻划分为意群、句子、段落以及篇章,通过制定的规则计算情感关键句倾向值,最终获得段落以及整个篇章的情感倾向值,从而得出新闻的情感倾向。与情感词典和SVM情感分类方法的实验结果对比表明,本文方法在对新闻文本进行倾向判别时效果较好,方法具可行性。

关键词: 情感分析, 规则, 网络新闻, 情感词典

Abstract: According to the structure, the news style was divided into several paragraphs. Based on sentiment lexicon and semantic rules, a method of extracting sentimental key sentences was used to analyze the sentiment of sentences within each paragraph. Firstly, sentiment lexicon was built by considering the emotion, twist, negation, degree and sums up vocabularies; Secondly, according to rules, news text was divided into sense groups, sentences, paragraphs and chapters; Furthermore, orientation value of sentimental key sentences was computed by the rules established, and then the sentimental orientation value of the paragraphs and the whole chapters was obtained by weighted average of sentences, thus the sentimental orientation of news was revealed. Compared with lexicon based method and SVM sentiment classification, experimental results show that the method proposed has good effects on the orientation identification of news text, showing good feasibility as well.

Key words: rules, sentiment lexicon, sentiment analysis, online news

中图分类号: 

  • TP311.1