江苏科技信息 ›› 2016, Vol. 33 ›› Issue (8): 27-29.doi: 10.3969/j.issn.1004-7530.2016.08.004
• 论文 • 上一篇 下一篇
严顺
出版日期:
发布日期:
Yan Shun
Online:
Published:
摘要: 中文分词是自然语言处理的重要研究范畴,当前关于古汉语的分词研究尚有待探索。文章基于条件随机场(CRF)模型探究了古汉语文献的自动分词,并设计了2组对比实验,对包含有27部经典先秦典籍的古汉语语料库进行了词性标注模型研究。
关键词: CRF, 古汉语语料库, 词性标注
Abstract: Chinese word segmentation is an important research area of Natural Language Processing (NLP). Current research on ancient Chinese words has yet to be explored. Article based on CRF model explores the automatic word segmentation of ancient Chinese literature, and designs two comparative experiments; 27 classic books of Pre-Qin Chinese corpus is part of speech (POS) tagging study model.
严顺. 基于CRF的古汉语分词标注模型研究[J]. 江苏科技信息, 2016, 33(8): 27-29.
0 / / 推荐
导出引用管理器 EndNote|Reference Manager|ProCite|BibTeX|RefWorks
链接本文: https://qkcb.jssti.cn/jskj/CN/10.3969/j.issn.1004-7530.2016.08.004
https://qkcb.jssti.cn/jskj/CN/Y2016/V33/I8/27