Deep Learning based approaches
- APPLYING DEEP LEARNING TO ANSWER SELECTION: A STUDY AND AN OPEN TASK [IBM Watson]
We apply a general deep learning framework to address the non-factoid question answering task. Our approach does not rely on any linguistic tools and can be applied to different languages or domains…the top-1 accuracy can reach up to 65.3% on a test set, which indicates a great potential for practical use.
We treat the QA from a text matching and selection perspective.
From the definition(From answer selection’s perspective), the QA problem can be regarded as a binary classification problem. For each question, for each answer candidate, it may be appropriate or not. In order to find the best pair, we need a metric to measure the matching degree of each QA pair so that the QA pair with highest metric value will be chosen.
Section 4 “4. RESULTS AND DISCUSSIONS” needs carefully attention
- A Neural Network for Factoid Question Answering over Paragraphs
- Deep Learning for Answer Sentence Selection
- A Deep Architecture for Matching Short Texts
However, before learning can take place, such pairs needs to be mapped from the original space of symbolic words into some feature space encoding various aspects of their relatedness, e.g. lexical, syntactic and semantic.
While pairwise and listwise approaches claim to yield better performance, they are more complicated to implement and less effective train.
To train
the embeddings we use the skipgram model with window size 5
and filtering words with frequency less than 5. The resulting model
contains 50-dimensional vectors for about 3.5 million words.
. Embeddings
for words not present in the word2vec model are randomly
initialized with each component sampled from the uniform
distribution U[−0.25, 0.25].
Additionally, even for the words found in the word matrix, as
noted in [38], one of the weaknesses of approaches relying on dis-tributional word vectors is their inability to deal with numbers and
proper nouns. This is especially important for factoid question answering,
where most of the questions are of type what, when, who
that are looking for answers containing numbers or proper nouns
- Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection
- Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement [… a novel neural network architecture that demonstrates state-of-the-art accuracy on three SemEval tasks and two answer selection tasks.]
- Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks
- Multitask Learning with Deep Neural Networks for Community Question Answering
- Inner Attention based Recurrent Neural Networks for Answer Selection
- LSTM-based Deep Learning Models for non-factoid answer selection
Undefined
- A survey on question answering systems with classification
- WikiQA: A Challenge Dataset for Open-Domain Question Answering
- Automatic Feature Engineering for Answer Selection and Extraction
- Question Answering Using Enhanced Lexical Semantic Models
- A Joint Model for Answer Sentence Ranking and Answer Extraction
- Learning concept importance using a weighted dependence model
- Parameterized concept weighting in verbose queries [6 and 7 are mentioned as state-of-the-art in Waston lab’s paper]
Questions (q,q)
- 关键词
宝宝能吃辣椒吗? 跟宝宝能吃鱼吗?相似度高
- 肯定,否定
- 语义
- 相似问题不同答案
- 歧义
宝宝多大能睡枕头。宝宝能睡多大的枕头
- 时间/数量词
7个月宝宝能吃鱼吗? 9个月宝宝能吃鱼吗?
- 疑问词
问的是怎么办,回答的是为什么
- 分词粒度
大三阳和小三阳的分词错误
- 缩写
孕前和怀孕前期; 唐筛和唐氏筛查
(1)关键词的识别:识别词的重要程度,除了BM25之外,还有搜索引擎中term出现的频率;
“孕妇怎么补锌?” vs “孕妇怎么补钙?”
(2)附加词的去除:
比如“速度有多快”里面的“快”,“炎症有哪些症状”里面的“症状”,省略词的复原类似;
(3)属性词的识别:
急性、慢性
(4)性别词的识别:
前列腺、经期
(5)缩写词的还原:
大学三年级 vs 大三
唐筛 vs 唐氏筛查
(6)数量词的计算:
3周半、7个月
(7)分词的粒度:
小/三/阳、大三/阳
(8)疑问词的识别:疑问句有不同的类别之分
哪些?多少?
(9)时间词的识别:
天、周、月、岁
(10)限定语的去除:
怀孕早期能不能喝蜂蜜
怀孕晚期能不能喝蜂蜜
(11)歧义的消解(焦点的识别)
婴儿多大能用枕头?
婴儿能用多大枕头?
(12)否定词的识别?
“孩子白天不爱睡觉怎么办?” vs “孩子白天爱睡觉怎么办?”
“我这个结果是正常的吗?” vs “我这个结果有问题吗?”
训练数据 粒度。 分词 embedding 抽特征
- paragraph 第一句和最后一句
- 同一个问题的不同答案的第一句和最后一句的聚合做data augment
- 问题分类,看一下error case
- 提取问题的关键词,焦点, 意图, 不一定需要做full parsing
反馈: 1. 直接问他喜不喜欢。2. 点进去相关问题。 3. 点进去后like还是dislike