Never Stand Still Faculty of Engineering Computer Science and Engineering
Click to edit Present’s Name
Extractive Summaris...
School of Computer Science and Engineering
Research Overview
2/12
Motivation:
 Research benefits from and expands on the...
School of Computer Science and Engineering
Theoretical Background
Abstract
– Retrospective perception by the authors
– At ...
School of Computer Science and Engineering
Qualifying Key Contributions
4/12
 Statistical Characteristics:
 Over-represe...
School of Computer Science and Engineering
Multi-stage Extractive Summarisation
5/12
School of Computer Science and Engineering
Stage 1: Keyword Profiling
 Input:
1. Citation summary of the target paper
2. ...
School of Computer Science and Engineering
Stage 2: Keyword Profile Language Modelling
 Input: paper keyword profile
 Me...
School of Computer Science and Engineering
Stage 3: Summarisation as Model Divergence IR
 Input: keyword profile language...
School of Computer Science and Engineering
Stage 4: Novelty-driven Re-ranking
 Input: top-k ranked sentence pool
 Method...
School of Computer Science and Engineering
Results
 Evaluation method: Pyramid score
 Performance comparison:
 Resilien...
School of Computer Science and Engineering
Conclusion
11/12
 KPLM -- a multi-stage statistical summarisation framework:
1...
School of Computer Science and Engineering
THANK YOU!
12/12
QUESTIONS?
of 12

NAACL2015 presentation

Published on: Mar 3, 2016
Source: www.slideshare.net


Transcripts - NAACL2015 presentation

  • 1. Never Stand Still Faculty of Engineering Computer Science and Engineering Click to edit Present’s Name Extractive Summarisation Based on Keyword Profile and Language Model Han Xu, Eric Martin and Ashesh Mahidadia 1/12
  • 2. School of Computer Science and Engineering Research Overview 2/12 Motivation:  Research benefits from and expands on the work of others  Interdependent nature of knowledge creates needs to explore new fields  Large amount of literature -- challenging task Objectives: Design a tool that facilitates identification of the key contributions of papers 1. First by identifying keywords that capture the most important contributions of a paper 2. Then by creating an extractive summary consisting of information-rich sentences that cover the former
  • 3. School of Computer Science and Engineering Theoretical Background Abstract – Retrospective perception by the authors – At the time of writing – Low information redundancy o McDonald et al (2005) use the Chu-Liu-Edmonds (CLE) algorithm to solve the maximum spanning tree problem. o To learn these structures we used online large-margin learning (McDonald et al, 2005) that empirically provides state-of-the-art performance for Czech. Citation summary – Extrospective judgment by the community – Over a period of time – High information redundancy  Source of contributions:  Qazvinian’s single paper summarisation corpus:  25 highly cited papers in the ACL Anthology Network from 5 different domains  Two files provided for each paper: 1. A citation summary 2. A manually constructed key contribution list 3/12
  • 4. School of Computer Science and Engineering Qualifying Key Contributions 4/12  Statistical Characteristics:  Over-representedness: keywords that are frequently used when citing a paper  Exclusiveness: keywords that are only used when citing a paper Citation sentences containing W1 Citation sentences containing W2
  • 5. School of Computer Science and Engineering Multi-stage Extractive Summarisation 5/12
  • 6. School of Computer Science and Engineering Stage 1: Keyword Profiling  Input: 1. Citation summary of the target paper 2. Citation summaries of all papers in the target paper’s domain  Method: one-tailed Fisher’s exact test  Output: keyword profile 6/12
  • 7. School of Computer Science and Engineering Stage 2: Keyword Profile Language Modelling  Input: paper keyword profile  Method: negative log transformation  Output: keyword profile language model (KPLM) 1. Directly encodes words’ salience as pseudo generative probabilities 2. More discriminative than a traditional language model 7/12
  • 8. School of Computer Science and Engineering Stage 3: Summarisation as Model Divergence IR  Input: keyword profile language model  Method: negative cross entropy retrieval model  Output: top-k sentences whose MLEs are of the smallest divergence to the KPLM 8/12
  • 9. School of Computer Science and Engineering Stage 4: Novelty-driven Re-ranking  Input: top-k ranked sentence pool  Method: Top Sentence Re-ranking (TSR)  Output: 5-sentence extractive summary 9/12
  • 10. School of Computer Science and Engineering Results  Evaluation method: Pyramid score  Performance comparison:  Resilience to more stringent summarisation size limit: 10/12
  • 11. School of Computer Science and Engineering Conclusion 11/12  KPLM -- a multi-stage statistical summarisation framework: 1. Keyword profiling 2. Keyword profile language modelling 3. Summarisation as model divergence based IR 4. Novelty-driven re-ranking  State-of-the-art performance in summarising scientific papers  Good resilience to more stringent summary length limit  Future Work:  Higher order n-grams  Multiple paper summarisation
  • 12. School of Computer Science and Engineering THANK YOU! 12/12 QUESTIONS?

Related Documents