Mul$-­‐Target
 Machine
 Transla$on
 with
 
Mul$-­‐Synchronous
 Context-­‐free
 Grammars
 
 
Graham
 Neu...
Mo#va#on
 
•  When
 transla$ng
 into
 language
 T1,
 equivalent
 
transla$ons
 into
 a
 second
 lan...
Overall
 view
•  Mo$va$on
 
•  Propose
 mul*-­‐synchronous
 context-­‐free
 
grammars
 (MSCFGs)
 
•  How
...
Proposed
 Framework
•  Build
 on
 the
 well-­‐known
 synchronous
 context-­‐
free
 grammars
 (SCFG)
 
•...
How
 to
 learning
 MSCFGs
 
•  Learn
 from
 tri-­‐lingual
 parallel
 data
 
 
1.  Alignment
 
•  al...
How
 to
 learning
 MSCFGs
 
•  Learn
 from
 tri-­‐lingual
 parallel
 data
 
 
1.  Alignment
 
2.  P...
3.
 Calculate
 Features
 (13
 Features)
 
•  In
 standard
 SCFGs
 
–  P
 (γ|α1)
 and
 P
 (α1|γ)
...
Decoding
(a)
 one
 LM
 (only
 T1)
 
(b)
 joint
 search
 method,
 is
 based
 on
 consecu$vely
 e...
Experiments
 
•  Mul$
 UN
 Corpus:
 
–  Parallel,
 T1
 LM
 data:
 100,000
 Sentences
 
 
–  T2
 L...
Result
 1
•  T2
 =
 Spanish
 
 (best
 results)
 
•  Par$cularly
 effec$ve
 in
 similar
 languages
...
Result
 2
•  BLEU
 scores
 for
 different
 T1
 LM
 sizes
 without
 (-­‐LM2)
 or
 
with
 (+LM2)
 ...
of 11

Multi-Target Machine Translation with Multi-Synchronous Context-free Grammars @NAACL読み会_KomachiLab

Multi-Target Machine Translation with Multi-Synchronous Context-free Grammars
Published on: Mar 3, 2016
Published in: Science      
Source: www.slideshare.net


Transcripts - Multi-Target Machine Translation with Multi-Synchronous Context-free Grammars @NAACL読み会_KomachiLab

  • 1. Mul$-­‐Target  Machine  Transla$on  with   Mul$-­‐Synchronous  Context-­‐free  Grammars     Graham  Neubig  and  Philip  Arthur  and  Kevin  Duh Presenter:Shin  Kanouchi   NAACL_Reading2015@KomachiLab,  TMU 1
  • 2. Mo#va#on   •  When  transla$ng  into  language  T1,  equivalent   transla$ons  into  a  second  language  T2  can  help         •  T1  has  a  weak  language  model   •  T2  has  a  strong  language  model   can  we  use  a  T2  in  to  improve  results?     2
  • 3. Overall  view •  Mo$va$on   •  Propose  mul*-­‐synchronous  context-­‐free   grammars  (MSCFGs)   •  How  to  Learning  MSCFGs     •  How  to  perform  search  (Decoding)   – including  calcula$on  of  LM  probabili$es  over   mul$ple  target  language  strings     •  Experiment   – gains  of  up  to  0.8-­‐1.5  BLEU  points     3
  • 4. Proposed  Framework •  Build  on  the  well-­‐known  synchronous  context-­‐ free  grammars  (SCFG)   •  Propose  mul*-­‐synchronous  context-­‐  free   grammars  (MSCFGs),  with  mul$ple  targets     4
  • 5. How  to  learning  MSCFGs   •  Learn  from  tri-­‐lingual  parallel  data     1.  Alignment   •  alignments  for  each  sentence  automa$cally   •  IBM  models  implemented  by  GIZA++  (Och  and  Ney,  2003)     2.  Phrase  Extrac$on   3.  Calculate  Features   Source:   T2:   T1:   5 independent
  • 6. How  to  learning  MSCFGs   •  Learn  from  tri-­‐lingual  parallel  data     1.  Alignment   2.  Phrase  Extrac$on   •  phrase-­‐extract  algorithm  of  Och  (2002)     •  Source    →  T1   •  a                        →  了 •  ra$fié        →  批准 •  a  ra$fié  →  批准 了   3.  Calculate  Features   •  Source  →  T2   •  X   •  X   •  a  ra$fié  →  ra$fied     →  a  ra$fié  →  批准 了 |  ra$fied     6
  • 7. 3.  Calculate  Features  (13  Features)   •  In  standard  SCFGs   –  P  (γ|α1)  and  P  (α1|γ)   •  log  forward  and  backward  transla$on  probabili$es   –  Plex(γ|α1)  and  Plex(α1|γ)   •  log  forward  and  backward  lexical  transla$on  probabili$es   –  a  word  penalty  coun$ng  the  non-­‐terminals  in  α1,     –  a  constant  phrase  penalty  of  1.     •  In  MSCFGs   –  P  (γ|α2)  and  P  (α2|γ)   –  Plex(γ|α2)  and  Plex(α2|γ)   –  word  penalty  for  α2   •   In  addi$on   –  P  (γ|α1,  α2)  and  P  (α1,  α2|γ)     7
  • 8. Decoding (a)  one  LM  (only  T1)   (b)  joint  search  method,  is  based  on  consecu$vely  expanding  the   LM  states  of  both  T1  and  T2     (c)  sequen*al  search  method,  first  expands  the  state  space  of  T1,   then  later  expands  the  search  space  of  T2.     8
  • 9. Experiments   •  Mul$  UN  Corpus:   –  Parallel,  T1  LM  data:  100,000  Sentences     –  T2  LM  data:  4,000,000  Sentences   S:  en  T1,  T2:  ar,  es,  fr,  ru,  zh  (all  combina$ons)   •  Decoder:     –  Travatar  (Neubig,  2013)     •  Baseline:   –  A  standard  SCFG  grammar  with  only  the  source  and  T1     •  Proposed:   –  The  full  MSCFG  model  with  the  T2  LM       9
  • 10. Result  1 •  T2  =  Spanish    (best  results)   •  Par$cularly  effec$ve  in  similar  languages       BLEU 10
  • 11. Result  2 •  BLEU  scores  for  different  T1  LM  sizes  without  (-­‐LM2)  or   with  (+LM2)  an  LM  for  the  second  target.     11

Related Documents