java - Creating word pairs, triplets etc for evaluation in Bleu -


I need to create a list of words, triples, etc. to evaluate in the BUL metric. I Bullu Unigram (one word) Begins with and goes to N-Gram - N is being specified at runtime

For example, the sentence "Israeli officials are responsible for airport safety" < P> This will be a list of words for unigrams

  Officials of Israel Officers Airport are responsible for airport safety   

There are relevant trigrams

  Officials of Israeli officials are responsible for the airport's responsibility for the AIPAR.   

I've coded the working blade, which forces hard code 4 to NGR and compels the calculation of cruelty Unigram. It's ugly hell, and besides, I should be able to supply n at run time.

Snippet trying to generate pairs / triples etc. -

  string current = ""; Int temp = 0; (Current = current: "+ + Goldwords [i]; while (temp & lt; N_GRAM_ORDER) {current = current + ":" + goldwords [temp + i]; temporary ++;} gold.grams.add (current); current = ""; temp = 0;}}   

edit - Therefore, the output of this snippet should be for bigger -

  Israeli: officials of officials: responsible responsible: to: airport airport: security   

Where the word sleep a string array I am tinkering with this loop for the day, removes relationships and it will not click for me. Can anyone see what I am doing wrong?

I will change this:

  string current = ""; int (temp = 0; (Intuit = 0; I & lt; Gold wordlines for Long - N_GRAM_ORDER; i ++) (current = current + ":" + Goldwords [i]; while (temp   

To do this:

  string current = ""; (I + j & lt; goldwords.length ()) for + = (i + j   

Thus, the word for the outer word is repeated through the first word of incorporation, the internal loop repeats through all the words to be included. One thing to note is that if statement is used to extract a direct error, if you only want full en-gram then it should be moved to the loop outside of the internal.

If you have to tell, you will get:

  Israel: Officer officials are: responsible for: airport to airport: security protection   

If you want:

  Israeli: Officials of officials: are: responsible for this, try this code:  
  string current = "" ; (I + n_GRAM_ORDER & lt; goldWords.length ()) {for (int j = 0; j & lt; N_GRAM_ORDER; j ++) for (current I + 0; i   

(The above code is done without checking against the compiler, so there may be one or a small syntax error in it. Validate this, but it will close you).

Comments

Popular posts from this blog

c# - ASP.NET MVC - Attaching an entity of type 'MODELNAME' failed because another entity of the same type already has the same primary key value -

jasper reports - How to center align barcode using jasperreports and barcode4j -

django - CommandError: You must set settings.ALLOWED_HOSTS if DEBUG is False -