java - Creating word pairs, triplets etc for evaluation in Bleu -
I need to create a list of words, triples, etc. to evaluate in the BUL metric. I Bullu Unigram (one word) Begins with and goes to N-Gram - N is being specified at runtime
For example, the sentence "Israeli officials are responsible for airport safety" < P> This will be a list of words for unigrams
Officials of Israel Officers Airport are responsible for airport safety There are relevant trigrams
Officials of Israeli officials are responsible for the airport's responsibility for the AIPAR. I've coded the working blade, which forces hard code 4 to NGR and compels the calculation of cruelty Unigram. It's ugly hell, and besides, I should be able to supply n at run time.
Snippet trying to generate pairs / triples etc. -
string current = ""; Int temp = 0; (Current = current: "+ + Goldwords [i]; while (temp & lt; N_GRAM_ORDER) {current = current + ":" + goldwords [temp + i]; temporary ++;} gold.grams.add (current); current = ""; temp = 0;}} edit - Therefore, the output of this snippet should be for bigger -
Israeli: officials of officials: responsible responsible: to: airport airport: security Where the word sleep a string array I am tinkering with this loop for the day, removes relationships and it will not click for me. Can anyone see what I am doing wrong?
I will change this:
string current = ""; int (temp = 0; (Intuit = 0; I & lt; Gold wordlines for Long - N_GRAM_ORDER; i ++) (current = current + ":" + Goldwords [i]; while (temp To do this:
string current = ""; (I + j & lt; goldwords.length ()) for + = (i + j Thus, the word for the outer word is repeated through the first word of incorporation, the internal loop repeats through all the words to be included. One thing to note is that if statement is used to extract a direct error, if you only want full en-gram then it should be moved to the loop outside of the internal.
If you have to tell, you will get:
Israel: Officer officials are: responsible for: airport to airport: security protection If you want:
Israeli: Officials of officials: are: responsible for this, try this code: string current = "" ; (I + n_GRAM_ORDER & lt; goldWords.length ()) {for (int j = 0; j & lt; N_GRAM_ORDER; j ++) for (current I + 0; i (The above code is done without checking against the compiler, so there may be one or a small syntax error in it. Validate this, but it will close you).
Comments
Post a Comment