apache pig - Embed shell in PIG script -


I'm new to match pig and shell patterns.

I have a file and the third column is the content like "M2534896R402Qnew" I need to draw the number between 'M' and 'R'. In Pig script,

  raw = load 'record.txt' as PigStorage ('\ t') (Charrey, Charra, Charra, Charra); Data = raw through 'command command';   

How can I change the third column so that the 3 columns of all the data are removed from the crude?

Thank you.

There is no need to use streaming for it. Use the underlying UDF REGEX_EXTRACT pig already can handle it:

  $ cat record.txt f1 f2 M2534896R402Qnew f4 f1 f2 M2534896R987Qxyz f4 f1 F2 M2534897R421Qabc f4f 1 F2 M47Rzxcvzxcv f4 f1 f2 12345m000r f4f1f2m 23551finnf4f1f2m298793r133r23quinf4 $ cat test pog raw = load 'record.text' used in pygdottage ('tttta') (F1: Chararay, F2: Chararay, F3: Chararai, F4: Chararai); Ext = FOREACH Raw Origin REGEX_EXTRACT (f3, 'M (\\ d +) R', 1); Dump Extras; $ Pig-x local test.pig (2534896) (2534896) (2534897) (47) (000) () (298793)   

Note that REGEX_EXTRACT There is a chararray . If you want int you have to enter it.

Comments

Popular posts from this blog

Verilog Error: output or inout port "Q" must be connected to a structural net expression -

jasper reports - How to center align barcode using jasperreports and barcode4j -

c# - ASP.NET MVC - Attaching an entity of type 'MODELNAME' failed because another entity of the same type already has the same primary key value -