Write a parallel array Haskell expression once, run on CPUs & GPUs with repa and accelerate -


Repo and Accelerate API similarity

The CPU has a Haskell Ripa Library for the automatic calculation of parallel array . There is automatic data parity on dynamic library GPUs. The APIs are quite similar, with the same representation of the N-dimensional array. Between Quick and Ripa arrays with any fromRepa and toRepa with Data.Array.Accelerate.IO :

 Can switch. Repeat from  (shapes, alt, e) => Array A Sh E - & gt; Array sh 'e toRepa :: size sh sh = = & gt; Array sh 'e - & gt; ARAE AHE   

There are many backend to accelerate including LLVM, CUDA and FPGA (from which see Figure 2). I have seen one to speed up, though the library does not seem to be maintained. Given that the repo and the programming models at a faster pace are similar, I hope that there is a great way to switch between them. A once written function can be executed with R.C.C.P.PPP or with a fast backend, e.g. With CUDA function

Two very similar functions: rip and sharpen a pumpkin

Take a simple image processing thresholding function if a scale pixel value is less than 50, then it is set to 0 , Otherwise it retains its value. Here's what it does for a pumpkin:

Represents the rip in the following code and accelerates the implementation:

  Module main where importable data. Array Import qualified data as RAP Array Rep. IOM Import qualified data as RBM. Array Speed ​​as an importable data. As an importable data. Accelerate.io as an import data. Word - acceleration threshold acceleration :: IO () threshold accelerate = IIGIT & lt; - Either (error show) ID `FMAP` arid imageframbmp" Pumpkin-in. BMP "new IMG = ARUN $ AMAP avalaxy (AUS IMG) A. Writing Image TOBMP "Pumpkin-Out. BMP" New Imag Where - *** Exception: Predal.Order.com applies to EDSL types evalPixel :: A. Expand A.Word32 - & gt; A.Exp A.Word32 evalPixel p = if P & gt; 50 then p else 0 - repa thresholdRepa :: IO () threshold Apply the threshold on the image using Repa: AR :: Io (RARR RU R. DIM2 (Word 8, Word 8, Word 8) arr = either (Error.) ID `FMAP` R. Read image FMBPP "Pumpkin-in.bmp" IMG & lt; - ARIIGM & lt; - RComputer (RMAP applicable stop point IMG) R.LiteimJTOBMP "Pumpkin-Out. BMP" Where the newIMG applies, AtPoint :: (Word 8, Word 8, Word 8) - & gt; (R8, Word, 8, Word 8) applyAtPoint (r, g, b) = aaa [r ', g', b '] = map is applied on the onhand pixel [R, g, b] (r', g ', b' ) Applicable Hint on Pixel X = if X> 50 then x and 0 data balance charge = rip | Increase the main: IO () main = Let us switch UserChoice = Repa - this command line flag case user's preference repo - & gt; Threshold Arrived Acceleration - & gt; Threshold Accelerate   

Question: Can I write it only once?

The implementation of thresholdAccelerate and thresholdRepa are very similar. Is the array processing process a great way to write once, then choose the option for multicore CPU or acceleration in a switch in the program? I can think of choosing my import whether I import either CPU or GPU, i.e. either Data.Array.Accelerate.CUDA or Data.Array.Repa I want to import the type of work with ACC :

  play: arrays a => ACC A - & gt; A   

Or, for example using a type of square, something like this might be:

  main :: IO () main = Let's do userChoice = Repa - this is a command line flag action; & Lt; - Option of case user repo - & gt; ApplyThreshold :: RepaBackend () Acceleration - & gt; applyThreshold :: CudaBackend () verb   

or it is such a case that, for every parallel array function I want to express for both CPUs and GPUs, I have to implement it twice - Once with the Republic Library and re-accelerating the library? The short answer is that, for the moment, you have to write both versions unfortunately.

However, we are working on CPU support for speed, which will eliminate the need for the repo version of the code. In particular, recently got a new LLVM-based backend that targets both GPUs and CPU:

This new backend is still incomplete, buggy and experimental, but we It is planning to make a viable option for the current CUDA backend.

Comments

Popular posts from this blog

Verilog Error: output or inout port "Q" must be connected to a structural net expression -

jasper reports - How to center align barcode using jasperreports and barcode4j -

c# - ASP.NET MVC - Attaching an entity of type 'MODELNAME' failed because another entity of the same type already has the same primary key value -