In response to a recent post highlighting some performance problems with arrays in haskell I decided that there are some fairly primitive functions missing in the current array library. My attempt at fixing these issues is now on hackage in the array-utils package. My hope is that some or all of these functions will be added to the array package in GHC 7.2.
The functions I have implemented basically try to remove as much bounds checking as possible, so the implementation of these functions all use the unsafeRead, unsafeWrite and unsafeIndex functions to help avoid extra overhead. Some of the functions that are included are:
updateElems :: (MArray a e m, Ix i) => (e -> e) -> a i e -> m ()
Which updates every element in the array with the given function.
updateElemsM :: (MArray a e m, Ix i) => (e -> m e) -> a i e -> m ()
the monadic version
updateElemsIx :: (MArray a e m, Ix i) => (i -> e -> e) -> a i e -> m ()
also provides the index to the update function. There’s also a monadic version of this.
updateWithin :: (MArray a e m, Ix i) => (e -> e) -> (i,i) -> a i e -> m ()
Which updates every element in the line/rectangle/prism defined by the start and end indexes.
updateOn :: (MArray a e m, Ix i) => (e -> e) -> [i] -> a i e -> m ()
Which updates the given indices.
updateSlice :: (MArray a e m, Ix i) => (e -> e) -> (i,i) -> a i e -> m ()
Which updates every element from the start index until the end index, so every element in the flat array from start to end.
Update: The difference between updateWithin and updateSlice is that if you have a 2D array with indices from (1,1) to (10,10) and you say updateSlice (+10) ((2,5),(4,2)) arr, then it will add 10 to all elements whose index is between
index ((1,1),(10,10)) (2,5) which is 14 and
index ((1,1),(10,10)) (4,2) which is 35. So it will update elements 5 to 10 on row 2, 1 to 10 on row 3, and 1 to 2 on row 4. If you used updateWithin here, it wouldn’t update anything, because range ((2,5),(4,2)) returns an empty list. I might do another post with images to help clear this up.
All functions in the module use Int based indexing and unsafe functions internally to hopefully speed up the code that’s generated.
I’m yet to benchmark these functions and see whether they would make any difference to the results of the above article (I doubt they’d be any faster than the Ptr versions). Whether they are faster or not, they should hopefully save a fair amount of code for a lot people that’s easy to get wrong. When I do benchmark these, I’ll add the results to this blog.
Speaking of getting it wrong, while I am fairly confident, I haven’t fully tested these functions yet, so if you feel they would be useful to you, and you run into strange results, I would love to know about it! I’m hoping to figure out how to get quickcheck to run some tests, and hopefully I’ll have that done next weekend.
If you can think of any more functions you think should be in the array package, please let me know, and I’ll see if I can add them. All the code is available on GitHub.