»
S
I
D
E
B
A
R
«
Back at uni
Feb 27th, 2009 by axman6

It’s Friday night, and I have just finished my first week of my second year at the Australian National University. This is the reason I haven’t written anything here for the last week, and my blog rate will probably remain low for the next 10 weeks… but I’ll do my best! This semester I’m doing some rather interesting courses (and some not so much…)

  • COMP2300Introduction to Computer Systems Which is about.. well computer systems. It covers topics like binary representations of numbers, C, computer architecture, etc. The thing I am however most looking forward to the most will be the third assignment, which will be based around the uni’s new T2 (the one Ben Lippmeier is using for most of his GHC on SPARC work)

  • COMP2100Software Construction This seems to be more about how to write larger software, than actual programming (though it does emphasise programming weekly to get into good habits). I’m interested to see for the Personal Software Process (PSP) stuff goes, it seems to me that it can make you a more productive programmer quite easily. After only an hour of using the time keeping for it, i’ve noticed i don’t get distracted as much, because it’s not programming time, so all i do is program in this time. Means i distract myself less, which is very helpful.

  • ENGN2211 – Electronic Circuits This is a second year electronics course, and should be fairly interesting too. Not much to say about this one…

  • ENGN2226 – Engineering Systems Analysis This course is taught by a pretty cool German maths PhD, and focuses mostly on modelling systems, which is obviously an integral part of engineering. We’ll have to use MATLAB for this, which from what I’ve heard from others may not be very fun at all. Could be a good place to use some of the nice statistical packages for haskell…

Anyway, I’ll try and keep up with the haskelling, but I’ll be doing a lot of non haskell this semester (C, Java, ASM, MATLAB), so it may have to fall by the way side for now :(

– Axman

AVar changes
Feb 22nd, 2009 by axman6

Yesterday, i stuck AVar 0.0.4 on hackage. Again, not thinking things through as well ad i should have, i realised later it should have been the 0.1.0 release, since it was the first major change to the package since i first wrote it.

So what’s new? Well, i split it up a bit, into three modules: Data.AVar, the classic interface, Data.AVar.Unsafe, the same as Data.AVar, but it throws any exceptions encountered by the variable, instead of passing them back to the user, and Data.AVar.Internal, which contains the code for the actual for AVars, and the datatypes used by them.

I think this is a nice way of doing things, and hopefully will make using the system a little easier if you don’t expect to cause exceptions. So as always, if you have any suggestions, please let me know.

– Axman

AVar released (three times)
Feb 20th, 2009 by axman6

The other day, I put my AVar package on hackage. Doing so taught me more than i expected it would, mainly that i need to do more testing of cabal packages before submitting them, and also to make sure you’ve exported all the functions you need to actually use the package (I forgot to put putAVar in the module exports -_-). So i’m up to release 0.0.3, without much work being done at all (although, I can’t think of much more to do really).

With the current release, i think that all exceptions that may occur from functions passed by the user to the variable should be caught by and handled correctly by the variable. I guess in a sense, AVars are smart variables that know what is and isn’t good for them, and will refuse to hurt themselves (hopefully!).

If you’re interested in seeing what AVar can do, check out the Hackage page. I really have no idea how it might be useful, but I’d love to hear others thoughts on it. If you have any requests, please let me know, and I’ll see what I can do (proper transactional … transactions are not something I want to tackle, especially with uni starting on monday)

– Axman

ASTM updates
Feb 18th, 2009 by axman6

Today, I’ve been talking with some people in #haskell about this ASTM thing, and I’ve had some great input, mainly from Stefan Ljungstrand (ski on #haskell), along with quicksilver and Simon Marlow, and I thought I’d share the changes I’ve put in with their help. I’m still not sure how useful this stuff is, but it’s fun and I’m learning. If you’re wondering what i’m on about, see this post.

So, the new features I’ve added are mainly to do with exception handling, so that variables don’t ‘die’ if you call tail on [], they report the error, and don’t change the ’state’. I’ve also deleted the Swap constructor, and replaced it with a more general Mod’ constructor:

data Transaction a =
Put a
| Get (MVar a)
| Mod (a -> a) (MVar (Maybe E.SomeException))
| forall b. Mod' (a -> (a,b)) (MVar (Either E.SomeException b))
| Atom (a -> Bool) (a -> a) (a -> a) (MVar Bool)

This you provide a function which takes the current ’state’, possibly modifies it, and returns something else, and return that something else back to you. This works a lot like the State monad, but allows concurrency.

I’ve changed handler as well to look like this:

handler :: Chan (Transaction a) -> a -> IO b
handler chan !x = do
req <- readChan chan
case req of
Put a         -> handler chan a
Get mvar      -> do
putMVar mvar x
handler chan x
Mod f mvar    -> do
let x' = f x
p <- E.catch (E.evaluate x' >> return Nothing)
(\e -> return (Just e))
putMVar mvar p
case p of
Nothing -> handler chan x'
_       -> handler chan x
Mod' f mvar    -> do
let y@(a,b) = f x
p <- E.try (E.evaluate a >> E.evaluate b)
case p of
Right _  -> do
putMVar mvar (Right b)
handler chan a
(Left e) -> do
putMVar mvar (Left e)
handler chan x
Atom test y n res ->
if test x
then do
putMVar res True
handler chan (y x)
else do
putMVar res False
handler chan (n x)

As you can see, I’ve added some exception handling, but I still need to add some more to the Atom/condModAVar case. What I want is something that can keep running even after there’s been an exception. I mean, who wants a mutable variable to just stop working huh?. I think once I get that done, along with haddock docs, I’ll stick it on hackage.

– Axman

ASTM: redundant STMish fun
Feb 17th, 2009 by axman6

So, tonight, I had an idea for something that I’m not sure if it’s either useful, efficient, interesting or anything other than a waste of time. It was an idea for the emulation of mutable variables, using functions as a way of storing the values in an accumulating parameter of sorts. They could safely be shared between threads, and could (and sort of do) support atomic actions.

It turned out to be really simple to implement, and I think just showing you the code would explain what I’m on about more than anything else. I don’t know how else this could be extended (I’m sure it could be though, for some fairly interesting ideas), and since it has been written in the wee hours of the morning, I may be doing some pretty stupid stuff…

So, I’d love to hear some discussion (though yes, I do already know this could probably be done with MVars or TVars or what have you) and ideas, and here’s the code:

module ASTM where
import Control.Concurrent
import Control.Concurrent.MVar
import Control.Concurrent.Chan
data Transaction a =
Put a
| Get (MVar a)
| Mod (a -> a)
| Atom (a -> Bool) (a -> a) (a -> a) (MVar Bool)
data AVar a = AVar (Chan (Transaction a))
newAVar :: a -> IO (AVar a)
newAVar x = do
chan <- newChan :: IO (Chan (Transaction a))
forkIO (handler chan x)
return (AVar chan)
where
handler chan x = do
req <- readChan chan
case req of
Put a         -> handler chan a
Get mvar      -> do
putMVar mvar x
handler chan x
Mod f         -> handler chan (f x)
Atom test y n res ->
if test x
then do
putMVar res True
handler chan (y x)
else do
putMVar res False
handler chan (n x)
putAVar (AVar chan) x = writeChan chan (Put x)
modAVar (AVar chan) f = writeChan chan (Mod f)
getAVar (AVar chan)   = do
res <- newEmptyMVar
writeChan chan (Get res)
takeMVar res
condModAVar (AVar chan) p t f = do
res <- newEmptyMVar
writeChan chan (Atom p t f res)
takeMVar res

Shootout PiDigits program kinda sucks (and possibly so does GHC)
Feb 12th, 2009 by axman6

So I’ve been trying to see if I could speed up the pidigits program, and, well, failed. I’m not alone though, Stephen Blackheath and others have tried too, and no one’s managed to make things much, if at all faster. The main problem seems to be that GHC is allocating a lot of RAM, which must be related to the need for Integers. The current program uses up 500+MB by the time it finishes. Not good. So, I’m putting a call out to see if anyone else can either think of a better way to get the answer than the current entry, or any tips about reducing memory usage. If you come up with anything, I’d love to hear from you. Cheers, ~ Axman

Other shootout news
Feb 6th, 2009 by axman6

I’ve been talking to Stephen Blackheath on the #haskell-in-depth channel, and he’s just submitted another new program to the shootout. This entry is for the regex-dna problem.

For this benchmark, each language is limited by the speed of the regex library it’s using.
Right… let me see. Here’s what I did;
I am not certain, but it seems that forcing the evaluation of s1, s2 and s3 (the blobs of input data) before parallelizing the processing
gave better CPU utilisation.
I also changed the part at the end where it does large numbers of pattern substitutions, so that it updated everything in-place in a mutable buffer.
I found that splitting the pattern substitutions into chunks sped things up by a factor of three. It looks like there’s some sort of overhead incurred on large buffers when the regex ‘match’ function creates its AllMatches array.
The reason why I needed memmove instead of memcpy is that memmove can work on overlapping buffers.
These changes improved the performance from 1:00 user / 0:42.2 real to 0:40.1 user / 0:25.12 real on my dual core, compared with 23.3 sec user for the slower non-parallel one of the two C++ implementations. (The parallel one wouldn’t build for me.)

You can see the details of the submission here, and the actual code. He also adds:

I took my unsafePerformIO’s out too. I have a reputation to maintain!

It’s good to see that someone else is working on making haskell look good (and doing a much better job than me thankfully!). Maybe soon we’ll be topping the list, or at least beating Java to move up to 4th place.

More n-bodies speedups
Feb 5th, 2009 by axman6

So I’ve been taking full advantage of the new #haskell-in-depth channel, and with the thanks of blackh and others, I’ve made a few more changes, and I’ve shaved another whole 30 seconds of my already not too shabby 1m17s. I’m now down to 47.215s, making a saving of 30s, or a saving of 39%.

If you’re wondering what I’m on about, take a look at this post for the story of my n-bodies program so far, and this port for what my code looked like before.

So how did I do it this time? Well, I decided STRefs weren’t necessary, and could easily be replaces with accumulating parameters in basically all of my code that used them. I did this in the hope they would be kept in registers, but even if they’re not, it’s still faster. I also saved about 3 seconds by inlining all my read* and write* functions with {-# INLINE #-} pragmas that would write or fetch the various parts of the Body’s info. So here’s the code, first the original one:

advance dt ps = do
forM_ index1 $ \i -> do
-- (B p v m) <- readBody ps i
p <- readPos ps i
v <- readVel ps i
m <- readMass ps i
vvar <- newSTRef v
forM_ (index2 i) $ \j -> do
-- (B p' v' m') <- readBody ps j
p' <- readPos ps j
v' <- readVel ps j
m' <- readMass ps j
let dp = p .-. p'
dst = mag dp
magnit = dt / (dst  dst  dst)
change = magnit . dp
(sca1,sca2) = (m' . change, m *. change)
writeVel ps j (v' .+. sca2)
modifySTRef vvar (.-. sca1)
vvar'@(V nx ny nz) <- readSTRef vvar
-- unsafeIOToST (print vvar')
writeVel' ps i nx ny nz
writePos ps i $ move' dt (B p vvar' m)

And the new one, with an inner loop:

advance dt ps =
forM_ index1 $ \i -> do
-- (B p v m) <- readBody ps i
p <- readPos ps i
v <- readVel ps i
m <- readMass ps i
let loop2 !j !v@(V !x !y !z)
| j > numPs = return v
| otherwise  = do
p' <- readPos ps j
v' <- readVel ps j
m' <- readMass ps j
let dp = p .-. p'
dst = mag dp
magnit = dt / (dst  dst  dst)
change = magnit . dp
(sca1,sca2) = (m' . change, m *. change)
writeVel ps j (v' .+. sca2)
loop2 (j+1) (v .-. sca1)
v' <- loop2 (i+1) v
-- unsafeIOToST (print v')
writeVel ps i v'
writePos ps i $ move' dt (B p v' m)

TextMate haskell bundle improvements
Feb 4th, 2009 by axman6

For the last week or so, ozy` has been working on improving the haskell bundle in TextMate. He’s done a lot of work, and it really makes a difference. The changes should be in SVN sometime soon I hope.

Changelist

  • Fixed regex for backquote-infix notation
  • Added highlighting for unit () and empty list []
  • Fixed up module, class, import, and instance declarations. In particular, some non-reserved words that are only meaningful in the above contexts are highlighted appropriately.
  • Added highlighting for all operators.
  • Changed scope selectors for some varSyms that were previously highlighted as “punctuation.separator.” This scope is not highlighted by many themes.
  • Added highlighting for numeric constants. Floats and integrals have separate scope selectors.
  • Changed scope selectors for some things like pragma syntax.
  • Fixed minor errors in regexes that highlight function names.
  • Modularized a lot of patterns. (There’s still a lot to be done here.)
  • Added highlighting for Prelude type names and constructors. This is scope aware, in the sense that (eg.) Maybe only gets special treatment in the context of a type signature. (This is incomplete, though.)
  • Added highlighting for commas. At the moment this is utterly useless.
  • Fixed some dumb issues where pragmas would only be scoped as such if they appeared inside a block comment.
  • Highlighted a couple of pragma directives. (Incomplete.)
  • Fixed up the type signature scope a little bit. (Not much.)

I’ve been using the changes while he’s been working on them, and they certainly do make the code look quite a bit better. I’ll be happy when haskell is a first class language in TextMate, and this is the first big step in getting it there. Hooray!

~ Axman

N-bodies evolution
Feb 4th, 2009 by axman6

So my last post got submitted to reddit by dons, and someone asked if I translated the code from C, so I thought I’d talk about my evolution of the problem, and how big the difference is runtime from start to finish is so far. First I should explain the problem. What needs to happen is I take 5 planets and the sun, each with an initial position and velocity, and a mass. Then I set about modifying each body’s position and velocity according to the gravitational pull of all the other bodies. For the purposes of this problem, in the end we’re only interested in the total energy in the system, which changes as time goes along. For the actual shootout, the bodies must be iterated 50,000,000 times, with a delta time of 0.1 (Not actually sure what that’s 0.1 of, possibly days). The reason I got interested in this problem was because I decided to take a look at the current haskell submission. I was a little horrified when I say it as all pointer nonsense and back magic. Full of stuff haskell can do fine (And possibly better and more safely that other languages), but should really be avoided where possible. I wanted to see if you could write a fast version, using more accessible techniques.

Lists

So my initial try used lists. I thought about how I could make is faster, and cool at the same time, and failed by using Control.Parallel, and I think all I have to say about this version that I am not proud of it, and when I just tried to run it with n=50000000, I decided to kill it after it reached 3.3GB of RAM usage in under a minute. Yeah, not great at all.

Parallel Arrays

So my next iteration of the problem basically continued on from the list version, but this using parallel arrays. As I’d basically followed in the same steps I’d used for the list version, its performance was on par with that of the list version; it was also killed after using 3GB of RAM in about 30 seconds, and again, I was not proud. It should also be mentioned that neither the list of parallel array versions produced the correct answers with n=1000, so they were never going to be contenders

STArrays

It was about this time I discovered the ST monad, and I fell in love (Sort of). The ST monad allows you to write imperative programs, using mutable variables and arrays, in a totally pure way. It’s a little like the State monad, but it has a generic state of ‘forall s’. I’m not going to go into the details of ST, but if you’re interested, check out SPJ’s paper on the subject of Lazy Functional State Threads (.ps.Z). This is where I finally started to get some good results. The program runs in about 1.4MB RAM (:O), which means I could actually time its runtime. So I did that, and well, at least it finished. The runtime was 14m46.957s according to time, which is still considerably faster than both Pythons, JavaScript under TraceMonkey, ruby, perl, php, and ruby again, but it’s slower than Lua, which is 24x slower than the current fastest C++. But I was getting somewhere! I could see this was the way to go, with memory usage being the most important thing so far, ST made keeping it really low easy as can be. Now for some speed…

STUArrays

I can’t remember where I found out about STUArrays, but I could see they were the obvious solution. They’re almost as close to C arrays as you’re going to get in nice haskell, while still remaining pure, and not so ugly. Basically STUArrays are unboxed arrays (U) in the ST monad (ST). This means that when out put a type into them, instead of putting a pointer to the data in the type inside the array, you put the actual data into the array, so it’s all sitting next to each other. This of course restricts the type you can out into an array somewhat to things like Doubles, Ints, Words, and the few other things with single constructors (This rules out Bools and Integers). There is quite a speed advantage to using STUArrays, but writing your own instance for say MArray (STUArray s) Body (ST s) is a little off putting. Luckily I had talked with Dons about what I was working on, and he’d already started working on something similar. So I nicked his MArray instance, modified it, and… Failed. For some reason, I kept getting segfaults when trying to get this STUArray stuff to work with my Body type with n > 22:

data Vec3 = V !Double !Double !Double
data Body = B {
pos ::  {-# UNPACK #-} !Vec3,
vel ::  {-# UNPACK #-} !Vec3,
mass :: !Double
}
I was not happy! And I still have no idea what went wrong. But I moved on.

STUArray s Int Double

I decided I’d avoid the MArray instance completely and use Double STUArrays directly. This turned out to be a good idea indeed, bringing some nice performance benefits. By doing this, I didn’t need to worry about taking out redundant data from the array by fetching a whole Body each time, I could pick out the data I needed, be it the Velocity, Mass or Position, or all three if I needed them all. With this work, I brought the runtime down to around 2m40 (I think), an approx. 80% improvement in runtime! This was massive! I was certainly onto something. I was now sitting behind Erlang at 5.9x and below Scheme PLT at 9.1x.

Performance

Ok, I’d come this far, but I was now stuck for ways to increase the speed. I’d done some algorithm rejiggering to see if I could gain any speed, and I managed to cut another 10 seconds or so off, but that wasn’t going to cut it. So, I took a look at the performance pages on the wiki. The Arrays page was most useful, suggesting that the use of unsafeRead and unsafeWrite could make huge improvments by removing redundant bounds checking. The floating point page also helped a little, with the suggestion of using the -fexcess-precision flag to speed up the floating point calculations. With these two changes, I managed to cut a whole minute 15 off the time. Yes, I just got a 50% speedup! Now I was getting happy. I was now beating Erlang HiPE, but sadly I was nowhere near C#’s 45 seconds, and definitely nowhere near C++’s 21 seconds. Still, how often do you get to produce a 50% speedup in some already relatively fast code?

Conclusion and the future

So what next? Well, I think I can save some more time be using explicit loops with accumulation parameters instead of using STRefs which I have a feeling may carry some overhead. I don’t know what kind of gains I can expect, but I doubt they’ll be that large. I might also consider making my code even more C like, by directly accessing parts of the arrays directly. I may even get rid of the whole ST thing, and working in IO and with more primitive operations, but by doing that, I’ll be moving away from the easier to understand code I have now. So, what can I conclude from all this? Well, Haskell sure can be fast, while still remaining beautiful. But making it really fast starts to get ugly (Introducing unsafe* does not help :\ ). There are obvious way to do things in haskell, and even some obvious fast ways to do them. But really fast code takes effort (Which is true of most, if not all languages). I’ve learnt a lot while writing this program, and I’ve still got a lot to learn. Wish me luck!

»  Substance: WordPress   »  Style: Ahren Ahimsa
© Alex Mason (Axman6) 2009