|
Wednesday, 25 November 2009 00:00 |
|
I've been spending my downtime playing around with PyMC over the past few days. There's quite a few good examples, including the coal mining change-point analysis (using the Jarret 1979 data), and a Bayesian ANOVA example using data from Montgomery. I decided to try and reproduce a Bayesian Belief Net result (albeit rather the hard and pointlessly computationally intensive way) with it instead. The package comes pre-built with several common distributions, but not much that works well with the table-based probabilities that a lot of BBN problems are packaged with. As a result I had to implement my own distributions by hand, which ended up being a good exercise in extending the package, even if in a somewhat trivial way. Three 'gotchas' that jumped out at me: - PyMC does one-at-a-time sampling, so if you have any 1/0 conditional probabilities floating around, you can find yourself locked into particular configurations of your network pretty rapidly, where neither can change until the other does, but them both being in the other state simultaneously is possible. You can fix this by defining the child node as a 'deterministic', but if you're playing around with different probabilities it can be a bit of a pain to switch it back and forth.
- If you don't initialize your network to a 'possible' state, PyMC gets severely of bent out of shape. You need to initialize it appropriately, then take that into account for the burn in.
- Despite what the documentation says, PyMC likes you to initialize variables; it's not likely to draw 'good' ones on its own.
Those complaints aside, it works as advertised. It's not without its own quirks, but I already much prefer it to WinBUGS, and the speed seems to be pretty comparable. |