It’s a Very Small World

I ran into a buddy (Nathan Renick) from my Panama City Beach summer in the Knoxville airport, Wednesday night. That’s exceptionally obscure, since he lives in Mississippi and I live in North Carolina. It just worked out that the only flight I could book (late) was through Knoxville and Nathan’s family was getting together in eastern Tennessee for Thanksgiving, this year. It is a very, very small world.

The EdBlog works again, and other topics

The EdBlog “broke” at the end of May, when I found a ColdFusion MX limitation. (The last two entries had to be done by hand.) So, it took me a while today to fix it. It’s finally up, again.

Much of my free time over the last two weeks has been spent assembling an online directory for the Panama City Beach, 1999, Campus Crusade for Christ, summer project. It’s up, and appears to be working quite well.

The news of the day is that Rice beat SMS in Round 1 of the College World Series (CWS) – in our four trips, that was only our second win and our first first-round win. I think we’re going all the way, this year, anyway, though, so I wasn’t surprised.

Meanwhile, I’m not working at the New Life Resources office, anymore – I’ll be telecommuting for a little while, until I finish some open projects, then working on an as-needed basis. Since we have some potentially huge contracts in the works with Topsail Consulting, I need to focus on that.

That’s about what’s new! I’m going to go get my butt kicked by Fritz 7 (a chess program) and try to relax a bit.

Spammunition

Today’s breakthrough is discovering a great and free spam filter for Outlook 2000 (and up) which uses Bayesian filtering. It’s called Spammunition.

For those not familiar with these filters, a Bayesian filter operates on the assumption that history repeats itself: the odds of something being true in the future (or the present) can be predicted extremely well from the odds that the same thing was true in the past. In other words, without doing any complex combinatorics or statistical analysis, the fact that a playing card is a red 50% of the time over hundreds of draws is a great predictor that the odds of the next card being red are also 50%. In terms of emails, this would imply that if emails containing the word “weight” are spam 99% of the time (they are), we can delete any future emails containing that word with 99% confidence.

The trick here is to get beyond that 99%. Imagine if 1% of your legitimate emails get randomly deleted, and you happen to own an online business. That could be very expensive! So, we look at the weights of all the words in the message, including headers, subjects, etc, and combine them appropriately. Some filters using this technique (called Bayesian filtering, because it’s based on Bayes’s Rule) pass 99.95% accuracy. Now, that’s getting there. For more info, read Paul Graham’s website.