Those of you who might be normal, sorry. Somehow the 20-page MIS paper I’ve got hanging over my head has put me in the mood to write an entry that’s technical and nerdy. If you’re reading – thanks a lot, Dr. Yen. Is technical and nerdy better than nothing? Probably not. But here it is. Maybe I’ll write something interesting… next month?
When I order something or register with a website, I normally do not want to hear from them ever again. Did you send me what I bought? Can I login when I want to read something? Ok then, thanks, have a nice life. When they do email me, as even the more trustworthy companies insist on doing once every 4 hours or so, it makes me mad. Usually the first thing I will say at the sight of another “$10 off your next $3000 order” message is the name of the offending company, followed or preceded by a gentle expletive of some sort. In this sense, I am no different than your general garden variety nerd.
Not lately. Lately, I’ve been trying to teach my computer the meaning of “spam.” As such, I’d love to get some emails about political commentary or upcoming sales or general developments at websites I enjoy. I spent half an hour yesterday subscribing to the mailing list for each of my favorite bands and news sources. Why? Because that would be something to train my computer to like, versus the billions of true garbage messages I get about transexuals and poodles and transexual poodles.
Is anyone here familiar with Bayesian logic? Yeah, that’s what I figured. Apparently it’s fairly effective as an email categorizing technology. You set up a handy program to classify your email on its way to your inbox, and then you create a filter in whatever (Eudora, Outlook, et al) you use for reading mail. The classification program keeps track of every message it processes, and a browser-based interface lets you categorize new messages so it knows which ones are good and which are bad. Meanwhile, the filter in your email client shoves the ‘bad’ messages into a junk box or straight to the trash. Doesn’t that sound nice?
It is pretty nice, and I’m going to use it right this time around. See, when I originally set up iamthathero.com I was glad to finally have a .com site. I put my new email address on every page, and never thought twice – or even once, to be honest – about the fact that dirty, stupid, inconsiderate people (‘spammers’) make a living by selling email addresses to other dirty, stupid, inconsiderate people (‘other spammers’). And they have automated scripts that exist for the sole purpose of pulling email addresses off of websites. As a result, I was soon getting roughly 1,000 junk emails to every .6 real ones.
A close friend and superior nerd set up SpamBayes on my laptop, and I was on my way to enjoying trash-free email browsing. Or so I thought. As it turns out, SpamBayes develops a deep identity crisis when you classify several hundred emails a day as “spam” and one email every few weeks as “ham” (a simple but amazingly confusing slang term for good email). But that’s what I did, and instead of spending the summer sifting through metric tons of junk email I spent the summer sifting through metric tons of junk email and then telling SpamBayes that each one was junk. Don’t do that.
I’ve just started using SpamBayes on my new computer, and when it correctly classifies a message as spam I just discard it. Because why mess with its database when it’s getting the categorization right? If you get a lot of junk mail and are nerdy enough to try and fix it, check out spambayes.org for the free software and better explanation. While you’re at it, search the web for ‘spam trap’ to find articles and tricks for killing spam robots that visit your own site. Oh, and if you know someone who sends mass unsolicited emails for a living, kick them in the teeth.