Another milestone

•January 30, 2010 • 4 Comments

Well this is pretty impressive. As of sometime yesterday, this blog officially hit 30,000 visits.

I first created it back in February 2008, so it’s barely been around for two full years. The visiting rates have fluctuated wildly since then, but what’s been interesting is the last year, or really, the last three months. Since November 2009, this blog has averaged almost 4,000 visitors each month. That’s over 100 every single day.

Now. I’m quite sure that a good portion of those visits – if not the majority – aren’t even human, but most likely aliens. Or intarweb bots. But probably the former.

So in celebration…here’s a lolcat :) Thank you again for reading and commenting on my thoughts, however inane and non-substantive they may be.

Anecdotes do not constitute valid scientific conclusions

•January 23, 2010 • 2 Comments

I don’t think I’ve ever mentioned it here before, but I am a huge fan of the Bad Astronomer. Dr Phil Plait is a scientist-turned-writer who has one of the most levelheaded minds on this side of the intert00bz. I read his book Death From the Skies! and enjoyed every minute of it (once, of course, I got past the whole “this book is about how the Earth could end” concept. Kind of depressing initially, before it hit me just how cool all the stuff was…and the fact that it would likely never happen in my lifetime or those of my great-great-great-great-great-grandchildren).

Dr Plait makes no secret of the fact that he is a member of the newly-coined “Skeptic” movement, something which has always been around but was only recently formalized in the wake of those who actively campaign against vaccines, global warming, evolution, and other scientific fields of research garnering significant media time.

The latest incident involved something called the Shorty Awards, an internet-based award given out by voting over Twitter for popular individuals across several different fields. A man by the name of Mike Adams, an editor on the website Natural News (a pretty ridiculous website about vaccine alternatives and homeopathy) was nominated and picked up many, many votes in a short period of time. Later, his votes were rescinded when it was revealed that many of his votes were coming from newly-created Twitter accounts whose only post was a vote for him.

Suspicious, right? Apparently not to the vitriolic and frothing Mike Adams. Some of my favorite lines:

Within a few days, thanks to the votes of our very large base of readers, myself and Dr. Mercola were leading the health category, having taken the #1 and #2 positions. This was all done with legitimate votes from real people from all over the world who support our work.

Mr Adams, I don’t know if you’re aware of the nature of Twitter or the Internet at large, but unless you personally spoke to each and every individual who voted for you, I don’t know how you could possibly know this.

I was set to take the top prize, and Dr. Mercola was in a solid second place when some vaccine pushers got word that a couple of “natural medicine whackos” (as they described us) [...]

Wellllll…

But the opposition didn’t stop there: They unleashed a campaign of slanderous and false accusations against NaturalNews readers, accusing the readers of somehow engaging in fraudulent voting.

Again, unless you personally identified every voter, you’re glossing over the fact that it’s very, very simple for an individual to create as many Twitter accounts as they want. I could spend 5 minutes writing a program which does this automatically. Welcome to the internets, Mr Adams.

Without a shred of supporting evidence (because none exists) [...]

Or because you’re really not thinking that hard. Then again, it’s a lot easier to rail against vaccine, recommend purchasing an organic avocado, and sending a patient on their merry way than it is to devote your life to researching the biochemical reactions responsible for various ailments and running countless time-consuming lab tests to discover how to counteract them, isn’t it?

We are legitimate producers of natural health content who are both known all over the world, and we have very large numbers of followers and readers spanning well over a hundred countries.

Known all over the world, sure. Large number of followers, I’ll buy that too. Even all over the world! Cool. But how does any of that equate to legitimate votes? Simple: it doesn’t. You’ve committed the cardinal human sin of data inference: calculating P(B|A) instead of P(A|B). Just because you are known all over the world (nevermind negative perspectives), just because you have lots of followers (nevermind the phenomenon known as the internet troll), and just because you have a presence all over the wrold (um…internet?), does not inherently equate to legitimate votes. Legitimate votes, on the other hand, would be more indicative of a large fanbase. Or a small fanbase with programming talent.

Man, that critical-thinking science stuff keeps getting in the way, doesn’t it?

It wasn’t really surprising to see the vaccine quacks engaging in their false accusations, of course: Lying and cheating is par for the course for the vaccine and pharmaceutical industries. Their supporters apparently reflect that same lack of ethical behavior. They will apparently do anything to win, even if it means engaging in widespread false accusations and trying to get natural health people removed from the contest altogether.

This is, by far, my favorite. Mr Adams, you’ve completely unraveled.

First, judging from the vehemence of your post, I’m extremely skeptical when you say you weren’t surprised. Secondly, this has absolutely nothing to do with the pharmaceutical industry; this has to do with legitimate scientists and researchers. Don’t confuse the two, as they are very different entities (it does strike me as bizarre that the pharmaceutical industry – manufacturing drugs in order to save lives – is a for-profit industry), and yes, Big Pharma has a tendency (as for-profit entities are wont to have) to be less than fully honest, forthcoming, and ethical in its business practices.

Third, nice job lumping everyone in one big bin and condemning them all, though I must say that is very characteristic of the anti-critical-thinking crowd. Again, these are not supporters of the industry so much as the researchers behind the medicine. I am one such supporter, and I can say without reservation that if I heard an anti-vaxxer was currently in the lead for a health-related award, I would mobilize every person I knew to swing the vote the other way.

As for your unfortunate disqualification, there is certainly no solid proof that any wrongdoing occurred. But don’t flatter yourself: neither is there any way to know if any wrongdoing didn’t occur, and based on the evidence we have, it’s not encouraging. If it was my decision, I wouldn’t have removed you from the contest, but rather simply vacated the votes counted from newly-created Twitter accounts.

Regarding your work, Mr Adams: getting things wrong is inherent to the scientific process. Thomas Watson once said: “If you want to succeed, double your failure rate.” Proving previous theories wrong is how science advances its understanding of the physical world. It’s the best tool we have at grasping the rules of the phenomena we can observe. This “natural” stuff you’re advocating? I’m pretty sure that if there really was a 25-day cure to Type II diabetes, you’d have retired by now.

Don’t confuse a map with lots of blank edges for a world with no land masses; not being aware of something does not constitute nonexistence.

Farewell VMWare, Hello VirtualBox

•January 19, 2010 • Leave a Comment

It’s official, I’ve given up on VMWare Server.

Oh, we had a lovely run. I tinkered with it while I was still an undergraduate at Georgia Tech, and I began running two full-time VMWare virtual machines – Windows XP and Ubuntu – when I arrived at Carnegie Mellon. VMWare Server saw me upgrade my host machine from Windows XP to Windows Vista to 64-bit Vista to its current Windows 7 platform. There were a few snafus (got that spelling right), of course: VMWare 1.x behaved erratically, VMWare 2.x didn’t have a web-UI plugin for OS X, and my virus scanner threw regular fits over VMWare’s subprocesses.

All relatively minor. Until December 2009 rolled around.

For whatever reason, VMWare stubbornly refused to allow my virtual machines’ startup process to go beyond 95%, at least in the web UI. Oh, the machines themselves were up and running in the background – essentially, the process was headless – but that didn’t help when I needed to administrate, say, the Windows XP machine. I had no access to the web UI. I couldn’t even shut the machines down without killing it from the process manager.

I even started support threads: one on Serverfault, one on the VMWare forums. No answers. I found several other similar threads in the VMWare forums, but each seemed to have a different solution specific to the platform, none of which worked for me. The common thread seemed to involve 64-bit host machines, but that aside no solution was adequate.

My ultimate solution? VirtualBox!

Its look and feel is very reminiscent of VMWare 1.x, except this actually works. Not only that, it comes with a bonus: full support for Windows’ Remote Desktop protocol (obviously with a few extra security measures in place), so there’s no need to install a proprietary web-UI plugin, only a client that can communicate via RDP. I’ve been using this cool open source RDP tool for OS X called CoRD. It’s still in beta, but it shows incredible promise. Until the codebase matures a little further (perhaps over the summer I may try to participate in its development), I’m using it alongside Windows’ own OS X port of Remote Desktop. Which is, frankly, an annoying application.

I now have both Windows XP and Ubuntu virtual machines running happily within VirtualBox, and it works like a charm. My only complaint so far is that I can’t close the VM windows on the host machine without shutting down or pausing the virtual machine itself. But that’s a pretty minor complaint, a far cry from having no manual access whatsoever.

Just in time, too: my thesis research is really gearing up, and I need access to a sandbox machine with highly configurable Python, Django, and MySQL access. A whole lot of extra hard disk space is another big plus. And setting up a Dyn-DNS domain name for easy access doesn’t hurt.

Facebook Plea

•January 7, 2010 • 2 Comments

Recently, Facebook updated its security and privacy settings to provide users with more protection of their data, more customizability in who has access to that data, and generally tighten things up on a whole.

Whether or not this had anything to do with the TOS controversy from last year is another matter.

But ever since then, something has been nagging at me. And it’s not getting any better; in fact, it’s getting worse. And it’s not only me. My girlfriend posted an entry of her own – much more bluntly than I ever could have – highlighting the annoyance.

If we have so much control over our information, why can’t we disable these little blurbs polluting our walls?

All those little "recent activity" blurbs in between posts. Those buggers.

This may sound somewhat pedantic and trite, but here’s the crux of it all: every single little action a user takes – commenting on a photo, “like”-ing someone’s status, or editing their profile – is posted on the wall. Even if I post a link on my cousin’s wall, the friend of mine at Carnegie Mellon who is not friends with my cousin can see this action that I took.

That’s a privacy issue. And it’s not under my control. Which is not cool. Plus, it’s really freaking annoying! I don’t want my Mom to know how much I’m actually on Facebook when I should be getting work done for graduate school!… :P

And now I’m going to go back to super-duper top-secret project planning. Oh yes indeed.

::Edit:: Facebook recently posted a message on their new security section:

Whether we display a story on your profile is now controlled by the privacy of the content itself, rather than an additional setting. For example, only people who can see both your Wall, and the Wall to which you posted would be able to see a story about you writing on a friend’s Wall. You cannot completely turn off recent activity stories anymore. However, if you want to remove a particular story that currently shows up, simply click the “Remove” button that appears to the right of the story after you move your mouse over it.

So my previous argument about friends-of-friends seeing posts I’ve made on not-mutual friends’ walls doesn’t necessarily stand anymore, provided you have your security set up properly. Still, why they’ve eliminated the ability to turn off “Recent Stories” altogether is entirely beyond me.

Stupid Questions

•January 4, 2010 • 4 Comments

I’ve been slacking off, I realize this. I shall indeed return to a more regular posting schedule…later! For now, I want to delve into a topic that has been nagging at me since reading someone’s Facebook status update on the subject.

That image, while blatantly satirical and cynical, offers quite a lot more insights that are subtle. The status I read went as follows:

[Jane] has made a pact with [Joe] that our New Year’s resolution will be “Not answering dumb questions.”

As an aside, let me begin by saying that I am taking all possible avenues to keep this entry from becoming a rant. Rather, I want to point out the assumptions made in and problems with this statement, and suggest viable alternatives. I suppose there’s really no way to explain myself out of those who would choose to view this as me getting up on my high horse – you are certainly entitled to your opinions – but to those who are undecided I ask only that you withhold opinion until you’ve finished reading.

Problems

The biggest problem with this statement is that it’s grossly naïve. What’s wrong with dumb questions? Why shouldn’t they be answered? What’s the motivation for leaving them unanswered? How does one respond instead? Not only are these questions explicitly unanswered, but their implicit answers are likely the reasons for this statement’s naïvete.

For example, giving possible answers:

“Dumb questions are pointless to answer; it won’t fix anything.” Now you’re making the assumption the person asking them is dumb as well. Which is it – the person or the question? Or both? Or are you just too lazy and impatient to deal with someone who may just not be as mentally agile as you?

“I would respond by telling them it’s a dumb question.” I’ll give you points here for at least engaging in a dialog, even if it’s still rather one-sided. At least you’re giving your reasons for refusing to answer, and maybe the person who asked the original question can then rephrase it in a way that meets your high standards of approval.

The other huge problem is that it makes the definition of “dumb” purely subjective. Think about it: to Bill Gates, we’re all dirt poor plebeians; to the 99.99999999999999999999999% of the physical universe on which humans have absolutely no impact, it wouldn’t mean a hill of beans if Earth vanished in a brilliant fireball tomorrow; to the illiterate cotton farmer in Turkmenistan, America is a meaningless footnote that exists only in the rumor mill.

Who are you to say what is dumb and what isn’t? Does the nature of a question make it dumb or not dumb, or does the person asking weigh into the equation as well? If a question is asked by someone who works exceptionally hard but just happened to miss a relatively simple detail, would that be less dumb than the exact same question asked by someone who has a reputation for slacking off? If the question is leading, are you more likely to consider it dumb if you disagree with it? If a gifted and brilliant mathematician had to squeeze in one last semester of poetry in order to graduate, would you consider a question about the basics of iambic pentameter dumb?

Solutions

The picture itself might be specific to the sciences, but its applicability extends far beyond the academic realm.

As a graduate student with quite a bit of tutoring, teaching assistantships, and general Q&A under his belt, I will certainly be the first to admit that being asked “is Java compiled or interpreted?” 30 times in the space of a few hours can chip away at one’s sanity. While it does help to keep in mind that most of those were asked honestly, it should go without saying that some people who asked them were lazy bums who didn’t show up to lecture, didn’t come to recitation, and expected me – the TA – to do their work for them.

This brings me to my point: the definition of a “dumb question” has almost nothing whatsoever to do with its content or the content of its answer; rather, it has almost entirely to do with why the question was asked in the first place.

And therein lies the rub. We cannot know the “why” explicitly, so we approximate implicitly by way of reputation, or content of question/answer, or some other metric that we may not even be actively aware of at the time. Statistically speaking, human beings royally suck at approximating, so the end result is the original statement: naïvete which is just as bad, if not worse, as the questions this person is actively seeking to avoid.

The solution is, as with most things, to keep an open mind. Be aware of your assumptions, and for Pete’s sake, don’t throw blanket generalizations around. They’re always, always wrong.

(did u c wut I just did thar?)

Merry Christmahanukwanzaramadanukkah to all!

•December 25, 2009 • Leave a Comment

I hope everyone is enjoying a beautiful Christmas morning, and is having a fabulous holiday season with friends, family, and loved ones. :)

It’s the most wonderful time of the year

•December 16, 2009 • Leave a Comment

Walnut St, in Shadyside, Pittsburgh. Old Man Winter is upon us.

I’m finally back in Atlanta. The semester, oddly enough, wrapped up pretty well; still waiting on my thesis research grade to come in, but otherwise – all things considered – I can’t really mark it as anything but a success.

Here’s how it all went down:

Astrophysics

It’s unfortunate, but I had to drop the course shortly before midterm. My other courses were entirely too high-priority relative to this one, which was an elective. As interesting as the course material was (we had been discussing the physics underlying supernovae when I’d dropped), graduating in May was more important. Maybe as a PhD student I could take another crack at astrophysics.

Computational Modeling and Simulation

This was the last course I needed in order to be eligible to receive my degree. In theory, by passing this class, I could have graduated this month, but I’m sticking around another semester to deliver my thesis and be a TA. At any rate, though my grades – both for the class (B) and on the final project (B) – weren’t as outstanding as I would have liked, I still did well and am now eligible to graduate. Plus, how many people have the opportunity to set up a simulation to observe the effects of disease spread and drug resistance in a population, given a particular dosage regimen?

Web Applications

My “fun” course proved challenging of its own volition. I was happily seated with an A+ in the course at midterm, having aced all three of my homework assignments. Given that, however, I opted to make the fourth and final assignment orders of magnitude more difficult, and by doing so I had to turn the assignment in late (for a 70/100). Furthermore, for the final project, my partner and I ran into time constraints; as M.S. students we were just too busy to devote enough time to really do a thorough job (at least, until last week when I submitted, albeit late, the complete product). I ended up scoring a 149/200 on the final exam, which was apparently a full standard deviation above the mean. Apparently that was enough to push me to an A-.

Thesis Research

Ho man, talk about a full-time commitment. I was technically only registered for 12 credits (roughly, a 4-hour course) but was doing about 18 or even 24’s worth for most of the semester, particularly in the second half. My work this semester was but the first phase in what will eventually culminate in the delivery and defense of a thesis in May, centering around the work of the Murphy Lab in protein localization. Specifically, my work involves deriving some method of comparison between the data of disparate sources of information, and as of two weeks ago, the first phase of this work was completed. I also co-authored two papers (set to be published in the next month or so), and am also working with some HCI folks to enhance to the UI of my thesis project for entrance into an upcoming competition.

Overall, busy busy busy; the sad part is, I have a lot of work to finish over the break as well. Thankfully, I polished off the last of my PhD applications a week ago (8 in total!), and am now making sure all my recommendations have been submitted and that all supporting documentation (GRE scores, transcripts…bleh) arrives safely.

Oh…and if, for some reason, anyone wants to buy me a Christmas gift, check out my NewEgg and Amazon wish lists I just set up! :)

Merry Christmahanukwanzaaramadanukkah!

Literally pulling 25-hour days

•December 2, 2009 • Leave a Comment

I promise I’m still alive! Just trapped beneath the avalanche of final semester coursework and research, plus polishing off my PhD applications. Here is an accurate portrayal of my current situation:

Not quite according to plan

•November 14, 2009 • 4 Comments

I’ve spent the better part of the last few hours reinstalling my Ubuntu virtual machine from scratch after completely botching my previous install’s configuration. How? I was attempting to get Python 2.5 up and running with a few custom packages, and ended up accidentally removing everything Python-related, which included many rather important system packages. Synaptic then froze up, and most of the basic system operations stopped working.

Freaking awesome. Just how I wanted to spend my Saturday evening.

But now that I have it working again, I wanted to delve into my latest project: a semi-intelligent Twitterbot! There are three core components to this project:

  1. Read the public Twitter timeline to accumulate posts.
  2. Build a Markov Model out of all accumulated posts.
  3. Use a cronjob to modulate the frequencies of #1 and #2.

I’ve posted about Hidden Markov Models before, and this is an example of theory put into practice. Granted, the utility of this application is questionable, but if for no other reason, it sure is entertaining. In fact, since activating my Twitterbot a little over a week ago, it’s already garnered a decent response. Here are some of my favorites thus far:

lulz

It’s endlessly amusing to me that so many people seem to think my bot is actually a person. At least a few also seem to be amused by its antics. Still others respond as though nothing is amiss. It’s also managed to flag down multinational Twitter users. And it’s even attracted the attention of other bots!

How does it work, ya say? Welllllll…

The underlying assumption of HMMs is that there is a hidden state that influences whatever the output we actively observe is. Within this context, it means we’re assuming there’s an unobservable pattern to the sentences Twitter users post that results in the actual words we can see. Thus, if we observe enough of these posts, we should, in theory, be able to infer those hidden states.

Yeah yeah, that wasn’t very simply put. Nevertheless, let’s move on.

The assumption my bot makes is pretty straightforward: each word that is observed depends only on the word before it. Put another way, this means that, given a single word, there is only a certain number of words that can come after it. Of these finite number of words that can come after it, some are much more likely than others.

This makes intuitive sense. Take any one of the sentences in this post, for example: after you read one word, you’re already expecting a certain word or number of words that could follow it (it’s how we read, in fact; ever heard that humans only actively read about 70% of the words on a  page? all the others are inferred by this same method). It’s basically a primitive form of contextual analysis.

From a technical standpoint, this dependence on only a single previous word is called a “first-order” Markov Model. HMMs can go as high as you’d like. There is another similar Twitterbot built by a friend of mine which uses a “second-order” Markov Model, in that each word depends on the two previous words, resulting in a sentence that probably makes more sense than mine will. But for those of you who ahead of me, this also means much more of the original posts used to build the HMM will show up in the generated posts.

And honestly, I wanted my bot’s posts to be as random as possible while still kind of making sense :P Hence I purposefully implemented a first-order model.

My bot accumulates 800 posts from the public feed over 20 minutes, then uses those posts to build a first-order HMM. From that model, it then constructs a post by sampling from the model, and posts it to the Twitter account.

If you’re interested in following the bot, you can find it here.

I’m in the process of refining the current model, perhaps a hybrid first-second order HMM. I may also include some topics that are weighted more heavily than others, so the generated posts more accurately reflect the trending topics. And of course, I’m open to suggestions!

Yes, this bot provides a wonderful source of amusement, particularly since I am way over my head in schoolwork. Applying for jobs, applying to PhD programs, conducting research with Dr Murphy, and actually keeping up with my coursework is all proving very difficult to juggle these last few weeks of the semester. So it’s nice to have a new joke to read every 20 minutes!

I also want to mention that, a few days ago, my total hits on this blog surpassed 20,000. Thank you again to all those who seem to find something on this blog interesting :)

sports-pictures-denver-broncos-men-tights

I would like to go on record…

•November 3, 2009 • 2 Comments

…with the following statement. AHEM!

Though I am a student of the sciences; though I am a pupil of technology and a patron of theory; though I relish immersing myself into the latest and greatest advances and leaps in the cutting edge of research, nothing will ever rival the sheer utility of this, the greatest resource a graduate student could possibly have at his/her disposal:

Seriously. My beautiful and fabulous girlfriend bought me one for my 24th birthday, and it’s seen an unbelievable amount of action since this fall semester began. It’s truly the shining beacon of human ingenuity.