There is no such thing as an unscientific one, either.
A poll is this: a question or questions that are interpreted and answered by a group of individuals at a given time.
There is no difference between a poll and a vote save one. A poll is a sample of all people it is said to represent. A vote is taken of all those people (or their representatives).
Any statistics derived by a given poll are certain (excepting mistakes in calculation) for the group questioned. For example, in honor of Occupy Wall Street suppose I stand on 72nd street and Second avenue tomorrow at noon and ask the first fifty people I meet, “Yes or No: are you in favor of eating the rich?”
There are only three possibilities, resulting in a fixed percent of “Yes”es , “No”es, and variously worded versions (including silence) of “Get away from me, you nut.” Suppose these percentages are 20%, 30%, and 50%, respectively.
I could then announce to the world my poll results: “30% in Favor of Eating the Rich.” And I’d be telling the truth, too.
Now, could I help it if you thought that the 30% of which I spoke meant “all Americans”? Remember that 30% is the real, God’s honest percentage of folks I interviewed. But is it representative of “all Americans”? That depends on the extent to which the people I interviewed “look like” the rest of Americans.
So what does “look like” mean? And why should I care? Because a poll is not a vote but we want to use the poll to guess what the vote might have been had we been able to sample the entire universe of people under consideration.
In other words, a poll is always a prediction, a forecast, a guess.
Everybody has certain measurable characteristics. Some of these match characteristics shared by others, some don’t. One characteristic is sex; another is age. Others are height, weight, kind and place of residence, diet, gene sequences, presence of certain maladies, religious and political affiliation, subscription of various beliefs, ownership of certain cars, books, and other objects, and so on. The number of characteristics we could track is essentially limitless.
The people I sampled have characteristics, of course. I would have had to select a limited number of these characteristics (time is finite), which I could have broken down by category: so many of Lithuanian descent, so many ex-league softball players, and so on. My poll is then a valid guess of what the vote would be for other groups of people who have the same fraction of those of Lithuanian descent, etc.
The result of any poll is a valid guess of what the vote would be of all the people who shared the same characteristics of those polled.
This goes for polls conducted on blogs or on MSNBC viewers, just as it holds for those polls issued by Gallup, Rasmussen, etc. All of these are valid and are predictive of other groups of people who “look like”—that is, share characteristics in the same frequency—of those who were polled.
The impassable difficulty is that no two people ever match all measurable characteristics, therefore no two groups of people will ever match frequencies of all characteristics. In that sense, no polls are ever valid for any but the group directly questioned.
To fix this predicament, we have to enforce limits on characteristics. But which; how to pick? It depends on what vote the poll is approximating, and what behavior is implied by that vote.1
Take presidential polls. There are, at this writing, 307 million Americans. We have a good guess—-but it is only a guess—on the number of these Americans who are female and who are male. We have less knowledge about the breakdown of age and race. Everything else is various levels of vague.
There is a distinct behavior associated with this poll: the election itself, though we still have to decide what the election means. Is it the winner of the electoral college or the popular vote? Suppose the former.
I then conduct a poll of “voters” using a set and frequency of characteristics I assumed would be probative and wait until the election and see how well the poll matches the results.
If the poll turns out to be a good guess, then I have confidence the characteristics (and frequency) I chose were adequate. But if the poll were a poor guess, then I would refine those characteristics or frequency or both2.
Organizations like Gallup and Rasmussen pay attention to these kinds of details—they have to, else they would go broke—which is why their polls are taken more seriously than groups like MSNBC and other news organizations who produce polls solely for their entertainment value.
1Not all polls represent an unambiguous behavior. For example, what exactly does, “I like the direction the country is taking” even mean in terms of measurable behavior? Be wary of these kinds of polls.
2A poll’s “margin of error” is just a number pumped out of a formula which takes as input the size of poll and the size of universe the poll is said to represent. It matches the accuracy of the poll-as-vote only by accident, and is therefore of only limited use. Part of a poll’s actual accuracy is how the poll is conducted; i.e. its mechanics (time of day administered, how many landlines called, how many cell phones, etc.; how the questions are written and interpreted).