December 17, 2004

Snohomish Anomaly Still Unexplained

If you assume that the newly added absentee ballots should break to both candidates in the same proportions as in the previously counted absentee ballots, then the probability of Rossi receiving at most **75** of the 194 votes is **0.15%**. This is not an outcome that can be reasonably attributed to random variation. There is no cause to allege that anything improper has occurred, but an explanation from the vote counters is in order.

The *Everett Herald* offers this explanation of Gregoire's 44 vote pickup.

results of the hand recount favored Gregoire, reflecting where the candidates drew their most electoral support. She won 51 percent of mail ballot votes while Rossi won 53 percent of those cast on touch-screen machines on Election Day.[hat tip to our good friend David "Horse Man" Goldstein for posting the link to the Herald article in a comment]The increase in votes in the recount came solely from mail ballots; the results of votes cast electronically did not change. Thus, Gregoire could be expected to pass Rossi.

The Herald's theory explains only a small part of the unexpected swing to Gregoire. I pulled some numbers off the Snohomish County recount canvass

Absentee Ballots | Gregoire | Rossi | Rossi Percentage |

Machine Recount | 96,925 | 95,153 | 49.54% |

Manual Recount | 97,044 | 95,228 | 49.53% |

Pickup in new ballots |
+119 | +75 | 38.66% |

Expected breakdown of new ballots |
+98 | +96 | |

Discrepancy | +21 | -21 |

The Snohomish County Auditor's office is emailing me the complete canvass files in spreadsheet form, so I can study the numbers more closely and see if there are any patterns in the precinct level returns.

Posted by Stefan Sharkansky at December 17, 2004
11:10 AM | Email This

Comments

I suppose that if the std dev of candidate-share of-votes is large on a precinct by precinct basis, and the probability that "uncounted" ballots come from a few concentrated (instead of many, diffuse) precincts, perhaps the chances of such an incremental vote pick-up (on either side) is bigger than what you have assumed. Is this right? (Of course, "ain't it a bitch" how this always seems to happen when a Dem candidate needs to make sure every vote is counted...)

Posted by: Curious Out of Stater on December 17, 2004 11:32 AMYour probability analysis also assumes that each newly discovered ballot is an independent random event, which may or may not be the case. For example, it could be that some precinct misplaced a stack of ballots that was then found during the manual recount. In that case, those ballots would be correlated, not independent.

Posted by: MathGeek on December 17, 2004 11:49 AM1) So, the question now is: Will there be a new election?

2) If one occurs, will it ONLY be Rossi and the Gre-gore clone or what?

3) If so, the eliminated 3rd party ballots would most likely determine the winner on a re-vote ..... that solution would be one that all Washington citizens "could hold their nose and accept".

Posted by: leaddog2 on December 17, 2004 11:51 AMI suggested the same possible explanation, but I have to wonder how this can happen for mailed ballots? These, I assume, are cast relatively independently (from precicnt involvement) and, I also assume, are all mailed to and handled at a central county-level facilitiy (I am not sure the latter is true in this case).

Posted by: Curious Out of Stater on December 17, 2004 11:57 AMSnohomish found some 200 new ballots. They had a previous base of about 200,000 ballots. If the new ballots they found numbered, say, 10,000 or 20,000, then you would have a case in saying they should be pretty close to the same proportion. But when the base of new ballots is so infinitesmally tiny, no statistical correlation could possibly occur. It is what it is. Period.

Look at flipping a coin. If you flip a coin 10,000 times, you will certainly get 5,000 heads and 5,000 tails. But if you flip that coin just 10 times, it would not be surprising to get 10 heads and no tails, or 10 tails and no heads.

Get off your correlation kick. It doesn't exist in such small numbers.

Posted by: Nelson on December 17, 2004 12:20 PMYou are as ignorant as you are hysterical. It is absolutely not the case that "If you flip a coin 10,000 times, you will certainly get 5,000 heads and 5,000 tails".

If you flip a coin 10,000 you *expect* to get 5,000 heads. Expectation is not the same thing as certainty. Being off by a sufficiently small number is not surprising.

And if you flip a coin 10 times, it is surely possible get all heads. But it is not "unsurprising", it is surprising, as the probability is ... low. what is it exactly, Nelson? Do you know? If you can't calculate it, why not do an experiment? Flip a coin ten times. Write down the number of times it comes up heads. Repeat until one of your trials gives you heads all ten times. Let us know how many times you had to repeat this test until you got all 10 heads.

Posted by: Stefan Sharkansky on December 17, 2004 12:30 PMAfter all but Spokane and King county reporting the NET margin is +1 Rossi after 1,107 votes added to Gregoire (553) and Rossi (554) combined.

This is not posted on the Sec of State site yet, my source is the Pierce County Auditor site.

Posted by: Neil Sullivan on December 17, 2004 12:40 PMWhat are the odds of that?

Posted by: David Goldstein on December 17, 2004 12:41 PMAlso there are attacks from the likes of Nelson in this post regarding the statistical anomalies in the newly fund ballots.

None of these attacks, or articles in the MSM acknowledge the obvious questions.

All the Republicans are aksing for is to stop the process long enough to allow for some explanation as to why the ballots are being found, a chain of custody, and explanation of statistical anomalies, etc.

There is no basis for these attacks. The Democrats should have nothing to hide if these found ballots are honest votes that need to be counted.

So Dean Logan, et. al. open up and tell us all what is going on here with the newly found ballots. Let independent auditors review the signatures, envelopes, signature databases, etc.

You have nothing to hide, right?

Posted by: Jeff B. on December 17, 2004 12:44 PMInsufficient information. Your flipping technique may not randomize the event. Just as it takes something like seven shuffles of a deck of cards with an automatic shuffler to produce random distributions, it takes a certain number of flips. If you get in a groove and flip with the same forces, you induce repetitive results, like a bowler who bowls a perfect game.

Nice try. Posted by: srogers on December 17, 2004 12:49 PM

Otherwise, I will continue to use logic and some working knowledge of data correlations in my conclusions. Did you know, for example, that the odds against winning a State lottery that uses 5 numbers plus a mega number, are 120,000,000 to 1?

So if a State (like California) sells 10,000,000 lottery tickets per drawing, your correlation table would say that there would be a winner less than once out of every 12 weeks, since there would also be some duplicate numbers chosen by lottery players.

We know that isn't correct. California, for example, has 104 lottery drawings a year (2 each week). There are winners in about 40% of the drawings (in some cases multiple winners), not in less than 8% of the drawings.

Do you think there is someone from the Snohomish or King County board of elections who is also working at the California State Lottery Board?

Posted by: Nelson on December 17, 2004 01:02 PMPierce County: Gregoire +232, Rossi +201

With Spokane and King left to report, we're almost back to where we started, with a net gain of +1 for Rossi.

Posted by: David Goldstein on December 17, 2004 01:02 PMWhat I want to know is how do you recount an electronic-only voting county?

Posted by: Jacqueline on December 17, 2004 01:19 PMWhat I want to know is how do you recount an electronic-only voting county?

Posted by: Jacqueline on December 17, 2004 01:19 PMHow about an initiative to completely abolish the sales tax? It shouldn't be difficult to get the necessary signatures. And it should pass, since Republicans will vote for it. So will a lot of lower and middle income Democrats, since the sales tax is an unfair and regressive tax that has relatively minor impact on the upper income people (who happen to lean Democrat in this state -- only middle income people lean Republican).

The passage of the initiative will, of course, virtually cripple state government and force massive cutbacks in state programs. It will cause far more of the heavily Democrat state bureaucracy to lose their jobs, than would have been the case if Rossi were elected.

This is a drastic measure, but it is probably one of the most effective forms of justice that can be exacted to deal with this stolen election.

Posted by: Richard Pope on December 17, 2004 01:21 PMHow about an initiative to completely abolish the sales tax?

Or we could just set off a small nuclear device in Olympia during the session. It's just as reasonable a proposal, and far less destructive to the rest of the state.

Posted by: David Goldstein on December 17, 2004 01:42 PMPresumably we have a recount to correct errors from the previous count, without also introducing new errors. If there is a small discrepancy between the counts, it is reasonable to assume that either we've corrected a small error, or we've introduced a small error. In either case, it's probably not worth the trouble to investigate what happened.

On the other hand, in the event of a large discrepancy, we want to ask ourselves whether

(a) we caught and corrected a significant error

(b) we introduced a significant new error

If (a), we should understand what caused the original error in the first place so it can be avoided in the future. If (b), we should want to correct the error before the outcome is certfied. Either way, the vote counters owe it to the voters to explain what happened differently in the two counts that produced such different answers. Don't you agree?

Posted by: Stefan Sharkansky on December 17, 2004 02:03 PMDon't even spend a minute dreaming that we will all sit back and do nothing after this is over. There are huge numbers of us that had NO idea we were living in the Ukraine.

Posted by: Julie on December 17, 2004 02:14 PMCongratulations, people of Washington (sadly including myself). We are witnessing the result of default elections of Dems to various local positions of power. Expect to continue being taxed into oblivion. Expect more businesses and doctors to move out of state. Expect our economy to go into the crapper, all because the WSDP cannot except defeat when it happens. I hope you're happy.

Posted by: Ferrous on December 17, 2004 02:19 PMThe CalTech/MIT studies clearly conclude that recounts are more accurate than the original "preliminary" count... indeed, the "tabulation validation rate" is a measure of the accuracy of the preliminary count, using the recount as the benchmark. Pierce County's tabulation error rate comes to about .3%... better than, but relatively consistent with the NH study.

As to why these under votes broke for Gregoire, it has been suggested here that since us Democrats are stoopid, we have more trouble filling out ballots in a manner that is machine readable. But while I have heard this anecdotally elsewhere, I've seen no studies that back it up.

Posted by: David Goldstein on December 17, 2004 02:46 PMNot correct. I posted my correspondence with the principal author of these studies and he *assumes* that hand recounts are more accurate, but doesn't show the data to back that up. Besides, most of the other CalTech/MIT results from N.H. and other states are not consistent with what we've observed here in WA...

I'm willing to compromise on Seattle.

A flaw in the optical scanning system...they weren't designed to pick up crayon marks...;>)

Suppose we have, say about 194,000 balls and exactly half are painted blue, the remainder painted red, in a cement truck. We let the truck's mixer run for a while and really mix up the balls. Then we blindfold a Ukrainian and have her pull exactly 194 balls out of the cement truck.

The statistical assistant from Kiev takes the balls as they are handed over and arranges them in a row on the ground, blue balls in a line on the left, red balls extending the line on the right.

Now, at the end of this, what are the chances that there are at most 75 red balls in the 194 that were drawn?

With this simplified setup, the odds of that result are 38.97% (the odds of *exactly* 75 red ones are the same as the odds of exactly *97* red ones: 0.513% or 1/195)

I claim that the simplifications that I made to get this result are minor and do not impact the result in any appreciable way.

First, I used a 50-50 distribution, and not the tad off from 50-50 that appears to be the actual case.

Secondly, I assumed that the probability of each drawn color did not change as balls were drawn. That's not quite accurate, but we drew a mere 0.1% of the balls and I say that can be ignored too.

How did I arrive at this remarkable result?

Notice that from 0 red balls to 194 all-red balls there are 195 cases. 76 of these have 75 or fewer red balls. The cases are not quite equiprobable, but the difference is so minor in drawing only 0.1% of them that let's assume they are equiprobable. Then the odds of 75 or fewer reds is 76/195 = 38.97% (The odds for 75 or fewer blues are the same of course.)

Could the blindfolded Ukrainian have substituted some blue balls for ones being drawn from the cement truck (without knowing which color they were)? Sure.

Can you determine that from this statistical result? No. The law of small numbers is working against you.

Of course, you can believe there is cheating regardless. But you didn't need these calculations for that.

Posted by: orcmid on December 19, 2004 11:08 PM=binomdist(75,194,0.4954,1) Posted by: Stefan Sharkansky on December 19, 2004 11:53 PM

If we assume that the binomial distribution is so large that we can substitute a normal probability function for it, it goes like this:

The mean of a sample of 194 is 97 (with equiprobable blue and red).

The variance of a sample of 194 is 194/4 = 48.5

The standard deviation is the square root of that, or 6.96.

119 blue balls is (119-97)/6.96 = 3.16 standard deviations above the mean.

The normal-distribution probability of all values below +3.16 standard deviations is approximately 0.9991836477 (I used 3.15 because I have that exact value in a table).

The probability of 119 or more blue (that is, 75 or fewer red balls) is

1-0.9991836477 = 0.0816% which is way the heck out there. That shows how bad my Ukrainian math is!

This is a pessimistic result because the normal distribution tails off to infinity in both directions, rather than ranging from 0 to 194. Using Stefan's Excel formula with my simplified equiprobable case, we get

binomdist(75, 194, 0.5, 1) = 0.097%

for a the probability of 75 or fewer red balls with a 50-50 chance of a red ball.

This squares with Stefan's

binomdist(75, 194, 0.4954, 1) = 0.148% result and

binomdist(75, 194, 0.4953, 1) = 0.150%

with the second using the recounted population.

Notice that as the probability of a red ball goes down, the probability of 75 or fewer goes up.

So it makes no sense to attribute this result to chance, as if the additional ballots were equivalent to random samples from a uniformly distributed population. I don't have enough information to do anything else. Maybe the detailed breakdown Stefan is getting will provide some insight, if it matters. I wonder how one can apply statistics at all at this level.

So, let's see. My back-of-the-envelope approach produced a result that was off by a factor of 400. That'll teach me to be such a smart ass. I did better with a finger-in-the-air prediction of the Monorail recall outcome!

Posted by: orcmid on December 20, 2004 01:05 AMPost a comment