Guardaley | X-Art
Malibu Media’s geolocation accuracy: more scrutiny
Yesterday I wrote about certain skepticism expressed by California judges regarding Malibu Media’s geolocation reliability. Particularly, I mentioned Malibu’s declaration in response to a CASD magistrate’s order for supplemental briefing. That declaration included a list of reasonably precise IP address resolutions that purports to prove the following claim:
From undersigned’s experience filing lawsuits in California, Maxmind [a geolocation provider Malibu uses] has always been 100% accurate to the state level, 100% accurate at identifying the ISP and has predicted the correct district 98 out of 99 times.
However, Maxmind lists much more modest numbers on its website, and Malibu is aware of those numbers: they were copied to their declaration:
Maxmind’s geolocation tracing service is “99.8% accurate on a country level, 90% accurate on a state level, 81% accurate on a city level for the US within a 50 kilometer radius.”
Anyone with a math background should immediately experience a WTF moment. Indeed, 90% probability of correct resolution on the state level translates to 0.00000072% likelihood of 100% error-free resolution of 178 addresses (the number of subpoenas issued in the Northern District of California). To be fair, it is higher than the probability of a copyright troll having a heart, yet still extremely low. Even if we assume that Maxmind is being CYA-style cautious with the numbers, and the actual accuracy is 10 times higher (99%), it would still be only 16.7% likelihood of all the 178 addresses resolve correctly on the state level.
So, I smelled bullshit.
Now, the list of successful resolutions has 106 entries. For simplicity, let’s talk only about CAND cases as they are the most abundant: 83 entries. Why list only 83 out of 178 subpoenas issued in CAND? OK, let’s assume that Comcast has not yet coughed out subscribers’ personal information in the latest batch of lawsuits (32 cases filed on 3/31/2016). Also, in 15 cases the judge denied early discovery. Still, 131 is far greater than 83.
Cherry-picking is apparent.
Since cases dismissed with prejudice (i.e. settled) are not likely the ones that would involve a wrong jurisdiction, let’s only pay attention to those dismissed without prejudice. I identified 11 CAND cases that were not listed in Malibu’s report embedded above:
- Malibu Media v John Doe 76.102.251.41 (CAND 15-cv-04167) 9/11/2015-12/1/2015: 81 days
- Malibu Media v John Doe 98.234.120.117 (CAND 15-cv-04177) 9/11/2015-11/18/2015: 68 days
- Malibu Media v John Doe 24.5.113.76 (CAND 15-cv-04211) 9/15/2015-11/18/2015: 64 days
- Malibu Media v John Doe 67.169.187.169 (CAND 15-cv-04247) 9/17/2015-11/20/2015: 64 days
- Malibu Media v JOHN DOE 24.130.61.195 (CAND 15-cv-05384) 11/24/2015-3/1/2016: 98 days
- Malibu Media v JOHN DOE 67.169.83.102 (CAND 15-cv-05394) 11/24/2015-2/10/2016: 78 days
- Malibu Media v JOHN DOE 71.202.112.89 (CAND 15-cv-05398) 11/24/2015-4/1/2016: 129 days
- Malibu Media v JOHN DOE 76.21.69.22 (CAND 15-cv-05405) 11/24/2015-2/10/2016: 78 days
- Malibu Media v JOHN DOE 98.210.169.17 (CAND 15-cv-05407) 11/24/2015-2/10/2016: 78 days
- Malibu Media v JOHN DOE 98.248.144.132 (CAND 16-cv-01019) 2/29/2016-5/18/2016: 79 days
- Malibu Media v JOHN DOE 24.6.171.5 (CAND 16-cv-01621) 3/31/2016-5/19/2016: 49 days
Dismissal without prejudice means that the troll does not see any monetary sense in pursuing the case. There are many reasons for that: small businesses with open wi-fi, financial hardship (i.e., negative Turnip Test), military service etc.
Should we also add wrong ISP and/or wrong district to the list of reasons?
It is not a stretch to assume that after receiving the subpoena result from Comcast in one or more cases listed above, the troll couldn’t proceed because the putative defendant couldn’t be found in the Northern District.
An additional consideration: except 15-cv-05398, all the omitted cases were dismissed earlier than the other ones filed on the same dates, which also suggests a possibility of early error detection rather than a decision not to pursue the case based on investigation.
Also, Malibu listed the entries ordered predominately by case number, and it is especially strange to observe that the numbers in question were excluded not from the beginning or the end of the list, but pulled from the middle, seemingly deliberately.
Without seeing the subpoena results, all I wrote above is no more than a speculation. I don’t have a power to request results for the omitted cases. However, judges have, and I really hope for a sua sponte order compelling Malibu to fill the gaps. Also, defense attorneys can petition judges to allow them to subpoena the ISP for very limited information: municipality is enough.
In any case, apparent cherry-picking should raise judicial brows, and the troll must explain why those particular IP addresses (60% of total) and not the others were chosen to prove the accuracy.
Recommended reading on geolocation
- Fusion: How an internet mapping glitch turned a random Kansas farm into a digital hell by Kashmir Hill
- Politics & P2P: Geolocation — The Fools Tool by Andrew Norton
Followup
Update
12/6/2016
Today I was excited to discover Judge Alsup’s order denying (albeit without prejudice) ex-parte discovery in 53 CAND cases based on the same reasoning:
[…] Attorney Mosesi appended an spreadsheet to back up that data, but the spreadsheet omitted dozens of cases filed in this district alone.
It appears those cases were omitted because Malibu Media never received a response from the Internet service provider in those cases, but the failure to address so many cases in this United States District Court For the Northern District of California district (and presumably elsewhere in California) casts significant doubt on counsel’s personal knowledge of the accuracy of the Maxmind database. Maxmind’s own statements of its accuracy, restated in counsel’s declaration, are hearsay. Malibu Media has failed to provide sworn evidence to support the reliability of the Maxmind database, which is necessary to show that this Court has personal jurisdiction over each of the defendants and that venue is proper here. Accordingly, Malibu Media’s motions are DENIED.
It is a gradual process but increasingly the federal judiciary is smelling the stink of the geolocation “evidence” and finding it wanting. Your calculations seem to be supported by the fairly recent Rightscorp foray into copyright trolling. Their client, Rotten Records, filed 18 lawsuits in NJ and PA by their attorney, Jordan Rushie, according to PACER. At least 1 of these 18 lawsuits resulted in the geolocation tech misidentifying the STATE (NY, not NJ), never mind the municipality or county. lol
Your site software must agree Raul is a really important person…it tells us his response to this post is “2 Responses” in the discussion, lol.
I think a defense attorney is about to have a field day!
WordPress counts pingbacks as responses, which does not negate your (correct) presumption of Raul’s importance.
It should probably be mentioned that there were some good articles earlier this year on the inaccuracies of geolocation data. MaxMind’s specifically, though other companies came in for some scrutiny as well.
http://fusion.net/story/287592/internet-mapping-glitch-kansas-farm/
http://fusion.net/story/290772/ip-mapping-maxmind-new-us-default-location/
http://ktetch.co.uk/2016/04/geolocation-the-fools-tool/
Of particular note is the quote from a MaxMind co-founder: “We have always advertised the database as determining the location down to a city or zip code level. To my knowledge, we have never claimed that our database could be used to locate a household.”
The articles were mainly about how poorly chosen default locations were sending people to unsuspecting people’s houses. In short, any geolocation tool that gives you anything more specific than a city, county, or zipcode, is probably full of shit.
Yes, Kashmir Hill’s article was truly fascinating, not only because of information about location services, but also as an encouraging example how a journalist can cause a change for the better: after the story was published, Maxmind changed its default locations to point to neutral places (e.g., a middle of a lake).
Thanks for reminding, I’ll add a “recommended reading on geolocation” link section.