Why Internet Filters Don’t Work and Why Libraries who Filter are Wrong
Librarian in Black | Sarah Houghton | May 7, 2010
A Washington State Supreme Court decided yesterday in a 6-3 decision that public library internet filtering is not censorship, because filtering is “collection development.” You can read more in Library Journal, on ReadWriteWeb, or read the actual court decision and the majority and dissenting opinions.
My reaction is simple, as someone who has fought, and won, an internet filtering challenge in my own library. Our communities’ intellectual freedom is at risk.
This is a huge step backward for intellectual freedom. And if we follow the logic in this case, the Library is leaving their internet collection development up to an automated software system and some untrained minimum-wage lackeys at the filtering company. Filters are not collection development and filters don’t work. My frustration at the decision-makers’ lack of education about these issues is immeasurable.
I posted comments on the LJ & RWW sites. Those comments are duplicated below. If you want to know more about filters, read on.
This is a gigantic issue for public libraries and I have serious fear about what this means for our communities’ future of information access.
ReadWriteWeb’s coverage brought up the ethical argument against filtering. Just because someone is using a library computer, does that mean that he or she automatically has less access to information? It shouldn’t, and libraries are fighting for information access rights every day.
Besides the ethical argument against filtering there are plenty of practical arguments. Namely, filters don’t work, they cost a lot of money, and take a lot of time to operate.
I’m the Digital Futures Manager for the San Jose Public Library. A couple of years ago, a filtering challenge was brought by one of our city council members to the library. We were told to filter, we said no, and we embarked upon an extensive study about the effectiveness of filters, which you can find at:http://www.sjpl.org/sites/all/files/userfiles/agen0208_report.pdf. The overall results? Internet filtering software **does not work**.
Looking at our own library’s study as well as all of the published studies done in the last decade [**see the end of this post for a complete table**], it’s consistently found that 15-20% of the time, content is over-blocked (e.g. benign sites that are blocked incorrectly). And 15-20% of the time, content is under-blocked (e.g. sites deemed “bad” gets through anyway). We found that overall, filters have only about a 60-70% accuracy rate for traditional text content. Looking at all surveys of filtering accuracy from 2001-2008 (no studies have been done in 2009 or 2010 that I’m aware of), the average accuracy of all the tests combined from 2001-2008 was 78.347%, and that is measuring only text content with only one study looking at images. If we think “well, filters get better over time, right?” and only look at studies from 2007-2008, we see a nominally higher accuracy percentage: 83.316%. So, while filters may be getting a little better…they’re still wrong 17% of the time for text content, and over half the time for image, video, and other non-text content. If you think about what that means practically speaking for your browsing experience, you may think: “We’re spending money and time on these systems why again?”
Filters simply do not work on multimedia content, which is usually what people think the filters are for (naughty videos and photos). The accuracy in filtering images, audio, video, RSS feeds, and social networking content is embarrassingly low: about 40%. That means that *over half the time*, the filter makes the wrong decision about blocking a photo or video. Again, why would we foist these failed systems willingly upon our communities?
And how do filters work? There are automated little spiders crawling the web, looking for naughty content — usually there’s a formula (which the companies will never tell you) that looks for some combination of trigger keywords, trigger URLs, if there are too many images on the site, a weird combination of letters & numbers in the URL, etc. If the spider determines something fits in the “naughty” category, then there it goes. If the company is particularly vigilant (often not the case), they will have some minimum-wage untrained lackey spot-checking results from the spider. So if a filter constitutes collection development, we have left our online collection development in the hands of an automated software system and untrained non-library staff. Worse yet, the company won’t even tell us why or how they choose to categorize items. You usually do have the ability to add things to the white list (OK stuff) or black list (naughty stuff). But as subjectivity is key in issues of content, even among library staff, who gets to decide what is bad and what isn’t?
Also concerning is that library customers report usually not being willing to ask for something to be unblocked for them as they are embarrassed as the library has automatically put them in the position of looking at something “naughty” even if it isn’t. So how many of our library customers walk away without the information they need? And whose fault is that? Ours!
Beyond that, the time that it takes for staff to unblock sites and handle the administrative paperwork to do so is incredible — many libraries estimate it at 60 minutes of staff time per request. The return on investment of dollar and time investment is negative. You lose when you install a filter.
And that’s the bottom line. Filters make the library lose money and time. Filters make the customers lose access, time, and confidence in the library’s use and relevance.
People who want to install filters in libraries have the best intentions (usually). They think that it will “protect the children” or “filter out pictures of penises.” Sadly, the technology has not caught up with our expectations for how it should work. People truly believe that filters work, but only because they haven’t looked at the research or tried one out themselves. If there were filters that didn’t overblock or underblock, I’d be the first in line to take a look at them. But the software is fallible. And turning over an entire community’s freedom of information access to a known-failed software system is just about the most foolish thing any library could choose to do.
Filtering Studies and Their Findings, 2001=2008 (no studies found in 2009 or 2010)
Average accuracy 2001-2008: 78.347%
Average accuracy 2007-2008: 83.316%
(someone made an argument that if we only count recent survey results, the accuracy will be significantly higher, but it’s less than 5% higher, within the margin of error cited in all of these surveys)
|2008||Protecting Children on the Net with Content Filtering||EU Safer Internet||
|2008||Closed Environment Testing of ISP-level Internet Content Filtering||Australian Communications and Media Authority||
|2008||Deep Throat Fight Club Open Testing of Porn Filters||Untangle||
|2008||Expert Report||Dr. Paul Resnick (for North Central Regional Library District)||
|2007||Report on the Accuracy Rate of FortiGuard||Bennet Haselton (for the ACLU)||
|2006||Expert Report||Philip B. Stark (for the DOJ)||
|2006||Websense: Web Filtering Effectiveness Study||Veritest (for Websense)||
|2004||Report on the evaluation of the final version of the NetProtect Product||Net-Protect.org||
|2003||Internet Blocking in Public Schools||Online Policy Group||
|2002||Corporate Content Filtering Performance and Effectiveness Testing Websense Enterprise v4.3||eTesting Labs (for Websense)||
|2002||No Evil: How Internet Filters Affect the Search for Health Information||Kaiser Family Foundation||
|2001||Expert report of Dr. Joseph Janes||Dr. Joseph Janes (for the ACLU)||
|2001||Internet Filtering Accuracy Review||Cory Finnell for the Certus Consulting Group (for the DOJ)||
|2001||Updated Web Content Software Filtering Comparison Study||eTesting Labs (for the DOJ)||
|2001||Digital Chaperones for Kids||Consumer Reports||
|2001||Effectiveness of Internet Filtering Software Products||Paul Greenfield, Peter Rickwood, and Huu Cuong Tran (for the AustralianBroadcasting Authority)||
|2001||Report for the EuropeanCommission: Review of Currently Available COTS Filtering Tools||Sylvie Brunessaux et al.||
My name is Sarah Houghton and I am working as the Director for the San Rafael Public Library (California), a two library system serving our town of 60,000. And, of course, I write this Librarian in Black blog website thing which has been around since 2003.