Ping (zestyping) wrote,

Factville: a new project.

I'm starting a new open source project. It's something i've been thinking about for quite a while now, and have mentioned to people here and there. Let me tell you a bit about it.

Exhibit A: a book on gender differences

A little while ago, i wrote about a newspaper article in the SF Chronicle. The subject of the article was a new book by Louann Brizendine called The Female Brain. On the cover of the book is a brain-shaped mass of white plastic telephone cord, the old kind that comes in a long springy coil — a visual wisecrack depicting the book's central claim that women are born communicators ("excess testosterone shrinks the communications center").

The book jacket lists several gender stereotypes as bullet points. One of them is a specific numerical claim: A woman uses about 20,000 words per day while a man uses about 7,000. Other sources give a wide range of numbers, from "7,000 vs. 2,000" to "50,000 vs. 25,000". Perhaps a lot of people believe women are inherently more talkative. But there doesn't seem to be much evidence to back this up. Actually, a recent study suggests that men and women talk about equally much.

Nonetheless, Brizendine's claim was quoted all over the media. It made a huge impact (the book was a bestseller), and a considerable amount of time went by before it was debunked. To a casual observer, the claim probably doesn't even appear to be debunked at all: a reputable scientist says one thing, a little while later another scientist says the opposite — who's to say which is right? Another virtual throwing up of the hands, another shaking of heads, another anecdote about those silly academics who can never agree on anything.

Catching and recovering from misconceptions

Of course, this sort of thing goes on all the time. Brizendine, as i said, is a reputable scientist — she is a medical doctor and has been on the faculty at Harvard and UCSF. Plenty of facts and figures quoted in the media are presented by people who don't even claim to be scientists or to have evidence. Public misconceptions are pervasive, stubborn, and can be enormously costly.

When you come across a fact — or something that's claimed to be a fact — how do you know whether it's true? Maybe you Google for it; after all, the Web is somewhat more democratic than the TV and print media. But the Internet is also notoriously good at spreading rumours. Maybe you check Wikipedia, trusting its community editing process to do a good job of weeding out errors. Or perhaps you visit Snopes, hoping that the rumour you heard is common enough that someone there will have written an article about it, and you think the people who run that site are pretty decent at what they do.

On the other hand, Wikipedia and Google are a little too general: they may give you an article that's generally related to your topic, and then you need to examine it to see if it mentions the particular claim you want to check. And in both cases, the filtering process is hard to examine: Google's ranking algorithm is secret, and although at Wikipedia everything is public, you could spend weeks reading through the discussion pages trying to find out how a particular claim got inserted into the article. Snopes offers an excellent overview of each rumour, but there's only so much that two people can write. And of course you have to trust those two people.

An idea for a new service

So i think there's a useful service that could be provided by a new website: something with the openness and democratic participation of Wikipedia, but more focused on specific claims and the evidence for them. Thus Factville: a community-edited database of facts and supporting evidence. The site i have in mind would not be an alternative to Wikipedia, but rather a tool to help Wikipedians. A large part of the debating on Wikipedia consists of people gathering sources to support statements they want to put in the article; Factville could help them organize these sources and settle these debates. Factville would also be a tool for bloggers and journalists. When a controversial claim appears in the media, articles spring up all over, taking sides on the claim, quoting and citing sources to support their position. Why not have a place to gather the complete list of sources? Why not discuss them and rate them, the way the Web has taught us to discuss and rate photos, discuss and rate URLs, discuss and rate movies?

That's what Factville is about. It's going to be a Frankensteinian cross between Wiki-style websites (community-edited, completely freeform text, with a recorded history of changes to establish accountability) and Flickr-style websites (community-maintained, structured information, with tags, comments, and ratings). The big challenge will be to make this simple and easy to use. Here is the basic design:
  • The site is a database of claims.
  • Each claim has lists of supporting and refuting citations.
  • Each citation quotes from a source and explains how it relates to the claim.
  • Each claim can also have supporting and refuting arguments.
Each of these things is an editable page, with associated discussion and rating tools.

A source can be any kind of published work — a newspaper article, a conference paper, a video clip, a blog entry, etc. Some sources stand on their own (like Brizendine's book); others belong to a publication venue and rest partly on the venue's reputation (the credibility of an article in the New York Times is related to your opinion of its editing standards).

Citations and sources are separate things because the same source could be used for several claims, or even cited as evidence on both sides of the same claim (perhaps quotations excerpted from different parts of the same source). Information on sources could also be automatically drawn from the syndication feeds of popular publications.

When a contributor wants to put together several sources or other claims on Factville, and combine them into a reasoned case for or against a claim, they can write an argument. Other visitors can rate the arguments up or down so that the most convincing arguments get the most attention.

The ratings of claims, citations, and arguments are not supposed to tell you what is true. They can only tell you about other people's opinions. But the goal is to give you as complete as possible a view of all the evidence, and to let the collaborative power of a large crowd help you find the most relevant factors to consider, as you make your own decision whether to believe each claim.

A modest start

I don't have a running website yet. I have a lot of ideas, some in my head and some written down, many in this journal entry. And i have a start at some code that implements the database structure i just described. Today i registered a new project a Launchpad, an open source project hosting service. You can monitor my progress on the Factville page there. The code I've written so far is available from that page. It's written in Python and runs on Django, which i'm still learning.
  • Post a new comment


    default userpic

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.