Free as in Freebase

It’s been a while since I’ve blogged about Freebase, the semantic web database maintained by Metaweb. But I recently had the chance to meet Freebasers Robert Cook and Jamie Taylor and hear them present to the New York Semantic Web Meetup on “Content, Identifiers and Freebase” (slides embedded above).

It was a fun and informative presentation. Perhaps the most surprising revelation about Freebase was that all of their data fits in RAM on a 32G box (yes, some of you caught me live-tweeting that during the presentation). Their biggest challenge is collecting good data that lends itself to the reconciliation needed to make Freebase useful as a data repository. Despite the lack of a near-term revenue model, the Freebasers are bullish about their approach: strong identifiers, strong semantics, open data. On the last point, almost all of Freebase is available under the Creative Commons Attribution License (CC-BY)–which, as far as I can tell, make anyone free to develop a mirror of Freebase. Indeed, many people are using this data, including Google and Bing.

You might wonder whether Freebase is a business or a non-profit foundation–and the question did come up. The …

You might wonder whether Freebase is a business or a non-profit foundation–and the question did come up. The answer is that Freebase eventually expects to make money by providing services, e.g., helping advertisers. They see their graph store as a competitive advantage–but they freely admit that this advantage will erode over time. Indeed, the surprisingly small size of their graph makes me wonder how much speed and scalability matter, compared to the challenge of data scarcity.

I’d like to see Freebase succeed. I’m particularly a fan of the work David Huynh has done there on interfaces for semantic web browsing. Clearly their investors are true believers–Metaweb has raised a total of $57M in funding. I don’t quite get it, but I’m happy we can all benefit from the results.

Link to original post