spotsearch

Friday, June 29, 2007

PowerSet's PowerLabs Demo Day, 6/28/2007



Finally, PowerSet opens up! They held a demo day for bloggers and users who signed up for their PowerLabs web site, and revealed some of their thoughts. They've expressed a very refreshing attitude of openness which is, ironically, quite unusual for a search engine company. PowerSet is making use of existing technologies where they can. For instance, they use Wikipedia data, FreeBase semi-structured data, and Wordnet. Of course, they also have announced they will use Ruby on Rails too. Currently their main test index is fairly small (Wikipedia) but, they plan to scale up, with a focus on verticals.

Their main competitive edge is their linguistic approach to search. By augmenting their index with additional linguistic information, they hope to more accurately answer search queries, especially question & answer type queries. Cofounder Steve Newcomb says that Moore's Law is on their side- currently, the linguistic parsing they do during indexing is quite expensive, on the order of 1 second per page. However, machines are getting faster every year, and, code can continually be optimized, making their approach feasible and practical. They make use of both their own 720-cpu-core cluster as well as the Amazon Elastic Compute Cloud (EC2) cluster (using about 1000 CPU cores in Amazon's cluster) to do their processing.

An interesting experiment on their part is PowerLabs. They view it as a way to build a community that will provide early access to their search engine and will obtain early feedback in the development of their search engine. I'm a believer in getting early feedback and rapid iteration, so, I think they're on the right track with that.

Their demo was impressive, however, it was a series of "canned" queries. They unfortunately didn't take random audience queries or let anyone directly play with their search engine, so at this time, it is quite difficult to make a judgement on how strong or weak their technology is. I think it will take quite a long time for them to perfect the system, as they have to also handle standard short web queries well in addition to their core competence (question & answer queries). One weak area is they currently don't do link analysis for ranking, though they do intend to have some sort of link analysis incorporated by the time they launch. I suspect they will have a significant amount of work in tuning their ranking algorithms. They also mentioned that they intend to create snippets augmented with metadata. For instance, if the result had the name of a famous person, they might add a parenthetical bit of metadata about the person: George W. Bush (President of the United States).

Overall, it was an interesting Demo Day. I liked their attitude of openness and community-building. However, whether they succeed or not will come down to how good their technology is, and if they can get people to switch. Building a good demo is one thing, a good first step- making it work in the real world on a huge range of real user queries is another thing. I look forward to trying out PowerSet soon.



For another take on the event, see OnoTech's post.

Thursday, June 28, 2007

Spotsearch is back!

Hi,

There are too many things I want to blog about, so Spotsearch is now back!

PowerSet is having a demo day today in their San Francisco offices. They have promised great things- a search engine to take on Google, a fresh "open" approach to sharing information (are they the Anti-Google?), so, will they live up to their hype? Only time will tell.