Livin’ on the Edge

That’s the title of a good Aerosmith song; not the best, but still quite good… I think that the edge is also where search needs to go. Google is trying to do it in one centralized location using hundreds of thousands of PCs, but search will only scale if you do it on the edge.

Actually, that’s not quite true. We couldn’t be where we’re at today without Yahoo!, Google, Microsoft, and before them. But we’re now in the era of Web 2.0, and while a whole bunch of companies are sprouting up touting AJAX-y features that they claim will disrupt life as we know it, I think what Web 2.0’s huge disruptions are still to come. We have yet to see the Netscape IPO of Web 2.0. I think it’s time to migrate the search knowledge we’ve acquired so far to the edge network.

In fact, I think that search now belongs on the client as an application that sits there, observes, and learns from your behavior. Let’s be honest: Search 1.0 is trying to divine your hopes, fears, wants, and needs by using a lousy text box as the interface between you and it. If it gets lucky, you might throw a few more words at it, but that is usually due to the fact that it was utterly incompetent in finding what you wanted to begin with. Worse, if you get exasperated and move on to a different search service, it knows not whether it has succeeded or failed.

Now a search application on the client knows a lot more about you since it has full and undisputed access to 100% of your clickstream. It knows that you were listening to Concrete Blonde when you launched your browser, googled the band, read up on them at Wikipedia, and noticed that Johnette Napolitano released an album called ‘Sketchbook 2′. You pointed your browser to Amazon, where you searched for Johnette Napolitano and came up with absolutely nothing. You went back to Google, queried ‘Johnette Sketchbook 2′, and found what you were looking for right here, and promptly made the purchase.

Search 2.0 looks something like this: Clickstream logging and aggregation occurs on the client. I can also bookmark and tag items, either stored locally or on the network and I can share this information with others (or not). Others may be more nuanced than public or private. Perhaps my privacy policy can tap into my social network so I’ll only allow friends and friends of friends to view my favorites. Whatever… No matter what the key is this: I have the ability to store 100% of this information right here on my own machine: I *own* my data, and I’m sure of it because the code I’m running is open-source so I don’t worry about Spyware. If I choose to share the data, I can do so freely since the client application supports common P2P protocols.

Now maybe I want the convenience of storing my data on the network so I can use and retrieve it wherever I am. The choice is mine. Perhaps I pay a monthly fee to have a trusted 3rd party host that data for me or, perhaps, I willingly give some of my data to pay hosting costs, or, if I really have nothing to hide, I might even try and make a buck by selling every piece of data I produce. I repeat: the choice is mine.br /br /I’d think that as far as business model, running these “edge boxes” where you either host data for users for a fee or you buy their data, aggregate it, slice and dice, and sell it to corporations after stripping off the identity or, for those few brave souls, along with it.

It’s time to move search to the edge. When a new programming language surfaces, the first compiler is written using an older established language. The language comes to life when you can finally write a compiler for that language using the language itself. Extending the metaphor, Search 1.0 has provided us with a “new programming language.” It’s time to write the compiler.