Our social search approach – Part II

June 20, 2007

In continuation to Our social search approach – Part I

The most damaging phrase in the language is: “It’s always been done that way.”
— Grace Murray Hopper

Starting off from where I left, the basic concept of a “social search-marking” (a loose term for integrated social bookmarking and social search) service came to my mind almost four months ago. This concept was not the next best thing to sliced bread, nor was it rocket science to get started with. However, the benefits of a “social search-marking” service over-shadowed the short-comings which I could foresee. Ofcourse there are a couple of big players in social bookmarking already (specially with Yahoo buying out Del.icio.us), and a number of interesting social search engines cropping up.

So, I embarked on a research spree, mainly investigating on factors such as the feasibility of “social search-marking”, existing market players with atleast 80% similarity to what I had in mind, revenue structure for such a service (profit-making is essential for any venture), and the technology infrastructure needed.

After almost 3-4 weeks of extensive research and analysis, I drew a good picture of what was involved around developing, launching and running a “social search-marking” service. However, not everything aligned with my pre-research plans. I came across a number of bottle-necks which I had to rethink about and modulate accordingly. Two most important findings from my research were:

1. There were no existing web-based services or software tools that combined social bookmarking and social search. Although, its benefits were quite clear to me, atleast from a mental visualization and some less-constructive paper sketches (on coffee-shop paper tissues). Overall, I looked into the offerings and business models of several “social-driven” and “user-generated” services.

2. I also learnt that the technology infrastructure which was needed to build, test and reliably run this service couldn’t be based on an old-school client-server architecture and a sub-standard application hosting environment. So far, I was sure that the application side of it was pretty much a web-based data-driven application, and a small browser extension (to assist in browser-based access to the application, and most importantly search engine integration). The hardware, network backbone and resource management side of it was a different story. Imagine a tiny (and safe) plugin running in your browser, and pinging a web service on a remote server for every Google or Yahoo search you made – to lookup relevant bookmarks across yoru social search network. That would be very resource intensive for the web server alone – hitting it hard on CPU, RAM, bandwidth, and eventually its performance & stability. We had to think in terms of stabilizing all such issues, without escalating the budget.

By the end of March, I concluded from my research that a “social search-marking” service has potential, it will add value to the “online bookmarking and web search” domain, it has prospective users, it can be profitable, and it needs quite a lot of technology infrastructure planning before we can even think of getting it off the ground. By mid April, we were busy! We were in a parallel work mode – doing further research (10% time allocation), initiating the system development (10% time allocation), and planning the core architecture for the service (80% time allocation). We utilized the 80-20 rule throughout the project plan. The majority of our time and efforts went into the most critical aspects of the system. By the time development overtook planning, we stricly followed the guideline to “release early, release often“. That meant, test-driven development, a lot of prototyping, prioritizing the tasks for the first release, and aligning only the required processes to intended objectives. We had to stay focused. Actually, I had planned to launch a blog to jot down the project progress and “lessons learnt” in a continuous fashion. But that didn’t happen until recently. I feel that knowledge management goes a long way.

The entire architecture could be broken into three layers:

1. A data-driven web application – for online bookmarks management and social search networking;

2. A web browser addon – a thin-client to assist in browser-based access to the application, and most importantly search engine integration;

3. A web service – to act as the broker interface between the the browser addon and the remote DAL.

We chose the LAMP stack for all web development, based on the fact that with it there won’t be a learning curve, and it will provide high-end performance along with scalability & flexibility for solution design. Instead of writing the web application from scratch, its “hull” was based on an in-house PHP/MySQL framework that I’ve utilized for several projects. This provided a very flexible time-tested solution to play with. Special consideration was given to the database abstration layer, since this would be a demanding application with very frequent database hits. A caching component and tweaking the database configuration gave us a stable backend. I’m sure we’ll have to work on this further once the application is in production, based on real-world usage analysis. Trust me, no amount of stress testing can compensate for a flood due to the Slashdot effect!

The browser addon was essentially browser-dependent, which meant that we must develop a different addon extension for the Mozilla breed (Firefox/Netscape/Flock), and another one for Internet Explorer. Based on some user preference metrics, we decided to focus on the former (Firefox), and keep the latter (IE) for a future release. The CoReap Firefox extension, which will ideally be a browser-pane frontend to the CoReap web application, mainly assisting in search engine integration (through AJAX), had to be a pure JavaScript and XUL based scripting solution. To improve client-side performance and reduce overheads (server requests), we also made good use of Firefox’ in-built support for SQLite (an embeddable, zero-configuration SQL database). While remotely hosted web applications have a native security model, we made sure that the web browser addon follows the same policy, and maintains high-levels of stability and client-side safety. The Greasemonkey tool came quite handy in rapid prototyping of the basic functionality for the Firefox extension. We knew that once the Firefox extension takes shape, we’ll be able to extend the same feature-set to IE (although building it in .NET).

One of the most interesting features of CoReap, our social search approach, is the fact that it seamlessly integrates with existing search engines. So, you can have access to conventional web search results (as you do now, but something that’s rarely looked at beyond the first few pages), and bookmarks (personally recommended resources) from the social search network that you build around your friends and known experts — all organized on the same search results page! Our search engine integration model was designed to be highly flexible so that the social search results (bookmarks by your and your friends) can be embedded on any web search engine or platform. However, to keep things simple and cater to the largest possible audience, the first release of CoReap will only support Google search (around 43% market share) and Yahoo! search (around 28% market share). While I’m on it, I should mention that CoReap is neither affiliated with, nor endorsed by, Google Inc. or Yahoo! Inc. We do plan to improvise an API for CoReap, so that platform-independent integration is not an issue. I’ll post more about the roadmap and timeline in the coming days.

Not the least, we spent a lot of time (and still are) working out a managed hosting environment for CoReap. A hybrid hosting solution with dedicated resources (i.e. isolated server resources and a strong network backbone) was the only way to go. This will also allow us to scale up, as & when the demand increases. More on this later.

To be continued in part III …


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: