Octazen + Facebook: 5 Interesting Facts

Facebook last week made a talent acquisition of a two person Malaysian startup called Octazen, which builds functionality that allows users on a social network to import their friends from other sites. Here are 5 things I learned after reading about this on Techcrunch, GigaOm, and Quora.

1. Octazen and Facebook go a long way back. I knew that Facebook has been using contact importing functionality since the early days, but it seems that they've been using Octazen all along. One would have thought that they built their own functionality. According to GigaOm:
Facebook last week acquired a small Malaysian startup called Octazen Solutions, [...] that the social network had already been using to grow its number of users

2. There's a vibrant industry around contact importing. This includes the companies Octazen, Improsys, Cloudsponge, and the open source project OpenInviter

3. The contact-importing space is semi-legit. In addition to clearly breaching the TOS of the products hosting the data, it seems that there was a constant cat & mouse game between the contact importing companies (like Octazen) and the mail providers and others who host the data like Microsoft Hotmail, or Yahoo mail. Here is one account from a Techrunch commenter:
when we used them yahoo will ban us often, the way around is to use lot of servers (with different hosting providers) with multiple ips and rotate the ips (as they get banned). This was basically a cat/mouse game.

4. Contact importing is a widely accepted practice. Improsys lists as its clients Myspace, Orkut, Photobucket, iLike, Stumbleupon, and a who's who of the web2.0 social networking players. View the full list here.

5. Facebook will benefit from Octozen's acquisition on multiple fronts. These include gaining two domain experts in distributed scraping technologies, preventing other companies to use their technology, preventing people attempting to scrape Facebook's user data, and maybe even reducing their licensing costs (in case a significant increase in usage was projected). A knowledgeable user on Quora answers the question: "Why did Facebook acquire Octazen?" with the following 8 points:
  1.  
    1. disable the ability for loads of 3rd party sites from benefitting from value propositions that defeat the point of FB Connect (ie shutdown available tools that enable quickly building up a portable social graph that don't depend on FB Connect)
    2. hire the leading experts in how to build distributed systems that can get around rate limits
    3. hire the experts in data scraping techniques (which can be used to help lockdown FB data from similar experts - which helps ensure that FB is a walled garden) 
    4. hire expert h4ck3r5 to assist security team efforts
    5. onboard experts who can help optimize address book importing tools
    6. potentially keep past business relationship discressions private (unlikely they are doing anything tons of other non-publicly traded sites aren't already doing - ie breaking ToS)
    7. hire extremely competent engineering talent
    8. potential patent 
Analysis

The fact that a contact importing industry exists tells us two facts: companies are interested in accessing user data from other services. And they want to protect user data on their own services. 
If there were easy to use all-you-can-eat API's to extract user data then social networks would not have to resort to the likes of Octazen. To me this emphasizes the importance of initiatives like the data portability project. Clearly data is an important competitive barrier that benefits the company hosting it. However, not all data is the same. It makes sense for first level user data to be set free and made conveniently portable to be used if a user wishes to. However, a company can still keep within its walls other second-level statistical data that can help in improving its own product and providing a better user experience. So for example:
  • Gmail can allow users to export their email contacts with the frequency of messaging. And they can keep data about how often they log in, what percentage of emails are unread at any given moment, on which dates each message was sent out, etc.
  • Facebook can allow users to save out their name, family, D.o.B, and other information to any other site with a click of a button as long as the user clearly knows what is happening. And it can retain data around how often a user views a friend's profile, when friends were added, and so on. That is currently not permitted with the Facebook developer terms of use.
One particular instance of an environment where the user data was and is still walled off is with IM. Sites like Meebo and Imo.im started by connecting users to their IM networks from the a unified web interface. While this was clearly against official terms of use of AOL, MSN Messenger, etc, it is becoming more and more accepted practice and the web IM's are winning out. Established networks are inter operating. Jabber is being adopted. And data is similarly likely to be freed on the rest of the web.