The Data Wars Revisited

Companies are increasingly using data as swords and shields.
Jul | 7 | 2015

 

Jul | 7 | 2015
}

RevisitedNearly eight years ago, Josh McHugh in a great Wired piece asked the question, Should Web Giants Let Startups Use the Information They Have About You? The article examines the pros and cons of allowing small companies to scrape data.

In the nearly eight years since the publication of that piece, scraping data remains a controversial practice. To be sure, there’s significant demand for tools that pull data from websites and return it in usable formats. Startups such as GrepsrKrakiopromptcloud, and import.io1 allow non-technical users to grab data en masse from websites and create customized application program interfaces (APIs). Put differently, these go well beyond old-school copying and pasting.

import.io screen shot. Click to embiggen.

For those with mad Python chops, libraries such as Beautiful Soup and Scrapy can typically go well beyond what WISWYG scrapers can do.

Across the aisle, many companies view scraping their data as a tremendous threat. They’re not wrong. For instance, the practice represents one way to get yourself banned from Facebook. Zuck understandably doesn’t want people gobbling up reams of Facebook data, without question one of his company’s most valuable assets.

The Larger Trend

As companies grow, they start to restrict access to their APIs.

I’m not going to argue the merits and demerits of scraping here. I do, however, want to call attention to the larger trend going on here. The data wars are not confined to popular sites such as Facebook and Google. The battle for data is becoming increasingly bloody. What’s more, it’s manifesting itself in decidedly unsexy areas such as HR software. (See my post earlier this month on the Zenefits-ADP scuffle.)

  • Build a custom or proprietary API. No longer is the sole purview of tech behemoths.
  • Build a data moat, something that Netflix, Amazon, and Facebook have effectively done.
  • Close or limit access to its API. Many have done this, including Twitter and LinkedIn. Yes, developers can violate the terms of an API and get slapped for doing so.

Simon Says: The data wars have arrived.

To be sure, there are pros and cons with all strategies. For instance, option three might “protect” data, but it’s going to earn the ire of developers and users. Wooing developers and partners by opening “platforms” and APIs is standard practice at the beginning. As companies grow, however, they start to restrict access to their APIs.

In The Age of the Platform, there are no simple answers.

IBM paid me to write this post, but the opinions in it are mine.

Footnotes

  1. Read my interview with import.io CEO David White here.

Go Deeper

Outliers

Thoughts on parallels between emerging technologies from last decade and the WFH debate.

Receive my musings, news, and rants in your inbox as soon as they publish.

 

2 Comments

  1. Veronica Pullen

    Hey Phil, I was just alerted to your post in my daily alert from Mention, and I wanted to drop by and say thank you for linking back to my post on getting banned from Facebook.

    Since I wrote that post, Facebook have now removed the ability to scrape data completely, by withdrawing access to their api from the scraping software I mentioned. If you try and target an ad to a previously scraped audience, your ad will be denied immediately too,

    Great post, and thank you again for quoting my post.

    Warm regards
    Veronica

    • Phil Simon

      You know more about it than I do, but that doesn’t surprise me. You can’t even copy something and pasted for the most part.

Comments close 180 days after post publishes.

 

Next & Previous Posts

2 Comments

  1. Veronica Pullen

    Hey Phil, I was just alerted to your post in my daily alert from Mention, and I wanted to drop by and say thank you for linking back to my post on getting banned from Facebook.

    Since I wrote that post, Facebook have now removed the ability to scrape data completely, by withdrawing access to their api from the scraping software I mentioned. If you try and target an ad to a previously scraped audience, your ad will be denied immediately too,

    Great post, and thank you again for quoting my post.

    Warm regards
    Veronica

    • Phil Simon

      You know more about it than I do, but that doesn’t surprise me. You can’t even copy something and pasted for the most part.