Saturday, September 10, 2011

Diffbot sees the Internet as people do now free for developers

diffbot-logo

Diffbot is geeky and incredibly interesting technology that uses robots, algorithms, computer vision and artificial intelligence to handle content on the website so how can a man. "The whole Internet can be broken down into 30 different types of pages" explains co-founder Mike Tung, also known as "Mike," and "Diffbot Diffbot you can define them all." Diffbot knows the difference between a social network profile, blog, home page, product page, events page and dozens more.

Diffbot today released his first set of APIs that are now open for all developers for free. Running can significantly affect the types of applications that developers can create, and for consumers, this means that a variety of intelligent applications is about to quit.

New APIs: on demand & follow

With two APIS available now, developers can create applications that will automatically extract the values from the pages of the applications that understand the trends and who says it, applications that provide RSS feeds, where no one had been available previously, and apps that announced only the relevant parts of the Web page, ignoring the ads, header, and footer copy.

And that's just for starters. Future APIS will enable developers to automatically turn the pages event calendar appointments, social network profiles in the vCard format or to automatically extract the delivery prices or feedback from product pages, among other things. Although Diffbot does not have an established road map, he looked forward to the launch of these new APIs for several months.

Today, the first 2 API available are:

On-demand API: this API consists of page types "Frontpage" and "Article". The first is used to analyze a site's pages and index pages using common layout markers as headers spoken, photos, articles, ads etc article API extracts text clean article, photos and tags. (For example, see Readably).Follow the API: This is used to track changes and updates to any Web page. Diffbot automatically determines the portion of the page that a developer wants to follow and extracts the metadata as the names, images, text, summary and more, and then segments page relevant sections (see photo).

What can Diffbot actually?

These APIS are already used speech recognition system for companies like maker nuance, AOL (Disclaimer: TechCrunch is owned by AOL), social media monitoring firm SocMetrics and others.

AOL uses Diffbot to retrieve the title, author, image, text, video, themes, and other metadata for its new iPad mag AOL publications. Nuance uses technology to improve its processing into a product for physicians, natural language processing, which requires an understanding of the complex medical terminology. SocMetrics sends a bit.ly short links to Diffbot to get the full text of the article text and so that it can determine which users of social media are saying about what topics most.

These are just a few examples of big-name. There are fewer, but just as innovative use cases out there, too. Like Hacker news radio for example, which reads the Hacker news and comments to you. Or FeedBeater that allows you to easily turn any URL into an RSS feed automatically (one of the first creations Diffbot). Or created by Diffbot Twitter feed that tracks changes to a Web page for the city of São Paulo, Brazil (as it has no RSS), and tweets updates.

The new self-service platform for developers is free up to 50000 API calls per month. Cloud plan provides $ 500 calls per 100000, and then $ 0.002/call after that. Managed business plan requires a custom pricing.

Diffbot was founded by Mike Tun and Leith Abdulla, both Stanford graduate students to leave to create a company. The idea arose from the desire to Tung automatically track new appointments on the Web site of the class automatically, using the technology. Diffbot was also the first launch, sponsored by the Stanford incubator programme, now called StartX (formerly SSE Labs).


Diffbot, founded in 2008, two students at Stanford, applies the methods of computer vision in order to obtain the semantic structure of Web pages. Diffbot analyzes documents a lot as a person, using ...

Read More

View the original article here

No comments:

Post a Comment