View Full Version : The line between book and Internet will disappear

11th September 2010, 09:24 AM
A few months ago I posted a tweet that said:

The distinction between “the internet” & “books” is totally totally arbitrary, and will disappear in 5 years. Start adjusting now.

The tweet got some negative reaction. But I'm certain this shift will happen, and should happen (I won't take bets on the timeline though).

It should happen because a book properly hooked into the Internet is a far more valuable collection of information than a book not properly hooked into the Internet. And once something is "properly hooked into the internet," that something is part of the Internet.

It will happen, because: what is a book, after all, but a collection of data (text + images), with a defined structure (chapters, headings, captions), meta data (title, author, ISBN), and prettied up with some presentation design? In other words, what is a book, but a website that happens to be written on paper and not connected to the web?

An ebook is just a print book by another name

Ebooks to date have mostly been approached as digital versions of a print books that readers can read on a variety of digital devices, with some thought to enhancing ebooks with a few bells and whistles, like video. While the false battle between ebooks and print books will continue -- you can read one on the beach, with no batteries; you can read another at night with no bedside lamp -- these battles only scratch the surface of what the move to digital books really means. They continue to ignore the real, though as-yet unknown, value that comes with books being truly digital; not the phony, unconnected digital of our current understanding of "ebooks."

Of course, thinking of ebooks as just another way to consume a book lets the publishing business ignore the terror of a totally unknown business landscape, and concentrate on one that looks at least similar in structure, if not P&L.

While you can list advantages and disadvantages of print books versus ebooks, these are all asides compared with the kind of advantages that we have come to expect of digital information that is properly hooked into the Internet.

Defining a book by what you cannot do

What's striking about this state of affairs -- though not surprising, given the conservative nature of the publishing business, and the complete unknowns about business models -- is that we define ebooks by a laundry list of things one cannot do with them:

* You cannot deep link into an ebook -- say to a specific page or paragraph chapter or image or table
* Indeed you cannot really "link" to an ebook, only various access points to instances of that ebook, because there is no canonical "ebook" to link to ... there is no permalink for a chapter, and no Uniform Resource Locator (url) for an ebook itself
* You (usually) cannot copy and paste text, the most obvious thing one might wish to do
* You cannot query across, say, all books about Montreal, written in 1942 -- even if they are from the same publisher

You cannot do any of these things, because we still consider that books -- the information, words, and data inside of them -- live outside of the Internet, even if they are of the e-flavor. You might be able to buy them on the Internet, but the stuff contained within them is not hooked in. Ebooks are an attempt to make it easier for people to buy and read books, without changing this fundamental fact, without letting ebooks become part of the Internet.

Many people don't want books to become part of the Internet, because we just don't know what business would look like if they were.

This will change, slowly or quickly. While the value of the digitization of books for readers has primarily been, to date, about access and convenience, there is massive and untapped (and unknown) value to be discovered once books are connected. Once books are accessible in the way well-structured websites are.

What lurks beneath the EPUB spec

The secret among those who have poked around EPUB, the open specification for ebooks, is that an .epub file is really just a website, written in XHTML, with a few special characteristics, and wrapped up. It's wrapped up so that it is self-contained (like a book! between covers!), so that it doesn't appear to be a website, and so that it's harder to do the things with an ebook that one expects to be able to do with a website. EPUB is really a way to build a website without letting readers or publishers know it.

But everything exists within the EPUB spec already to make the next obvious -- but frightening -- step: let books live properly within the Internet, along with websites, databases, blogs, Twitter, map systems, and applications.

There is little talk of this anywhere in the publishing industry that I know of, but the foundation is there for the move -- as it should be. And if you are looking at publishing with any kind of long-term business horizon, this is where you should be looking. (Just ask Google, a company that has been laying the groundwork for this shift with Google Books).

An API for books

An API is an " Application Programing Interface." It's what smart web companies build so that other innovative companies and developers can build tools and services on top of their underlying databases and services.

For instance:

* Google Maps has an API so that geolocation services (for instance Yelp) can use Google Maps and the business data contained therein to better serve their niche customers
* Twitter has an API so that other services can build Twitter clients, search Twitter, provide Twitter analytics, etc.
* Amazon has an API that lets developers easily find and point to product information.
* Wikipedia has an API, so that you can do thing like make books out of every edit done on the Wikipedia article, " The Iraq War"

We are a long, long way from publishers thinking of themselves as API providers -- as the Application Programming Interface for the books they publish. But we've seen countless times that value grows when data is opened up (sometimes selectively) to the world. That's really what the Internet is for; and that is where book publishing is going. Eventually.

I don't know exactly what an API for books would look like, nor do I know exactly what it means.

I don't know what smart things people will start to do when books are truly of the Internet.

But I do know that it will happen, and the "Future of Publishing" has something to do with this. The current world of ebooks is just a transition to a digitally connected book publishing ecosystem that won't look anything like the book world we live in now.


* View From the Trenches: Surviving Change
* What publishers can and should learn from "The Elements"
* The Tragic Death of Everything
* Metadata, Not E-Books, Can Save Publishing...

tags: digital publishing, distribution, ebooks, epub

Comments: 14

jeroenvduffelen [10 September 2010 10:26 AM]

It will happen? It is happening already! At Widescript we are bringing the e-book (epub & zhook format) to every connected device through a web browser. At the moment we are developing some examples to show of the potential of web enabled interactive e-books. We are considering building an API for some of the interactive features but don't really know what should be made public through it. What kind of features would you think should be opened up with an API?

Antoine RJ Wright [10 September 2010 10:37 AM]

I will agree with the above poster, this is already happening in niche areas. It is only a matter of time before such conventions are normal.

In respect to publishers becoming more aligned with developers, we've seen this in the Biblical publishing arena with two events (Olive Tree partnering with Zondervan, and Logos's work on the Biblia API). Both efforts blurr the lines of publishers and developers, and set the stage for books being redefined (see the EEC commentary coming from Logos).

Just a matter of time; and you folks at O'Reilly should be right on the leading edge of this as well :)

Hugh McGuire [10 September 2010 11:47 AM]

@jeroenvduffelen ... certainly it *is* happening already. But on the fringes of the publishing world, which is where it *should* happen. As for what kind of info should be exposed via API, I don't quite know. At least: all meta data, and associated images. Maybe you should be able to find paragraph #142, sentence #7 (or - be told that "To be or not to be" is found there). Maybe the API should allow readers or partners to add metadata to phrases, words, locations contained in the text - ie to build a semantic metadata layer "behind" the book, including links etc. Certainly all books should allow for commenting, to be displayed as: a) every comment, b) comments from just me, c) comments from just john smith, d) comments from a particular group of people (eg. my editors, my family, my english lit class etc); probably the API should allow a metadata analysis tool to find place names within the text, and add appropriate links to, say, Google maps.

etc. etc. etc.

Hugh McGuire [10 September 2010 11:51 AM]

13th September 2010, 02:04 PM
So true. We just need to accept it.Do we even write with pens anymore ?