Pagination in the Browser:

What, Why, and How

Nellie McKesson

@nelliemckesson

nellie@hederis.com

Laura’s intro

Dave’s talk: history, experimentation

Got something for everyone: definitely gonna talk about some pretty geeky theory and tech stuff, do some complaining about ebook reading systems, and show some fun examples of what’s out there today.

Brief intro: pagination in the browser means viewing paged content in the browser

@nelliemckesson | HEDERIS.COM

Here’s what we’re talking about:

https://s3.amazonaws.com/pagedmedia/live/examples/index.html

It looks like this (talk through what we’re seeing)

To many people, this is kind of a no-brainer. Like, obviously this is/should be a thing.

We’re so used to seeing paginated things, and the internet is so established, that it’s kind of inconceivable that this is relatively new.

To better understand both why this is kind of revolutionary, and why it took so long, I’m going to start off by going over some background and history.

Some Background

To be clear, the idea of books in browsers is not new.

@nelliemckesson | HEDERIS.COM

Books in browsers aren’t new:

http://www.gutenberg.org/files/2701/2701-h/2701-h.htm

There are plenty of websites that host entire book texts that people can read online.

Most of these websites are the standard website reading experience that we’re accustomed to: the long scroll.

But it was really ebooks that led to the expanded interest in digital publishing and reading.

@nelliemckesson | HEDERIS.COM

Sony Reader

2006

In 2006 we got the Sony Reader

@nelliemckesson | HEDERIS.COM

Amazon Kindle

Sony Reader

EPUB 1.0

2006

2007

In 2007 we got the first EPUB spec (I always get self conscious saying these kinds of firm dates in front of these crowds, because I know there are a number of you who were part of that history, so forgive me if I’m off slightly)

Also in 2007 Amazon released the first Kindle

@nelliemckesson | HEDERIS.COM

Amazon Kindle

Sony Reader

B&N Nook

EPUB 1.0

2006

2007

2009

In 2009 Barnes and Noble released the Nook

@nelliemckesson | HEDERIS.COM

Amazon Kindle

Sony Reader

B&N Nook

Apple iPad &
iBooks

EPUB 1.0

2006

2010

2007

2009

In 2010 Apple released the iPad and iBooks

@nelliemckesson | HEDERIS.COM

Here’s all the eReaders that have existed:

https://en.wikipedia.org/wiki/Comparison_of_e-readers

And a variety of other people jumped on the ereader device manufacturing train.

So, a couple years later, this is what we’re looking at (scroll through, describe it all - some is crazy, but there’s at least one person trying to use this device to read books)

Each of these devices was essentially a brand new browser,

with different levels of support for the standard CSS that was being used on the Web,

Based on what the manufacturer thought was important for books,

and how much their developers wanted to build.

@nelliemckesson | HEDERIS.COM

2012 is when I was managing the ebook development program at O’Reilly

I try to use my own stories to illustrate this stuff, so this is kind of an old example,

but sadly the point is still relevant.

@nelliemckesson | HEDERIS.COM

OReilly is primarily a reference publisher, publishing books about computers, coding languages, and deeply geeky stuff.

So we were dealing with content across a broad spectrum of topics:

code in pretty much every language,

but also math, chemistry, just about any technical topic you can think of.

@nelliemckesson | HEDERIS.COM

Part of our team’s job was to go through all the customer complaints from the various distribution platforms, mainly Kindle and iBooks,

see what files people were complaining about,

and then try to fix either the file itself or our overarching toolchain,

to create files that would no longer have that specific problem,

on whatever device the customer was using.

This was kind of a nightmare, because more often than not we didn’t have much data around what platform the person was loading the file onto.

So we had to load the file onto all the devices in our device library,

and see if we could replicate the error,

and then figure out a fix that wouldn’t cause new errors on other devices.

@nelliemckesson | HEDERIS.COM

CSS template

The result was a generalized ebook template that was optimized to work on as many devices as possible,

which means that it was pared down as much as possible purely for the purposes of not breaking,

not generating unhappy customers,

and being somewhat adaptable to various screen sizes and platforms.

In more recent years I also headed up a project at Macmillan trade to develop a similar universal CSS template for some of our EPUBs,

This was trade content, so a bit less complex than OReilly,

But we’d still get customer complaints about something not showing up correctly,

And we’d have to do pretty much the same process for testing and fixing the problem.

@nelliemckesson | HEDERIS.COM

Device support is still an issue today

The landscape has certainly gotten better since then,

But support is still a thorn in ebook developers sides,

especially if you’ve got folks using the old device that they bought years ago.

This puts publishers in the same position as software developers,

who have to say they only support certain platforms, and certain operating systems.

Now, I’m not quite ready to talk about the details of pagination in the browser,

But I will point out that a big reason that lots of companies are releasing cloud-based apps,

Is because it’s a lot easier to manage a single code base,

Than a bunch of different versions of software on all kinds of operating systems.

Devices, Today

So, let’s look at the device landscape today.

Because what we’re seeing is that The way that people read is still changing.

The shift that started in 2006 still hasn’t really settled.

I’m not a business analyst, but I’m going to present some numbers to you that seem significant, even to me.

@nelliemckesson | HEDERIS.COM

2010

2011

2016

10.1m

23.2m

7.1m

Kindle Unit Sales

Source: Tom’s Guide - https://www.tomsguide.com/us/is-the-ereader-dead,review-5158.html

So, the first thing we see is that People aren’t buying as many reading-specific devices.

In 2010, Amazon sold 10.1 million Kindles, and the Kindle was only 63% of e-readers shipped worldwide

By 2011, Kindle shipments more than doubled, to 23.2 million.

Flash forward to 2016, and Amazon shipped only 7.1 million Kindles.

@nelliemckesson | HEDERIS.COM

2016

2017

2018

$191m

$146m

$111m

Nook Sales (dollars)

Source: Forbes: https://www.forbes.com/sites/ellenduffer/2018/06/23/nook-sales-decrease-in-2018/#29ddbbc41bf5

It’s hard to find unit sales numbers for Nook, but based on B&N financial reports, we know that NOOK sales decreased 23.9% in 2018 over 2017 (from $146 million to $111 million)

And decreased 23.5% in 2017 over 2016 (from $191 million to $146 million)

@nelliemckesson | HEDERIS.COM

Relatedly, in 2015, only 19% of U.S adults owned an e-reader, pretty uniformly spread across sex, location, age

@nelliemckesson | HEDERIS.COM

2017

Kindle Oasis

Nook GlowLight 3

"This past Prime Day was the best ever day for Kindle sales both in the U.S. and globally"
- Techcrunch, 2018

BUT ereader device makers are still manufacturing devices:

In 2017, both Amazon and B&N released new ereading devices: The Kindle Oasis and The Nook GlowLight 3

Press release, 2018: "this past Prime Day was the best ever day for Kindle sales both in the U.S. and globally"

Now let’s look at book unit sales.

@nelliemckesson | HEDERIS.COM

If you look at book unit sales, you can see that while print sales are still doing fine, there’s a healthy segment of folks who have adopted the ebook reading format and want to keep reading that way.

Book sales trends are complicated and there’s plenty of people, possibly in this room, who devote their careers to analyzing this type of data, so I’m not going to suggest that there’s a single simple factor behind these numbers.

Pricing is certainly a factor

I’d wager that the reading experience is also a big factor

It was interesting to me to hear Rachel Comerford say that PDF continued to be more popular at Macmillan Education

Because for a long while we found this to be true at O’Reilly as well,

perhaps because it was a cleaner and more beautiful reading experience.

The problem isn’t that you can’t create beautiful layouts with HTML and CSS, it’s that the varying degrees of device support leave you with one hand tied behind your back.

For me, what this says is that we, the bookmakers, are still looking for the way that we fit in, in the digital age, in a way that benefits authors, bookmakers, and readers.

@nelliemckesson | HEDERIS.COM

People spend lots of time on the internet (talk about where I got this slide, and go over the chart: US, Canada, France)

@nelliemckesson | HEDERIS.COM

And there’s a slight shift away from tablets as a browsing device (Hootsuite cut this from their 2019 report, so I’m using the 2018 data)

@nelliemckesson | HEDERIS.COM

Old Devices

New Devices

General Devices

Internet use

Existing digital readers

New digital readers

So to sum all this up, you’ve got a bunch of people out there with older devices, some people out there with new devices,

and a majority of people with a phone or tablet or computer, but no device devoted specifically to reading.

And you’ve got lots of people spending lots of time on the internet, looking at web browsers.

You don’t want to lose your readers who HAVE switched over to digital,

And you want to gain new readers who are already digital but maybe haven’t realized how wonderful books are.

The landscape is ripe for experimentation, as Dave mentioned earlier this morning

@nelliemckesson | HEDERIS.COM

Digital publishing with high perceived value

Book production that can scale

So we’re looking for a digital publishing model that readers are willing to pay for

And in order to experiment quickly and efficiently, we need a book production workflow that isn’t going to bankrupt us.

So, in addition to having full control over your book’s layout and design,

This is also about finding ways to expand your market and move faster.

@nelliemckesson | HEDERIS.COM

Full CSS support*

Compex content

Easy to update

A = bh

1

2

When it comes to platforms for reading or showcasing content, putting content in the browser has some pretty clear benefits:

Full support for CSS - you can make beautiful digital books (caveat that we’ll talk about shortly)

You can render complicated things like math and more accurate colors

It’s also easy to update:

If you’re hosting your content from a single database, then you can make updates and those updates will be pushed out live to readers.

@nelliemckesson | HEDERIS.COM

HTML

EPUB

??

PDF

But the really exciting possibility for me, is using the browser to create your book files -- especially your print files.

I’m a bookmaker, first and foremost.

I gave a talk at ebookcraft in 2017 about book production toolchains built around HTML and CSS.

[describe the automated workflow chart]

At this point, HTML and CSS are fully established, viable tools for making book files - ebook files, and PDF for print.

These kinds of workflows have historically relied on special software that can take HTML and CSS files, and translate them into laid-out PDFs.

Software like Prince or Antenna House.

These workflows already rely on CSS3’s paged media spec.

This is a special set of CSS properties just for working with paged content, like books.

They control things like trim size, running headers and footers, page numbering, and so on.

@nelliemckesson | HEDERIS.COM

HTML

EPUB

??

PDF

www.nelliemckesson.com/speaking.html

This workflow is fast and it can make nice-looking books,

especially ebooks,

but it doesn’t give you the same visual control over your print output that InDesign-based workflows have.

For example, you can’t just adjust the number of lines on a page by dragging a box up or down.

You have to do some pretty tedious work to adjust your page layout.

I’m not going to dive too deeply into this, because it’s a whole talk of it’s own --

in fact, its a talk I’ve given in the past, and I have some links on my website if you’re interested.

@nelliemckesson | HEDERIS.COM

HTML

EPUB

??

PDF

But what browser-based pagination does is it opens up that layout process,

so that you can see exactly how things are going to look, and adjust your content accordingly.

@nelliemckesson | HEDERIS.COM

HTML

EPUB

??

PDF

And from a digital reading standpoint, it allows you to present readers with beautiful, print-quality layouts, in digital form.

@nelliemckesson | HEDERIS.COM

<HTML>

So let’s dive into what this really means and how it works.

To do browser-based pagination, you need to get content into HTML

Again, I’m not going to dive into that side of things too much today, but I’m always happy to chat about it.

EPUB was a huge step in that direction - we started getting a critical mass of content already in the HTML format,

which is the language of the web,

and publishers started to get used to marking up their content with HTML.

So the logical progression, from a developer standpoint, is to just have your paged content right in the browser,

rather than jumping through a lot of hoops to display HTML,

when browsers are already perfectly set up for that purpose.

@nelliemckesson | HEDERIS.COM

And the CSS3 paged media spec basically laid out a blueprint for how this should work.

These new CSS rules made it much easier to build browser-based pagination scripts.

Not because browsers support these rules - on the contrary, most browsers have no support for most of the CSS3 paged media spec.

But it gave developers a clearly defined set of rules to follow when building their own pagination systems, so that they could focus just on the technical aspects of getting pagination to work.

Pagination:

Box vs. Column

There are two main philosophies right now around paginating content:

The column method, and the box method

I have a clear favorite, and it seems like most of the other devs in this space have reached the same conclusion

But I’m going to show you both, to save you some trouble

@nelliemckesson | HEDERIS.COM

Some ereader apps already use the column method, and it makes sense in theory.

You basically split your content into a series of columns,

@nelliemckesson | HEDERIS.COM

Viewable Area

and then set your viewport to only display one (or two) columns at a time.

The benefits of this method are that you can easily reflow your content as it changes.

@nelliemckesson | HEDERIS.COM

That’s just built-into the browser: the browser knows to split the content into columns, and it’s going to automatically flow your content amongst those columns.

Where the column method gets tricky is with basically anything having to do with spacing and positioning.

@nelliemckesson | HEDERIS.COM

Like when you’re adding things like margins around the pages,

or dealing with things like footnotes -- anything that needs to have a fixed size, in a fixed position on a page.

[Describe the spread that is being shown -- example of a standard, albeit rather ugly, print spread]

Page margins are especially important here, when you’re talking about true book-like pagination.

These are defined as part of the CSS paged media spec, and the spec is a bit wordy but the concept is pretty simple for anyone who’s ever dealt with a print book.

@nelliemckesson | HEDERIS.COM

Basically, you can think about a page as a collection of boxes.

You’ve got your main content box, which is where your actual book text goes.

@nelliemckesson | HEDERIS.COM

Moby Dick

5

And then you’ve got a variety of possible margin areas around the edge that you can put other content in.

People typically use these areas for running headers and running footers.

At a minimum, you have 8 areas: like this,

@nelliemckesson | HEDERIS.COM

Moby Dick

5

but you have the option of using up to 16 margin areas, each filled with different things and with different designs.

@nelliemckesson | HEDERIS.COM

Moby Dick

5

So if we’re talking about the column model, the way it works is that each column represents the content area for a page

And then you draw additional boxes around each column to represent the margin areas.

[Describe what we’re seeing: the margin area is highlighted green just to highlight it]

You can do this either on a page-by-page basis, if you only need to display a couple pages at a time, as we’re seeing here

@nelliemckesson | HEDERIS.COM

Moby Dick

5

Moby Dick

6

7

Moby Dick

Or you can do it all at once, for example if you’re printing all of these pages to PDF

@nelliemckesson | HEDERIS.COM

0.5 in

0.75 in

0.75 in

0.5 in

Things start to get tricky when you start introducing variable elements.

A simple one is having differing inside and outside margins, which is a standard part of print book design.

[Describe the page design mockup]

@nelliemckesson | HEDERIS.COM

Right

Right

Left

0.5 in

0.5 in

0.75 in

0.75 in

0.75 in

0.5 in

So, in the standard english-language book layout,

if you think of your columns as pages,

the odd numbered columns would be right hand pages,

and the even-numbered columns would be left-hand pages.

And if you think of the column gap as the sum of the margin widths,

you’re actually dealing with a few different column gap requirements:

One value for the inside margins on a spread,

and then a different value for the transition between spreads.

But this column gap is controlled by CSS,

and it’s only possible to have a single column-gap value for a set of columns,

so you’ll have to do some fancy script maneuvering to make this work.

@nelliemckesson | HEDERIS.COM

RUNNING HEADER

5

RUNNING HEADER

Chapter master page

Frontmatter master page

v

This gets even more complicated when you introduce the idea of different master pages.

I’m borrowing this term, master pages, from InDesign,

which essentially means that you have a bunch of page templates for different sections in your book.

A simple version of this is that you have one master page for frontmatter sections,

that uses a roman numeral for the page numbers,

and then a different master page for main chapters, that uses a decimal number.

@nelliemckesson | HEDERIS.COM

How do we know which master page to use on this content?

When you’re working with column layout,

you need some way to detect whether the content you’re working with is inside a frontmatter section or a chapter,

in order to display the correct master page with the correct margin boxes.

But HTML doesn’t have great ways to target columnar content,

or to figure out where things are within a columnar spread.

So to work with columns, you need to do a lot of calculating locations and figuring out what’s visible in the current column that you’re looking at.

@nelliemckesson | HEDERIS.COM

Total content width / single column width = total page count

So the solution is to use a scripting language like Javascript to figure out the position of the content within the full column spread,

And what type of master page to use.

For example, you’d write a script that says:

First let’s figure out how wide my content is, and divide that by the width of one column, and that gives us our total pagecount.

@nelliemckesson | HEDERIS.COM

Column position / column width =

current page number

12 in.

6 in.

Then we figure out what the position of this specific column is,

And that’s how we know if this is an odd or even page, and how to size the margin areas.

@nelliemckesson | HEDERIS.COM

<section epub:type="chapter">

</section>

Then we look at the content that is visible in the column I’m currently looking at.

What kind of container is this content inside of?

If it is a frontmatter container, then adjust my margin content accordingly.

@nelliemckesson | HEDERIS.COM

Where this completely falls apart is if you have different page sizes within a book, which is rare but not unheard of.

There’s no way to set columns with differing widths using standard CSS column handling,

I imagine there is probably a way to make this happen, but it would take some serious scripting gymnastics.

@nelliemckesson | HEDERIS.COM

<section epub:type="chapter">

<h2 epub:type="title">Loomings</h2>

<p>Call me Ishmael. Some years ago⁠—never mind how long precisely⁠—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people’s hats off⁠—then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.</p>

<p>There now is your insular city of the Manhattoes, belted round by wharves as Indian isles by coral reefs⁠—commerce surrounds it with her surf. Right and

. . .

Alright, so now let’s talk about the box method.

The box method takes your single flow of HTML, and re-chunks it into page-sized chunks.

So, instead of having one continuous flow,

you have a bunch of chunks that each have duplicates of all the required parent elements and so on.

For example, instead of having one section container with your chapter in it,

@nelliemckesson | HEDERIS.COM

<section epub:type="chapter">

<h2 epub:type="title"> Loomings</h2>

<p class="split">Call me Ishmael. Some years ago⁠—never mind how long precisely⁠—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim</p>

</section>

<section epub:type="chapter">

<p class="continuation split"> about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically</p>

</section>

<section epub:type="chapter">

<p class="continuation"> knocking people’s hats off⁠—then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.</p></section>

you have a bunch of section containers, with little pieces of your chapter in them.

Duplicating these parent containers is an essential part of making sure that each page chunk carries all the required information that you need to correctly display each page.

Which means all the kinds of things we talked about with the column method, like knowing whether this is frontmatter or a main chapter.

So, all the instructions you need for picking your master page and figuring out how to set up your margins,

Are right in the content that you’re looking at,

and you don’t have to do as much hunting as in the column method.

@nelliemckesson | HEDERIS.COM

<section epub:type="chapter">

<p class="continuation"> knocking people’s hats off⁠—then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.</p></section>

The DOM:
The relationships between elements in an HTML document

parent container

child element

The pages themselves are set up in a way that the DOM understands:

The DOM basically means the relationships between the different elements in your HTML.

In the box method of pagination, you’ve got discrete elements inside of other discrete elements,

and you do things with your elements based on those nesting relationships.

That’s HTML 101.

@nelliemckesson | HEDERIS.COM

<section epub:type="chapter">

<h2 epub:type="title">Loomings</h2>

<p>Call me Ishmael. Some years ago⁠—never mind how long precisely⁠—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people’s hats off⁠—then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.</p>

<p>There now is your insular city of the Manhattoes, belted round by wharves as Indian isles by coral reefs⁠—commerce surrounds it with her surf. Right and

. . .

<section epub:type="chapter">

<h2 epub:type="title"> Loomings</h2>

<p class="split">Call me Ishmael. Some years ago⁠—never mind how long precisely⁠—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim</p>

</section>

<section epub:type="chapter">

<p class="continuation split"> about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically</p>

</section>

The hard part of the box method is almost all up front:

getting your content correctly split into your page chunks.

As with the column method, this involves some pretty intense scripting,

calculating locations of various things within a view area.

@nelliemckesson | HEDERIS.COM

The general idea is that you take your rendered content --

and rendered is a key part here --

this means your content with all the design rules applied to it,

so that it is truly reflective of how things will look and where things will sit on a page.

@nelliemckesson | HEDERIS.COM

Moby Dick

1

@nelliemckesson | HEDERIS.COM

You take this rendered content, and you try to fit it all into a page container, that has all the margin areas and everything built into it.

You then figure out if it all fits or not.

If it doesn’t fit, then you figure out where exactly it stopped fitting,

typically by going through each character until you find the exact breaking point.

@nelliemckesson | HEDERIS.COM

Moby Dick

1

@nelliemckesson | HEDERIS.COM

Moby Dick

1

1

Now that you’ve got your break point,

you split the remaining content off,

making sure to close any boxes and then restart them,

and then move on to the next page box and you do it all over again.

@nelliemckesson | HEDERIS.COM

Moby Dick

1

1

Moby Dick

1

1

@nelliemckesson | HEDERIS.COM

Once you’ve got your content chunked, you’ll need to repeat the process anytime something changes.

So, anytime there’s a design change or text change that affects the amount of space that elements take up

you need to re-run your pagination script to make sure your content is correctly chunked.

In this example, I increased the font size of the main body font, from 14 points to 16 points,

And you can see that the content is now overflowing the page.

This is different from the column method, which is great at content reflow,

and which is why it works great for ereading systems,

where users are adjusting things like font-size etc.,

and want to see those changes instantaneously.

@nelliemckesson | HEDERIS.COM

<section epub:type="chapter">

<h2 epub:type="title">Loomings</h2>

<p>Call me Ishmael. Some years ago⁠—never mind how long precisely⁠—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people’s hats off⁠—then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.</p>

<p>There now is your insular city of the Manhattoes, belted round by wharves as Indian isles by coral reefs⁠—commerce surrounds it with her surf. Right and

. . .

<section epub:type="chapter">

<h2 epub:type="title"> Loomings</h2>

<p class="split">Call me Ishmael. Some years ago⁠—never mind how long precisely⁠—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim</p>

</section>

<section epub:type="chapter">

<p class="continuation split"> about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically</p>

</section>

If you’re dealing with content changes,

this also means you need to make sure those content changes are getting made in your original single block of HTML

so that it can be re-chunked,

or else recompile your pages into one block of HTML and then re-chunk that.

@nelliemckesson | HEDERIS.COM

Recap:

Columns: great for reflow

Boxes: great for page layout

I actually prefer the box method.

The column method is great for reflow, but makes intricate layout harder.

The box method, because it’s set up as these mini DOM chunks, is actually really easy to manipulate and adjust,

and the speed of reflow is something that a good programmer can workaround, as we’ll see in a minute.

Examples in Action

I’m about to do a bunch of live demos, so let’s all hope that nothing blows up

@nelliemckesson | HEDERIS.COM

One of the first tools for pagination in the browser was Vivliostyle. They had some organizational changes, and split off their commercial products to a company called Trim-Marks, rebranding the products as VersaType.

But they still maintain the open source portions of their code under the Vivliostyle name.

@nelliemckesson | HEDERIS.COM

As you can see, it’s pretty straightforward pagination in the browser, similar to what you’d expect from an e-reading system.

It reads whatever CSS styling information you’ve included in the file, which in this example is just coming from the same repository that the HTML is in.

It has decent support for CSS paged media.

I haven’t done extensive testing with it, but I know there are people who have.

I think specifically Tzviya from Wiley has done some work with them on using Vivliostyle in live book production workflows,

and may be able to speak better to where the technology stands today.

@nelliemckesson | HEDERIS.COM

A more recent tool is Colibrio reader.

Colibrio started out by focusing on their pagination script.

At least, back when I chatted with them a year ago, they were using the box method, but they had some special magic that made reflow super fast.

They also had a cool way to configure the actual page breaking rules,

using a series of weighted variables to figure out where the best places to break each page were.

Essentially it lets you rank different potential break points.

@nelliemckesson | HEDERIS.COM

The concept is kind of like this:

You could say that there should definitely never be a break after a chapter title,

and there should probably not be a break after a subhead title,

but there absolutely can be a break after a blockquote.

So, if you assign numeric values to those things,

the script can look at those values and figure out where the best places to break each page are,

which leads to hopefully a little less work for page layout staff,

and a nicer reading experience for users.

This is really similar to the way most automated systems approach hyphenation --

A ranked list of places where its ok to break words.

Colibrio is set up as an SDK, meaning that it can be plugged into other people’s tools and websites.

@nelliemckesson | HEDERIS.COM

For example, the folks over at Circular Software are using Colibrio in their new tool, Masterplan, as a way of giving page-by-page previews of book interiors.

They’re building on top of Colibrio,

and then adding some extra functionality to restrict the pages that are displayed,

so that you don’t have to maintain a bunch of different files for your content (sample chapter, whole book, etc.).

@nelliemckesson | HEDERIS.COM

Alright, so, another browser-based pagination toolset is paged.js, created by the folks at PagedMedia.org.

They actually have a number of book-based open-source tools, and paged.js is one of the newest ones.

It’s an open-source JavaScript library for pagination in the browser.

Of the tools we’re talking about today, this is the one I’m the most familiar with, and that I now use in my own tools.

They use the box method as well for paging content, but have built-in functionality to make reflow pretty fast when you make changes.

@nelliemckesson | HEDERIS.COM

This is the demo that I showed you back at the very beginning, with paged content.

This HTML has been configured to be editable, so you can type in here,

and if you type enough, you’ll trigger the reflow functionality

You can see it hang for a second when we get to that point,

because it is re-chunking the entire adjusted content

@nelliemckesson | HEDERIS.COM

This is a pure script library, so it’s not something that a non-coder can just pick up and start playing with.

But it’s a pretty concise stand-alone library that can relatively easily be plugged into other people’s tools.

As with most of these tools, the paged.js folks have included support for a lot of things from the CSS paged media spec,

so that you can write CSS that conforms to that spec and paged.js will correctly parse it and apply it to your final laid-out pages.

As I mentioned, we decided to use paged.js in our own tools at my company Hederis.

We actually built our own pagination script at the beginning using the box method,

but paged.js was working really nicely and I’m a fan of killing your darlings, so to speak.

@nelliemckesson | HEDERIS.COM

So, here’s an example of paged.js in action.

This is a simple little demo that we whipped up for you guys.

This is our own tool, that lets you design and layout pages in the browser, instead of expecting you to hand-code your own CSS design files.

Trying to break open that book layout and design process

so that you can have automed publishing with HTML, and still have WYSIWYG page layout

Final Thoughts

So what’s next?

@nelliemckesson | HEDERIS.COM

Other standard book things:

Table of Contents

Reading order

Metadata

etc.

We’ve only talked about pure pagination, but when it comes to viewing books in the browser,

There are a variety of other things to figure out how to handle

Like tables of contents and how to dictate reading order

A great thing about EPUB is that they’ve figured out how to handle a lot of these standard book things

So it’s a great place to get some inspiration

And in fact, the lead developer of paged.js also developed epub.js, for displaying epubs in the browser.

@nelliemckesson | HEDERIS.COM

I know there are also some W3C groups thinking about how to build specs around books in the browser,

outside of the EPUB format,

Like the packaged web publication format, that Dave touched on earlier

So there are people thinking about what rules need to exist around books on the web, and reading on the web

@nelliemckesson | HEDERIS.COM

HTML

EPUB

??

PDF

Post-
processing

And when you’re using pagination in the browser for book production workflows to create print PDF files,

you can use most of these tools to create layouts in the browser, and save those layouts to PDF,

but you’ll need to do a bit of extra work to make sure that your PDFs are compliant to your printer’s specifications.

That’s something that’s built into things like Prince and InDesign that browsers haven’t quite figured out.

Thank you!

Nellie McKesson

@nelliemckesson

nellie@hederis.com

And with that, I will unceremoniously end my talk.

Thank you!

Pagination in the Browser (ebookcraft 2019) - Google Slides