A few simplified points on web and document security

(I’m largely thinking out loud with this and noting this down for myself, so feel free to ignore. Also, most of the following is extremely simplified. The actual security issues involved can get quite a bit more complicated.)

To prevent the spread of malware modern client systems refuse to run code without safety assurances.

(Historically, well-implemented whitelist or whitelist-oriented security models tend to be safer than blacklists or heuristics. All else being equal.)

1. User overrides

In most cases the end user can override the system and run the code regardless. Some only with great effort (iOS jailbreaking). Some with relative ease (Android side-loading). Some with no difficulty at all (Windows XP everything).

The easier it is for the user to force unsafe code to be run, the more likely the system is to become infected. This has solidly been confirmed through observation and data.

The countervailing force is the platform’s median level of user expertise. The higher it is, the less precautions the system has to take and the more options it can offer for overriding the precautions. And since ecosystems that require high expertise tend to be small, the gain in targeting them often does not offset the difficulty for malware makers. This is not a panacea. As soon as the perceived value of the ecosystem as a target increases, the whole thing becomes a sitting duck.

Tying overrides to expertise in some way isn’t as useful as you’d think since people can memorise complex activities without understanding them. That is, if the user thinks they’re being inconvenienced by a security precaution, they can usually find step-by-step guidelines to how to override them, provided the overrides are available, and with use those overrides become muscle memory or habit. No understanding or expertise is required. This means that a system that intentionally allows the user to override precautions (as opposed to unintentionally as with iOS jailbreaks) will become more and more porous over time.

If the balance of security precautions and user expertise is wrong the system’s ecosystem gets hit with a ‘malware tax’ which is the cost the users incur in making their systems safe. If that malware tax is too high, large parts of the system’s ecosystem become unsustainable which results in a loss of diversity and leaves a lot of potentially valuable use cases unaddressed.

2. The web’s security is built on origin

The web’s basic model of safety is this:

Does this come from a location I trust and, if so, how much can I trust it?

It measures that trust in multiple ways:

The only way to get the full set of capabilities is to come from an origin with the highest level of trust. Extensions can access more capabilities than encrypted websites which in turn have more than unencrypted websites. Even when a site is run offline (e.g. with service workers), it is strongly bound to its origin (origin is a basic part of the service worker security model).

Origin as a measure of security is a basic assumption of most, if not all, web standards.

3. Publishing is a low expertise environment

This is both in terms of in-house technical expertise and—more importantly—in terms of end-user expertise. Publications should require even less expertise of the end-user than an iOS app. Expecting your average reader to have more technical know-how and a better understanding of software security than your average iOS user is a non-starter in multiple ways.

Just as important is the lack of technical expertise within the publishing industry. An extremely common pattern among publishers and imprints is to only have a single in-house digital staff member who manages and oversees the outsourcing of all work related to digital.

Which leads us to the tech brain-drain the industry is experiencing at the moment. A very solid proportion of the tech people I knew in the industry a year ago are no longer working in publishing proper. And flow keeps increasing. Publishers are reacting to the ebook slowdown by cutting down on digital staff.

Add to that the universally lower salaries throughout the publishing industry and you get an environment where technical expertise is limited to much smaller and less influential clusters than in comparable industries.

This isn’t an environment that’s conducive for making nuanced decisions on software security, especially given the possible repercussions of implementing an unsafe ecosystem. A ‘malware tax’ could easily kill off the entire digital publishing ecosystem, especially when its nearest competitors (non-portable websites and mobile phone apps) have a relatively low ‘malware tax’.

4. Portable web publications are… difficult

Portable Web Publications, the idea that a web page can be seamlessly transformed into document with the same portability characteristics as that of an ebook (i.e. a packaged file that users can give to each other), is not-so-subtly being presented as the next generation saviour of the obviously broken and dysfunctional ebook ecosystem.

The problem with this vision is this:

It assumes a symmetry in capabilities and behaviour between the hosted document (i.e. with a HTTP or HTTPS origin) and the portable document. The web’s inherent security model dictates that the only safe portable document is one that’s severely restricted in its capabilities. And the publishing industry’s user expertise requirements dictate that there cannot be any exceptions to these restrictions.

The web’s model of security is to give documents capabilities in proportion to how trusted their origin is. For documents whose origin cannot be trusted because it is a fully portable document, the only safe and compatible option is to give it extremely limited capabilities, much more limited than even those given to a regular unencrypted website.

There aren’t many ways around this and none that seem viable to me.

  1. A new security model from scratch, making sure that it slots neatly in as a replacement for the origin model.
  2. An app store-style whitelist: only publications from specific sources get full capabilities. This is what browsers do for extensions.
  3. A publisher whitelist (the Mac OS X gatekeeper model): only publications signed by a key from a central authority have full capabilities.
  4. An origin whitelist: only give full capabilities to portable documents whose code is identical in every way to code that’s available from a trusted source.

The first option is impractical given the sheer amount of work involved and the high potential for error (a single design mistake dooms the entire system to be unsafe forever).

The second and the third are impractical as limiting capabilities to central authorities completely destroys the utility of such documents and ties the ecosystem to silos; it becomes no better than Apple’s app store or Amazon’s Kindle store.

The last option turns portability into a joke; it means you can only get full capabilities if there’s a network connection and accessible origin during install which makes them functionally indistinguishable from progressive web apps, except more complicated and more error prone.

Which leads us to…

5. What model would work?

If full and unrestricted capabilities are not an option for portable documents the question becomes one of how we should restrict them. You’ll note that solutions 1-3 are the ones ebook reading systems have tended to opt for (ebooks are, in essence, a dysfunctional subset of portable web documents).

  1. Block all network access. No cross-origin requests. No WebRCT. No image hot-linking. No nothing. EPUB3’s current model of only allowing an immutable whitelist of media assets (nothing dynamic) would work. Given how easy it is to get data out of a container like this using nothing but social engineering, combining this solution with solution 2 might be necessary.
  2. Limit storage. Depending on what other restrictions are in play this could be anything from ‘no persistent storage’ (like a browser’s incognito mode) to storage that is either limited severely in size or duration or both.
  3. Limit the context of where JavaScript can be run. This is the model used by iBooks Author books where JS can only be run in short-duration pop-up widgets with limited storage capabilities.
  4. Only allow a community-defined subset of JavaScript to be run: behaviours are exclusively defined by an open source set of custom elements whose implementation is accessed at a single authoritative origin. This is the model chosen by Google for AMP.

I personally favour option 4 since it is implicitly evolvable and postpones standardisation and the bureaucracy that involves indefinitely. It’s a model that allows for plurality (each niche/market can defined its own set of custom elements for behaviour) and explicitly does away with the idea that one solution can possibly fit all. It would inevitably lead to a diverse set of pseudo-formats. Each of which would target a specific use case but could still be run in a browser. And the browser would not have to implement any support or understand the format itself as anything other than standard HTML.

Options 1-3 are still viable but need to be clearly defined within the spec (which took ages for EPUB3 to clear up, for example) and require proper implementation by the client systems themselves, most of which are going to be outside of the publishing ecosystem’s sphere of influence (browser vendors, OSes, major reading systems). Also, experience with EPUB3 reading systems should show us that implementing those options is error- and bug-prone.

The state of implementation in publishing both on the publisher side and on the reading system side is abysmal. Publisher ebooks barely use EPUB3 features and certainly aren’t idiomatic HTML (constructs like <p class="h1"> are the norm). Reading system EPUB3 support is spotty and very, very buggy. Therefore any path that requires pre-existing actors to suddenly become much better at their job than they have been to date has a much lower chance of success than a path that does not.

The only path that has a realistic chance of long-term success is community- and implementation-driven code where each use case is addressed separately by the people who need it.

Premature standardisation, i.e. standardising when the community isn’t yet (or shouldn’t yet be) sure what needs to be done or how, is an extremely risky path to take. Especially when you consider that ebook standardisation has, to date, been nothing short of an abject failure.

(The above is just off the top of my head. Notes for myself hammered out over coffee this afternoon. Feel free to ignore or disregard.)


If you care about portable web documents working in web-based systems solutions 1-3 are not viable.

As in, if you want to be able to take a portable web document and read it in a reading system that has been built as or integrated into a website, even semi-unfettered JavaScript becomes too hard to implement. The logistics involved would make most, if not all, companies to just drop the feature from their roadmap.

And if you don’t think that’s an issue, think to yourself how far ATOM/RSS would have gotten if they had mandated support for JavaScript in entries.

It can be done, sure. But it basically requires using CAJA at this point which is a major pain in the ass and adds another point of failure. In theory you could use Secure Ecmascript but that doesn’t seem to have progressed much over the past seven years and only supports a subset of JavaScript.

Which means that if you want to maximise the odds of implementation, you pretty much have to subset JavaScript in some way in Portable Web Publications.

And we all saw how well not caring about implementation worked for EPUB3?