Opiniones | Opinions | Editoriales | Editorials

It’s time to kill the web?

 
Picture of System Administrator
It’s time to kill the web?
by System Administrator - Monday, 25 September 2017, 9:35 PM
Group Colaboradores y Partners

That’s you, frontend hacker

It’s time to kill the web

by Mike Hearn

Something is going on. The people are unhappy. The spectre of civil unrest stalks our programming communities.

For the first time, a meaningful number of developers are openly questioning the web platform. Here’s a representative article and discussion. Here’s another. Yet another. I could list more but if you’re interested enough in programming to be reading this you’ve already read at least one hilarious rant this year about the state of modern web development. This is not one of those articles. I can’t do a better job of mocking the status quo than the poor people who have to live it every day. This is a different kind of article.

I want to think about how one might go about making a competitor to the web so good that it eventually replaces and subsumes it, at least for the purpose of writing apps. The web has issues as a way of distributing documents too, but not severe enough to worry about.

This is the first of two articles. In part one I’m going to review the deep, unfixable problems the web platform has: I want to convince you that nuking it from orbit is the only way to go. After all, you can’t fix problems if you don’t analyse them first. I’ll also briefly examine why it is now politically acceptable to talk about these issues, although they are not actually new.

In part 2 I’ll propose a new app platform that is buildable by a small group in a reasonable amount of time, and which (IMHO) should be much better than what we have today. Not everyone will agree with this last part, of course. Agreeing on problems is always easier than agreeing on solutions.

Part 1. Here we go.

Why the web must die

Web apps. What are they like, eh? I could list all kinds of problems with them but let’s pick just two.

  1. Web development is slowly reinventing the 1990's.
  2. Web apps are impossible to secure.

Here’s a good blog post on Flux, the latest hot web framework from Facebook. The author points out that Flux has recreated the programming model used by Windows 1.0, released in 1985. Microsoft used this model because it was appropriate for very slow computers, but it was awkward to develop for and so within less than a decade there was an ecosystem of products (like OWL) that abstracted away the underlying WndProc messaging system.

One of the reasons React/Flux works the way it does is that web rendering engines are very slow. This is true even though the end result the user actually sees is only a little bit fancier than what a Windows user might have seen 20 years ago:

 
 
Windows 98
 

Sure, the screen resolution is higher these days. The shade of grey we like has changed. But the UI you see above is rather similar in complexity to the UI you see below:

 
 

Even the icon fashions are the same! Windows 98 introduced a new trend of flat, greyscale icons with lots of padding to a platform where the previous style had been colourful, tightly packed pixel art.

But Office 2000 was happy with a 75 Mhz CPU and 32mb of RAM, whereas the Google Docs shown above is using a 2.5Ghz CPU and almost exactly 10x more RAM.

If we had gained a 10x increase in productivity or 10x in features perhaps that’d be excusable, but we haven’t. Developer platforms in 1995 were expected to have all of the following, just as the price of entry:

  • A visual UI designer with layout constraints and data binding.
  • Sophisticated support for multi-language software components. You could mix statically typed native code and scripting languages.
  • Output binaries so efficient they could run in a few megabytes of RAM.
  • Support for graphing of data, theming, 3D graphics, socket programming, interactive debugging …

Many of these features have only been brought to the web platform in the past few years, and often in rather sketchy ways. Web apps can’t use real sockets, so servers have to be changed to support “web sockets” instead. Things as basic as UI components are a disaster zone, there’s no Web IDE worth talking about and as for mixing different programming languages … well, you can transpile to JavaScript. Sometimes.

One reason developers like writing web apps is that user expectations on the web are extremely low. Apps for Windows 95 were expected to have icons, drag and drop, undo, file associations, consistent keyboard shortcuts, do useful things in the background … and even work offline! But that was just basic apps. Really impressive software would be embeddable inside Office documents, or extend the Explorer, or allow itself to be extended with arbitrary plugins that were unknown to the original developer. Web apps usually do none of these things.

All this adds up. I feel a lot more productive when I’m writing desktop apps (even including the various “taxes” you are expected to pay, like making icons for your file types). I prefer using them too. And I know from discussions with others that I’m not alone in that.

I think the web is like this because whilst HTML as a document platform started out with some kind of coherent design philosophy and toolset, HTML as an app platform was bolted on later and never did. So basic things like file associations don’t exist, yet HTML5 has peer to peer video streaming, because Google wanted to make Hangouts and Google’s priorities dictate what gets added. To avoid this problem you need a platform that was designed with apps in mind from the start, and then maybe add documents on top, rather than the other way around.

Web apps are impossible to secure

At the end of the 1990’s a horrible realisation was dawning on the software industry: security bugs in C/C++ programs weren’t rare one-off mistakes that could be addressed with ad-hoc processes. They were everywhere. People began to realise that if a piece of C/C++ was exposed to the internet, exploits would follow.

We can see how innocent the world was back then by reading the SANS report on Code Red from 2001:

“Representatives from Microsoft and United States security agencies held a press conference instructing users to download the patch available from Microsoft and indicated it as “a civic duty” to download this patch. CNN and other news outlets following the spread of Code Red urged users to patch their systems.”

Windows did have automatic updates, but if I recall correctly they were not switched on by default. The idea that software might change without the user’s permission was something of a taboo.

 
 
First signs of a Blaster infection
 

The industry began to change, but only with lots of screaming and denial. Back then it was conventional wisdom amongst Linux and Mac users that this was somehow a problem specific to Microsoft … that their systems were built by a superior breed of programmer. So whilst Microsoft accepted that it faced an existential crisis and introduced the “secure development lifecycle” (a huge retraining and process program) its competitors did very little. Redmond added a firewall to Windows XP and introduced code signing certificates. Mobile code became restricted. As it became apparent that security bugs were bottomless “Patch Tuesday” was introduced. Clever hackers kept discovering that bug types once considered benign were nonetheless exploitable, and exploit mitigations once considered strong could be worked around. The Mac and Linux communities slowly woke up to the fact that they were not magically immune to viruses and exploits.

The final turning point came in 2008 when Google launched Chrome, a project notable for the fact that it had put huge effort into a complex but completely invisible renderer sandbox. In other words, the industry's best engineers were openly admitting they could never write secure C++ no matter how hard they tried. This belief and design has become a de-facto standard.

Now it’s the web’s turn

Unfortunately, the web has not led us to the promised land of trustworthy apps. Whilst web apps are kind of sandboxed from the host OS, and that’s good, the apps themselves are hardly more robust than Windows code was circa 2001. Instead of fixing our legacy problems for good the web just replaced one kind of buffer overflow with another. Where desktop apps have exploit categories like “double free”, “stack smash”, “use after free” etc, web apps fix those but then re-introduce their own very similar mistakes: SQL injection, XSS, XSRF, header injection, MIME confusion, and so on.

This leads to a simple thesis:

I put it to you that it’s impossible to write secure web apps.

Let’s get the pedantry out of the way. I’m not talking about literally all web apps. Yes you can make a secure HTML Hello World, good for you.

I’m talking about actual web apps of decent size, written under realistic conditions, and it’s not a claim I make lightly. It’s a belief I developed during my eight years at Google, where I watched the best and brightest web developers ship exploitable software again and again.

The Google security team is one of the world’s best, perhaps the best, and they put together this helpful guide to some of the top mistakes people make as part of their internal training program. Here’s their advice on securely sending data to the browser for display:

To fix, there are several changes you can make. Any one of these changes will prevent currently possible attacks, but if you add several layers of protection (“defense in depth”) you protect against the possibility that you get one of the protections wrong and also against future browser vulnerabilities. First, use an XSRF token as discussed earlier to make sure that JSON results containing confidential data are only returned to your own pages. Second, your JSON response pages should only support POSTrequests, which prevents the script from being loaded via a script tag. Third, you should make sure that the script is not executable. The standard way of doing this is to append some non-executable prefix to it, like ])}while(1);</x>. A script running in the same domain can read the contents of the response and strip out the prefix, but scripts running in other domains can't.
NOTE: Making the script not executable is more subtle than it seems. It’s possible that what makes a script executable may change in the future if new scripting features or languages are introduced. Some people suggest that you can protect the script by making it a comment by surrounding it with /* and */, but that's not as simple as it might seem. (Hint: what if someone included */ in one of their snippets?)

Reading this ridiculous pile of witchcraft and folklore always makes me laugh. It should be a joke, but it’s actually basic stuff that every web developer at Google is expected to know, just to put some data on the screen.

Actually you can do all of that and it still doesn’t work. The HEIST attackallows data to be stolen from a web app that implements even all the above mitigations and it doesn’t require any mistakes. It exploits unfixable design flaws in the web platform itself. Game over.

Not really! It gets worse! Protecting REST/JSON endpoints is only one of many different security problems a modern web developer must understand. There are dozens more (here’s an interesting example and another fun one).

My experience has been that attempting to hire a web developer that has even heard of all these landmines always ends in failure, let alone hiring one who can reliably avoid them. Hence my conclusion: if you can’t hire web devs that understand how to write secure web apps then writing secure web apps is impossible.

The core problem

Virtually all security problems on the web come from just a few core design issues:

  • Buffers that don’t specify their length
  • Protocols designed for documents not apps
  • The same origin policy

Losing track of the size of your buffers is a classic source of vulnerabilities in C programs and the web has exactly the same problem: XSS and SQL injection exploits are all based on creating confusion about where a code buffer starts and a data buffer ends. The web is utterly dependent on textual protocols and formats, so buffers invariably must be parsed to discover their length. This opens up a universe of escaping, substitution and other issues that didn’t need to exist.

The fix: All buffers should be length prefixed from database, to frontend server, to user interface. There should never be a need to scan something for magic characters to determine where it ends. Note that this requires binary protocols, formats and UI logic throughout the entire stack.

HTTP and HTML were designed for documents. When Egor Homakov was able to break Authy’s 2-factor authentication product by simply typing “../sms” inside the SMS code input field, he succeeded because like all web services Authy is built on a stack designed for hypertext, not software. Path traversal is helpful if what you’re accessing is an actual set of directories with HTML files in them, as Sir Tim intended. If you’re presenting a programming API as “documents” then path traversal can be fatal.

REST was bad enough when it returned XML, but nowadays XML is unfashionable and instead the web uses JSON, a format so badly designed it actually has an entire section in its wiki page just about security issues.

The fix: Let’s stop pretending REST is a good idea. REST is a bad idea that twists HTTP into something it’s not, only to work around the limits of the browser, another tool twisted into being something it was never meant to be. This can only end in tears. Taking into account the previous point, client/server communication should be using binary protocols that are designed specifically for the RPC use case.

The same origin policy is another developer experience straight out of a Stephen King novel. Quoth the wiki:

The behavior of same-origin checks and related mechanisms is not well-defined in a number of corner cases … this historically caused a fair number of security problems.
In addition, many legacy cross-domain operations predating JavaScript are not subjected to same-origin checks.
Lastly, certain types of attacks, such as DNS rebinding or server-side proxies, permit the host name check to be partly subverted.

The SOP is a result of Netscape bolting code onto a document format. It doesn’t actually make any sense and you wouldn’t design an app platform that way if you had more than 10 days to do it in. Still, we can’t really blame them as Netscape was a startup working under intense time pressure, and as we already covered above, back then nobody was thinking much about security anyway. For a 10 day coding marathon it could have been worse.

Regardless of our sympathy it’s the SOP that lies at the heart of the HEIST attack, and HEIST appears to break almost all real web apps in ways that probably can’t be fixed, at least not without breaking backwards compatibility. That’s one more reason writing secure web apps is impossible.

The fix: apps need a clear identity and shouldn’t be sharing security tokens with each other by default. If you don’t have permission to access a server you shouldn’t be able to send it messages. Every platform except the web gets this right.

There are a bunch of other design problems in the web that make it hard to secure, but the above examples are hopefully enough to convince.

Conclusion

HTML 5 is a plague on our industry. Whilst it does a few things well those advantages can be easily matched by other app platforms, yet virtually none of the web’s core design flaws can be fixed. This is why the web lost on mobile: when presented with competing platforms that were actually designed instead of organically grown, developers almost universally chose to go native. But we lack anything good outside of mobile. We desperately need a way of conveniently distributing sandboxed, secure, auto-updating apps to desktops and laptops.

Ten years ago I’d have been crucified for writing this article. I expect some grumbling now too, but in recent times it’s become socially acceptable to criticise the web. Way back then, the web was locked in a competition with other proprietary platforms like Flash, Shockwave and Java. The web was open but it’s survival as a competitive platform wasn’t clear. Its eventual resurgence and victory is a story calculated to push all our emotional buttons: open is better than closed, collective ownership is better than proprietary, David can beat Goliath etc. Many programmers developed a tribal loyalty to it. Prefixing anything with “Web” made it instantly hip. Suggesting that Macromedia Flash might actually be good would get your geek card revoked.

But times change. The web has grown so fat that calling it open is now pretty meaningless: you have no chance of implementing HTML5 unless you have a few billion dollars you’d like to burn. The W3C didn’t meet its users needs and is now irrelevant, so unless you work at Google or Microsoft you can’t meaningfully impact the technical direction of the web. Some of the competing platforms that were once closed opened up. And the JavaScript ecosystem is imploding under the weight of its own pointless churn.

It’s time to go back to the drawing board. Now grab a drink and read the next article in this series: what follows the web?

Link: https://blog.plan99.net

What should follow the web?

Design principles and an instantiation for a post webapp world

In part 1 of this series I argued that it’s time to start thinking about how to replace the web. The justifications are poor productivity and unfixable security problems.

Some people felt that what I wrote was too negative, and didn’t recognise the web’s good points. That’s right: part one was “let’s discuss the idea that we’ve dug ourselves a big hole”, part two is “how do we design something better?” This is a huge topic so it will actually take more than two parts.

Let’s call our competitor to the web, NewWeb (er, branding can come later). To start we must understand why the web became successful in the first place. The web out-competed other ways of writing apps that had far better tools for building graphical applications, so clearly it had some qualities that overwhelmed its flaws. If we can’t match those we’re doomed.

Building a GUI with Matisse, a UI designer with automatic alignment and constraints. Long live the (newly stylish) King?

We also need to focus on cheapness. The web has vast teams of full time developers building it. Much of their work is duplicated or thrown away, some of what they’ve created can be reused, and small amounts of new development might be possible … but by and large any NewWeb will have to be assembled out of bits of software that already exist. Beggars can’t be choosers.

Here’s my personal list of the best 5 things about the web:

  1. Deployment & sandboxing
  2. Shallow learning curve for users and developers
  3. Eliminates the document / app dichotomy
  4. Advanced styling and branding
  5. Open source and free to use

These aren’t exactly the most famous design principles: if you asked most developers to sum up the essence of the web’s architecture they’d probably say URLs, view source, cross platform, HTTP and so on. But those are just implementation details. The web didn’t take over from Delphi and Visual Basic because JavaScript rocked so much more. There were deeper reasons.

The above things are pretty self explanatory. I’ll analyse them more as we go. To beat the web we must go further than just matching what it does well, we must also offer unique improvements the web cannot easily incorporate. Otherwise there’d be no point: we’d just work on making the web better instead.

I offer two things in this text: some design principles I think any serious web competitor should follow, and a concrete instantiation cobbled together out of various open source projects. Many readers won’t like that concrete instantiation because they’ll have other projects or languages in mind they like better, and that’s fine. I don’t care. It’s just an illustration — really it’s the principles that matter. I name specific codebases only to show that dreaming is not totally unrealistic.

Design principles

The web lacks a coherent design philosophy, at least for the app parts. Here’s some things I think it should have but lacks:

  1. A clear notion of app identity
  2. A unified data representation from backend to frontend
  3. Binary protocols and APIs
  4. Platform level user authentication
  5. IDE oriented development
  6. Components, modules and a great UI layout system — just like regular desktop or mobile development.
  7. It will also need the things the web does well: instant start of streaming apps with no installation or manual updates required, sandboxing of those apps, sophisticated UI styling, mix and match of “documenty” stuff with “appy” stuff (like being able to link to particular app views), a very incremental learning curve and a design that discourages developers from building UI that is too complicated.

The first four items in bold are all security related. That’s because security issues are my motivation for taking such an extreme position. Perhaps the low productivity of the web could be fixed with yet another JavaScript framework — perhaps. But I don’t think security can. I’ll look the non-bold items in later parts.

App identity and linking

To get the deployment and sandboxing benefits of the web, some sort of app browser will be required. We’d also like the ability to treat parts of an app as if it were a hypertext document so we can link to things. That’s a key part of the web’s success — an Amazon search results page is sort of an app, but because we can link to it, it’s also sort of a document.

But our app browser shouldn’t physically resemble a web browser. Look at it with fresh eyes: web browser UI is not excellent. 

The URL is a key part of the web’s design but it does make a mess sometimes!

The first problem is that URLs leak all over the UI. The URL bar routinely holds random bits of browser memory encoded in a form that is difficult to understand for both humans and machines, leading to a parade of exploits. If a desktop app dumped random internal variables into the title bar we would rightly regard that as brand-destroying bugginess, so why do we tolerate it here?

The second problem is that URLs are hard to read for machines too (mind blowing exploits available here). Even ignoring clever encoding tricks like those, the web has various ways to establish the identity of an app. Browsers need some notion of app identity for separating cookie jars and tracking granted permissions. This is an “origin”. The concept of a web origin evolved over time and is essentially impossible to describe. RFC 6454 tries, but as it admits:

Over time, a number of technologies have converged on the web origin concept as a convenient unit of isolation. However, many technologies in use today, such as cookies [RFC6265], pre-date the modern web origin concept. These technologies often have different isolation units, leading to vulnerabilities.

For example, ponder what stops your server setting a cookie on the .com domain. Now what about kamagaya.chiba.jp? (it’s not a website, it’s a section of the hierarchy like .com is!)

Being able to unexpectedly link to parts of an app that don’t benefit from it and aren’t expecting it is one of the sources of the “dangerous link” problem. Heavily URL-oriented design makes all of your app subject to data injection by malicious third parties, a design constraint that’s proven near impossible to handle well, yet is blithely encouraged by the web’s architecture.

Still, being able to link to the middle of an app from documents and vice-versa is pretty nice when it works and is expected by developers. Let’s take a couple of requirements:

Requirement: app identity (isolation) must be simple and predictable.

Requirement: we must be able to deep link into an app, but not always.

To their credit, the Android designers recognised these requirements and came up with solutions. We can start there:

  • App identity is defined by a public key. All apps signed by the same key share the same isolation domain. String parsing is not used to determine isolation.
  • Apps can opt-in to receiving strongly typed packets of data that that ask them to open themselves with some state. Android calls this concept “intents”. Intents aren’t actually very strongly typed in Android, but they could have been. They can be used for internal navigation as well, but to allow linkage from some other app the intent must be published in the app’s manifest. By default, the app cannot be invoked via links.
  • Where to find an app to download? A domain name is a good enough starting point. Whilst we could support HTTP[S] downloads too, ideally we’ll find a NewWeb server listening on a well known port and we can fetch the code from there. So a domain name becomes a starting point for getting the app, but is otherwise quite unimportant.

To distinguish URLs and intents from our own take, let’s call them link packets.

Requiring developers to opt-in to external linkage will lower the linkability of NewWeb in order to improve its security. Perhaps that is a poor tradeoff and will be fatal. But many links people can theoretically create on the web today are invisibly useless anyway due to authentication requirements, or because they don’t contain any useful state to begin with (many SPAs). And such apps would minimize their surface area for attack. The malicious link that is the starting point of so many exploits would immediately become less severe. That seems valuable.

There are other benefits too — the importance of the domain name for webapps makes it tempting for domain providers to seize domains their executives personally disagree with. Whilst simple websites can be relocated without too much pain, more complex sites that may have established email reputations, OAuth tokens etc are more badly harmed. Private keys can be lost or stolen, but so can domain names. If you lose your private key at least it’s your own fault.

As for the rest of the browser UI, it can probably be scrapped. Tabs could be useful, but reload should never be necessary and the back button can be provided by the app itself when it makes sense to do so, a la iOS.

How to build the app browser? Even though NewWeb is pretty different to the web, the basic UI of fullscreen apps is OK and users will want to switch back and forth. So why not fork Chromium and add a new tab mode?

A unified data representation

Maybe the link packet concept sounded a bit vague. What is this thing? To define it more precisely we need data structures.

The web has a variety of ways to do this but they’re all text based. To stress a point from my last article, text based protocols (not just JSON) have the fundamental flaw that buffer signalling is all ‘in band’, that is, to figure out where a piece of data ends you must read all of it looking for a particular character sequence. Those sequences may be legitimately a part of the data, so that means you also need an escaping mechanism. And then because these protocols are meant to be sorta human readable, or at least nerd readable, there are often odd corner cases to do with whitespace handling or Unicode canonicalisation that end up leading to exploits, like HTTP header splitting attacks.

In the early days this might have helped the web catch on with developers. View source has definitely helped me out more times than I can count. But the average web page is now over 2 megabytes in size, let alone web app. Even boring web pages of static text frequently involve masses of minified JavaScript that you need machine help to even begin to read. The benefits seem less than they once did.

  
A primitive decompiler
 

In fairness, the web has in recent years been slowly moving away from text based protocols. HTTP/2 is binary. The much touted “WebAssembly” is a binary way to express code, although it doesn’t actually fix any of the problems I’ve been discussing. But still.

Requirement: data serialisation should be automatic, typed, binary and consistent from data store to front end.

Serialisation code is not only tedious to write but also a major attack surface. A good platform would handle it for you. Let’s define a toy syntax for expressing data structures:

 

enum SortBy {
  Featured,
  Relevance,
  PriceLowToHigh,
  PriceHighToLow,
  Reviews
  }
   
  @PermazenType
  class AmazonSearch(
  val query: String,
  val sortBy: SortBy?
  ) : LinkPacket
 

The web equivalent of this would be an amazon.com URL. It defines a immutable typed data structure to represent a request to open the app in some state.

That data structure is marked with @PermazenType. What’s this all about?

If we’re serious about being length prefixed, strongly typed and injection-free throughout the entire stack then we have to do something about SQL. The Structured Query Language is a friendly and well understood way to express sophisticated queries to a variety of stonkingly powerful database engines, so this is a pity. But, SQL is a text based API. Despite SQL injection being one of the easiest exploit types to understand and fix, it’s also one of the most common bugs in real websites. That’s no surprise: the most obvious way to use SQL in a web server will work fine but silently opens you up to being hacked. Parameterised queries help but are a bolt-on fix, which can’t be used in every situation. The tech stack creates a well disguised bear trap for us to fall into and our goal is to minimize the feature:bear ratio.

SQL has some other issues. You quickly hit the object-relational mapping problem. The results of a SQL query can’t be natively sent over an HTTP connection or embedded into a web page, so there’s always some required transformation into another format. SQL hides the performance of the underlying queries from you. The backends are hard to scale. Schema changes often involve taking table locks, making them effectively un-deployable without triggering unacceptable downtime.

NoSQL engines are not convincingly better: they typically fix one or two of these problems but at the cost of throwing out SQL’s solutions to the rest, meaning you’re often stuck with a solution that is different but not necessarily better. As a consequence the biggest companies like Google and Bloomberg have spent a lot of time researching how to make SQL databases scale (F1ComDB2).

Permazen is a new take on data storage that I find quite fascinating at the moment. Quoting the website:

Permazen is a completely different way of looking at persistence programming. Instead of starting from the storage technology side, it starts from the programming language side, asking the simple question, “What are the issues that are inherent to persistence programming, regardless of programming language or database storage technology, and how can they be addressed at the language level in the simplest, most correct, and most language-natural way?”

Permazen is a Java library, however, its design and techniques could be used in any platform or language. It requires a sorted key/value store of the kind that many cloud providers can supply (it can also use an RDBMS as a K/V store), but it does everything else inside the library. It doesn’t have tables or SQL. Instead it:

  • Uses the collection operations of the host language to do unions and intersections (joins).
  • Schemas are defined directly by classes, it does not require an object/relational mapping to be defined.
  • Can do incremental schema evolution ‘just in time’, avoiding the need for table locks to change how data is represented in the data store.
  • Transactions are precisely defined and copying data out of a transaction or back in is clearly controllable by the developer.
  • Change monitoring, such that you can get callbacks when the underlying data changes, including across processes (if the underlying K/V store supports this, which they often do).
  • There’s a CLI that lets you query and work with the data store (e.g. to force data migrations if you don’t want just-in-time migration), and a GUI.

Really, you should just read the excellent white paper or read the slides (the slides use the old name for the library but it’s the same software). Permazen isn’t well known, but it’s the cleverest and slickest approach to integrating data storage tightly with an object oriented language that I’ve yet seen.

One of the interesting things about Permazen is that you can use ‘snapshot transactions’ to serialise and deserialise an object graph. That snapshot even includes indexes, meaning you can stream data into local storage if memory is too small and do indexed queries over it. It should be clear how this can provide a unified form of data storage with offline sync support. Resolving conflicts can be done transactionally as all Permazen operations can be done inside a serialisable transaction (if you want a Google Docs like conflict-free experience that will require an operational transform library).

NewWeb doesn’t have to use this sort of approach. You could use something a bit more traditional like protobufs, but then you need to use a special IDL which is equally inconvenient in all languages, and also tag your fields. You could use CBOR or ASN1. In my current project, Corda, we have built our own object serialisation engine that builds on AMQP/1.0 and in which messages are by default self describing, so given a binary packet you can always get back the schema — as such it can do a View Source feature, kinda like XML or JSON. AMQP is nice because it’s an open standard and the base type system is pretty good (e.g. it knows about dates and annotations).

The point is that receiving data from potentially malicious sources is risky, so you want as few degrees of freedom in the format as possible and as many integrity checks as possible. All buffers should be length prefixed, so users can’t inject malicious data into a field to try and terminate it early. Data needs to be thoroughly checked as it’s loaded, so the best place to do that is in the type system and constructors of the data structures themselves — that way you can’t forget. Schemas help understand what the structure should be, making it harder for attackers to create type confusions.

Despite the fully binary nature of the proposed scheme, a one-way transformation to text can be defined for debugging and learning purposes. In fact, Permazen does that too.

Simplicity and learning curve

One problem with type systems is that they add complexity; they can feel like a lot of pointless bureaucracy to developers who aren’t used to them. And without tooling, binary protocols can be harder to learn and debug.

Requirement: a shallow learning curve

I’m convinced that a big part of the web’s success is because it’s essentially untyped — everything is just strings. Bad for security, productivity and maintainability but really good for the learning curve.

One way this is being addressed in the web world is with gradually typed versions of JavaScript. That’s good and valuable research, but those dialects aren’t widely used and most new languages are at least somewhat strongly typed (Rust, Swift, Go, Kotlin, Ceylon, Idris …).

Another way to address this might be with a clever IDE: if you let developers start out with untyped data structures (everything is defined as Any) the runtime can figure out what the underlying types seem to be, and feed them back into the tool. It can then offer to swap in better type annotations. If the developer seems to be encountering typecast errors as they work, the IDE can offer to relax the constraint back again.

Of course, this gives less safety and maintainability than forcing the developer to really think about things up front, but even fairly weak heuristically determined types that only trigger errors at runtime can protect against a lot of exploits. Runtimes like the JVM and V8 already collect this sort of type information. In Java 9 there’s a new API that lets you control the VM at a low level (the JVMCI) which exposes this sort of profiling data — it’d be a lot easier to experiment with this type of tooling now.

Languages and VMs

Talking of which, what language should NewWeb use? Trick question of course: the platform shouldn’t be fussy.

There have been many attempts to introduce non-JavaScript languages to the web over the years, but they all died trying to get consensus from browser vendors (mostly Mozilla). In fact the web has gone backwards over time in this regard — you used to be able to run Java, ActionScript, VBScript and more, but as browser vendors systematically erased plugins from the platform everything that wasn’t JavaScript got scrubbed out. That’s a shame. It could certainly use competition. It’s a sad indictment of the web platform that WebAssembly is the only effort to add a new web language, and that language is C … which I hope you don’t want to write web apps in! XSS is bad enough without adding double frees on top.

It’s always easier to agree on problems than solutions, but never mind — it’s time to get (more) controversial. The core of my personal NewWeb design would be the JVM. This will not surprise those who know me. Why the JVM?

  • HotSpot can run the largest variety of languages of any virtual machine, and the principle of cheapness dictates that we want to be able to use as much open source code as possible. 
    Yes, that means I want to be able to make apps that include (run at high speed), along with RubyPythonHaskell, and oh yes — C, C++ and Rust should come along for the ride too. All this is possible because of two really cool projects called Graal & Truffle, which I’ve written about extensively before. These projects let the JVM run even languages you wouldn’t expect (like Rust) in a virtualized, fast and interoperable way. I don’t know of any other VM that can blend so many languages together, so seamlessly, and with such good performance.
  • The OpenJDK sandbox has been attacked relentlessly for years and has learned the same painful lessons that web browser makers have. The last zero day exploit was in 2015, and the one before that was in 2013. Just two sandbox 0-days in 5 years is good enough for me, especially because lessons have been learned from each exploit found and fundamental security improvements have been made over time. No sandbox has a perfect track record, so really this boils down to a personal judgement call — what level of security is good enough?
  • There’s huge quantities of high quality libraries that solve key parts of the puzzle, like Permazen and JavaFX.
  • I happen to know it quite well and I like Kotlin, which has a JVM backend.

I expect lots of people to disagree with that choice. No problem — the same design ideas can be implemented on top of Python or V8 or Go or Haskell, or whatever else floats your boat. Best if you can pick something that has an open spec with competing implementations though (the JVM does).

The politics of Oracle don’t concern me much because Java is open source. There aren’t a lot of high quality, high performance open source runtimes out there to choose from. The ones that do exist are all built by giant software firms who have all made their own questionable decisions in the past as well. The web has at various times been heavily influenced or outright controlled by Microsoft, Google, Mozilla and Apple. All of them have done things I found objectionable: that’s the nature of large companies. You won’t agree with their actions all the time. Oracle won’t win any popularity contests, but the nice thing about open source code is that it doesn’t really have to.

There are a few things I’d want to reuse from Google Chrome specifically. My app browser should support an equivalent to WebGL (i.e. eGL), and Chromium’s ANGLE project would be useful for that. The app browser should silently auto update like Chrome does, and Google’s auto update engines are open source too. Finally, whilst the Pack200 format compresses bytecode by a lot, throwing in nice codecs like Zopfli wouldn’t hurt.

RPC

Binary data structures aren’t enough. We need binary protocols too. A standard way to link client with server is RPC.

One of the coolest British startups right now is Improbable, who have published a great blog post on how they’re moving away from REST+JSON and towards gRPC+protobufs from the browser to the server. Improbable describe the result as a “type safe nirvana”. Browsers don’t make this as easy as HTTP+XML or JSON, but with enough JavaScript libraries you can bolt it on top. It’s a good idea. Back in the real world where we’re all writing web apps, you should do this.

RPC protocols take some experience to design. My dream RPC supports:

  • Returning futures (promises) and ReactiveX Observables, so you can stream “push” events naturally.
  • Returning large or infinitely sized byte streams (e.g. for live video).
  • Flow control for high vs low priority transfers.
  • Versioning, interface evolution, type checking.

Again, in the Corda project we’ve tackled designing an RPC stack with these properties. It isn’t currently released as a standalone library but perhaps in future we’ll do that.

HTTP/2 is one of the most recent and most designed parts of the web, and it’s a decent enough framing and transport protocol. But it inherits a lot of cruft from HTTP. Consider the hacks opened up by obscure HTTP methods that are never used. HTTP’s odd approach to state doesn’t help. HTTP itself is stateless, but apps need stateful sessions, so app devs have to add their own on top using a mishmash of easily stealable cookies, tokens and so on. This leads to problems like session fixation attacks.

It’d be better to split things more clearly into layers:

  • Encryption, authentication and session management. TLS does a decent job of this. We’d just reuse it.
  • Transport and flow control. HTTP/2 does this, but messaging protocols like AMQP also do it, without the legacy.
  • RPC

The RPC stack is obviously integrated with the data structure framework, so all transfers are typed. Developers never need to do any parsing by hand. RPC naming is just a simple string comparison. There are no path traversal vulnerabilities as a consequence.

User authentication and sessions

NewWeb would not use cookies. It’s better to use a public/private keypair to identify a session. The web is heading in this direction, but to be compatible with existing web apps there has to be a complicated “binding” step where bearer tokens like cookies are connected to the underlying cryptographic session and that’s just needless complexity in a security sensitive component — it can be tossed in a clean redesign. Sessions are only ever identified on the server side by a public key.

User authentication is one of the hardest bits of a web app to get right. I wrote down some thoughts on that previously so I won’t go deep on that again. Suffice it to say that linking a session to an email address is a feature best handled by the base platform and not constantly re-invented at the app layer. A client side TLS certificate is sufficient to implement a basic single sign-on system, with the basic ‘send email and sign certificate if received’ flow being so cheap that Lets Encrypt style free providers are not unthinkable.

This approach builds on tech that already exists, but should make a serious dent in phishing, password database breaches, password guessing and many other related issues.

Conclusion

A platform that wants to compete with the web should have a strong focus on security, as that’s the primary justification and competitive advantage. It’s the main thing you can’t easily fix about the web. That means: binary, type safe data structures and APIs, RPC, cryptographic sessions and user identification.

It also means code that’s sandboxed and streaming, UI that has great layout management yet is also responsive and styleable, some way to blend documents and apps together as well as the web does, and a very productive development environment. We also need to consider the politics of the thing.

I’ll take a look at these other aspects in a future post.

Link: https://blog.plan99.net

7516 words