When I discuss the ideas behind cloudalion with people, a concern that keeps coming up is whether a declarative implementation of an application can ever perform as well as an imperative implementation.
In this post I intend to answer this concern by giving a few examples in which declarative approaches lead to better performance than imperative ones.
HTML and CSS
HTML and CSS are declarative languages. They allow Web designers to describe the structure of a page, and then provide rules for styling its components. HTML and CSS are complex languages, giving very powerful features to Web designers. Their declarative nature allows the task of developing a Web application to be split between coders, who write “code” (client-side and server-side) and non-coders (Web designers), who write HTML and CSS.
But in my opinion, the reason for these standardized languages is not this division of labor. The reason is performance.
HTML is faster than X11
Let’s look at HTML first. Was first introduced in the early 90’s as a simple document format. However, around 1994, with the viral popularization of the Web following the introduction of Netscape, some previously esoteric part of HTML got more and more attention — forms. Unlike previous document formats, HTML included a way for users to write information in fields, check checkboxes and select elements from lists, and then submit it all to the server. This opened the door to the first Web applications.
Why was the introduction of forms so revolutionary? Because for the first time you could run an arbitrary application from a server located half-way around the world, with the GUI running (and responding) on your machine.
Before HTML, what were your options?
Well, there were some Windows applications that gave GUI to specific services. If you wanted to connect to a new service, you had to download the application first, and you even did not have the Web to do this. You had to know where exactly it is located and use FTP for the download.
If you were a UNIX user, you were more in luck. The UNIX world has (and had long before the Web) a distributed UI system commonly known as “X-Windows“. This is a system in which the client is an application running on a remote machine willing to display some UI, and the server is a program that runs on the local machine, displaying the graphics to the user. The protocol used by the client and the server is named X11.
The X11 protocol allows clients (applications) and servers (windowing systems) to send messages to one another. The servers send events to the clients: mouse clicks, key-presses, etc, along with requests to draw parts of a window. The client, in turn, sends drawing commands to the server.
Now imagine a simple button. When a user clicks a button, we expect the windowing system to animate the click by changing the shading of the button to look pressed during the click. In X-Windows, this is done with these steps:
- The server sends a click event to the client.
- The client sends drawing commands for a pressed button to the server.
- The server draws the pressed button.
A consequence of that is that drawing this simple animation requires a round-trip between the server and the client. As long as they are both in the same local-area network (LAN), this is fine. But for a wide-area network (WAN)? No way! As someone who sometimes needs to use X-Windows applications over WAN, it’s not pretty. Usability drops significantly.
HTML pages, in contrast, show great usability even when the server and the client (which swap places with respect to X11) are located on two different continents. Why? Because HTML is more expressive, and describes the elements on the page, including buttons. Then, the client handles the animation of a click by itself. No round-trip needed.
You can say I’m comparing apples and oranges. Besides X11 being imperative (drawing commands and events) and HTML being declarative (structure of a document) there are other differences between the two. HTML, for example, knows what a button is, while X11 does not. One can say, if we extended X11 to include buttons, maybe the X-server would be able to animate them by itself. But if X11 had known what a button was and has kept track of its and animated it on the X-server side, wouldn’t that make X11 a little more declarative?
Relational is Faster Than Hierarchical
This decade will be remembered in the history of databases as the decade in which SQL gave in to NoSQL. SQL was the “lingua-franca” of databases for so long that only few of us remember that some time ago, SQL itself was the new kid on the block, and the database world was dominated by other kinds of databases.
The hierarchical databases model was one of a few popular models in the pre-relational era. A lone survivor from this family is LDAP, a protocol based on the hierarchical model that is still being used for user databases.
Hierarchical databases were structured as trees. It exposed an API for programs to perform lookups and queries through paths on the tree. You could, for example, provide a path to a node, and ask for all its children. You could set and get properties for each node, identified by a path.
With hierarchical databases you could probably do anything you can with a relational database. So how come relational databases became so successful, and hierarchical databases have gone extinct?
I think the reason for that is that while you can represent any data model you can think of in either database model, the big difference between the two is in what you can express in a single query. Consider the Twitter example we discussed time and time again in this blog. In SQL, a timeline query is a single SELECT query with a join between two tables. In a hierarchical database a timeline query means fetching one’s followees (one query), then for each followee, getting his or her tweets (another query for each followee). That’s a lot of queries…
The number of queries by itself is not the problem. The main problem with performing one task with multiple queries is that it does not allow the database to optimize this query.
Relational databases go through great length to optimize queries. When you perform a query that joins two tables, the database will probably do its best to scan each table only once. One approach could be to scan one of the tables, store in memory the different locations in the other table it needs to visit, and then visit them according to their order on disk. All this to minimize seek time, and as a result – the entire query time.
When we provide the database with one query at a time, its ability to do such fancy optimizations is limited. Think of what would be a more efficient way for you to perform your job. When your boss tells you in advance what she needs from you, or when she calls you every 5 minutes with what she needs you to do next?
MapReduce is a concept that revolutionized the way big-data processing is performed. While the implementation of the Google system of the same name, and Hadoop (the open-source project that was built according to this concept and soon became the de-facto standard of big-data processing) are imperative by nature, they are built upon declarative principles. These very principles are what makes this concept work so well. The fact that mappers and reducers do not have access to state of anything external to them is what makes it possible to run thousands of mappers and reducers in parallel.
Another example of declarative principles helping boost performance via massive parallelism is GPU shaders. Traditionally, 3D graphic standards such as OpenGL and DirectX only included ways to draw solid objects. Effects such as fire or water required algorithms to run on the GPU, updating the geometry of the 3D graphics at every frame. A CPU is typically not fast enough for this, so in most cases such effects were sketchy and unrealistic.
To address this problem, GPU vendors stated at some point to allow user code to run on the GPU. OpenGL and DirectX provided programming languages to allow game creators write their own algorithms to simulate things like water, fire and smoke. These little programs, named kernels, are actually functions that run for each pixel separately. They have access to the location of the pixel, as well as arbitrary data that can be passed to them, and they each return the color of one pixel.
The idea of a kernel comes from signal processing, where they are usually expressed using a mathematical function. In OpenGL and DirectX they are expressed by a program in a C-like language, but this program is so isolated and stateless we can consider it declarative.
So, is Declarative Code Faster?
We’ve seen examples where declarative languages and approaches were taken to improve performance relative to imperative approaches. What do these examples have in common? Can we infer from them that declarative approaches are always faster than imperative ones?
The answer to the last question is obviously no. Imperative code will be usually faster for doing “imperative stuff”. Think about a sorting algorithm. Such an algorithm requires random access to an array and modifying state. These are imperative things. You could define these operations in a declarative language and then have some runtime execute it, but it will never run as fast as simply executing it imperatively. So what do these examples have in common? What makes declarative approaches improve performance in the above examples?
One thing that is common to almost all the above examples, is that using declarative approaches allowed the software implementing these approaches take more informed decisions. Be it a Web browser that is better informed on the intended behavior of the GUI it is presenting than an X11 server, be it a relational database that is given the entire query at once and not piece by piece, or be it Hadoop, which is given the entire job definition (the mapper and the reducer) ahead of time, allowing it to plan how to run them in massive parallelism.
Another thing in common to most these examples is that the declarative language or model is captured in standards. These standards allow different implementations of browsers, databases or GPUs to take different imperative approaches to make our Web pages, queries or shaders run faster, without us having to do anything about it. In fact, standards encourages vendors to compete against each other on that exact point.
The cloudalion project is designed to provide a purely-declarative development stack for Web applications. What does it mean? It means that you (the Web developer) write, in a declarative way, what your app needs to do, and we (the cloudalion project) make sure your app runs as well and as fast as we can make it. How fast? We start with slow, then move to somewhat slow, then to almost acceptable speed, then to acceptable speed, and maybe at some point we get to fast. This does not happen in one day, or one month or even one year. It takes time and effort. We are willing to take the time and make the effort. Are you?