The Abstraction gap

On ArtimaBill Venners describes what Josh Bloch calls the “semantic gap” problem. It concerns with the performance optimization of programs. I think that the term “semantic gap” is misused here: in fact, “semantic gap” is already used in computer science to denote a rather different kind of problems.  In embedded system development we already know this problem very well, and we call it “Abstraction Gap” (see for example this paper or some literature on MDA software development for embedded systems).

The Abstraction Gap problem has to do with the ever increasing distance between the model of the computation platform that designers and programmers have in mind when developing their software, and the actual platform on which the software will execute eventually.

When I was a boy, the personal computers were Commodore Vic 20 and Spectrum ZX80.

VIC 20

VIC 20

The first one had 2 Kn of RAM, with the (costly) possibility to expand to 16Kb.  It was not possible to do much with 2 Kb, but some people were actually able to code some simple videogame with it.

When the Commodore 64 arrived on the market, 64 Kb looked like a lot of memory. DO you remember the videogames that were available on C64? Some of them were masterpieces, quite impressive at the time. Programmers were able to exploit every single byte to improve the quality of the graphics and the sound. Every variable was double checked to see if it was possible to eliminate it or reuse it later in the same program.

Soccer game for C64

Soccer game for C64

The games were so nice that people started coding emulators, and nostalgics can now play some game, even  online.

Today, with no less than 4 Gb of RAM, we have about 5 orders magnitude more RAM than C64, and a even greater computational power. However, it is clear that programmers do not put too much emphasis on performance optimization today as they used to do yesterday. Why?

The answer is complex, and there are many reasons behind it, the most important being:

  1. The cost of software development is directly proportional to the time it takes to complete it. Quicker development means a greater probability that we can make money out of it. Moreover, due to the Moore law, the RAM size grows exponentially, as well as the processing power (at least as of today). Therefore, it is less costly to wait for a more powerful platform (that will come anyway) than to spend months of costly and dangerous code optimization (the probability to introduce new bugs by optimizing is not so small…)
  2. In the old and glorious times of Commodore 64, programmers were coding for a specific platform only. In many cases, a videogame was available only on C64, and rarely it was ported on other platforms (like ZX Spectrum). Therefore, the programmer could tailor its code style and technique to the specific hardware, and use every single feature of it. This is still true in some cases (for example when programming on small embedded systems). However, in 99% of the cases, today portability is a must. Designers and Programmers sometime develop software before the hardware is available! They must take into account that their program needs to run on many hardware platform, and sometimes be portable on many operating systems.

So, the two main drivers are: need to speed up development; and need to write portable applications (write once, runs anywhere, WORA). To accomplish these goals, what happened is that little by little the “level of abstraction” was raised. First, the operating system abstracted away from the hardware limitations (for example, by introducing virtual memory). Then, libraries and tools provided pre-implemented data structures and algorithms, making it easier the task of writing complex applications. Then, virtual machines (like Java) and interpreters made it easy to write portable code. Now, operating system virtualization, to allow multiple OS images to execute on the same hardware. And web-browsers can execute code too, behaving like virtual machines themselves.

However, this happened not only in software but also in hardware. The Intel assembly instructions are more or less the same since a very long time, but behind that we can find two or three levels of caches, pipelining, speculative parallelism at the micro-instruction level, reordering of instructions, multi-threading. All of this completely hidden to the programmer. It’s almost impossible to calculate precisely  how long a certain piece of code will need on modern processors, because it depends on too many things, on too many mechanisms, many of them undocumented.

Every time the “abstraction level” of the programmer is raised, the distance between program and actual code executed on the bare processor increases. Paraphrasing a famous aphorism: “All problems in computer science can be solved by another level of abstraction“. True, but at which price?

Consider what happens to a Java Script code running on a web browser that runs on a OS that runs inside a VM that runs on a different OS running on the latest multicode processor. Very nice and portable indeed, but what can the programmer know about the performace of his Java Script code?

Clearly, no one wishes to go back to the old and glorious days of low level programming. Abstraction is necessary to address complexity. However, it is maybe the time for the programmer to regain some control over the execution of the code. For this reasons, many researcher are trying to focus on “filling the abstraction gap” between high level programming models and low level implementation platforms.  Platforms should provide enough information on non functional properties related to timing and performance, and, whenever possible, appropriate “knobs” to control such properties. Programmers then should take into account such properties when developing their software, and if necessary, turn the “knobs” to improve the performance as needed.

Will such approach will be succesful in future software platforms? In my opinion, at least something will leak from research into commercial products. Probably, it will be substantially different than programming on old hardware platforms like C64, where the programmer was in complete control of the platform. However, we may find a new way to think and develop software in the future.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s