About good programmers, good programs and performance

Well, not really.

I won’t present any benchmarking results here for two reasons:There a lot of them out there.

Just use google.

They are not informative anyway.

Those tests are usually using either some kind of optimized numerical algorithm, or some code that does nothing.

Even in unfavorable conditions (JVM is just not meant for short-running tasks), you can see that latest JVM (especially JDK-11, which includes some nice vectorization optimizations) performs very closely to native-compiled languages like C++ or Rust which makes one of the fastest compiler ever and definitely fastest JIT compiler.

The myth about slowness of JVM originates in early 2000-s, when the compiler was indeed much slower.

Current tests performed for example during development of kmath library shows about 10 % performance increase compared to similar numpy operations.

Yet, there are still some problems.

Thinking outside of the boxKotlin eliminates a lot of performance problems known from Java.

Inline functions allow to perform functional operation without creating intermediate objects, thus removing infamous stream processing overhead Inline classes in some cases allow to avoid unnecessary allocations (though handling inline classes is a bit tricky at the moment, it is not always obvious if the class will be actually inlined).

Most important, coroutines allow to avoid thread creation hidden cost which always was one of major performance pitfall for any parallel programming.

Modern JDK are very good in inlining method calls so even complicated object structure does not introduce additional runtime costs.

The major problem which is left is boxing.

Let me explain it for those, who do not know what I am talking about.

As you probably know (or should know) in most modern languages there are variables of two type: primitive types like numbers and booleans and reference types aka classes.

Different languages treat those objects differently, but in JVM, primitives have a separate type and always transferred by value, reference types are transferred by reference.

It is possible to wrap a primitive in a object that holds it and thus create a reference type frequently called a Box.

Using Boxes have both pros and cons.

On a plus side, one can hide an abstraction behind the reference and use the same generic reference for different implementations, for example Number class could hold a double, int or even complicated structure like a BigDecimal.

Also one can use boxes in structures that work with references like Lists.

The minus is that each call to the value inside the box requires additional dereference operation and heap access, which is rather expensive compared to operations on primitives.

JVM 1.

5 and later has a feature called autoboxing which allows to pass a boxed value like Integer to the places that require primitive int and vice versa.

The VM automatically puts the value into the box or extracts the primitive from the box.

The bad thing is that performance still suffers on this operations.

Especially if one performs multiple boxing-unboxing operations in a row.

From the developer point of view, it means that if one wants to get good performance on primitive operations, he needs to create a specialized code that deals with primitives and primitive arrays like double[] in Java or DoubleArray in Kotlin.

And it is really hard to write the code which will work fast on generic numbers.

The boxing problem is also present for structures, but it does not have such a dramatic performance impact as for primitive operations.

Kotlin is both good and bad in terms of boxing.

It is good because there is not distinction between primitive numbers and boxed numbers, compiler makes decision about it automatically.

It means, that it will use unboxed variant if possible and boxed one if not.

On the bad side, it is much harder to understand just by looking on code if boxing happens or not.

The same goes for inline classes.

They could be used to avoid object boxing locally, but if transferred somewhere, it becomes boxed and it is really hard to understand, where it happens without decompiling the bytecode.

Also, one needs to remember that Kotlin function-types are generic by nature, so any primitive or object passed through it will be boxed if the function is not inlined.

Those problems could be avoided by Developer, but require careful handling.

In future, the problem of boxing on JVM will be probably partially solved by introducing Valhalla value-types and better escape-analysis in GraalVM (even now Graal shows very promising boost on boxed array evaluation).

Also Kotlin language team is working on language-specific solutions.

The boxing problem is not specific to Kotlin and Java, it arises in one or another form in all languages.

Yes, even in those that have specialized solutions like value-types.

Python, for example solves it with “brute force”, it just infers dynamic type and than uses specialized native implementation of this type.

This solution is available in Kotlin as well.

You can just write specialized versions for mathematics and it will work really fast (similar or even in some cases faster than native implementation).

Or you can just use JNI to connect to your favorite native implementation.

The access to native libraries is more cumbersome in JVM than it is in Python, but in fact not much.

The verdictYou do not need a good language to write a good program if you are a good programmer.

And yes, good means fast as well.

You still want to use a good language even if you are a good programmer, because good language will allow you to write it faster and safer, also it will mean that your program will evolve faster and tooling is important.

User does not need good language, he wants simple language.

And he does not want a good program, he wants a working program.

It is OK to write simple working programs, but in the long run, the evolution will be slow and painful.

The language performance is a myth.

What matters is ability of the program to solve problems.

If problem does not need high performance, you do not need to optimized it.

If program needs to be fast, it could be done in any language.

So the language should be selected for it convenience, not for mythical performance.

In real life, there are only limited number of places, where performance matters and those places could be easily written once and hidden inside the libraries.

JVM ecosystem and Kotlin in particular are mature and comfortable enough to be used to solve high-performance problems.

And language is really a good compromise between simplicity and flexibility.

It has some JVM-based limitations, but performance problems are mostly solved and the one actual problem — boxing could be avoided (and in fact will be solved in future by Valhalla, GraalVM or both).

So, yes, you can write a performance-critical applications in nice Kotlin language without native back-end.

AfterwordI intended this article to be much more technical, but in the end, it does not matter.

There are a lot of articles about performance out there, most of them do not make sense.

What I wanted to say is that you can make your program run fast in any language.

What matters is that you make it fast without sacrificing simplicity and language features.

Also you need to keep balance in all things.

For illustration I used snapshots from popular Russian cartoon Смешарики (there is English translation out there, but the original is better).

The plot of this specific chapter is that there is a race in the desert and everyone is building a bizarre-looking race car.

In the end, everyone crashes: one is too fast and can’t turn in time, one blown up, one went flying, and after all the winner was the simplest and slowest one (and it is not the conclusion I want to draw from the article above).

.. More details

Leave a Reply