The Subjective Vision Of An Ideal Programming Language

The Subjective Vision Of An Ideal Programming LanguageAlex MaisonBlockedUnblockFollowFollowingJan 17Further, the article is my point of view.

Perhaps it will allow someone to take a fresh look at the design of programming languages or to see some advantages and disadvantages of specific features.

I will not go into private details like “there should be a while construct in the language,” but I will simply describe the general approaches.

Impact of previous experienceProgramming languages ​​in which we previously wrote, drive thinking into the framework of the language.

We ourselves may not notice this, but an outsider with a different experience may advise something unexpected or learn something new.

The framework of thinking moves apart a bit after mastering several languages.

Then in language A, you may want to have a feature from B and vice versa, and there will be an awareness of the strengths and weaknesses of each language.

For example, when I tried to invent and create my own language, my thoughts were completely different.

I thought about completely different things within completely different terms.

Below I will describe the features of the language that I would like to see in the “ideal” programming language.

The effect of syntax on code styleIt would seem that in any programming language you can express almost any idea and the syntax of the language is not important.

But a typical program is written in as simple and short a way as possible, and some possibilities of the language may prevail over others.

Examples with the code (I did not check them for correctness, it is just a demonstration of the idea)Python:filtered_lst = [elem for elem in lst if elem.

y > 2]filtered_lst = list(filter(lambda elem: elem.

y > 2, lst))In python, a long, heavyweight declaration of anonymous functions.

It is more convenient to write as in the first line, although the second option seems to be more algorithmically more beautiful.

Scala:val filteredLst = lst.


y > 2)This is closer to the ideal.

Nothing extra.

If in python it was possible to declare lambda in a shorter way, at least it => it.

y> 2, then the list generators would not be very necessary.

The most interesting thing is that the approach like in a rock scales well into a chain of calls like lst.

map (_.

X) .

filter (_> 0) .

distinct () We read and write code from left to right, elements go along the chain of transformations also from left to right it is comfortable and organic.

In addition, the development environment for the code on the left can provide adequate hints.

In python, the line [elem for elem in the development environment to the last does not suspect what type the element has.

Large constructions have to be read from right to left, which is why these biggest constructions in python are usually not written.

= set(filter(lambda it: it > 2, map(lambda it: it.

x, lst))))The approach with lst.

filter (… map (…) in python could exist, but it is killed in the bud by dynamic typing and non-ideal support for development environments that are not always aware of the type of variable.

And to suggest that in numpy there is a max function — always, please.

Therefore, the design of most libraries means not an object with a bunch of methods, but a primitive object and a bunch of functions that accept it and do something.

Another example, already in java:int x = func();final int x = func();The constant version is stricter, but it takes up more space, is worse readable and is used far from wherever it could.

In Rust, developers deliberately made the declaration of a variable length so that programmers use constants more often.

let x = 1;let mut x = 1;It turns out that the syntax of the language is really important, and should be as simple and concise as possible.

Language should be initially created under often used features.

An anti-example is C++, where, for historical reasons, the class definition is scattered over a pair of files, and declaring a simple function may not fit into the line thanks to words like a template, type name, inline, virtual, override, const, constexpr and at least “short” descriptions argument typesStatic typingPerhaps, if you look at the static typing in C, you can say that it does not describe the whole variety of relationships between objects and their types, and only dynamic typification will save us.

But it is not.

There is no escape from types, and if we do not indicate them, this does not mean that they are not there.

If you write code without a rigid framework, then it becomes very easy to produce errors and it is difficult to prove the correctness of what is happening.

With the growth of the program, this is aggravated, and something big, I think, is not written in dynamic languages.

There are powerful type systems that allow you to create flexible pieces while maintaining severity.

In addition, no one forbids in static typing to create an island with the help of the same hashmap or something else.

And if in the static program an island of dynamic objects can be made without problems, on the contrary, it will not work.

Third-party code is written without any consideration for the types, and it is not always possible to correctly describe this chaos with existing python tools.

(For example, if the function receives some tricky parameter for input, it will return a different type than without this parameter)So static typing is a must have for modern languages.

Of the advantages of static typing, it is worth noting the greater rigor of the program, the detection of errors at the compilation stage, as well as more room for optimization and code analysis.

Of the minuses — types sometimes need to be written, but auto-typing reduces this problem.

Unit, void and function differences from the procedureIn Pascal/Delphi there are a division into procedures (not returning values) and functions (something returning).

But no one forbids us to call a function, and the return value is not used.

Hm, So what is the difference between a function and a procedure?.Yes, nothing, it is the inertia of thinking.

Peculiar Legacy, crawling in Java, C++ and a bunch of languages.

You say: “there is void!” But the problem is that the void in them is not exactly the type, and if you get into the templates or generics, this difference becomes noticeable.

For example, in Java, HashSet <T> is implemented as HashMap <T, Boolean>.

Type boolean is just a stub, a crutch.

It is not needed there, a value is not required in HashMap to say that there is no key.

In C/C++ there are also nuances with sizeof (void).

So, in an ideal language, there should be a type Unit, which occupies 0 bytes and takes only one value (no matter what, it is one, and if you have a Unit, then this is it).

This type should be a full type, and then the compiler will be easier, and the design of the language more beautiful and more logical.

In an ideal language, you can implement HashSet <T> as a HashMap <T, Unit> and have no overhead for storing unnecessary objects.

TuplesWe still have some historical legacy, probably from mathematics.

Functions can take many values, and return only one.

What kind of asymmetry ?!.This is done in most languages, which leads to the following problems:Functions with a variable number of arguments require special syntax — the language becomes more complicated.

Making a universal proxy function becomes more difficult.

To return multiple values ​​at once, you have to declare a special structure or pass variable arguments by reference.

It is not comfortable.

The funny thing is that from the point of view of hardware there are no limitations — just as we decompose arguments by register or stack, we can do the same with the returned values.

There are some steps towards std:: tuple in C++, but it seems to me that this should not be in the standard library, but exist directly in the type system of the language and be written, for example, (T1, T2).

(by the way, you can look at the type Unit as a tuple without elements).

The signature of the function should be described as T => U, where T and U are some types.

Perhaps, someone from them Unit, perhaps, a tuple.

Frankly, I am surprised that in most languages ​​it is not.

Apparently, the inertia of thinking.

Since we can return Union, we can completely abandon the division of expression/instruction and make sure that in the language any construction returns something.

This is already implemented in relatively young languages ​​like scala/Kotlin/Rust — and this is convenient.

val a = 10 * 24 * 60 * 60val b = { val secondsInDay = 24 * 60 * 60 val daysCount = 10 daysCount * secondsInDay}Enums, Union and Tagged UnionThis feature is more high-level, but it seems to me that it is also needed so that programmers do not suffer from errors with null pointers or the return of value type pairs, an error like it go.

First, the language must support a lightweight declaration of enum types.

It is desirable that in runtime they turn into ordinary numbers and there is no extra burden from them.

But it turns out every pain and sadness, when some functions return 0 on successful completion or an error code, while other functions return true (1) on success or false (0) on a file.

In the right way.

The enumeration type declaration must be how short that the programmer can write directly to the function signature in the function that returns something from success | fail or ok | failReason1 | failReason2.

In addition, the enumeration types that can contain values ​​are very convenient.

For example, ok | error (code) or Pointer [MyAwesomeClass] | null This approach will avoid a lot of errors in the code.

In general terms, this can be called sum-types.

They contain one of several values.

The difference between Union and Tagged Union is what we will do in cases of matching types, for example, int | int.

From the point of view of a simple Union int | int == int, since we have int anyway In general, with the union in b and so it turns out.

In the case of int |int The int tagged union also contains information about which int we have — the first or the second.

Small retreatGenerally speaking, if we take tuples and types-sums (Union), then we can draw an analogy between them and arithmetic operations.

List(x) = Unit | (x, List(x))Well, almost like lists in Lisp.

If we replace type-sum by addition (for good reason it is so called), interpret the tuple as a product, we getf(x) = 1 + x * f(x)Well, in other words, f (x) = 1 + x + x * x + x * x * x + …, and from the point of view of types of products (tuples) and types of sums it looks like.

List(x) = Unit | (x, Unit) | (x, (x, Unit)) | .

= Unit | x | (x, x) | (x, x, x) | .

A list of type x = is an empty list, or one x, or a tuple of two, or …We can say that (x, Unit) == x, the analogy in the world of numbers would be x * 1 = x, and also (x, (x, (x, Unit))) can be turned into (x, x, x).

Unfortunately, it follows from this that theorems for ordinary numbers can be expressed in the language of types, and, like theorems that are not always easily proved (if they are proved at all), the relationship of types can also be quite complex.

Perhaps that is why in real languages ​​such possibilities are severely limited.

However, this is not an insurmountable obstacle — for example, the template language in C ++ is Turing-complete, which does not prevent the compiler from digesting adequately written code.

In short, types of sums in the language are needed, and they are needed right in the type system of the language in order to combine normally with types of works (tuples).

There will be a lot of room for type transformations (A, B | C) == (A, B) | (A, C)ConstantsIt may sound unexpected, but immutability can be understood in different ways.

I see as many as four degrees of variability.

A variable variableA variable that “we” cannot change, but in general it is variable (for example, a container is passed to a function via a constant link)The variable that was initialized and it will not change anymore.

A constant that can be found right at compile time.

The difference between points 2 and 3 is not quite obvious, I will give an example: for example, in C++ we were passed a pointer to a constant memory to the object.

If we save this pointer somewhere inside the class, then we have no guarantees that the contents of the memory under the pointer will not change during the life of the object.

In some cases, we need exactly the third type of immutability — for example, when reading an object from several streams or when calculating something based on the properties of the resulting object.

It is the third type of immutability that will allow the compiler to perform some clever optimizations.

An example of use is the final field in java.

Personally, it seems to me that the nuances of 1–2 types of variability can be solved using interfaces and missing getters/setters.

For example, we have an immutable object that contains a pointer to variable memory.

It is possible that we will want to have several “interfaces” for using an object — and one that will not allow to change only an object and one that, for example, will close access to external memory.

(As you might guess, jvm languages ​​have affected me in which there is no const)Computations performed at compile time is also a very interesting topic.

In my opinion, the most beautiful approach is used in D.

Something like static value = func (42); and the most common function is explicitly computed when compiled.

Kotlin TricksIf someone used gradle, then, perhaps, when looking at non-working build files, you had a thought like “wtf?.What should I do?”android { compileSdkVersion 28}This is just Groovy code.

The android object simply accepts the closure {compileSdkVersion 28}, and somewhere in the wilds of the android plug-in, this closure is assigned an object in the context of which our closure will actually be launched.

The problem here is the dynamism of the groovy language — the development environment is unaware of which fields and methods are possible in our closure and cannot highlight errors.

So, in the Kotlin there are cunning types, and it could be implemented somehowclass UnderlyingAndroid(){ compileSdkVersion: Int = 42}fun android(func: UndelyingAndroid.

() -> Unit) .

We are already in the function signature saying that we are accepting something that works with the fields/methods of the UnderlyingAndroid class, and the development environment will immediately highlight errors.

We can say that this is all syntactic sugar and instead write like this:android { it => it.

compileSdkVersion = 28}But it’s ugly!.And if we enclose each other several such structures?.The approach in the Kotlin + static types allows you to make very concise and convenient DSL.

I hope, sooner or later, the whole gradle will be rewritten to Kotlin, usability will grow significantly.

I would like to have such a feature in my language, although it is not critical.

Similar to extension methods.

Syntactic sugar, but quite convenient.

It is not necessary to be the author of the class to add another method to it.

And they can also be invested in the scope of something and thus not cluttering the global area.

Another interesting application is that you can hang these methods on existing collections.

For example, if a collection contains objects of type T that support addition with themselves, then you can add the collection method sum, which will be only if T allows it.

Call-by-name semanticsThis is again syntactic sugar, but it is convenient, and in addition, allows you to write lazy calculations.

For example, in the code type map.

getOrElse (key, new Smth ()), the second argument is taken not by value, and therefore a new object will be created only if there is no key in the table.

Similarly, functions like assert (cond, makeLogMessage ()) look much nicer and more comfortable.

In addition, no one forces the compiler to do exactly an anonymous function — for example, you can inline the assert function and then it will turn into just if (cond) {log (makeLogMessage ())}, which is also good.

I will not say that this is a must-have feature of the language, but it clearly deserves attention.

Co-counter-in-non-variance of template parametersAll this is needed.

“Input” types can be made wider, “output” types can be narrowed, with some types nothing can be done, and some can be ignored.

In a modern language, you need to have support for this directly in the type system.

Explicit Implicit ConversionsOn the one hand, implicit conversions from one type to another can lead to errors.

On the other hand, the experience of the same Kotlin shows that writing explicit conversions turns out to be rather dull.

Ideally, the language should allow explicitly allow implicit conversions so that they are used consciously and only where necessary.

For example, the same conversion from int to long.

Where to store object types?They can not store at all.

For example, in C, all types are known as the compilation stage, and at run time we have this information no longer.

It can be stored together with the object (this is done in languages ​​with virtual machines, as well as for virtual classes in C ++).

Personally, the third approach seems more interesting to me when the type (pointer to the method plate) is stored directly in the pointer.

Values, references, pointersThe language should hide from the programmer implementation details.

In C++, there are problems with writing templates, since T in a template can be some kind of unexpected type.

This can be a value, a pointer, a link, or an rvalue link, some are spiced with the word const.

I can not say how to do it, but I can definitely see how to do it.

Something close to ideal for convenience is in Scala and Kotlin, where primitive types “pretend” to be objects, so everything with which we work looks monotonous and does not load the programmer’s brain and the syntax of the language.

Minimum of entitiesThis is what I do not like.

C# — a lot of things were dragged into the language, it all somehow strangely combined and increases the complexity of the language.

(I can be very wrong in the details since I wrote in C# a long time ago and only under Unity) For example, there are class fields, properties, and methods.

3 entities!.They are not very compatible with each other, you can declare several methods with the same name, but with a different signature, but for some reason, you cannot declare a property with the same name.

Or if the interface requires that it be property, then you cannot simply declare a field in the class — it must be property.

In Kotlin/Scala it is made better — all fields are private, from the outside are used through generated getters and setters.

Technically, they are simply methods with special names and can be overridden at any time.

All, no distortions.

Another example is the word inline in C++/Kotlin.

Do not drag it into the tongue!.Both here and there the word inline changes the logic of compilation and code execution, people start to write inline not for the sake of inline itself, but for the sake of opportunities to write a function in a header (C ++) or make a tricky return from a nested function as a caller (Kotlin).

Then forced_inline__, noinline, crossinline appear in the language, affecting some nuances and further complicating the language.

It seems to me that the language should be as flexible and simple as possible, and the same inline can be annotations that do not affect the logic of code execution and only help the compiler.

MacrosThe language must have macros that take the syntax tree as input and transform it.

In the case of dull repetitive code, macros can save from errors and make the code several times shorter.

Unfortunately, quite serious languages ​​like C++ also have a complex syntax with a bunch of nuances, which is probably why normal macros have not yet appeared there.

In languages ​​such as lisp and scheme, where the program itself is suspiciously similar to the list, writing macros do not cause major problems.

Functions inside functionsThe flat structure is sad.

If something is used only in one or two places, then why not do it as locally as possible — for example, allow declaring some local functions or classes inside functions.

The same is convenient: the namespace is not clogged, when removing a function code, its “internal” details are also deleted.

Substructural type systemIt is possible to implement a type system for which restrictions are imposed.

For example, a variable can be used only once or, for example, no more than one.

Why is this useful?.Move-semantics and the idea of ​​ownership are based on the fact that you can only give ownership of an object once.

In addition, any objects with an internal state may imply a certain scenario of work.

For example, we first open the file, read something/write, and then close it back.

Now the state of the file lies on the programmer’s conscience, although actions with it (theoretically) can be pushed into the type system and get rid of some of the errors.

Some private applications like ownership of objects are needed now, some will become popular when it appears in mainstream languages.

Dependent typesOn the one hand, they look very interesting and promising, but I have no idea how to implement them.

Of course, I would like to have clever types, for example, a list of at least one element or a number that is divisible by 5, but not divisible by 3, but I have little idea how this can be proved in a rather complex program.

AssemblyFirst, the language must be able to work without the standard library.

Secondly, the library should consist of individual pieces (perhaps with dependencies between them), so that if desired, only a part of them can be included.

Thirdly, in the modern language, there should be a convenient assembly system (in C ++ pain and sadness).

Functions, variables, and classes should be used to describe the course of the calculation; this is not something that should be stuffed into a binary.

To export to the outside, you can somehow annotate the necessary pieces, but everything else should be given to the compiler and let it dominate the code as it wants.

ConclusionSo, in my opinion, in an ideal programming language should be:The powerful type system supporting from the very beginningunion types and tuplesRestrictions on template parameters and their relationships with each otherPerhaps an exotic type of linear or even dependent types.

Convenient syntaxLaconicHaving to write in a declarative style and use constants.

Unified for types by value and by pointer (link)As simple and flexible as possibleTaking into account the possibilities of ide and designed for convenient interaction with it.

A bit of syntax sugar like extension-methods and lazy function arguments.

the ability to transfer part of the calculations to the compilation stageThe macros working directly with ASTConvenient companion type toolsAmbiguous features such as the presence/absence of the garbage collector I did not consider because in living languages ​​are needed with and without the collector.

I’m not sure that all my Wishlist described is well combined with each other.

I tried to describe the vision of a programming language in which I would like to write code.

Most likely, your views are different — it will be interesting to read the comments.

.. More details

Leave a Reply