Johann Joss ideas about computer languages improvements:

In development by Mr. J. Joss ...


Use mathematical notation

Algol introduced different styles of representation. Today we have the hardware which would allow to use the representation language as hardware representation but we are stuck with ASCII or the ASR 32 teletype.

We do not even have the mathematical symbols for >=, <= unequal, locical and and or!

What is for many people lost is, that a programming language is a description of an algorithm. Prof. Rutishauser once said, that an informal description is not an algorithm. Informally you can say: And then one iterates until eps is small enough. In an Algol program, you have to say what "small enough" means and you have to guarantee that this "small enough" is reached.

In my opinion the programming languages should evolve towards easyer reading. we should be allowed to use (TeX notation) \frac{a+b}{c+d} instead of (a+b)/(c+d). Of course, the printed output and IDE should look like the processed TeX.

New floating point standard:

I consider the IEEE floating point statdard a catastrophy. Using guard digits just makes a program very difficult to control. Fortunately, the GNU compiler collection corrects this fact. Tests were run on a PC using C and Fortran.
Guard digits are used, except one switches on optimization or debug mode. In these modes everything is predictable, no guard digits used.
Guard digits are mixed precision arithmetic. It is difficult enough to really understand fixed precision. The mixed precision just adds more complexity to an already difficult situation.

Partial underflow: The number range where partial underflow is working is very small. Say, you have an decimal exponent range of +-300 and 14 precision arithmetic. Partial underflow may help up to -314, which is not really worth having. An algorithm which makes good use of the partial underflow is difficult to find, if it exists at all.
Squelching the partial underflow to 0 makes ist much easier to detect and guard against. Implementing partial underflow in software is extreamly dangerous. A program which runs into that area almost stops. If this is in a real time situation it becomes a disaster. If it is implemented in silicon, it is a waste of silicon and energy, which would have better usages.

When I learned programming, I got the impression, that Algol real numbers are the mathematical real numbers+dirth. In a lecture, Prof. Rutishauser presented a different view: There is the set of machine numbers M. They form a subset of the real numbers. The machine operations are a mapping of MxM->M. This mapping should conform to certain rules (axioms), so that theorems about algorithms can be proven.

The language designer has no business to define the machine operations but he has to present a well defined interface to the underlying hardware. If he tries otherwise the resulting program becomes so slow that it is almost useless (at least for a very big set of problems). This is often not understood at all.

Hardware data type for array addresses:

A big question is whether a compiler should check the bounds of an index to an array or not. Doing so slows down the program, not doing it may lead to diffcult to find or even dangerous errors.

I cannot understand why this is not helped by the hardware. Here is may proposal:

Have a hardware datatype for addresses. This datatype contains 3 adderesses: The address of the element with index 0, the first and the last address of the memory area. Addresss arithmetic would be done with special address manipulation instructions and every indirect reference would be checked by the hardware.
This would not slow down the executeion because for a read access, the value could be given to the processor immediately and the checking done during the execution. If there is a violation of the check an interrupt would be generated.

Of course, this interrupt would come a bit late and the program cpounter would no longer point to the actual instruction. But this is not a real limitation because optimising compilers shuffle the machine instructions araound anyhow.

In any case it would be a big help to find and remove otherwise difficult to track errors. It also would stop all buffer overflow attacks and so make our computers much safer. When memory is freed by a program the pointer would be made invalid.
The compiler could use pointers only by indirect reference, so that there are no copies of the pointer type variable, but only references. This would slow down execution and it is up to the compiler writer (and user, if different compiler options would be made available)

Just as a remark: The instruction set of the PDP 10 computer had a hardware datatype byte pointer. This was heavily used in the operating system (at these times the operating systems were usually written in assembly language) for accesssing bit fields is system tables. It was also used in I/O. You had just to redefine the byte pointer in order to make a program work on full words (36 bits), 7-bit characters (normal ASCII) or 6-bit character (e.g.used for file names etc.) In these days memory vas very valuable, so 6 bit characters made sense.

Comments about string implementation

The standard C string library is inhearently unsafe and requires great care by the programmer. Historically this is very understandable. When C was invented, the computers had very limited memory and were for todays standards slow. One was happy, if the task could be done at all. Also the programs were small and so less error prone.

C uses 0-terminated strings. This has been critcised often because e.g. determining the length requires scanning. Everybody who programmed assembly languages in the 1960-ies and 70-ies found 0-terminated strings very easy to work with. There are no registers wasted for counters and/or limits. This made the programs much smaller and also usually much faster. The biggest drawback ist one character is (zero) is no longer available. Usually this is not a real problem.

Today, the size of the programs is much less a problem. I cannot understand why the standard C library with its danger of buffer overflow has not be deprecated and even forbidden for web applications.

goto and if-then-else comments

The goto statement is quite contraversal. Many damn it. There are even some who declare goto free progams as structured.
The problem of the goto statement is not the goto but the label. When I see a goto in a program, I know immediately what it means. When I see a label, I have to search through the whole block to see where all teh gotos are.
I guess that most of the aversion against the goto is in reality a fight against Fortran. most of the gotos in a Fortran program are in fact if-then-else or if-then. if-then-else ist much more preferable than a goto because it communicates the intentions of the programmer much better.

Also the Fortran style invites a bad layout of the program. If you use if-then-else, you find the then-part and the else-part together. In many Fortran programs you find a structure like the following: At a certain point a special condition has to be handeled. At this point an if with a goto is inserted and the normal flow continues. Later the special handling is programmed and this is terminated by a jump back to the main sequence. Very often the main code makes some assumptions which are no longer valid in the special case or there is not even a good point to continue. This makes the programs difficult to read and error prone.

Basically, there are 2 types of gotos: Forward gotos and backward gotos. In the theory of computations about the most difficult problem is to determine whether an algorithm terminates. Forward gotos never create loops by themselves and are for ths reason quite harmless. On the other hand backward gotos can create loops and need special attention. One place where gotos are very handy is for error handling. If a procedure finds some error condition it very often has to do some cleanup before returning. These error points can occur at various points in the procedure. For this gotos come very handy. This technique ist used extensively in the Linux kernel. Some declare this as bad style. I consider it a good style and the critqes as not understanding the problem.

Many goto free programs contain cleanup code spread out through the whole procedure, followeg by a return statement. This makes maintenance of the program very hard and error prone. When a new cleanup is needed, the programmer has to scan the whole procedure for points where new cleanup is needed. Also it leads to duplicated cleanup code.

An other place, where I prefere gotos is when a user input is checked and if found incorrect an error message is displayed and a repeat of the input is requested. I have some mixed feelings about this. There is usually only one reference to the label, a global item, the label is used for something purely local and it is a backward goto.

The alternative solution is a loop. I consider this as extremely ugly. It is syntactically a loop, but the loop is usually just executed once. So semantically it is not a loop. It also misleads the compiler. The compiler may try to move invariant code out of a loop which is only executed once. Also many programmers use an auxiliary variable for the wermination of the loop like: ok:=false; while(!ok) ......

The ok is an even moreglobal object than the label. Also ok=false; is just not really correct. Nothing is not okat any point. This just adds other ugly parts to an already ugly concept.

I am not really happy with this use of gotos, but I do not know any better alternative at this moment.

own as a static variable declaration

One cumbersome concept of Algol are the own variables. The big problem there is their initialzation.
In modern object oriented programming, this concept is called static variables.
It is still a needed, but still cumbersome concept.
It is a situation like Dirac said: "Some new ideas are here needed."




The doughter of Prof. Rutishauser, one of the founders of Algol is working on a biography of her father. I may get some useful data from there.