Friday, July 8, 2016

Software Archaeology

This blog is to discuss what a software engineer, me, has to do when there has been years of neglect to a program.

I work in the embedded systems space, so this blog will talk about embedded programs, not Windows, not Unix, but embedded programs. Some written exclusively in assembly, some in C. Most with no threads or other OS assistance.

Definitions: Software Archaeology - The investigation, research, documentation, and rewriting to gain meaningful understanding of long ago abandoned or neglected software programs.

What causes a program to be abandoned or neglected? Why is archaeology required in the first place?

The programs I have worked with were written in the early '90s. Software standard practices are better than back then. I will say that many projects today are better than back then, but there are many that are still built the same way it was done 20 years ago.

Software consultant, Joe, talking to friend consultant John.

"Joe - How's the new assignment going?" ask John. "Oh, they're writing legacy code" replied Joe.

When software is written it combines 1) the author's domain knowledge, and 2) the author's understanding of the underlying hardware.

The software is constrained by how well the underlying hardware can accomplish the task to be completed. The software is also constrained by the author's knowledge of the domain problem that is the source of information for the task. The author then brings their personality, experience, drive, and insight to the writing of software.

Software is the art and science of translating human goals into a language where a computer can perform the task expounded in the goal.

What do I find when I read code from another era? I find the remnants of enough of the domain and programming language knowledge to do the job, but no more.

I took a computer languages course in 1976. I was introduced to Algol, PL/1, Snobol, and APL. I do not use any of these languages today. I don't know who does. I learned and used FORTRAN in other courses, which is still widely used in numerical computing applications. C was just starting to be used in research labs.

If I had to resurrect a program from that era, I would have to learn, to a certain extent, the actual computer language, its syntax and nuance, to understand how the program functioned.

Sometimes, the need for the source code is not necessary. Sometimes all that is required is a complete definition of the inputs and outputs of the program. This is probably a simple program, but if you know that a certain list of numbers goes into a program and a set of operations is performed on that list and a new new set of numbers is created, then any programming language that handles the inputs and outputs could be used to satisfy the task of converting the input to the output.

The constraint on recovering an old program is that the inputs and outputs are still in place. They cannot be changed. What is missing is the exact definition of the inputs and outputs. The program knows what those inputs and outputs are. But the complete definition is in the code.

Unfortunately, the program can't expound on its nuances or give background on what the author was thinking. Comments help, when they are present.

The techniques of re-factoring are those that will provide the most insight with the best chance of documenting the inputs and outputs.

The other difficulty with working with old programs are the tools. With each generation of processors, comes a new generation of tools.

In the '90s the method by which one debugged an embedded program was very primitive or very sophisticated, but not much in between. The primitive method was to output RS-232 messages to display the current state of the code. Each output would reveal the changing state. Analysis would then determine what might be wrong. The very sophisticated, and thus very expensive method, was to use an In-Circuit Emulator or ICE.

Memory was expensive in the '90s. Embedded processors did not have cache. Programs ran from Read Only Memory, which may have been PROMs, EPROMs or Flash. The processor would have break point capability, but only if the memory location could be changed to an 'illegal instruction' to cause a jump to the interrupt handler that would provide the debugging support. This only worked if the program was running in RAM. Inserting an illegal instruction into ROM is impossible. This is the same mechanism used today for software breakpoints. Hardware breakpoints were nowhere to be seen.

The ICE provided a way for a host processor, a PC, to have RAM memory substitute for ROM memory as well as take over the clock operations of the processor, allowing the user to watch the processor step through each instruction in as much detail as desired.

Breakpoints are essential.

RS-232 output disturbs the timing of the program and use up precious memory. The ICE was an emulator and thus provided the debugging functions without the rewriting of code and using any additional memory.

If the program was neglected, then the tools have been neglected. The ICE unit may no longer power  on, if it can be found at all.

The history of how the author came to writing the code is lost. The author learned, most likely by trial and error, the nuances of the language and the hardware. This history is not documented.

All in all a puzzle.

That's another good definition of software archaeology, the study of puzzles created by time and neglect.