Process Intensity

Volume #1 Issue 5 December 1987

I have in previous issues of the JCGD referred to process intensity ; I propose in this essay to describe this most useful concept.

Process intensity is the degree to which a program emphasizes processes instead of data. All programs use a mix of process and data. Process is reflected in algorithms equations, and branches. Data is reflected in data tables, images, sounds, and text. A process-intensive program spends a lot of time crunching numbers; a data-intensive program spends a lot of time moving bytes around.

The difference between process and data is profound. Process is abstract where data is tangible. Data is direct, where process is indirect. The difference between data and process is the difference between numbers and equations, between facts and principles, between events and forces, between knowledge and ideas.

Processing data is the very essence of what a computer does. There are many technologies that can store data: magnetic tape, punched cards, punched tape, paper and ink, microfilm, microfiche, and optical disk, to name just a few. But there is only one technology that can process data: the computer. This is its single source of superiority over the other technologies. Using the computer in a data-intensive mode wastes its greatest strength.

Because process intensity is so close to the essence of "computeriness", it provides us with a useful criterion for evaluating the value of any piece of software. That criterion is a vague quantification of the desirability of process intensity. It uses the ratio of operations per datum, which I call the crunch per bit ratio. I intend here that an operation is any process applied to a datum, such as an addition, subtraction, logical operation, or a simple boolean inclusion or exclusion. A datum in this scheme can be a bit, a byte, a character, or a floating-point number; it is a small piece of information.

To demonstrate its utility, I shall apply this criterion to word processing software. Suppose that you are going to write a book on your word processor. Suppose further that you are omniscient in the subject matter of the book, impeccably organized, and a perfect typist. You simply sit down at the keyboard and start typing as you compose, making not a single mistake. When you are finished, you will have misused your word processor, for you could have done the same thing on a typewriter. In short, the word processor had zero utility for this project. And what was the crunch per bit ratio? It was zero, because not a single word or character was actually crunched by the program. The words moved directly from your keyboard to the printer with no significant intervening processing.

Now suppose that you discover that your omniscience was less omni than you thought, and there are a few little mistakes that you need to clean up. You go back to the word processor, make a few minor changes, and print out the new manuscript. Now the word processor has a minor advantage over the typewriter, but not a stupendous advantage you could probably have managed with a little cutting and pasting, and perhaps retyping a few pages. Note that the crunch per bit ratio has gone up from zero to a small value because you have manipulated some of the data in the file.

Now suppose that you are older and wiser and you realize that your manuscript is riddled with errors. You need to change the spellings of many words, you must completely reorganize the book and most of its chapters, and change its layout while you’re at it. You’ll be doing intensive reprocessing of the data as you move things around, execute massive global search and replacements, and in general crunch the hell out of your manuscript. Here we have a case of very high crunch per bit, and (not coincidentally) one in which the word processor shines its brightest.

The same analysis works with other applications. Spreadsheets show their greatest value when you recalculate the same data many times with many different variations. Database managers earn their price only when you have them sort, search, report on, and otherwise munch the data in many different ways.

The same is true with games: the higher the crunch per bit ratio, the more "computery" the game is and the more likely the game will be entertaining. Indeed, games in general boast the highest crunch per bit ratios in the computing world. Consider how little data a player enters into a flight simulator and how extensive are the computations that this data triggers.

The crunch per bit criterion also works well as an exposer of bad software ideas. For example, you old hands might recall the early days of the personal computing era and the ill-famed "checkbook balancing program". This piece of software was universally cited whenever anybody was boorish enough to question the value of personal computers. It wasn’t vaporware, either; there were lots of these checkbook balancing programs floating around. The thing was, nobody ever seemed to use them. Why not? Nobody seemed to be able to say just why, but they just weren’t practical. Let’s apply the notion of process intensity to the problem. These programs have a low crunch per bit ratio because they perform very little processing on each datum. Every number is either added to or subtracted from the checkbook balance, and that’s just about all. That’s one operation per datum -- not very good.

The same reasoning works just as well against that other bugaboo from the early days: the kitchen recipe program. This was a piece of vaporware (actually, "bullshitware" would be a better cognomen) frequently cited by husbands seeking to obtain their wives’ acquiescence to the purchase of one of these toys. It sounded great but in practice it never worked. Low process intensity was the reason.

We can even apply the process intensity principle to bad games. Does anybody remember that smash hit arcade game of summer 1983, Dragon’s Lair? This was the first videodisk game, and its glorious cartoon graphics created an instant sensation. The press rushed to write stories about this latest grand breakthrough; consumers threw bushelfuls of quarters at the machines; and Atari frantically initiated half a dozen videodisk game projects. Amid all the hubbub, one solitary figure stood unimpressed in his ivory tower, nose held high in contemptuous dismissal, disappointing reporters with comments that this was merely a fad. How was I able to correctly perceive that the videodisk game was doomed to failure once its fad value was exhausted? Simple: its crunch per bit ratio stank. All that data came roaring in off the disk and went straight on to the screen with barely a whisper of processing from the computer. The player’s actions did little more than select animation sequences from the disk. Not much processing there.

The "process intensity principle" is grand in implications and global in sweep. Like any such all-encompassing notion, it is subject to a variety of minor-league objections and compromising truths.

Experienced programmers know that data can often be substituted for process. Many algorithms can be replaced by tables of data. This is a common trick for expending RAM to speed up processing. Because of this, many programmers see process and data as interchangeable. This misconception arises from applying low-level considerations to the higher levels of software design. Sure, you can cook up a table of sine values with little trouble but can you imagine a table specifying every possible behavioral result in a complex game such as
Balance of Power?

A more serious challenge comes from the evolution of personal computing technology. In the last ten years, we have moved from an eight-bit 6502 running at 1 MHz to a 16-bit 68000 running at 8 MHz. This represents about a hundredfold increase in processing power. At the same time, though, RAM sizes have increased from a typical 4 kilobytes of RAM to perhaps 4 megabytes of RAM a thousandfold increase. Mass storage has increased from casettes holding, say, 4K, to hard disks holding 20 megabytes -- a five thousandfold increase. Thus, data storage capacity is increasing faster than processing capacity. Under these circumstances, we would be foolish to fail to shift some of our emphasis to data intensity. But this consideration, while perfectly valid, is secondary in nature; it is a matter of adjustment rather than fundamental stance.

Then there is the arguement that process and data are both necessary to good computing. Proponents of this school note that an algorithm without data to crunch is useless; they therefore claim that a good program establishes a balance between process and data. While the argument is fundamentally sound, it does not suggest anything about the proper mix between process and data. It merely establishes that some small amount of data is necessary. It does not in any way suggest that data deserves emphasis equal to that accorded to process.

The importance of process intensity does not mean that data has intrinsic negative value. Data endows a game with useful color and texture. An excessively process-intensive game will be so devoid of data that it will take on an almost mathematical feel. Consider, for example, this sequence of games: checkers - chess - Diplomacy - Balance of Power. As we move along this sequence, the amount of data about the world integrated into the game increases. Checkers is very pure, very clean; chess adds a little more data in the different capabilities of the pieces. Diplomacy brings in more data about the nature of military power and the geographical relationships in Europe. Balance of Power throws in a mountain of data about the world. Even though the absolute amount of data increases, the crunch per bit ratio remains high (perhaps it even increases) across this sequence. My point here is that data is not intrinsically evil; the amount of data can be increased if the amount of process is concomitantly raised.

The most powerful resistance to process intensity, though, is unstated. It is a mental laziness that afflicts all of us. Process intensity is so very hard to implement. Data intensity is easy to put into a program. Just get that artwork into a file and read it onto the screen; store that sound effect on the disk and pump it out to the speaker. There’s instant gratification in these data-intensive approaches. It looks and sounds great immediately. Process intensity requires all those hours mucking around with equations. Because it’s so indirect, you’re never certain how it will behave. The results always look so primitive next to the data-intensive stuff. So we follow the path of least resistance right down to data intensity.

Process intensity is a powerful theoretical concept for designing all kinds of software, not just games. It is highly theoretical, and so it is difficult to understand and implement, and there are numerous exceptions and compromising considerations that arise when applying the notion. Nevertheless, it remains a useful theoretical tool in game design.