Abstract Graphics II

 

I have some clearer thoughts on the abstract graphics concept:

 

Frame

The image will be framed in a rectangle 640 pixels wide by 320 pixels high. The remaining pixels available on the screen (120 pixels after menubar and window titlebar) will be dedicated to the menu list.

 

Background

This will be a high-resolution, photorealistic image selected on the basis of the Location field of the substory. It will fill the image frame. The source image for this will be a long, panoramic 360 degree view of the location; the presented window will be set at a 50 degree slice of this (this assumes a 14" monitor viewed from 18" away). The slicing process will be semi-random but continuous within a single thread. That is, when a thread begins, we randomly chose any 50 degree subset of the 360 degree circle; however, so long as the thread persists, we retain that same 50 degree subset. This means that the player will get many opportunities to see the same area.

 

The background will be enhanced with variable overlays in the form of background people images. Thus, the static image will consist of the landscape, buildings, room interiors, and other permanent features, but for each thread we will augment the static image with some static images of people in the background. For this purpose we will have a library of static overlays. There will be no animations in the background, as they would distract the eye.

 

Face Displays

The subject and the direct object will each be displayed using second-generation face display technology. This second generation will use 24-bit color and build a face 300 pixels high by 225 pixels wide. The display system will use the same basic algorithms as we used for the original face display technology, but the number of defining points of facial details will be increased. Moreover, this generation will include head attitude modifications for emotional expression. There will be three dimensions of head attitude, each constituting a rotation about an axis. These three are pitch, yaw, and tilt. Pitch is a rotation about a horizontal axis through the neck that is perpendicular to the line of sight. Thus, when pitch is applied, the person's nose goes up or down. Yaw is a side-to-side rotation of the head about a vertical axis through the center of the neck; when yaw is applied, the nose moves left and right. Tilt is a rotation of the head about an axis parallel to the line of sight through the neck. When tilt is applied, the nose will describe a circular arc. The technical details of pitch and yaw are simple. Each will offer only three settings: positive, zero, or negative, and will require a separate hand-drawn background face. Thus, a total of nine hand-drawn faces will be required for each character, representing the various combinations of pitch and yaw. Tilt will be accomplished with direct rotation of the image.

 

Debatable question: should we have sideways faces so that subject and direct object face each other? It seems important, yet would require a completely new set of algorithms and data. The problem is, we lose a lot of facial information this way. I don't like it.

 

I think that we should assign separate emotional expressions to each actor in the substory. Perhaps these expressions should be assigned on the basis of role; this would be difficult, as it would require the S-code interpreter to intervene in the face display algorithm. Perhaps they should be assigned on the basis of semantic position (subject, dirobject, indobject). This would be more direct. After all, who needs to show anybody other than the actors themselves?

 

How do we differentiate between player-witnessed events, events described indirectly to the player, and events in which the player is a participant?

 

Witnessed: standard treatment, with indobject in the background.

 

Described: player sees describer in the corner, scaled down, with a scaled-down version of what he would see in a witnessed event.

 

Participant: player sees subject and verb only. No presentation of player himself as occurs.

 

Verb Display

This is the most critical problem. We need to represent (not depict!) each verb. This representation will take a more iconic style, and is idiosyncratic to each verb. How do we handle it? Should we rely on animation to communicate the notion of action? Or should we rely instead on static panels as in the comics? I lean against animation on the grounds that a simple three-step animation will become grating after a few minutes. Panels are the way to go. But will we have screen space?

 

Text Display

Supporting text will be presented in a comics style. That is, the text can appear anywhere in the image, and will have to be scrunched into place to fit. We'll use text bubbles or partially wiped backgrounds or bars along the bottom or top. This implies that we cannot have mobile elements in the screen; animations must be animations-in-place.

 

Composition

How are these four elements (subject, verb, dirobject, and text) placed on the screen? One approach is to fix the four elements: subject on the left, verb in the middle, dirobject on the right, and text shoved in either top or bottom. But this strikes me as too formal. We could provide either automatic composition or specifiable composition. Automatic composition would figure out some pattern that holds all the pieces. Specifiable composition allows the storybuilder to make that determination and build it into the verb.

 

An important point about composition: if there's no free space, there's no freedom of composition. Perhaps I need to reduce the sizes of my components so that there's enough free space on the screen to permit some composition. Hmm...

 

All these considerations imply to me several conclusions: first, we must have specifiable composition. Second, we must provide the ability to scale each component of the composition as well as place it.

 

Static Animations

We must also provide a capability for static animations: image manipulations that continuously modify the image without moving it. The most obvious of these is the dissolve, of which I have developed many variations. Are these really necessary and useful? We could also do some color substitution. Could we also perhaps specify some kinds of facial expressions that include expressions? That could be expensive. I suspect that this is the biggest area of opportunism in our efforts.