First Thoughts on Abstract Graphics

This essay is my first attempt at defining what I mean by "abstract graphics".

 

Combinatorial imagery

The first conceptual component of abstract graphics is combinatorial imagery. This means that images are assembled out of parts rather than blitted in entirety onto the screen. The parts should represent the components of the substory being represented.

 

Representation rather than depiction

Our goal is not to depict actions but to represent them. Representation implies a degree of abstraction and indirection. This is not the same as low resolution or poor imagery; it means that our imagery will be detailed but not explicit.

 

A graphics language

This necessarily entails the creation of an abstract graphics language that allows us to send a message to an object that will in turn create the imagery. We can think of the current face technology in such terms. We simply tell it three things: whose face we want, what facial expression we expect, and where we want it drawn. It does the rest. We'll need a greatly expanded version of this concept.

 

Representing action

It is easy to represent characters; we need merely use their faces (although we could also use their full bodies when appropriate). However, representing action is considerably more difficult. There is no general solution to the problem; each action must be tackled separately.

 

The role of sound

We should include sound within these considerations as well. I think that we are necessarily talking about music rather than sound effects. Music is, by the way, an excellent source of inspiration for our thinking in this matter. Who cannot listen to Beethoven's Sixth Symphony without seeing the shepherd with his broken flute? What soul is so dead as to listen to the "Fall" hunting movement of Vivaldi's Four Seasons without seeing the hunters gaily galloping in search of their prey? The only problem with music is its linearity, its expectation of continuity. A series of romantic substories with musical accompaniment swelling in romantic power would indeed be a powerful experience, but what if were interrupted by a bit of bad news? How could that transition be handled effectively? I fear that this issue makes music a factor that should be included only after we have gotten a system up and running and developed some experience with it.

 

Classification as opposed to instantiation

Here's a novel approach: rather than think about each verb as a separate instance that will require its own custom fragment of identifying imagery, what if we classify each verb according to a variety of factors, rather like the classifier used in the Erasmotron, and then associate some imagery with each classification? Here is the set of classifications used in LMD, and some possible images associated with each one:

 

Battle swords clashing

Light Romance hearts and flowers

Heavy Romance

Involves Sex

Dealmaking

Witness

Nasty

Nice

Unwise

Magnanimous

Submissive

Deceptive

Assertive

Masculine

Feminine

Inquiry

Talk

Action

Conflict

Cooperation

Deferred

Justice

Concerns Arthur

Concerns Mordred

Plotline

Seedstory

Remembrance

Criminal

Two-party

Three-party

Involves Wealth

 

Wait a Minute!

What if I took a different approach. This classification system is specific to LMD; I have always assumed that every environment will have its own classification system. What if I establish a master classification system that will apply to all stories? If it's big enough, it could be useful. And what would be the basis of this divinely bestowed classification system? Roget's Thesaurus.

 

Here is a subset of the classifications used in Roget's:

 

Life Death

Physical Sensibility Physical Insensibility

Physical Pleasure Physical Pain

Taste Insipidity

Sweetness Sourness

Fragrance Fetor

Sound Silence

Light Darkness

Intellect Absence of Intellect

Attention Inattention

Care Neglect

Inquiry Answer

Probability Improbability

Belief Doubt

Assent Dissent

Truth Error

Expectation Nonexpectation

Information Concealment

Vigor Feebleness

Plainness Ornament

Loquacity Taciturnity

Will Necessity

Predetermination Impulse

Good Evil

Pursuit Avoidance

Importance Unimportance

Health Disease

Improvement Deterioration

Remedy Bane

Safety Danger

Action Inaction

Haste Leisure

Cunning Artlessness

Difficulty Facility

Hindrance Aid

Opposition Cooperation

Discord Concord

Attack Defense

Retaliation Resistance

Success Failure

Severity Mildness

Disobedience Obedience

Commission Annulment

Permission Prohibition

Offer Refusal

Acquisition Loss

Taking Restitution

Wealth Poverty

Credit Debt

Expenditure Receipt

Dearness Cheapness

Liberality Economy

Relief Aggravation

Cheerfulness Dejection

Rejoicing Lamentation

Beauty Ugliness

Hope Hopelessness

Courage Cowardice

Rashness Caution

Desire Dislike

Pride Humility

Insolence Servility

Friendship Enmity

Gratitude Ingratitude

Forgiveness Revenge

Respect Disrespect

Flattery Detraction

Probity Improbity

Innocence Guilt

 

There are some 70 classifications here, and there is considerable overlap and a number of classifications that don't really apply to our work. What if I came up with a set of 32 standard classifications, and the storybuilder could add another 32 custom classifications specific to her story? That would work. OK, let's take a stab at 32 standard classifications:

 

Romantic

Sexual

Dealmaking

Nice

Nasty

Unwise

Magnanimous

Criminal

Noble

Ignoble

Submissive

Assertive

Deceptive

Masculine

Feminine

Inquiry

Talk

Action

Conflict

Cooperation

Deferred

Immediate

One-Party

Two-party

Three-party

Involves Wealth

Expression of Feeling

Vigorous

Retaliatory

Sensational

 

That's only 30, but I think this establishes that it can be done. OK, now let's consider how these classifications might be handled with abstract graphics:

 

Romantic Hearts and flowers border

Sexual throbbing

Dealmaking bilateral symmetry of image

Nice bright clean colors

Nasty ugly colors

Unwise darkness

Magnanimous wide field of view

Criminal low point of view

Noble high point of view

Ignoble low point of view

Submissive downward looking

Assertive strong bold lines

Deceptive inversion

Masculine lines

Feminine circles

Inquiry

Talk light, fluffy imagery

Action heavy, solid imagery

Conflict collision

Cooperation parallelism

Deferred distance

Immediate proximity

One-Party one face

Two-party two faces

Three-party three faces

Involves Wealth

Expression of Feeling texture

Vigorous thrusting

Retaliatory cyclicity

Sensational explosion

 

OK, now let's come at this problem from a completely different direction: what's our palette? What visual tools do we have available to utilize?

 

background color

background texture

foreground texture

foreground color

background texture animation

image frame

dissolves

image motion

image shimmer

image solidity

line width

speed of animation

 

 

By themselves, these things don't do anything for me. But what if we also had a repertoire of basic images, symbols, or animations that could go into this? Just as a rapid sequence of high notes suggests something small and rapid, and the reverberation of a gong suggests exotic power, can we not find a set of visual elements that suggest meaning?

 

Is this not really a matter of finding hieroglyphics? Or are they too schematic already?

 

Tuesday, March 19th

 

Last night I re-read Scott McCloud's Understanding Comics. It triggered many useful thoughts. First and most fundamental, do I want to use full animation or "static animation" with still primary images. Full animation is obvious, but it suffers from the dual defect that 1) it demands the total attention of the player (dontcha dare miss anything, y'hear?) and 2) it lasts for only a few seconds, leaving a dead image while the player contemplates his next move. For both of these reasons, I am unenthusiastic about this option, however expected it might be. The second option uses a set of static primary images in the style of a comic strip, with the animation confined to the short period cyclic animation of the background or some other secondary feature of the image. This solves both of the previous problems, but raises new problems with screen space. How will I fit this into the screen? I could use a schematic style, with each verb represented by a static image or two, nestled between facial images of the subject and the direct object. I could even use an explicitly comic/hieroglyphic approach, with a vertical sequence of events, each row containing a single event in subject/verb/direct object sequence, but this strikes me as overly schematic in style. It's probably best to communicate a single event on each screen, separating the events by time rather than screen space.

 

Here's what I keep coming back to: a dissolving collage of three basic images:

 

1. The facial image of the subject, animated with emotional expression

2. One or two images representing the verb (NOT depicting the action!)

3. The facial image of the direct object, animated with emotional expression

 

Suppose now that this takes place in a pane 512h x 384v. We come up with a broad collection of dissolve/wipes: standard dissolve, shimmer dissolve, radar dissolve, horizontal edge dissolve, vertical edge dissolve, horizontal slat dissolve, vertical slat dissolve, reverberatory dissolve, random patch dissolve, radial dissolve, spiral dissolve, and so forth. Each dissolve can of course have its own intensity, variable and less than 100%. Suppose now that each facial image is 256h x 384v; the two facial images by themselves fill the screen. The verb image would occupy the entire 512h x 384v space, but it would timeshare the pane with the two facial images using the dissolves. That's visually curious, parsimonious of screen space, yet stable in its information delivery.

 

I just wish that there was another image to toss into the mix; alternating between two images is visually penurious. What if I gave the artist the option to create multiple images to represent the verb? If there were two or even three such images, it could get interesting. Disk real estate: 1,000 verbs times 2 images per verb at 512h x 384v by 24 bits deep yields (uncompressed) just about a gigabyte. However, 24-bit images will compress quite well, so I don't think this will be a problem. I am a little worried about those dissolves operating on 24-bit images; it may run too slowly.

 

Words plus images: Scott's right: it's idiotic to try to communicate everything with the image alone. Words complement the deficiencies of images, and vice versa; we should use both forms of representation to communicate a substory.

 

Tales versus substories: a tale is a collection of causally related substories. The use of words alone allows me to string together tales, but with images I'll have to break up a tale into a sequence of substories. There's no way around this problem. Grin and bear it.

 

Background color, texture, etc: I'd sure like to use more abstract visual forms to communicate something. The use of animated textures or background colors could work wonders here. It would be so nice if this could be tied in with the taxonomy I was considering earlier. So far this idea is a dry hole.

 

Building that taxonomy: this is something I need professional help with. It needs to be carefully honed.

 

Time to take a break and digest...