Enactive Interface Perception and Affordances

Posted in artificial intelligence, interfaces, philosophy on November 14th, 2011 by Samuel Kenyon

There are two freaky theories of perception which are very interesting to me not just for artificial intelligence, but also from a point of view of interfaces, interactions, and affordances. The first one is Alva Noë’s enactive approach to perception. The second one is Donald D. Hoffman’s interface theory of perception.

Enactive Perception vs. Interface Perception

Enactive Perception

The key element of the enactive approach to perception is that sensorimotor knowledge and skills are a required part of perception [1].

In the case of vision, there is a tradition of keeping vision separate from the other senses and sensorimotor abilities, and also as treating it as a reconstruction program (inverse optics). The enactive approach suggests that visual perception is not simply a transformation of 2D pictures into a 3D representation, and that vision is dependent on sensorimotor skills. Indeed, the enactive approach claims that all perceptual representation is dependent on sensorimotor skills.

Example of optical flow (one of the ways to get structure from motion)

My interpretation of the enactive approach proposes that perception co-evolved with motor skills such as how our bodies move and how our sensors, for instance, eyes, move. A static 2D image can not tell you what color blobs are objects and what are merely artifacts of the sensor or environment (e.g. light effects). But if you walk around this scene, and take into account how you are moving, you get a lot more data to figure out what is stable and what is not. We have evolved to have constant motion in our eyes via saccades, so even without walking around or moving our heads, we are getting this motion data for our visual perception system.

Of course, there are some major issues that need to be resolved, at least in my mind, about enactive perception (and related theories). As Aaron Sloman has pointed out repeatedly, we need to fix or remove dependence on symbol grounding. Do all concepts, even abstract ones, exist in a mental skyscraper built on a foundation of sensorimotor concepts? I won’t get into that here, but I will hopefully return to it in a later blog post.

The enactive approach says that you should be careful about making assumptions that perception (and consciousness) can be isolated on one side of an arbitrary interface. For instance, it may not be alright to study perception–or consciousness–by looking just at the brain. It may be necessary to include much more of the mind-environment system–a system which is not limited to one side of the arbitrary interface of the skull.

Perception as a User Interface

human-computer interfaces (this still from Matrix Reloaded)

human-computer interfaces

The Interface Theory of Perception says that “our perceptions constitute a species-specific user interface that guides behavior in a niche” [2].

Evolution has provided us with icons and widgets to hide the true complexity of reality. This reality user interface allows organisms to survive better in particular environments, hence the selection for it.

Perception as an interface

Or as Hoffman et al summarize [3] the conceptual link from computer interfaces:

An interface promotes efficient interaction with the computer by hiding its structural and causal complexity, i.e., by hiding the truth. As a strategy for perception, an interface can dramatically trim the requirements for information and its concomitant costs in time and energy, thus leading to greater fitness. But the key advantage of an interface strategy is that it is not required to model aspects of objective reality; as a result it has more flexibility to model utility, and utility is all that matters in evolution.

Besides supporting the theory with simulations, Hoffman [2] uses a colorful real world example: he describes how male jewel beetles use a reality user interface to find females. This perceptual interface is composed of simple rules involving the color and shininess of female wing cases. Unfortunately, it evolved for a niche which could not have predicted the trash dropped by humans that lead to false positives. This results in male jewel beetles humping empty beer bottles.

Male Australian jewel beetle attempting to mate with a discarded “stubby” (beer bottle)

For more info on the beetles, see this short biological review [4] which includes “discussion regarding the habit of the males of this species to attempt mating with brown beer-bottles.” It also notes:

Schlaepfer et al. (2002) point out that organisms often rely on environmental cues to make behavioural and life-history decisions. However, in environments which have been altered suddenly by humans, formerly reliable cues might no longer be associated with adaptive outcomes. In such cases, organisms can become trapped by their evolutionary responses to the cues and experience reduced survival or reproduction (Schlaepfer et al., 2002).

All perception, including of humans, evolved for adaptation to niches. There is no reason or evidence to suspect that our reality interfaces provide “faithful depictions” of the objective world. Fitness trumps truth. Hoffman says that Noë supports a version of faithful depiction within enactive perception, although I don’t see how that is necessary for enactive perception.


One might think of perception as interactions within a system. This system contains the blobs of matter we typically refer to as an “organism” and its “environment.”

You’ll notice that in the diagram in the previous section, “environment” and “organism” are in separate boxes. But that can be very misleading. Really the organism is part of the environment:

Of course, the organism itself is part of the environment.

True Perception is Right Out the Window

How do we know what we know about reality? There seems to be a consistency at our macroscopic scale of operation. One consistency is due to natural genetic programs–and programs they in turn cause–which result in humans having shared knowledge bases and shared kinds of experience. If you’ve ever not been on the same page as somebody before, then you can imagine how it would be like if we didn’t have anything in common conceptually. Communication would be very difficult. For every other entity you want to communicate with, you’d have to establish communication interfaces, translators, interpreters, etc. And how would you even know who to communicate with in the first place? Maybe you wouldn’t have even evolved communication.

So humans (and probably many other related animals) have experiences and concepts that are similar enough that we can communicate with each other via speech, writing, physical contact, gestures, art, etc.

But for all that shared experience and ability to generate interfaces, we have no inkling of reality.

Since the interface theory of perception says that our perception is not necessarily realistic, and is most likely not even close to being realistic, does this conflict with the enactive theory?

Noë chants the mantra that the world makes itself available to us (echoing some of the 1980s/1990s era Rod Brooks / behavioral robotics approach of “world as its own model”). If representation is distributed in a human-environment system, does it have to be a veridical (truthful) representation? No. I don’t see why that has to be the case. So it seems that the non-veridical nature of perception should not prevent us from combining these two theories.


A chair affords sitting, a book affords turning pages.

A chair affords sitting, a book affords turning pages.

Another link that might assist synthesizing these two theories is that of J.J. Gibson’s affordances. Affordances are “actionable properties between the world and an actor (a person or animal)” [5].

The connection of affordances to the enactive approach is provided by Noë (here he’s using an example of flatness):

To see something is flat is precisely to see it as giving rise to certain possibilities of sensorimotor contingency…Gibson’s theory, and this is plausible, is that we don’t see the flatness and then interpret it as suitable for climbing upon. To see it as flat is to see it as making available possibilities for movement. To see it as flat is to see it, directly, as affording certain possibilities.

Noë also states that there is a sense in which all objects of perception are affordances. I think this implies that if there is no affordance relationship between you and a particular part of the environment, then you will not perceive that part. It doesn’t exist to you.

The concept of affordances is also used, in a modified form, for interaction design as well. For those who are designers or understand design, you can perhaps understand how affordances in nature have to be perceived by animals so that they can survive. It is perhaps the inverse of the design problem–instead of making the artifact afford action for the user, the animal had to make itself comprehend certain affordances through evo-devo.

Design writer Don Norman makes the point to distinguish between “real” and “perceived” affordances[5]. That makes sense in the context of his examples such as human-computer interfaces. But are any affordances actually real? And that gets back into the perception as interface theory–animals perceive affordances, but there’s no guarantee those affordances are veridical.

1. Noë, A., Action in Perception, Cambridge, MA: MIT Press, 2004.
2. Hoffman, D.D., “The interface theory of perception: Natural selection drives true perception to swift extinction” in Dickinson, S., Leonardis, A., Schiele, B.,&Tarr, M.J. (Eds.), Object categorization: Computer and human vision perspectives. Cambridge, UK: Cambridge University Press, 2009, pp.148-166. PDF.
3. Mark, J.T., Marion, B.B.,&Hoffman, D.D., “Natural selection and veridical perceptions,” Journal of Theoretical Biology, no. 266, 2010, pp.504-515. PDF.
4. Hawkeswood, T., “Review of the biology and host-plants of the Australian jewel beetle Julodimorpha bakewelli,” Calodema, vol. 3, 2005. PDF.
5. Norman, D., “Affordances and Design.” http://www.jnd.org/dn.mss/affordances_and_design.html

Image credits: iamwilliam, T. Hawkeswood [4], Matrix Reloaded (film), Old Book Illustrations.
Diagrams created by Samuel H. Kenyon.

This is an improved/expanded version of an essay I originally posted February 24th, 2010, on my blog SynapticNulship.

Tags: , , , , , , ,

Gamification and Self-Determination Theory

Posted in interaction design on November 9th, 2011 by Samuel Kenyon

Games are not just for fun anymore—and indeed “fun” is not a good enough description for the psychology of gameplay anyway. Designers are trying to “gamify” applications which traditionally were not game-like at all. And this isn’t limited to just the Serious Games movement that’s been around for several years. This is a type of design thinking that has spread from the gaming world and is now merging with the User Experience Design / Interaction Design world.

Beyond the hype and mistakes of gamification that might be going on right now, there does seem to be a design thinking emerging with the intention to increase engagement and motivation of products. I assume the business angle is that this of course can result in more users and keeping users longer.

Dustin DiTommaso, experience design director at Mad*Pow, presented “Beyond Gamification: Architecting Engagement Through Game Design” yesterday. As I already mentioned, he says how “fun” is not a good definition. His main psychological theory (at least for this presentation) is Self-Determination Theory (SDT). What follows are my notes based on DiTommaso’s presentation (hopefully I haven’t butchered it too much).

Games keep people in intrinsic motivation. There are three intrinsic motivation needs (these terms are directly from SDT):

  1. Competence
  2. Autonomy
  3. Relatedness


This is about meaningful growth. Good games achieve a path to mastery. The user experiences increased skill over time. There are nested short-term achievable goals that lead to success of the overarching long-term goal.

The experience should be that of a challenge. If you’re familiar with Csíkszentmihályi’s Flow, it is similar (or perhaps exactly the same) as that.

As with most good interaction design, there has to be feedback. Specifically, there has to be:

  1. Meaningful information
  2. Recognition
  3. Next steps

Action-Rules-Feedback loop

On the meaningful info item: Progress should be made visible. But, rewards have to be meaningful. Rewards for meaningless actions are not good in the long term—-users will hack (or “game”) the system if they get bored and/or detached.

Screenshot from Rockband 3 (developed by Harmonix)

DiTommaso says that you should strive for “juicy” feedback. For example, the interface for the popular video game series Rock Band is entirely “juicy” feedback. Visual Thesaurus is a good example of juicy feedback that is less flashy than Rock Band.

Failure should be allowed in a graceful manner if it provides an opportunity to learn and grow. This might sound weird for interaction design where usually you don’t want users to fail at all. Mad*Pow supposedly has done research to back this up.


The game belongs to the user. Choice, control, and personal preference lead to deep engagement and loyalty. There has to be the right feedback for the type of autonomy for a given user. Experience pathways can be designed “on rails” to limit or give the illusion of freedom.

To motivate sustained interest the game should provide opportunities for action. For example, on a ski mountain, there are literally multiple pathways, and multiple levels of difficulty.


This is about mutual dependence. We’re intrinsically motivated to seek meaningful connections with others.

A game should provide meaningful communities of interest. The users should somehow be able to value something in the game beyond the mechanics that run the system. The users should get recognition for actions that matter to them. And they should be able to inject their own goals. An example of a system that allows user-customizable goals is Mint.com.

It’s also worthwhile to think of non-human relatedness. Dialogues between user interface avatars and humans actually matter and affect motivation. They are a type of relationship. So scripts, text, tones, etc. are very important.


This is my rough interpretation of DiTommaso’s “Framework for Success” intended for designers and related professions.

  1. Why gamify? Consider the users and the business cases.
  2. Research the player profile(s) (perhaps game-oriented personas?). This research can and should inspire the design. What are the motivational drivers? Is it more about achievement or enjoyment? Is it more about structure or freedom? Is it more about control of others or connecting with others? Is it more about self interest or social interest?
  3. Goals and objectives: What’s the Long Term Goal? What steps? Etc.
  4. Skills and actions: consider what physical, mental, and social abilities are necessary. Can the skills be tracked and measured?
  5. Look through the lenses of interest. The concept of “lenses of interest” comes from Jesse Schell. The list of lenses provided by DiTommaso are:
    • Competition types
    • Time pressure
    • Scarcity
    • Puzzles
    • Novelty
    • Levels
    • Social pressure/proof (the herd must be right)
    • Teamwork
    • Currency
    • Renewals and power-ups
  6. Desired outcomes: What are the tangible and intangible rewards? What outcomes are triggered by user actions vs. schedules? How do users see and feel incremental success and failure on the way to the Ultimate Objective?
  7. Play-test and polish: Platforms are never done. This isn’t really specific to gamification. I would say this is about the general shift from waterfall to iterative development methodologies (which I have used successfully in my own work). This can even extend out to the actual end users—they can be involved in the loop and even expect updates for improvement.

Image Credits:
1. Nightrob
2. Dustin DiTommaso / Mad*Pow
3. IGN
4. Mount Sunapee

Tags: , , ,