What are Symbols in AI?

Posted in artificial intelligence on February 22nd, 2011 by Samuel Kenyon

A main underlying philosophy of artificial intelligence and cognitive science is that cognition is computation.  This leads to the notion of symbols within the mind.

There are many paths to explore how the mind works.  One might start from the bottom, as is the case with neuroscience or connectionist AI.  So you can avoid symbols at first.  But once you start poking around the middle and top, symbols abound.

Besides the metaphor of top-down vs. bottom-up, there is also the crude summary of Logical vs. Probabilistic.  Some people have made theories that they think could work at all levels, starting with the connectionist basement and moving all the way up to the tower of human language, for instance Optimality Theory.   I will quote one of the Optimality Theory creators, not because I like the theory (I don’t, at least not yet), but because it’s a good summary of the general problem [1]:

Precise theories of higher cognitive domains like language and reasoning rely crucially on complex symbolic rule systems like those of grammar and logic. According to traditional cognitive science and artificial intelligence, such symbolic systems are the very essence of higher intelligence. Yet intelligence resides in the brain, where computation appears to be numerical, not symbolic; parallel, not serial; quite distributed, not as highly localized as in symbolic systems. Furthermore, when observed carefully, much of human behavior is remarkably sensitive to the detailed statistical properties of experience; hard-edged rule systems seem ill-equipped to handle these subtleties.

Now, when it comes to theorizing, I’m not interested in getting stuck in the wild goose chase for the One True Primitive or Formula.  I’m interested in cognitive architectures that may include any number of different methodologies.  And those different approaches don’t necessarily result in different components or layers.  It’s quite possible that within an architecture like the human mind, one type of structure can emerge from a totally different structure.  But depending on your point of view—or level of detail—you might see one or the other.

At the moment I’m not convinced of any particular definition of mental symbol.  I think that a symbol could in fact be an arbitrary structure, for example an object in a semantic network which has certain attributes.  The sort of symbols one uses in everyday living come in to play when one structure is used to represent another structure.  Or, perhaps instead of limiting ourselves to “represent” I should just say “provides an interface.”  One would expect that a good interface to produce a symbol would be a simplifying interface.  As an analogy, you use symbols on computer systems all the time.  One touch of a button on a cell phone activates thousands of lines of code, which may in turn activate other programs and so on.  You don’t need to understand how any of the code works, or how any of the hardware running the code works.  The symbols provide a simple way to access something complex.

A system of simple symbols that can be easily combined into new forms also enables wonderful things like language.  And the ablity to set up signs for representation (semiosis) is perhaps a partial window into how the mind works.

One of my many influences is Society of Mind by Marvin Minsky [2], which is full of theories of these structures that might exist in the information flows of the mind.  However, Society of Mind attempts to describe most structures as agents.  An agent is isn’t merely a structure being passed around, but is also actively processing information itself.

Symbols are also important when one is considering if there is a language of thought, and what that might be.  As Minsky wrote:

Language builds things in our minds.  Yet words themselves can’t be the substance of our thoughts.  They have no meanings by themselves; they’re only special sorts of marks or sounds…we must discard the usual view that words denote, or represent, or designate; instead, their function is control: each word makes various agents change what various other agents do.

Or, as Douglas Hofstadter puts it [3]:

Formal tokens such as ‘I’ or “hamburger” are in themselves empty. They do not denote.  Nor can they be made to denote in the full, rich, intuitive sense of the term by having them obey some rules.

Throughout the history of AI, I suspect, people have made intelligent programs and chosen some atomic object type to use for symbols, sometimes even something intrinsic to the programming language they were using.  But simple symbol manipulation doesn’t result in in human-like understanding.  Hofstadter, at least in the 1970s and 80s, said that symbols have to be “active” in order to be useful for real understanding.  “Active symbols” are actually agencies which have the emergent property of symbols.  They are decomposable, and their constituent agents are quite stupid compared to the type of cognitive information the symbols are taking part in.  Hofstadter compares these symbols to teams of ants that pass information between teams which no single ant is aware of.  And then there can be hyperteams and hyperhyperteams.

[1] P. Smolensky http://web.jhu.edu/cogsci/people/faculty/Smolensky/
[2] M. Minsky, Society of Mind, Simon & Schuster, 1986.
[3] D. Hofstadter, Metamagical Themas, Basic Books, 1985.

Tags: , , , , ,

Enactive Interface Perception

Posted in artificial intelligence, interfaces on February 24th, 2010 by Samuel Kenyon

UPDATE 2011: There is a new/better version of this essay:  “Enactive Interface Perception and Affordances”.

There are two theories of perception which are very interesting to me not just for AI, but also from a point of view of interfaces, interactions, and affordances.  The first one is Alva Noë’s enactive approach to perception.  The second one is Donald D. Hoffman’s interface theory of perception.

Enactive Perception vs. Interface Perception

Enactive Perception

The key element of the enactive approach to perception is that sensorimotor knowledge and skills are a required part of perception.

A lot of artificial perception schemes, e.g. for robots, run algorithms on camera video frames.  Some programs also use the time dimension, e.g. structure from motion.  They can find certain objects and even extract 3D data (especially if they also use a range scanner such as LIDAR, ultrasound, or radar).  But the enactive approach suggests that animal visual perception is not simply a transformation of 2-D pictures into a 3-D (or any kind) of representation.

Example of optical flow (one of the ways to get structure from motion). Credit: Naoya Ohta.

My interpretation of the enactive approach is that it suggests perception co-evolved with motor skills such as how our bodies move and how our sensors, for instance, eyes, move.  A static 2D image can not tell you what color blobs are objects and what are merely artifacts of the sensor or environment (e.g. light effects).  But if you walk around this scene, and take into account how you are moving, you get a lot more data to figure out what is stable and what is not.  We have evolved to have constant motion in our eyes via saccades, so even without walking around or moving our heads, we are getting this motion data for our visual perception system.

Of course, there are some major issues that need to be resolved, at least in my mind, about enactive perception (and related theories).  As Aaron Sloman has pointed out repeatedly, we need to fix or remove dependence on symbol grounding.  Do all concepts, even abstract ones, exist in a mental skyscraper built on a foundation of sensorimotor concepts?  I won’t get into that here, but I will return to it in a later blog post.

The enactive approach says that you should be careful about making assumptions that perception (and consciousness) can be isolated on one side of an arbitrary interface.  For instance, it may not be alright to study perception (or consciousness) by looking just at the brain.  It may be necessary to include much more of the mind-environment system—a system which is not limited to one side of the arbitrary interface of the skull.

Perception as a User Interface

The Interface Theory of Perception says that “our perceptions constitute a species-specific user interface that guides behavior in a niche.”

Evolution has provided us with icons and widgets to hide the true complexity of reality.  This reality user interface allows organisms to survive better in particular environments, hence the selection for it.

Perception as an interface

Hoffman’s colorful example describes how male jewel beetles use a reality user interface to find females.  This perceptual interface is composed of simple rules involving the color and shininess of female wing cases.  Unfortunately, it evolved for a niche which could not have predicted the trash dropped by humans that lead to false positives—which results in male jewel beetles humping empty beer bottles.

Male Australian jewel beetle attempting to mate with a discarded "stubby" (beer bottle). Credit: Trevor J. Hawkeswood.

All perception, including of humans, evolved for adaptation to niches.  There is no reason or evidence to suspect that our reality interfaces provide “faithful depictions” of the objective world.  Fitness trumps truth.  Hoffman says that Noë supports a version of faithful depiction within enactive perception, although I don’t see how that is necessary for enactive perception.

Of course, the organism itself is part of the environment.

True Perception is Right Out the Window

How do we know what we know about reality?  There seems to be a consistency at our macroscopic scale of operation.  One consistency is due to natural genetic programs—and programs they in turn cause—which result in humans having shared knowledge bases and shared kinds of experience.  If you’ve ever not been on the same page as somebody before, then you can imagine how it would be like if we didn’t have anything in common conceptually.  Communication would be very difficult.  For every other entity you want to communicate with, you’d have to establish communication interfaces, translators, interpreters, etc.  And how would you even know who to communicate with in the first place?  Maybe you wouldn’t have even evolved communication.

So humans (and probably many other related animals) have experiences and concepts that are similar enough that we can communicate with each other via speech, writing, physical contact, gestures, art, etc.

But for all that shared experience and ability to generate interfaces, we have no inkling of reality.

Since the UI theory says that our perception is not necessarily realistic, and is most likely not even close to being realistic, does this conflict with the enactive theory?

Noë chants the mantra that the world makes itself available to us (echoing some of the 80s/90s era Brooksian “world as its own model”).  If representation is distributed in a human-environment system, doesn’t that mean it must be a pretty accurate representation?  No.  I don’t see why that has to be true.  So it seems we can combine the two theories together.

There may be some mutation to enactive theories if we have to slant or expand perception more towards what happens in the environment and away from the morphology-dependent properties.  In other words, we may have to emphasize the far environment (everything you can observe or interact with) even more than the near environment (the body).  As I think about this and conduct experiments, I will report on how this merging of theories is working out.


Noë, A., Action in Perception, Cambridge, MA: MIT Press, 2004.

Hoffman, D.D, “The interface theory of perception: Natural selection drives true perception to swift extinction” in Dickinson, S., Leonardis, A., Schiele, B., & Tarr, M.J. (Eds.), Object categorization: Computer and human vision perspectives. Cambridge, UK: Cambridge University Press, 2009, pp.148-166.  PDF.

Hawkeswood, T., “Review of the biology and host-plants of the Australian jewel beetle Julodimorpha bakewelli,” Calodema, vol. 3, 2005.  PDF.

Tags: , , , , , , , , ,

What does SynapticNulship mean?

Posted in meta on December 28th, 2009 by Samuel Kenyon

I have been using the handle SynapticNulship for several months, and now it is the title of this blog.  I created this term because it conveys my interest in cognitive neuroscience combined with the concept of an advanced flying device.  A nullship is an antigravity conveyance—one might imagine a small volantor (flying car) or a large hovering ship not limited by typical aircraft lift constraints.  I was introduced to this word via a Heinlein novel that I read as a teenager (unfortunately, searching for “nullship” on the web turns up almost…null).  Spelling “null” as “nul” is unnecessary, but it could be considered an extra nod to computer nerds and those into the whole brevity thing.  A synapse is an interface between neurons in the brain, and the brain (along with the rest of the nervous system) is the captain’s chair of the mind (not the best metaphor but it sounds cool).  That notion paired with the notion of a futuristic antigravity ship results in a vague sense of flying above current minds with future technology.  Or it could mean a rising propelled by one’s mind.

Tags: , , , , ,