This is the "and Computers" part of the course. If I had to chose one phrase to describe the topics we will cover, I'd pick "machine intelligence" - simulating Brain or Mind with a computer.
Use of computers to model some aspect of "brain" or "mind".
There are many different approaches and lots of ways to categorize them, but I like to group them into three topics:
I talked a little bit about the first topic earlier in the semester. Today I'll talk some about the second. But, the emphasis for the rest of my lectures will be on the third. Some of you may want to attack the the first two in your projects.
So, what is the point of making computer models of "brain" or "mind"? I can think of two main reasons why we might want to do this -
Scientists and people who are interested in understanding fundamentals tend to be interested in the first goal, and engineers and people with a more practical orientation and an interest in applications tend to focus on the second. Clearly, there is a lot of overlap between these groups -- an engineer needs to know something about the first question in order to answer the second. Of course, a philosopher will want to start by answering the question "what is intelligence?" This can be a lot of fun to debate with friends late at night after a few beers, but I'm not going to say much about it because it could take forever. We all recognize that although computers are very fast, there are many things which people, and even very primitive animals, do better and faster. For example, which of these is not a tree? Or, what does this represent?
(viewgraph of trees and telephone pole) - all are "tree-like" ( line drawing of a cat ) - very stylized
These are hard problems for a computer to solve, even though we can solve them easily. All three approaches (Computational Neuroscience, AI, and Artificial Neural Nets) attempt to understand how problems like these can be solved. It's interesting to think about what we do when we solve them. What goes on in your mind or in our brain when you solve one of these problems?
This introspective approach is basically the one taken by traditional artifificial intelligence (AI). How would you tell someone what you did to solve these problems? (Or is your answer just a rationalization of something you did much less consciously?)
Computational Neuroscience falls mostly into the first category of goals, although it may shed some light on how to accomplish the goals of the second. The other two approaches, AI and Neural Networks, both offer a more practical approach toward implementing "machine intelligence". In order to understand how the brain works, we might have to model it down to the detail of single ion channels, but (at least with present computer technology) we wouldn't expect our simulations to be a practical way of performing "brain-like" computations. We need to leave out a some details.
So, how do we make a computer "intelligent"?
This knotty problem is like a tangled ball of wire which we can attack from two sides. The two different practical approaches are "Traditional AI" and "Artificial Neural Nets". The first approach has its roots in psychology, and might be called "Minds and Computers". It tends to focus on high level abstractions like "the mind". The other approach, which I'll call "Brains and Computers" tries to apply what biology and physiology tell about the way the brain works.
Traditional AI ---> "Machine Intelligence" <--- Neural Nets "Minds & Computers" /\/~~\ "Brains & Computers" psychology (behavior) ___/\ /\ \/\/\ \ biology (physiology) high level,top-down, X / /\ \_X___/~~~~~ low level,bottom-up, macroscopic / V / \/\__/ microscopic ~~~~~~Traditional Artificial Intelligence - falls into both categories of goals - represents the opposite extreme in terms of the level of microscopic detail in the models. The models tend to be high level abstract models based more on psychology or linguistics, rather than on biology. (Example: Freudian psychology uses non-biological concepts like the ego, superego and id. Jungian psychology uses others. Even if there are no biological structures corresponding to these functions, they may be useful models for understanding mental processes.) AI has been used as a tool for understanding "how intelligence works". But, I think it is most useful as a practical tool for making computers more like minds.
Artificial Neural Networks - largely the second category (although many workers in this field believe that simplified models can shed light on the workings of the brain). Here, the idea is to perform computations with networks of neuron-like elements. The approach has a lot in common with computational neurobiology, but we would like to leave out as many of the complicated biological details as we can safely ignore.
People with different interests and backgrounds tend to have different opinions/prejudices about the best end of the problem to attack. Having worked as a solid state physicist, I like the bottom-up approach. If I want to understand the behavior of fluids, I start with a computer simulation of interacting molecules and try to understand their macroscopic behavior on the basis of what we know about the microscopic behavior of the components. On the other hand, someone who needs to predict the behavior of 40 weight racing oil won't find this approach very helpful, at least in the short term. He or she may need a more empirical approach, using a lot of "black boxes" which aren't understood in detail. Also, we'll see that some types of "intelligent behavior" are better treated with one approach than the other. You might think about which approach seems best for the two pattern recognition problems which I posed. Do you take a cognitive approach (thinking about how you would describe the differences between a tree and a telephone pole)? Or is your process less conscious? Or do you think about how a frog recognizes a fly? A frog probably doesn't intellectualize things much. A pattern falls across its retina, and ZOT!
The nice thing about being an engineer is that you get to pick and choose between the various alternatives, and choose the one that seems to offer the best possibilities at the time. Sometimes when untangling a ball of wire, you work on one loose end until progress slows down, and then switch to the other. You hope to eventually meet in the middle. My own opinion is that recently, the Artificail Neural Net approach has been more fruitful - particularly if it is based on an understanding of the biology of the brain.
Another thing that I probably don't need to warn you about is that any time someone offers you a nice clear-cut duality like the one I've posed here, you should be VERY suspicious. You've been around long enough to know that these sorts of dichotomies are, at best, convenient but over-simplified approximations. There's a lot more overlap between these categories than I've implied. We have people in the psychology department here at CU who study both artificial and biological neural networks. People in the CS department work with both AI and neural network approaches to problems in linguistics. Marvin Minsky, who is a big name in traditional AI, has had a large influence (not completely positive) on the development of neural network theory. Plenty of biologists and neural network researchers take their direction from ideas about the behavior of the "mind". Some of them even speculate on fuzzy ideas like "consciousness". As we see progress from both directions, these groups will meet in the middle. Over time, we'll see the AI models become more grounded in biological fact, and the Neural Net models become more sophisticated and hierarchical in their organization. At some stage, it may no longer be convenient to make distinctions between these approaches.
I'm going to give you a very brief overview of the AI approach today. Then I'll concentrate on the Neural Net approach for the next six lectures.
If you would like to do some more reading on AI and artificial neural nets, you may want to look at this list of references, which gives some suggestions for optional reading. (This is not very up-to-date, however.)
When I say "AI" I mean traditional AI, because many people like to classify the neural net approach as just another part of AI. (This is not an unreasonable thing to do.)
The goals of what I call "traditional AI" are essentially the ones that I mentioned before: to use computer models to help us understand "intelligence", and to find ways to make computers exhibit intelligent behavior. Although I won't try to define intelligence, it doesn't hurt to try to list a few of the attributes of intelligent behavior. What things distinguish intelligent human behavior from a cleverly written computer program? Any ideas?
It might include the ability to:
Of course, there are many definitions of AI. The one I like the best comes from the book by Rich in the list of references: "AI is the study of how to make computers do things which, at the moment, people do better". It points out that AI is a moving target. We call something AI if we are on the verge of getting a computer to do it, but once we are sucessful, then the mystery disappears and it becomes just another clever computational technique.
One question which we could ask (but probably not answer) is: If a program exhibits the outward signs of intelligence, but does it in a completely different way from the way people do it, is it intelligent? In class, we discussed
The answer to this question probably depends on your goals - understanding the mind and intelligence vs. practical applications. However, to solve truly hard problems, it may not be possible to avoid paralleling human thought processes. Does evolution tend to produce optimum solutions? There is evidence that, at least in some areas, it does. (Humans can detect single photons - fly vision information processing efficiency has been shown to be near the theoretical limit.)
I'll list some applications, and give some history along the way. One reason for listing some of these potential applications and goals of Artificial Intelligence is so that later, after we know something about artificial neural networks, we can ask: "which of these are best solved by neural nets and which are best left to traditional AI techniques?".
Game playing - limited domain of knowledge - appeared easy, but has a "branching factor" of 35 for chess - leads to a "combinatorial explosion" Alan Turing - early '50s chess program Arthur Samuel - 1952 checkers - pioneered modern search and learning strategies - "informed search" - evaluation functions for "goodness" of move John McCarthy (one of AI's founding fathers, inventer of LISP) 1966 arranged a computer chess match between the US and Russia - neither played very good chess 1967 Richard Greenblatt of MIT AI lab wrote first of modern chess programs, MacHack (see "Hackers" by Steven Levy) 1989 - Neurogammon (Backprop ANN) won the international computer backgammon championship, competing against programs using traditional AI game-playing strategies (an important milestone) Theorem proving - specialized knowledge and formal logic, but still requires "good judgment" and intuition - can learn a lot about thought processes by trying to write such a program - how would you teach someone good strategies for proving trig identities? - introspection is a popular tool for AI Natural language processing, machine translation, speech recognition (this last is very difficult - "It's hard to wreck a nice beach") 1950's success with trivial problems ==> disillusionment 1954 Georgetown U - Russian/English translation for petroleum engineering had 250 words, 6 rules - looked promising. But was harder than it looked - An early translation to Russian and back to English gave "the vodka is good, but the meat is rotten" ("The spirit is strong, but the flesh is weak".) An important application: intelligent database retrieval, or an intelligent internet search engine 1966 NSF report - $20 million wasted ==> 1970's "winter of AI" 1973 in England, the "Lighthill report" drew similar conclusions and stopped research in AI for a decade. lesson: knowledge representation is important! - resurgence and hype in the '80s software ads "new and improved with AI" - like a detergent the popularity of Neural Nets has had a similar history Vision/pattern recognition - industrial and military applications Robotics - uses vision, plus problem solving, dealing with obstacles, formulating goals and plans - "put the red block on top of the yellow one" (The red block may be under the blue one.) Automatic Programming - output high level code from specifications Scheduling - optimal path - Traveling Salesman Problem - applications to manufacturing Machine Learning - performance should improve with experience Expert Systems - probably the most commercially successful area in AI - replace an expert in some specialized (and commercially profitable) domain
Expert Systems have undoubtedly been the most successful AI application in recent years, and have been largely responsible for the resurgence of interest in AI, so they merit a little more discussion. Cognitive scientists who are trying to understand the mind, and the nature of human thought don't find these very interesting, but they work! It will be interesting to compare the way that an expert system program and a neural net simulation solve the same problem.
They are called Expert Systems because they replace a human expert in some specialized (and usually commercially profitable) domain. They are also called Rule Based Systems or Knowledge Based Systems because they incorporate the expert's knowledge in an explicit set of rules.
Examples of some Expert Systems:
MYCIN - medical diagnoses in a specialized domain (infectious blood diseases) XCON - used by DEC to configure mainframe computer systems from a customer's order (there are many decisions to be made about physical placement of components in cabinets, cabling, power supplies, etc.) A more modern example would be computerized layout of semiconductor chips. PROSPECTOR - decision making in mineral exploration - "is it worthwhile digging here?" - it discovered a large mineral deposit DENDRAL - deduces a chemical structure from mass spectograms
There are also numerous programs for diagnosing equipment problems, giving tax advice, scheduling in manufacturing, etc. These are examples of certain types of problems for which this approach works well. All of them have some things in common:
Components of a RBS:
The control strategy or inference engine works in a loop, identifying rules that apply, choosing a rule and applying it. This modifies the contents of working memory, so that other rules may apply. The process repeats until the goal is reached, or no rules apply. There are various conflict resolution schemes which are used to pick which rule to use when more than one is applicable.
Sometimes, the components are arranged a little differently. The Rule Base and LTM can also be considered as part of the permanent "Knowledge Base", with the temporary STM shown in a separate box. The interactions between the components are the same, but the division of the knowledge base shown here emphasizes the differences between what we call Procedural and Declarative memory. This is justified by evidence that you are using different neural circuitry when learning to drive a car than when you are memorizing facts.
This approach was proposed in the early 60's as a model of methods used by people to solve problems. The concepts of LTM and STM arise from psychological experiments and have a basis in biology. For example, STM seems to obey the 7 ± 2 rule. When memorizing strings of numbers or items displayed on a table, most people can only hold about 7 ± 2 "chunks" of information in their mind at one time unless they are converted into long term memory. Of course, we don't have to abide by this size limitation in designing an expert system. However, it is useful to make this distinction between LTM and STM, and have an area of memory for temporary knowledge that will soon be thrown out.
Also, the rule base resembles a set of stimulus-response pairs which mirror the way experts solve a problem: "Well, in the case of so-and-so, I usually do such-and-such". "If the starter doesn't turn over, I check the battery. If the battery is dead, then I see if it will hold a charge."
Whether or not you like this as a good cognitive model, you have to agree that it is an effective technique for problem solving, if used on the right kinds of problems.
Here are some possible conflict resolution schemes: (Not covered in lecture)
An example: DENDRAL
In a mass spectrometer, the unknown compound is bombarded with a beam of electrons, expelling electrons and breaking it into a number of positively charged fragments. These are accelerated to a known velocity by an electric field, and deflected into a circular path by a magnetic field. From the radius of the path and the known applied fields, it is possible to calculate the mass/charge ratio of the detected particles. The viewgraph for DENDRAL shows the mass spectrogram plot of the relative intensity measured by the detector as a function of the mass/charge ratio of the particles, and the the chemical formula and structure for a particular compound,
C H O 8 16This formula can be known from quantitative analysis, but there are 698 possible combinations of the 8 carbon, 16 hydrogen and one oxygen atoms in a molecule. The actual structure, which has to be determined from the mass spectrometer results is:
CH - CH - C - CH - CH - CH - CH - CH 3 2 || 2 2 2 2 3 || OSome of the DENDRAL rules are:
Rule 74:
IF The spectrum for the molecule has two peaks at masses X1 and X2 such that: X1 + X2 = M + 28 and X1 - 28 is a high peak and X2 - 28 is a high peak and at least one of X1 or X2 is high THEN The molecule contains a ketone group
Rule 75:
IF There is a high peak at mass 71 and There is a high peak at mass 43 and There is a high peak at mass 86 and There is a high peak at mass 58 THEN There must be an N-propyl-ketone3 structure
Note that rule 75 leads to more specific conclusions than 74. If both of these apply, the conflict resolution scheme might choose rule 75 instead of 74. The conclusions of either of these rules would presumeably be part of the condition of others.
How does an expert system program differ from a program in Pascal or C with lots of IF-THEN-ELSE or CASE statements? The important distinction is that the rules are used as Data, rather than Code, i.e. it is "data driven" rather than "procedural" (We can see this from the example rules for the DENDRAL system.) This has some important consequences:
Discuss: how would we program a computer to recognize an insect? Would we use an expert system, or something more like "frog intelligence"?
Some of these will be just "buzzwords" without adequate explanation. The idea is just to identify some things that you might want to learn about some day.
Search - searching a decision tree - like "20 questions" "combinatorial explosion" - a "cost function" may be used to prune the tree Inference and logical deduction - deduction (cause --> effect), induction (generalization), and abduction (effect --> cause) - PROLOG language for predicate logic Fuzzy Logic - often incorporated into Expert Systems Semantic networks - represent interrelationships in a way that facilitates deductions - property inheritance (LISP and PROLOG have features that make it easy to implement this) Example: a network of relationships that apply to Clyde the elephant (What color is Clyde? Can elephants move?) - This representation has the neural analogy of "spreading activation". Frames - slot (category) and filler (information content) notation, developed by Minsky (MIT, '70s) - represent stereotypes which help us make sense of a situation - You walk into an unfamiliar room and have certain expectations (windows on walls, chairs on the floor and not on the ceiling, etc.) A robot should recognize a rectangle on the wall with light entering as a window. When understanding a report of an earthquake in a newspaper, one expects the location, number killed, dollar amount of damages, etc. Organizing information this way allows a program to represent knowledge in a way that allows it to answer questions. (A good program will handle exceptions well.) Scripts - Schank (Yale) - "John went into the restuarant and sat down. After waiting a long time, he got angry and left." (Did he eat? Why was he angry?) - similar to frames in the sense that it makes use of a stereotyped sequence of events. A "restaurant script" involves someone coming to take your order before you eat.These last three are forms of knowledge representation as well as techniques for querying an "intelligent" program.