From alnl-owner Wed Mar  4 08:33:24 1992
Received: from clvax1.cl.msu.edu ([35.8.2.1]) by scapa.cs.ualberta.ca with SMTP id <42585>; Wed, 4 Mar 1992 08:04:37 -0700
Date:	Wed, 4 Mar 1992 08:02:00 -0700
From:	"DOUG PIERCE CPS115" <pierced@clvax1.cl.msu.edu>
Subject: Extracting rules from ALNs?
To:	"alnl" <alnl@cs.ualberta.ca>
Message-Id: <92Mar4.080437mst.42585@scapa.cs.ualberta.ca>

    Greetings!

    I am new to this list, and so will introduce myself.  My academic
    background is: MS in Biostatistics, MS in Computer Science. 
    Working right now as a data manager and statistician at a medical
    school.  Have done lots of logic programming work (in Prolog). 
    Interested in inductive synthesis (or whatever you want to call
    it -- extraction, etc)  of rules from ALNs.  I don't see how one
    could possibly do this from any other type of net I know about. 
    My bias: I think you have to be an EE at heart to love those other
    nets (which I am not at all, though I have spent lots of time
    studying BP and similar algorithms).

    Is there anyone out there who has demonstrated, or who would like
    to attempt as much, that you can extract coherent rules from ALNs?
    It would not take a very complex example to make some judgment
    about this.  However, I think this idea is closely linked to
    others that would benefit from more substantial examples.

    For one thing, I am excited to hear any mention of problem
    decomposition with ALNs, but I haven't seen any clear attempts to
    illustrate how this might work -- let me know if I'm missing
    something here, I wouldn't be surprised.  I guess this would be a
    matter of using ALNs to find partitions or cover sets of input
    elements, then identifying hierarchical relationships within and
    between these sets.  But that may be a pretty wild guess.  If
    anyone has a clue about this, please let me know.


From alnl-owner Wed Mar  4 08:53:50 1992
Received: from clvax1.cl.msu.edu ([35.8.2.1]) by scapa.cs.ualberta.ca with SMTP id <42765>; Wed, 4 Mar 1992 08:42:41 -0700
Date:	Wed, 4 Mar 1992 08:30:00 -0700
From:	"DOUG PIERCE CPS115" <pierced@clvax1.cl.msu.edu>
Subject: Looking for a good metaphor
To:	"alnl" <alnl@cs.ualberta.ca>
Message-Id: <92Mar4.084241mst.42765@scapa.cs.ualberta.ca>

    One thing that is missing (for me) in the ALN literature, is a
    metaphor that captures in a succinct way the real strengths of this
    approach.  The discussion of implementation problems and their
    associated theoretical problems is of course the real meat of this
    type of interchange.  But...

    NNs and GAs (genetic algorithms) have their biological metaphors 
    ready-made and have gotten lots of mileage from such associations.  
    This is in spite of the extreme looseness of these assocations. 
    Proponents of ALNs have to look elsewhere for inspiration, I
    think.  Actually, my favorite kind of inspiration comes from
    positive results!  Which may turn out to be a distinguishing
    feature of ALNs.  

    In any case, I think this is a significant issue, not just fluff. 
    Any comments?

    Doug Pierce


From alnl-owner Fri Mar  6 09:01:11 1992
Received: from sumac.ecn.purdue.edu ([128.46.161.181]) by scapa.cs.ualberta.ca with SMTP id <42505>; Fri, 6 Mar 1992 04:37:53 -0700
Received: by sumac.ecn.purdue.edu (5.65/1.30jrs)
	id AA10580; Fri, 6 Mar 92 06:37:22 -0500
Date:	Fri, 6 Mar 1992 04:37:22 -0700
From:	mccauley@ecn.purdue.edu (Darrell McCauley)
Message-Id: <9203061137.AA10580@sumac.ecn.purdue.edu>
To:	alnl@cs.ualberta.ca
Subject: other appl of encoding/decoding


Has anyone used the encoding/decoding stuff in atree
for learning paradigms other than adaptive logic?
Investigated the consequences of using bit vectors
for something that expects inputs in [0,1]?

Darrell McCauley                Department of Ag Engr, Purdue Univ
mccauley@ecn.purdue.edu         West Lafayette, Indiana 47907-1146

From alnl-owner Fri Mar  6 12:32:57 1992
Received: by scapa.cs.ualberta.ca id <42528>; Fri, 6 Mar 1992 12:09:14 -0700
From:	Bill Armstrong <arms>
To:	alnl
Subject: using ALNs on quantized reals
Cc:	arms
Message-Id: <92Mar6.120914mst.42528@scapa.cs.ualberta.ca>
Date:	Fri, 6 Mar 1992 12:08:35 -0700

I would like to make some points about the use of adaptive logic
networks to solve problems with continuous values as inputs and/or
outputs.

atree release 2.0 has facilities for encoding quantized reals into
boolean vectors automatically.  A random walk technique is used
whereby one says how many bits are used to represent the value e.g. 20
and what is the number of bits changed to go from the code for one
value to the code for the next higher quantization level eg 1.  This
is called a 20:1 code.  Encoding is easily done by a table but the
decoding is done by linear search, which is *very* inefficient.  Allen
Supynuk has done decoding far more efficiently using Golay codes.
Note that the output variable has a coding of 16:3, where it is hoped
that the 3 will lead to fewer errors in decoding.

Here is part of a sample program for lf for some medical data:

tree
        size = 8191
        min correct  = 122
        max epochs  = 20

function
        domain dimension = 29
        largest  = 200 200 100 200 100 100 100 100 120 100 100 300 150 10 150 10
0 3000 6000 2000 2000 20 20 200 100 20 5 50 50 5 20
        smallest = 0 0 0 0 0 0 0 0 0 0 0 100 0 0 0 0 -2000 -2000 -2000 -2000 0 0
 0 0 0 0 0 0 0 0
        coding = 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1
20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1 20:1
20:1 20:1 20:1 16:3
        quantization = 101 101 101 101 101 101 101 101 121 101 51 151 151 101 15
1 101 101 101 101 101 101 101 101 101 21 51 51 51 51 21
        training set size = 122
        training set =

    69   102    55    68    43    24    31    17     0  3.45    17 158  92.8
1.68    50  24.4  1182  2423   324   664   3.2   1.6  46.2  22.5  1.45 0.71  21
.1  10.3  2.05    13

etc. ...

I now believe that the random walk encoding technique is the wrong way
to use ALNs on continuous inputs, one should just compute a threshold
like xi >= tj or xi < tj to get each input to the adaptive tree.  The
problem with random walk encoding lies not only in the somewhat
complex decoding step, but in the fact that the boolean functions
produced by the random walk encodings cannot be forced to be monotonic
increasing and monotonic decreasing when this is necessary.  Namely,
if you look at what happens to, say the 5th bit of the 20 used above,
it may change back and forth between 0 and 1 several times as one goes
sequentially through all the quantization levels of the original real.
Hence the function produced by the tree can do the same.

I feel monotonicity is an extremely important tool for assuring that
the function synthesized by an ALN is within the specified range of
values for *all* possible inputs.  Otherwise the network might be very
unsafe to use.  There are in general two ways I can think of for
developing safe neural nets.

1) You might be able to show using a bound on the magnitude of the
weights in an MLP that you have covered the set of possible inputs
adequately in testing so that between test points the function can
only vary by some acceptable epsilon.  For systems that use
approximate arithmetic and table lookup, those factors also have to be
taken into account. For low dimensional spaces where each variable is
quantized to few values, this technique may provide a solution. In
many cases the input space is of higher dimension e.g. 10, and each
dimension could be quantized to 100 levels.  Already this gives 100 ^
10 points in the input space, which is too many to test individually.
If the required epsilon allows you to test 5 ^ 10 points by just
testing one at the centre of a ball, you still have 20 ^ 10 tests to
make, which is still ten million million.  2) You could try to force
monotonicity, so that the greatest and least values of the network
output are guaranteed to be taken on at two corners of a
parallelepiped in the space of inputs, which you make test points.  In
the usual MLPs, where the function to be learned is not monotonic, you
will have both positive and negative weights influencing the output
for *every* input.  Hence you can't guarantee a safe system by this
means.

If you want a safe ALN system, I suggest you break the input space up
into perhaps overlapping parts using either conditions on the input
variables themselves eg x < 15 or x >= 10, or using some computed
functions c(x1,...xn) < 0, etc.  In each of the parts so created your
a priori knowledge should suggest that the function to be learned can
be assumed monotonic in each input separately.  The parts learned and
tested separately can then be safely combined by higher level gates,
with very little additional cost in hardware or time.

You set up the monotonic problems so that instead of learning the
value of the function f, you learn a predicate P(x1,..., xn, y), where
the value of P is 1 if y < f(x1,...,xn) and is 0 otherwise.  Then you
force the ALN trained on that part to be appropriately monotonic by
using as inputs to the adaptive tree thresholds of only one direction
(< or >=, but not both), the direction depending on the direction of
monotonic increase in the variable concerned.  In particular, y only
gets threshholds of the form y < tj (otherwise binary search won't
work).

Now the predicate answers (by giving a 1) questions of the form "is
the function value for x1,...,xn greater than y?"  If we want only a
decision like this we are finished.  If we want an approximation to
the the actual f(x1,...,xn) we use the same tree iteratively, each
time changing just y to perform binary search on y.  The number of steps of
search depends on the precision to which you need the function value.

This technique can allow one to develop safe solutions to problems,
and to avoid the complexity and obscurity involved in using the random
walks.  The current atree code doesn't make it easy to develop safe
solutions, so I can only hope that people using ALNs and other MLPs
are somehow taking steps to guarantee safety where it is required.
The currently used "black box" technique of train and test is
definitely risky unless additional steps are taken.


From alnl-owner Wed Mar 11 23:53:39 1992
Received: by scapa.cs.ualberta.ca id <42672>; Wed, 11 Mar 1992 18:17:53 -0700
From:	Bill Armstrong <arms>
To:	alnl
Message-Id: <92Mar11.181753mst.42672@scapa.cs.ualberta.ca>
Date:	Wed, 11 Mar 1992 18:17:32 -0700

From:	"DOUG PIERCE CPS115" <pierced@clvax1.cl.msu.edu>
Subject: Extracting rules from ALNs?
To:	"alnl" <alnl@cs.ualberta.ca>
Message-Id: <92Mar4.080437mst.42585@scapa.cs.ualberta.ca>
Status: OR

Doug Pierce writes in introducing himself to the alnl list:

>    Have done lots of logic programming work (in Prolog).

>   For one thing, I am excited to hear any mention of problem
    decomposition with ALNs, but I haven't seen any clear attempts to
    illustrate how this might work -- let me know if I'm missing
    something here, I wouldn't be surprised.  I guess this would be a
    matter of using ALNs to find partitions or cover sets of input
    elements, then identifying hierarchical relationships within and
    between these sets.

The use of predicates to efficiently represent continuous-valued
functions is related to Prolog, i. e. y=f(x) is represented as
P(x,y), where the logic tree of ANDs and ORs gets thresholds x < t1, x
>= t2, y < t3, etc. as inputs.  Since thresholds of the form y >= t4
are not used, we can do binary search to find y given x.  If we
decompose the space hierarchically, with ALNs as you suggest or just
with simple conditions like thresholds or linear discriminants, then
in each part we can impose a condition of monotonicity by restricting
the thresholds on x to one direction, e.g. x >= t5.  Now, given y, we
can find in any part the least x such that P(x,y) = 1.  This provides
an inverse of the monotonic function on the part.  The details are
tedious to work out, but the idea is that the ALN approach enables you
you instantiate any variables in a predicate and then compute the
constraints that apply to the remaining variables.  E.g.  With
P(x,y,z,w), you can instantiate x and w and still be able to find for
each y the least (or greatest) value of z which makes P=1.

So I think ALNs fit nicely into the framework of newer versions
of Prolog that emphasize constraint solving.

>  (I am ...) interested in inductive synthesis (or whatever you want to call
    it -- extraction, etc)  of rules from ALNs.  I don't see how one
    could possibly do this from any other type of net I know about. 

I think it is an NP-complete problem to decide if a given logic tree
will ever produce a 1 output (satisfiability).  Hence the general
problem of telling what an ALN does is intractable.  Since ALNs are
special cases of multilayer perceptrons, it is also an intractable
problem to tell what an MLP does.


>    My bias: I think you have to be an EE at heart to love those other
    nets (which I am not at all, though I have spent lots of time
    studying BP and similar algorithms).

On the contrary, I think EEs, who are very familiar with switching
theory, should love ALNs.  Maybe numerical analysts should love BP
nets, after all, BP is just approximation theory with a particular
twist based on Kolmogorov's theorem.

>From Doug's second message:

>   One thing that is missing (for me) in the ALN literature, is a
    metaphor that captures in a succinct way the real strengths of this
    approach.

ALNs are fast, simple and can support the design of safe systems.
Until we get a good metaphor, if there is one, "fast, simple, and
safe" is a good characterization.

>   NNs and GAs (genetic algorithms) have their biological metaphors 
    ready-made and have gotten lots of mileage from such associations.  
    This is in spite of the extreme looseness of these assocations. 

I would also like to think that brain theory people would prefer an
artificial neural network where all the signals are 0 or 1, like
action potentials in axons and dendrites.  Some of the adaptation
algorithms I have tried over the years have a flavor much closer to
(my understanding of) biological systems than either backprop or the
published ALN algorithms.  However, given a choice between something
that looks biological and something which implements something that
produces a near optimal solution to a statistical pattern recognition
problem, I would invariably choose the latter. (After all, why aspire
to merely *equal* biological brains' power?  There is a real need for
something better -- don't you agree?)

>   Proponents of ALNs have to look elsewhere for inspiration, I
    think.  Actually, my favorite kind of inspiration comes from
    positive results!  Which may turn out to be a distinguishing
    feature of ALNs.

Hope so.

>   In any case, I think this is a significant issue, not just fluff. 

As someone who would like the ALN field to flourish, I agree.
Thanks for your thoughtful comments.

Bill


From alnl-owner Thu Mar 12 22:02:17 1992
Received: from bramble.ecn.purdue.edu ([128.46.161.174]) by scapa.cs.ualberta.ca with SMTP id <42234>; Thu, 12 Mar 1992 21:42:55 -0700
Received: by bramble.ecn.purdue.edu (5.65/1.30jrs)
	id AA05210; Thu, 12 Mar 92 23:42:15 -0500
Date:	Thu, 12 Mar 1992 21:42:15 -0700
From:	mccauley@ecn.purdue.edu (Darrell McCauley)
Message-Id: <9203130442.AA05210@bramble.ecn.purdue.edu>
To:	alnl@cs.ualberta.ca
Subject: quantization as a filter?


I am completing a test of adaptive logic networks
for a particular application (predicting a physical
property using image features). It *appears* (meaning 
I haven't fully verified this) that ALNs did much 
better at predicting than statistical regression. 
Now, I'm trying to figure out/explain why.

One theory is that the quantization of real-valued
inputs by the atree software may have acted as a filter,
or, in some other way, was beneficial in function mapping.
Has this notion ever been suggested by anyone? References?

Darrell
--
James Darrell McCauley          Department of Ag Engr, Purdue Univ
mccauley@ecn.purdue.edu         West Lafayette, Indiana 47907-1146

From alnl-owner Fri Mar 13 10:34:43 1992
Received: by scapa.cs.ualberta.ca id <42466>; Fri, 13 Mar 1992 10:17:17 -0700
Subject: Re: quantization as a filter?
From:	Bill Armstrong <arms>
To:	mccauley@ecn.purdue.edu (Darrell McCauley)
Date:	Fri, 13 Mar 1992 10:16:58 -0700
Cc:	alnl
In-Reply-To: <9203130442.AA05210@bramble.ecn.purdue.edu>; from "Darrell McCauley" at Mar 12, 92 9:42 pm
X-Mailer: ELM [version 2.3 PL11]
Message-Id: <92Mar13.101717mst.42466@scapa.cs.ualberta.ca>

> 
> 
> I am completing a test of adaptive logic networks
> for a particular application (predicting a physical
> property using image features). It *appears* (meaning 
> I haven't fully verified this) that ALNs did much 
> better at predicting than statistical regression. 
> Now, I'm trying to figure out/explain why.
> 
> One theory is that the quantization of real-valued
> inputs by the atree software may have acted as a filter,
> or, in some other way, was beneficial in function mapping.
> Has this notion ever been suggested by anyone? References?
> 
> Darrell
> --
> James Darrell McCauley          Department of Ag Engr, Purdue Univ
> mccauley@ecn.purdue.edu         West Lafayette, Indiana 47907-1146
> 

You don't say whether you are using linear regression or not.  If the
problem is non-linear, why shouldn't a flexible mechanism like ALNs do
much better?

The quantization mechanism may act to filter out small amounts of noise,
but really, if the coding of reals into booleans is done using random walks,
it should merely preserve (at least locally) the relative distances between
quantization levels.  I don't see why it would improve things much over an
appropriate statistical technique.

Anyway, I hope it doesn't turn out to be untrue when you verify it.

Bill

From alnl-owner Fri Mar 27 10:01:32 1992
Received: from clover.ecn.purdue.edu ([128.46.161.152]) by scapa.cs.ualberta.ca with SMTP id <42585>; Fri, 27 Mar 1992 09:29:49 -0700
Received: by clover.ecn.purdue.edu (5.65/1.30jrs)
	id AA05643; Fri, 27 Mar 92 11:29:16 -0500
Date:	Fri, 27 Mar 1992 09:29:16 -0700
From:	mccauley@ecn.purdue.edu (Darrell McCauley)
Message-Id: <9203271629.AA05643@clover.ecn.purdue.edu>
To:	alnl@cs.ualberta.ca
Subject: interesting error trend with atree


Howdy,
 I recently wrote that I was using atree to predict a physical
property using image features. After running a series of trials
in which I gradually increased "min correct" parameter in 'lf'
and plotting prediction error over the range of the physical
property, I noticed a puzzling but interesting trend.

[I'm going to take a little bit to explain this, so bear with
 me.  I've found that if you ever ask netland for help, you had
 better be as descriptive as possible.]

 Generally, the function mapping resulted in over-prediction for
low values and under-prediction for high values (which is 
probably due to the features that I am using and the distribution 
of the data set).  Try to visualize a graph of prediction error 
and the actual values.  If a regression line is plotted, it has
a negative slope.
 Now, for a relatively low value of "min correct," many of the points
in the prediction error graph lie almost on a straight line!
Bothersomely so (if that's a word).  As I gradually increased "min
correct," this predictability of the error went away. For the best
case (lowest mean prediction error), there's not really any trend.
 In other words, fitting 
              (predicted-actual) = m(actual) + b 
resulted in a higher R-square for the lower values of "min 
correct" and lower R-square for higher values (from 0.6 to 0.2).
The slope (m) decreased with increasing "min correct."
(I know this may be hard to visualize - I can send a 30KB PostScript
graph if you are genuinely interested.) BTW, the average difference
between predicted and actual decreased with increasing "min correct"
(as you would expect).
 If you've got this picture in your head, you can see why this
is disturbing---the average error is higher but *much* more 
predictable when the network is required to learn fewer patterns.
My theory is that (1) I am not using enough (or the right) features
to characterize low/high values of the physical property and 
(2) the decreasing predictability of the error trend with more
training patterns (recall R-square decreases from 0.6 to 0.2)
may indicate some form of over-training (too many patterns, not
too many iterations).
 I could have a "whole heap of" outliers, but previous studies
with this type of data indicates that high levels of variability/
unpredictability. The straight line in the graph I described
for lower "min correct" may indicate that, with careful study, I
could just post-process the data to correct the error.

Well, I thought that this was interesting. If you have any comments, 
please send them.

Darrell

From alnl-owner Mon Mar 30 07:48:13 1992
Received: from clvax1.cl.msu.edu ([35.8.2.1]) by scapa.cs.ualberta.ca with SMTP id <42434>; Mon, 30 Mar 1992 07:27:02 -0700
Date:	Mon, 30 Mar 1992 07:26:00 -0700
From:	"DOUG PIERCE CPS115" <pierced@clvax1.cl.msu.edu>
Subject: RE: interesting error trend with atree
To:	"mccauley" <mccauley@ecn.purdue.edu>
cc:	"alnl" <alnl@cs.ualberta.ca>
Message-Id: <92Mar30.072702mst.42434@scapa.cs.ualberta.ca>

==================================================================
From:	mccauley@ecn.purdue.edu (Darrell McCauley)
Message-Id: <9203271629.AA05643@clover.ecn.purdue.edu>
To:	alnl@cs.ualberta.ca
Subject: interesting error trend with atree

Howdy,
 I recently wrote that I was using atree to predict a physical
property using image features. After running a series of trials
in which I gradually increased "min correct" parameter in 'lf'
and plotting prediction error over the range of the physical
property, I noticed a puzzling but interesting trend.

    etc.........


Darrell
    ================================================================

    This is very interesting, but I am a bit confused about what is
    changing and what is constant.  You say:

 If you've got this picture in your head, you can see why this
is disturbing---the average error is higher but *much* more 
predictable when the network is required to learn fewer patterns.

    Before I got to this point, I thought the error trend was
    dependent on your increasing the "min correct" parameter, *not*
    the number of patterns.  Which is it?

    If the number of patterns enters in here, I would guess that for
    small numbers, the net is memorizing the individual patterns, and
    thus the "error" terms are systematic, not random.

    I would also like to know how you processed the images (for other
    reasons entirely).  Did you digitize photographs, and if so, how
    did you transform a grey-scale image?  Did you end up with a grid 
    of 0's and 1's (black/white), or did you use grey-scale values as
    input to the ALN?  What was the resolution (X*Y)?  Did you have to
    reduce the size of the inputs to cope with processing time?

    The reason I ask is that I am doing a project on recognition of
    facial features, and will have to deal with these issues very
    soon.  Thanks for any ideas.  I am hoping that ALN's will be much
    faster than other net types for this application.

    Doug


From alnl-owner Mon Mar 30 11:45:10 1992
Received: from brier.ecn.purdue.edu ([128.46.161.175]) by scapa.cs.ualberta.ca with SMTP id <42143>; Mon, 30 Mar 1992 11:27:16 -0700
Received: by brier.ecn.purdue.edu (5.65/1.30jrs)
	id AA09272; Mon, 30 Mar 92 13:26:43 -0500
Date:	Mon, 30 Mar 1992 11:26:43 -0700
From:	mccauley@ecn.purdue.edu (Darrell McCauley)
Message-Id: <9203301826.AA09272@brier.ecn.purdue.edu>
To:	alnl@cs.ualberta.ca
In-Reply-To: "DOUG PIERCE CPS115"'s message of Mon, 30 Mar 1992 07:26:00 -0700 <92Mar30.072702mst.42434@scapa.cs.ualberta.ca>
Subject: interesting error trend with atree


>     This is very interesting, but I am a bit confused about what is
>     changing and what is constant.  You say:
> 
>  If you've got this picture in your head, you can see why this
> is disturbing---the average error is higher but *much* more 
> predictable when the network is required to learn fewer patterns.
> 
>     Before I got to this point, I thought the error trend was
>     dependent on your increasing the "min correct" parameter, *not*
>     the number of patterns.  Which is it?

yes?  Is not "min correct" the number of patterns that it is *required*
to learn? (i.e., adapt until you have learned "min correct" or reach
"max epochs," then predict for test set and exit).

To clarify, the number of patterns in the training set never changed,
only the "min correct" number.

>     If the number of patterns enters in here, I would guess that for
>     small numbers, the net is memorizing the individual patterns, and
>     thus the "error" terms are systematic, not random.

This could be, but does anyone have a clue why I saw the particular 
pattern?


From alnl-owner Mon May  4 19:42:50 1992
Received: from thistle.ecn.purdue.edu ([128.46.161.178]) by scapa.cs.ualberta.ca with SMTP id <42552>; Mon, 4 May 1992 19:28:37 +0100
Received: by thistle.ecn.purdue.edu (5.65/1.30jrs)
	id AA25023; Mon, 4 May 92 20:28:07 -0500
Date:	Mon, 4 May 1992 19:28:07 +0100
From:	mccauley@ecn.purdue.edu (Darrell McCauley)
Message-Id: <9205050128.AA25023@thistle.ecn.purdue.edu>
To:	alnl@cs.ualberta.ca
Subject: aln application paper available

 
The following paper (11 pages in length) is available via anonymous
ftp from pasture.ecn.purdue.edu:pub/mccauley/mccauley.beef.ps.Z and
         archive.cis.ohio-state.edu:pub/neuroprose/mccauley.beef.ps.Z

A shorter version was submitted to Transactions of the ASAE.  Any
comments or questions should be sent to mccauley@ecn.purdue.edu.
Though I cannot mail hardcopies, I may be willing to e-mail
compressed, uuencoded PostScript versions.

-------------------------------------------------------------------------

          FAT ESTIMATION IN BEEF ULTRASOUND IMAGES USING 
	       TEXTURE AND ADAPTIVE LOGIC NETWORKS 


James Darrell McCauley, USDA Fellow      Brian R. Thane, Graduate Student
 Dept of Agricultural Engineering        Dept of Agricultural Engineering 
        Purdue University                      Texas A&M University
   (mccauley@ecn.purdue.edu)                 (thane@diamond.tamu.edu)
	
	      A. Dale Whittaker, Assistant Professor
                 Dept of Agricultural Engineering
                       Texas A&M University
                      (dale@diamond.tamu.edu)

Overviews of Adaptive Logic Networks and co--occurrence image texture
are presented, along with a brief synopsis of instrument grading of
beef. These tools are used for both prediction and classification of
intramuscular fat in beef from ultrasonic images of both live beef
animals and slaughtered carcasses.  Results showed that Adaptive Logic
Networks perform better than any fat prediction method for beef
ultrasound images to date and are a viable alternative to
statistical techniques.  
\keywords{Meat, Grading, Automation, Ultrasound Images, Neural Networks.}

--
James Darrell McCauley          Department of Ag Engr, Purdue Univ
mccauley@ecn.purdue.edu         West Lafayette, Indiana 47907-1146
** "Do what is important first, then what is urgent." (unknown) **

From alnl-owner Mon May 11 00:48:34 1992
Received: by scapa.cs.ualberta.ca id <42481>; Mon, 11 May 1992 00:37:39 +0100
From:	Bill Armstrong <arms>
To:	alnl
Subject: new version for Windows
Message-Id: <92May11.003739mdt.42481@scapa.cs.ualberta.ca>
Date:	Mon, 11 May 1992 00:37:20 +0100

There is a new version of atree for Windows on the PC, atree release
2.5, available via ftp from menaik.cs.ualberta.ca [129.128.4.241] in
directory pub.

If you just want to try running it, then you can get the file
a25exe.exe.  This is self extracting.  Copy it into your main
directory on a PC and run it.  It creates a directory atree_25 which
contains executables.  This is a much enhanced version, and you will
enjoy the demonstration of noise immunity shown by the ocr demo.  Try
running ocr.exe and see how your eyes compete with the ALNs in
distinguishing the characters in the presence of noise!

If you intend to do some programming, you can get the full source in
atre25.exe, which also contains everything the shorter version does.

Both files are also available on CompuServe in forums WINADV and AIEXPERT.

ALNs have had great successes lately in several areas: beef grading,
classifying particles at an accelerator facility and designing
walking prostheses for spinal-cord damaged patients.

To date we have distributed over 1400 direct copies of the atree
release 2.0 software on Internet and over 800 on CompuServe.  ALNs
will be taught in some university courses as an alternative approach
to backprop.

The Unix version is still release 2.0, but this will be upgraded soon
to include some of the new features of release 2.5 under Windows.

Good luck with release 2.5!

Bill

PS atree release 2.5 was done by Monroe Thomas, who deferred taking on
a steady job so he could work on it and on a commercial version that
may be ready in the Fall.

Thanks Monroe!

From alnl-owner Wed Jun 17 14:28:31 1992
Received: from mail.uunet.ca ([142.77.1.1]) by scapa.cs.ualberta.ca with SMTP id <42453>; Wed, 17 Jun 1992 14:15:27 +0100
Received: from mitel by mail.uunet.ca with UUCP id <10018>; Wed, 17 Jun 1992 16:14:30 -0400
Received: from Software.Mitel.COM (spock) by Mitel.COM (4.1/SMI-4.1)
	id AA04031; Wed, 17 Jun 92 12:26:09 EDT
Received: from msn506 by Software.Mitel.COM (4.0/SMI-4.0)
	id AA28176; Wed, 17 Jun 92 12:26:08 EDT
Date:	Wed, 17 Jun 1992 10:26:08 +0100
From:	Gary.Murphy@Software.Mitel.COM (Gary Murphy)
Message-Id: <9206171626.AA28176@Software.Mitel.COM>
To:	alnl@cs.ualberta.ca
Subject: AI.CD-ROM

From alnl-owner Sat Jul  4 00:28:12 1992
Received: by scapa.cs.ualberta.ca id <42429>; Sat, 4 Jul 1992 00:13:48 +0100
From:	Bill Armstrong <arms>
To:	alnl
Subject: atree release 2.6 is available now
Message-Id: <92Jul4.001348mdt.42429@scapa.cs.ualberta.ca>
Date:	Sat, 4 Jul 1992 00:13:41 +0100

Dear alnl subscriber:

The adaptive logic network (ALN) simulation package atree release 2.6
is available on menaik.  It runs on IBM PCs and compatibles
under Windows 3.x.  Included is documentation and ON-LINE HELP that
will help you understand the basic principles of adaptive logic
networks and enable you to try out some examples of your own creation.

This package is not a toy.  Experimenters are using it on challenging
problems of medicine, physics and the environment.  It is possible to
use off-the shelf programmable logic devices to realize the results of
training these logic networks in very high-speed hardware, though we
do not provide those facilities here.

Please read the license, which among other things allows only
non-commercial use.  A commercial version is planned for later
release.

Atree release 2.6 is available in either of two files in the pub/
directory: atre26.exe and a26exe.exe.  The file atre26.exe contains
the full C and C++ sources for those who want to program some
applications using Borland C++ 3.x and Application Frameworks, or some
similar C/C++ development environment.  The other, smaller file
contains just the executables for those who just want to try out atree
release 2.6.

It is recommended that you execute atre26.exe in your main directory,
whereupon it will create a subdirectory atree_26 and extract
everything into it.  Running "setup" in that directory will create a
group of icons you can use to invoke demos and the facilities for
programming adaptive logic network applications in the lf language.
The "Open" command gives you access to numerous instructive examples.

The new OCR demo in release 2.6 allows the user to create his own
character shapes.  Rotations and translations of the characters and a
large amount of "salt and pepper" noise can be overcome by the
adaptive logic networks.  We think you will be impressed.

The Unix version, atree release 2.0, will be upgraded soon to add some
of the functionality of release 2.6.  It is not yet capable of the nice
color displays of the Windows version, but besides running on Unix
workstations it can be easily ported to Macintosh, Amiga, and other
machines.

Thanks to all those who have sent comments about their work on ALNs,
and have helped us to develop new concepts at a rapid rate. We would
appreciate your continued help in finding any problems in release 2.6.

Thanks to all of you on the alnl mailing list.  I hope the list will
become more active now that real results are starting to appear in
high energy physics, medicine, agriculture etc.

Bill


From alnl-owner Thu Jul 16 14:36:27 1992
Received: by scapa.cs.ualberta.ca id <42906>; Thu, 16 Jul 1992 14:18:13 +0100
Subject: Problems with atree 2.0 and 2.6
From:	Bill Armstrong <arms>
To:	alnl
Date:	Thu, 16 Jul 1992 14:18:02 +0100
X-Mailer: ELM [version 2.3 PL11]
Message-Id: <92Jul16.141813mdt.42906@scapa.cs.ualberta.ca>

Shyam Kamadolli pointed out some problems with atree 2.0 and 2.6:


> 1) atree 2.0 on UNIX
>    My problem is that the histogram generating script complains about the
>    line being too long after I run the lf routine on my training and test sets.

Rolf Manderscheid suggest trying gawk (gnu-awk) instead of awk if you have it.
Let me know what happens please.

----------------------------------


I did try gawk.. it WORKS. Thanks again.


2) atree 2.6

In lfBatch, if you try to change directories, the application
dies.

This is a bug, which will be corrected soon.

If others have found any problems, please let us know asap.

Thanks.

Bill

From alnl-owner Fri Jul 24 14:58:00 1992
Received: by scapa.cs.ualberta.ca id <42376>; Fri, 24 Jul 1992 14:20:09 +0100
From:	Bill Armstrong <arms>
To:	alnl
Message-Id: <92Jul24.142009mdt.42376@scapa.cs.ualberta.ca>
Date:	Fri, 24 Jul 1992 14:19:56 +0100

For those who would like to understand the ALN learning algorithm used
in the atree demo software, here is a little writeup that should help.

The best published introduction to ALN learning is the paper:

W. Armstrong and J. Gecsei, "Adaptation Algorithms for
Binary Tree Networks", IEEE Trans. on Systems, Man and
Cybernetics, 9, 1979, pp. 276-285.

The only significant changes in atree 2.0 and 2.6 from that paper are
that the heuristic responsibility is now a combination of true
responsibility, error responsibility, and responsibility of both
children of LEFT and RIGHT nodes.  The details are all in the "adapt"
routine in atree.c shown in part below.

private void
adapt(tree,vec,desired)

atree *tree;
bit_vec *vec;
bool_t desired;

{
    register int lres;
    register int rres;
    register int funct;

/*
 * INCR and DECR implement bounded counters.
 */
#define INCR(t) ((t) += ((t) < MAXSET))
#define DECR(t) ((t) -= ((t) > MINSET))

    funct = tree -> c_tag;

The c_tag is 0 for an AND node and 1 for an OR node.  LEFT and RIGHT
nodes have tags 2 and 3. Which is which doesn't matter here.

The node we are adapting is the root of a recursively defined "tree"
datatype.  First the node has to have its inputs evaluated if that
hasn't already been done. ( Of course, we don't adapt leaves.)

    if (funct != LEAF)
    {
        /* If the left child is unevaluated, evaluate it */

        if ((lres = tree -> n_sig_left) == UNEVALUATED)
        {
            lres = atree_eval(tree -> n_child[LEFT_CHILD], vec);
            tree -> n_sig_left = lres;
        }

Now n_sig_left stores the result of evaluation on the left.

        /* If the right child is unevaluated, evaluate it */

...  Same thing on the other side.

        /* Update the counters if needed */

There are two counters, one associated with the (left,right)=(0,1)
input pair and one with the (1,0) input pair.  The appropriate counter
is counted up or down depending on the desired network output.  ( Yes,
the desired *network* output -- because all nodes are monotonic
increasing functions.  There is thus a positive correlation between
the net output and any element's output.)

        if (lres != rres)

If there is a (0,0) or (1,1) input pair to the node, no adaptation occurs,
according to the above condition.

        {
            if (lres)
            {
                (void) (desired ? INCR(tree -> n_cnt_left)
                                : DECR(tree -> n_cnt_left));
            }
            else
            {
                (void) (desired ? INCR(tree -> n_cnt_right)
                                : DECR(tree -> n_cnt_right));
            }

            /* has the node function changed? */

The values of the two counters determine the function of the node; e.g
if they are both above (below) their midpoints, it is an OR (AND)
node.  If they are on different sides of the midpoints, the node is a
LEFT or RIGHT.

            tree -> c_tag = node_function(tree);
            funct = tree -> c_tag;
        }

        /* Assign responsibility to the left child */

It's symmetric to the resp of the right child discussed below, so
we don't need to discuss it.

        /* Assign responsibility to the right child */

        if (lres != desired || funct != (lres ? OR : AND))
        {
            adapt(tree -> n_child[RIGHT_CHILD], vec, desired);
        }

This finishes the algorithm, except for the one line that must be
understood to grasp the entire atree credit-assignment algorithm, and
that is the condition in the last if-statement:

lres != desired || funct != (lres ? OR : AND)

The current node is heuristically responsible when we get to this if
statement, and the question is which children should be h. resp.

If the left input is not equal to the desired network output (For a h.
resp.  node lres != desired is called an "error" on the left), the
right child is made responsible.  The rationale for this is that right
child is made responsible if there is an error on the left, since
maybe the left child can't correct the error (maybe it's a leaf).

If the node function is not OR or AND (i.e. it is LEFT or RIGHT with
funct == 2 or 3, which are never values of the expression (lres ? OR :
AND)), then the right child is made h. resp.  This says: try to put
the subtrees that are cut off by a LEFT or RIGHT to some useful work.
If such a cut-off subtree gets good enough, the node function might
change to AND or OR to give that subtree an effective role in the net.

Finally, if the left input is 0 for an OR node or 1 for an AND node
(i.e. if the right signal rres is enabled to pass through), then the
right child is heuristically responsible.

This last part of the credit assignment by h. resp., if implemented by
itself, is true responsibility for the AND and OR nodes: a node is
truly responsible if a change in its output would change the network's
output.  This component of heuristic responsibility is just
hill-climbing or gradient descent.  The two earlier components of h.
resp. go beyond hill climbing to give the learning algorithm much more
power than hill climbing alone.

Note that the left input controls the right child's responsibility,
and vice-versa.  This implies that every node in the tree can interact
with every other node at every pattern presentation.

There is one other trick: if the network output is still wrong after
the first adaptation, adapt again.  This compensates for a shortage of
errors to correct as the end of adaptation approaches.

(The code above was coded by Rolf Manderscheid, to whom should go
the glory or blame for putting a set of nested conditions into one
line.)