next up previous contents
Next: OptPars Up: THE RSM DATA STRUCTURES Previous: THE RSM DATA STRUCTURES

Task Specification

The TaskSpec structure is used to define an optimization problem to the system. Here is an example:

                  |  online->optweights NULL  |  final->xweights  NULL   |     
                  |  online->ytarget    NULL  |  final->pn_string NULL   |     
                  |  online->yweights   NULL  |  final->max       TRUE   |     
                  |  online->xtarget    NULL  |  con->fixed_dims  0      |     
                  |  online->xweights   NULL  |  con->fixed_vals  0.5    |     
                  |  online->pn_string  NULL  |  con->lo_vals     0      |     
                  |  online->max        TRUE  |  con->hi_vals     1      |     
                  |  final->optweights  1     |  total_steps      30     |     
                  |  final->ytarget     NULL  |  auto_facode      FALSE  |     
                  |  final->yweights    NULL  |  [usin_size       1]     |     
                  |  final->xtarget     NULL  |  [usout_size      1]     |
Online and final are the online and final costs. Each is specified by a quality structure. The cost function given by a quality structure is:
C(X,Y) = eval(pn) + 
         optweights . Y + 
         (X - xtarget)' * diag(xweights) * (X - xtarget) +
         (Y - ytarget)' * diag(yweights) * (Y - ytarget)
There are four terms. The first is a parse node. It is given by specifying string expression such as "5.0 * x[1] - 3.2 *y[0] + sin(x[0])". Look in the source file "damut/parse.c" to see its full syntax. x and y are used for the input and output variables, but their respective variable names (described later in the sidat structure) may also be used. This parse node could be the sole method of describing cost functions since each of the remaining 3 terms could be written in it. However, those terms are listed separately because they are the most common forms for cost functions, and some algorithms can take advantage of knowing the cost function is in those forms.

The second term is a weighted sum of the outputs. This is most useful for straight minimization or maximization of output values.

The third and fourth terms penalize squared deviation from a target input and output value respectively. Common uses are two find the X that achieves a target value of Y rather than minimizing or maximizing it. Similarly, it may be desirable to find the X nearest some nominal X that achieves a target or minimal cost, and thus there might be a penalty on squared deviation from the nominal. This is good for regularizing a problem that would otherwise be ill-defined because many Xs achieve the target Y. Normally, a quadratic penalty function can include a full matrix of weights. In practice, it is rare that this matrix would have non-zero terms anywhere but on the diagonal. Therefore, the structure specifies the weights as a vector of the weights on the diagonal.

For simplicitly, the algorithms expect the online and final costs to both be maximization, or both be minimization.

The constraints describe what the range of allowed experiments is. Fixeddims and fixedvals allow some of the input variables to be fixed at a certain value and thus excluded from the search. Fixeddims is an integer vector with 1s for any dimension that should be held fixed and 0s for the ones that should be experimented over. The fixedvals vector specifies what the values should be for the dimensions that are fixed. The values in the fixedvals vector of the dimensions that are not fixed are ignored.

Why not just remove any dimensions that you don't want to search over from the problem specification altogether? The reason is that you may have data which has your "fixed variables" set at values other than what you intend to fix them at. You would still like to use that data and have it contribute to the function approximator's estimation even though you already know what value you want for those variables.

Fixeddims and fixedvals are equality constraints on some of the variables. Inequality constraints are specified with the lovals and hivals vectors. Only axis-parallel constraints may be given and you must either give upper and lower bounds on all the input variables, or on none of them. The software will still operate if you give no bounds, but it is almost always better to set some reasonable limits on the input variables.

totalsteps is used by OPTEX to compute n in the second term of its valuation function. It is the total number of experiments you intend to run including any historical results you have. n is totalsteps minus the number of data points already given.

autofacode is a boolean indicating whether you would like to use an algorithm that will automatically choose the best facode for you. The facode is part of the GMBL function approximator and is described later. Use of this option is not recommended yet. In its current form, it always suggests gmstrings of the from L?0:SN:9 where the ? is filled in by its estimate of the best smooth code to use based mostly on leave one out cross validation error. Because of the special requirements of the RSM algorithms, choosing a facode based strictly on leave one out prediction error is not necessarily the best thing to do. It is important that the function approximator extrapolates in a reasonable way because the RSM algorithms need to have reasonable estimates of what will happen if they explore outside the region currently covered by the data. Reasonable extrapolation is not something that is directly encouraged by minimizing leave one out error.



next up previous contents
Next: OptPars Up: THE RSM DATA STRUCTURES Previous: THE RSM DATA STRUCTURES



Jeff Schneider
Thu Apr 25 13:10:56 EDT 1996