========================================= U N C E R T A I N T Y A R I T M E T I C ========================================= This file describes the arithmetic operations on uncertain values and functions, used in the optimization and preference elicitation. ------- Objects ------- We use the following hierarchy of classes to represent objects in the uncertainty arithmetic. Numeric interval Integer interval Real interval Numeric PUD Integer PUD Real PUD Uncertain PLF Y-uncertain PLF Certain PLF -= Numeric interval =- A "numeric interval" is a closed interval defined by its endpoints; for example, [2..4] represents the set of values between 2 and 4. If the lower endpoint is strictly larger than the upper endpoint, the interval is empty. We also allow specifying an interval as "Unknown", which means that we do not have any information about it. An integer interval represents the set of integers between its endpoints, whereas a real interval represents all reals between its endpoints. For example, the integer interval [2..4] represents three numbers (2, 3, and 4), whereas the real interval [2..4] represents infinitely many numbers. -= Numeric PUD =- A "numeric PUD" is a set of intervals with respective real-valued probabilities, which represents a probability distribution. It cannot contain empty or "Unknown" intervals, its intervals must not overlap, and the probabilities must add up to 1.0. Intervals in a real PUD may have common endpoints (e.g. [2..4] and [4..6]), whereas intervals in an integer PUD must not have common endpoints. We assume that intervals in a PUD are sorted by their lower endpoints. We also allow specifying a PUD as "Unknown", which means that we do not have any information about the distribution. An integer PUD represent a distribution over a finite number of integers, whereas a real PUD is a distribution over reals. For example, suppose that a PUD includes only one interval, [2..3]. If it is an integer PUD, it represents a distribution over two values (2 and 3); if it is a real PUD, it represents a uniform distribution over all reals between 2 and 3. -= PLF =- A "certain PLF" is a piecewise-linear function specified by a set of "inflection points," where the specification of each point includes its x and y coordinates. We assume that these points have distinct x-coordinates, and that they are sorted on x. An "y-uncertain PLF" is a list of inflection points, where each point has a specific x-coordinate, and either a specific value or a numeric PUD for its y-coordinate; these numeric PUDs must not be "Unknown". An "uncertain PLF" is list of y-uncertain PLFs along with respective probability values, which must add up to 1.0. We also allow specifying an uncertain PLF as "Unknown". ---------- Operations ---------- We plan to support the following operations on uncertain values and PLFs. -= Conversions =- Interval(Number) -> Interval Number(Interval) -> Number (or nil) Numeric-PUD(Number) -> Numeric-PUD Number(Numeric-PUD) -> Number (or nil) Interval(Numeric-PUD) -> Interval Numeric-PUD(Interval) -> Numeric-PUD (or nil) -= Interval arithmetic =- - Interval -> Interval Interval + Number -> Interval Interval * Number -> Interval Interval + Interval -> Interval Interval * Interval -> Interval Union(Interval, Interval) -> Interval Intersection(Interval, Interval) -> Interval -= PUD arithmetic =- - Numeric-PUD -> Numeric-PUD Mean(Numeric-PUD) -> Number (or nil) Std-Dev(Numeric-PUD) -> Number (or nil) Numeric-PUD + Number -> Numeric-PUD Numeric-PUD * Number -> Numeric-PUD Numeric-PUD + Numeric-PUD -> Numeric-PUD Numeric-PUD * Numeric-PUD -> Numeric-PUD -= Application of certain PLFs =- Apply(Certain-PLF, Number) -> Number Apply(Certain-PLF, Interval) -> Interval Apply(Certain-PLF, Numeric-PUD) -> Numeric-PUD -= Application of uncertain PLFs =- Apply(Uncertain-PLF, Number) -> Numeric-PUD Apply(Uncertain-PLF, Interval) -> Interval Apply(Uncertain-PLF, Numeric-PUD) -> Numeric-PUD ----------- Conversions ----------- The "conversion" operations are similar to type-casting; that is, they convert an object into a similar object of a different type. -= Interval(Number) =- Convert a number into an interval that includes only this number: Interval(2) -> [2..2] Formula: Interval(W) -> [W .. W] -= Number(Interval) =- If an interval represents a single number, return this number; else, return nil: Number([2..2]) -> 2 Number([2..4]) -> nil Number(Unknown) -> nil Formula: Number([L .. U]) -> L, if L = U -> nil, if L != U Number(Unknown) -> nil -= Numeric-PUD(Number) =- Convert a number into a numeric PUD that includes only this number: Numeric-PUD(2) -> (1.0 [2..2]) Formula: Numeric-PUD(W) -> (1.0 [W .. W]) -= Number(Numeric-PUD) =- If a PUD represents a single number, return this number; else, return nil: Number((1.0 [2..2])) -> 2 Number((0.5 [2..2], 0.5 prob [4..6])) -> nil Formula: Number((1.0 [W .. W])) -> W Number(Any-Other-Case) -> nil -= Interval(Numeric-PUD) =- Determine the minimal interval that includes all intervals of a PUD: Interval((1.0 [2..4])) -> [2..4] Interval((0.5 [2..2], 0.5 [4..6])) -> [2..6] Interval(Unknown) -> Unknown Formula: Interval((P1 [L1 .. U1], ..., Pn [Ln .. Un])) -> [L1 .. Un] Interval(Unknown) -> Unknown -= Numeric-PUD(Interval) =- Convert an interval into a PUD, where all values are equally likely; if the interval is Unknown, return Unknown: Numeric-PUD([2..4]) -> (1.0 [2..4]) Numeric-PUD(Empty) -> nil Numeric-PUD(Unknown) -> Unknown Formula: Numeric-PUD([L .. U]) -> (1.0 [L .. U]), if L <= U Numeric-PUD(Empty) -> nil Numeric-PUD(Unknown) -> Unknown ------------------- Interval arithmetic ------------------- -= - Interval =- Determine the negation of an interval: - [2..4] -> [-4..-2] - [-2..4] -> [-4..2] - Unknown -> Unknown Formula: - [L .. U] -> [-U .. -L], if L <= U - Empty -> Empty - Unknown -> Unknown -= Interval + Number =- Determine the smallest interval that includes all "a + NUMBER" values, where "a" is from the given interval: [2..4] + 6 -> [8..10] Unknown + 6 -> Unknown Formula: [L .. U] + W -> [L + W .. U + W], if L <= U Empty + W -> Empty Unknown + W -> Unknown -= Interval * Number =- Determine the smallest interval that includes all "a * NUMBER" values, where "a" is from the given interval: [2..4] * 2 -> [4..8] [2..4] * -2 -> [-8..-4] Formula: [L .. U] * W -> [L * W, U * W], if W >= 0 and L <= U -> [U * W, L * W], if W < 0 and L <= U Empty * W -> Empty Unknown * W -> [0..0], if W = 0 -> Unknown, if W != 0 -= Interval + Interval =- Determine the smallest interval that includes all "a + b" values, where "a" is from the first given interval, and "b" is from the second: [2..4] + [4..6] -> [6..10] [2..4] + Unknown -> Unknown Formula: [L1 .. U1] + [L2 .. U2] -> [L1 + L2 .. U1 + U2], if L1 <= U1 and L2 <= U2 Empty + Any -> Empty Unknown + Any-Nonempty -> Unknown -= Interval * Interval =- Determine the smallest interval that includes all "a * b" values, where "a" is from the first given interval, and "b" is from the second: [2..4] * [4..6] -> [8..24] [2..4] * [-1..2] -> [-4..8] Unknown * [2..4] -> Unknown Unknown * [0..0] -> [0..0] Formula: [L1 .. U1] * [L2 .. U2] -> [min(L1 * L2, L1 * U2, L2 * U1, L2 * U2) .. max(L1 * L2, L1 * U2, L2 * U1, L2 * U2)], if L1 <= U1 and L2 <= U2 Empty * Any -> Empty Unknown * Any-Nonempty -> Unknown -= Union(Interval, Interval) =- Determine the smallest interval that includes both given intervals. Union([2..4], [6..10]) -> [2..10] Union([2..4], Unknown) -> Unknown Formula: Union([L1 .. U1], [L2 .. U2]) -> [min(L1, L2) .. max(U1, U2)], if L1 <= U1 and L2 <= U2 Union(Empty, [L .. U]) -> [L .. U] Union(Unknown, Any) -> Unknown -= Intersection(Interval, Interval) =- Determine the largest interval contained in both given intervals. Intersection([2..4], [3..6]) -> [3..4] Intersection(Unknown, [2..4]) -> Unknown Intersection(Unknown, Empty) -> Empty Formula: Intersection([L1 .. U1], [L2 .. U2]) -> [max(L1, L2) .. min(U1, U2)], if L1 <= U1 and L2 <= U2 Intersection(Empty, Any) -> Empty Intersection(Unknown, Any-Nonempty) -> Unknown -------------- PUD arithmetic -------------- -= - Numeric-PUD =- Determine the negation of a PUD: - (0.7 [2..4], 0.3 [6..10]) -> (0.3 [-10..-6], 0.7 [-4..-2]) - Unknown -> Unknown Formula: - (P1 [L1 .. U1], ...) -> (..., P1 [-U1 .. -L1]) - Unknown -> Unknown -= Mean(Numeric-PUD) =- Determine the mean value of the specified distribution; if the PUD is Unknown, return nil: Mean(0.5 [2..4], 0.5 [6..8]) -> 5 Mean(Unknown) -> nil Formula: Mean(P1 [L1 .. U1], ...) -> 0.5 * (P1 * (L1 + U1) + ...) Mean(Unknown) -> Unknown -= Std-Dev(Numeric-PUD) =- Determine the standard deviation of the specified distribution; if the PUD is Unknown, return nil: Std-Dev(0.5 [2..4], 0.5 [6..8]) -> 2.4 Std-Dev(Unknown) -> Unknown Formula: For real PUDs: Std-Dev(P1 [L1 .. U1], ...) -> Sqrt( ((P1 * (U1^2 + U1 * L1 + L1^2)) + ...) / 3 - Mean^2 ) where Mean is the mean value of the distribution For integer PUDs: Std-Dev(P1 [L1 .. U1], ...) -> Sqrt( (P1 * (F(U1) - F(L1 - 1)) / (U1 - (L1 - 1)) + ...) - Mean^2) where Mean is the mean value of the distribution, and F(X) = x^3/3 + x^2/2 + x/6 Std-Dev(Unknown) -> Unknown -= Numeric-PUD + Number =- Determine the distribution of the "a + NUMBER" values, where "a" is distributed according to the given PUD: (0.7 [2..4], 0.3 [6..10]) + 4 -> (0.7 [6..8], 0.3 [10..14]) Unknown + 4 -> Unknown Formula: (P1 [L1 .. U1], ...) + W -> (P1 [L1 + W .. U1 + W], ...) Unknown + W -> Unknown -= Numeric-PUD * Number =- Determine the distribution of the "a * NUMBER" values, where "a" is distributed according to the given PUD: (0.7 [2..4], 0.3 [6..10]) * 2 -> (0.7 [4..8], 0.3 [12..20]) (0.7 [2..4], 0.3 [6..10]) * -2 -> (0.3 [-20..-12], 0.7 [-8..-4]) Unknown * 2 -> Unknown Formula: (P1 [L1 .. U1], ...) * W -> (P1 [L1 * W .. U1 * W], ...), if W > 0 -> (..., P1 [U1 * W .. L1 * W]), if W < 0 -> (1.0 [0..0]), if W = 0 Unknown * W -> (1.0 [0..0]), if W = 0 Unknown, if W != 0 -= Numeric-PUD + Numeric-PUD, Numeric-PUD * Numeric-PUD =- SKIP FOR NOW... --------------------------- Application of certain PLFs --------------------------- -= Apply(Certain-PLF, Number) =- Apply a function to a given number; that is, determine the y coordinate that corresponds to the given x coordinate: Apply(((0,1), (1,2)), 0.5) -> 1.5 Algorithm: Apply(((X1, Y1), (X2, Y2), ..., (Xn, Yn)), W) - If W < X1, return Y1 - If W > Xn, return Yn - If W = Xi for some i, return Yi - If X[i-1] < W < Xi for some i, return (W * Yi - W * Y[i-1] + Xi * Y[i-1] - X[i-1] * Yi) / (Xi - X[i-1]) -= Apply(Certain-PLF, Interval) =- Apply a function to a given interval; that is, determine the smallest interval that includes all "Apply(PLF, a)" values, where "a" is from the given interval: Apply(((0,1), (4,5)), [1..3]) -> [2..4] Apply(((0,1), (4,5)), Unknown) -> Unknown Algorithm: Apply(((X1, Y1), (X2, Y2), ..., (Xn, Yn)), [L .. U]) - Let X[i-1] < L <= Xi and X[j-1] <= U < Xj - Let YL be the result of applying PLF to L, and YU be the result of applying PLF to U - Return [min(YL, YU, Yi, ..., Y[j-1]) .. max(YL, YU, Yi, ..., Y[j-1])] -= Apply(Certain-PLF, Numeric-PUD) =- SKIP FOR NOW... ----------------------------- Application of uncertain PLFs ----------------------------- SKIP FOR NOW...