CLHS: Issue HASH-TABLE-REHASH-SIZE-INTEGER Writeup

Issue HASH-TABLE-REHASH-SIZE-INTEGER Writeup

Issue: HASH-TABLE-REHASH-SIZE-INTEGER

Reference: Draft 8.81, p.19-3

Category: CLARIFICATION/CHANGE

Edit History: Version 1, 06/16/91, Kim Barrett

Version 2, 09/26/91, Steve Haflich, add Franz current practice

Version 3, 01/10/91, Steve Haflich, Lucid & Chestnut

current practice

Problem Description:

The semantics for the :REHASH-SIZE argument to MAKE-HASH-TABLE are unclear.

The description in the draft says it can be

"an integer greater than zero, which is the number of entries to add, or it

can be a floating-point number greater than 1, which is the ratio of the

new size to the old size."

When the :REHASH-SIZE argument is an integer, it is unclear whether it is

expected to be scaled as the size is increased or if it is supposed to

indicate an expected additional number of entries to add.

At issue is whether a programmer can use the type of value provided for the

:REHASH-SIZE argument to give the implementation a hint as to the expected

growth rate for the table, with an integer indicating an additive growth rate

and a float indicating a multiplicative growth rate.

Proposal:

Specify that if the :REHASH-SIZE argument is an integer then the

implementation may assume that the expected growth rate for the table is

additive, and that if the argument is a float then it may assume that the

expected growth rate is multiplicative.

Clarify that the value of the :REHASH-SIZE argument does not constrain the

implementation to use any particular method for computing the new size when

the hash table is enlarged. The actual method for computing the new size is

implementation dependent and the :REHASH-SIZE argument only provides hints

from the programmer to the implementation.

Editorial Impact:

An isolated change to MAKE-HASH-TABLE and HASH-TABLE-REHASH-SIZE.

Rationale:

Provides a means for the programmer to reliably provide to the implementor a

particular piece of information about the programmer's intent, without

constraining the implementor to any particular implementation technique.

Current Practice:

Symbolics Genera and IIM appear to use the :SIZE and integral :REHASH-SIZE

arguments to produce a ratio which is then used as the effective rehash size

in the same way as if a float with the same value had been specified for the

:REHASH-SIZE.

Lucid, Franz, and Chestnut conform to the interpretation of

:REHASH-SIZE here suggested.

Discussion:

Pitman:

Only the application programmer knows whether new elements are expected to

arrive in an additive or multiplicative way. All the implementation knows at

the time growth needs to occur is that there isn't room. It can't tell how

many extra elements are coming. And just because the computation on the size

is allowed to be slightly fuzzy, that doesn't mean it doesn't matter whether

the input to that computation should be allowed to be fuzzy.

Control of memory growth is a frequently cited reason for preferring C over

Lisp. In some places, fixing the problems that underlie this is virtually a

research topic because no one can even figure out what they'd want to write

down in order to advise the program about what to do. Controlled growth as

in vector-push-extend or hash-table-rehash-size is not in that camp. To the

extent that we have a linguistic facilities begging for the opportunity to be

properly expressive, I see no reason to be vague--even if we're going to let

implementations do something a little different than what was asked for, we

should still define the language in such a way that the implementation at

least knows what was asked for. Both of the possible values (integers and

floats) are potentially meaningful in distinct ways, and trivial to

implement, so why blur their intent?