ISSUE: Change Comment Characters Counter Proposal

REVISION HISTORY:  2-17-94, William Lott (Gwydion)
                   2-20-94, Scott Fahlman (Gwydion)
		   2-21-94, William Lott (Gwydion)

RELATED ISSUES:
Change Comment Characters
Restrict Characters in Variable Names
Restrict Infix Operators

CATEGORY: Change

PROBLEM DESCRIPTION:

The Change Comment Characters issue adopts C and C++ style comments, except
that it says:

    All text in a multi-line comment must be syntactically valid as Dylan
    lexical tokens.

This was added so that "/*" and "*/" embedded in strings and other literals
would not confuse the comment-parsing machinery.  However, this restriction
creates several major problems:

1. This will greatly confuse C and C++ programmers who will expect to be
able to say things like:

    /* Compute the window's title. */

which is not syntactically valid due to the single apostrophe.  The
confusion will be much greater because the comment convention now looks
like C, but behaves very differently in a common situation.

2. An important use for multi-line comments is to comment out chunks of
code that are incomplete, under construction, or perhaps have just been
imported from some other language and have not yet been translated to
Dylan.  In all of these cases, the commented-out material may contain
material that is not syntactically valid as Dylan lexical tokens.

PROPOSAL:

Remove the #<space>, ##, and #{...#} comment conventions.

If // appears where a token would otherwise start, the // and everything
following it on the current line is treated as whitespace, including any
additional uses of //, /*, and */.

If /* appears where a token would otherwise start, it and everything up to
and including the next appearance of */ is treated as whitespace.  In
scanning for the end of the comment, no attempt is made to tokenize the
material passed over.  The scan is sensitive to only three character
sequences:

   //   Ignore everything, including and additional //, /*, and */, up to
        the next newline.

   /*   Begin an embedded comment.  Recursively scan for the end of this
        comment, then continue scanning for the end of the outer comment.

   */   End of the comment.

OPTIONAL AMENDMENT:

In addition to the above, retain #{ and #} as comment delimiters
specifically for code.

    #{ and #} must appear in matched pairs where a token could otherwise
    appear.

    Everything between the #{ and the matching #} is treated as
    whitespace.

    /*...*/ and // comments take precedence over #{ and #} comments.

Note that because #{ and #} can only appear where tokens can start, the
presence of the characters ``#{'' or ``#}'' in a string will not cause any
problems in finding the #} that corresonds to a particular #{.

RATIONALE:

The restriction that /* and */ can only wrap syntactically valid Dylan code
means that they can *only* be used to disable sections of legal Dylan code.
They cannot be used for multi-line comments, and they cannot be used to wrap
ill-formed code fragments.  Both of these uses are very important, and some
mechanism is needed for them.

The proposal above allows // and /*...*/ to be used as in C and C++, with
the additional ability to nest /*...*/ comments in almost all situations.

Given that they look the same as C/C++ comments, many (most?) C/C++
programmers will expect them to behave as in C/C++.  Nesting of /*...*/
comments will not surprise these programmers, but the inability to use them
for general commenting will.

We believe that the original proposal goes overboard in trying to make the
multi-line commenting convention "do the right thing" in all conceivable
cases.  The cases listed above will be very common, and the presence of /*
or */ in literals will be very rare.  In this proposal, we sacrifice the
rare case and handle the common one.  We are not aware of any other
language that tries to ignore the comment-closing sequence if it appears
inside a literal in commented-out code.

If users do have /* or */ embedded in code that they want to comment out as
a block, they would have to find these things and use // to hide them
individually.

If the other Dylan designers feel a compelling need for a safe mechanism
for commenting out chunks of Dylan code that are believed to be
well-formed, we believe that an alternative mechanism should be
provided for this purpose that does not involve the C-like /* and
*/ notation.  The optional amendment allows for this.

EXAMPLES:

    /*
     * multi-line comment,
     * of a style I've seen
     * in lots of C code.
     */

    if (~x)
      // Default x if unsupplied.
      x := compute_default()
    end

The examples below assume the optional amendment:

    // This conditional is unnecessary.
    #{ if (some_condition) #}
      do_something()
    #{ end #}

    #{ // This stuff doesn't work.
    lots();
    of();
    stuff()
    #}

    #{ // example of nested #{ and #}
    define method foo (x, y, z)
      #{ // This stuff doesn't work.
      lots();
      of();
      stuff();
      #}
      some_stuff_that_works()
    end;
    #}

COST TO IMPLEMENTORS:

The basic proposal should be easier to implement than the earlier
comment proposal, which requires tokenizing of the commented-out material.
With the amendment, it should be roughly equivalent work.

COST TO USERS:

Occasional problems and puzzlement when trying to comment out code that
happens to contain "*/" and "/*" within literals.

The amendment, if adopted, would require users to learn a third comment
notation.

BENEFITS:

Won't confuse C and C++ programmers who will expect familiarity when they
see the familiar /*...*/

Preserves the ability to create multi-line comments containing text that
would confuse the Dylan tokenizer.

Preserves the ability to disable sections of ill-formed code.

The optional amendment preserves the ability to safely comment out chunks
of well-formed Dylan code.

AESTHETICS:

In the eye of the beholder.  Some feel that the #{ and #} conventions are
ugly, especially if the symmetrical pair /* and */ are also present.  Of
course, the choice of #{ and #} in the amendment was arbitrary, and some
other sequence could be used there instead.

IMPLEMENTATION NOTES:

FUTURE FEATURES:

DISCUSSION:

VOTES: