REBOL3 tracker
  0.9.12 beta
Ticket #0001302 User: anonymous

Project:



rss
TypeIssue Statuswaiting Date27-Oct-2009 13:45
Versionalpha 92 CategoryDocumentation Submitted bymeijeru
PlatformAll Severityminor Prioritynormal

Summary Undocumented restrictions on allowed character combinations for words
Description According to the documentation a word can start with an alphabetic character (never mind the definition of that, but see #1230) and any of the following: !&*+-.<=>?_`|~
Subsequent characters can also be a digit or a single quote.

However, some combinations are apparently not allowed: a < or > sign inside a word, a word of more than two < or > signs, the combination <=>, and the combinations << and >> used as set-word. Any combination starting with < and continuing with something legal but different from = or > is taken as the start of a tag. I have not been able to explore exhaustively, but if there is a system behind this, I would like to know.
Example code
>> length? [a<<]
== 2
>> type? quote <<<
** error!
>> type? quote a<
** error!
>> type? quote a>
** error!
>> type? quote <<<
** error!
>> type? quote <=>
** error!
>> type? quote <<:
** error!
>> type? <!>
== tag!

Assigned ton/a Fixed in- Last Update18-Feb-2012 10:51


Comments
(0001704)
BrianH
27-Oct-2009 20:15

Some words are special-cased by the syntax scanner, but otherwise their characters conflict with other types and can't be used. These special cases need to be documented, and the characters that cause conflicts otherwise need to be removed from the list of standard accepted characters.
(0001717)
Carl
28-Oct-2009 17:55

I marked this as waiting for documentation changes.

As BrianH indicates, the matter is complicated by what the lexical analyzer will accept as a word. The < and > are overloaded as tags, so their use for words is a special case (for the comparison operators.)

Most of the task is to make doc changes, but I do not rule out some extra checks in the code as well.
(0003110)
BrianH
17-Feb-2011 09:49

Here is a set of tested rules for the alpha 110 lexical analyzer of R3 words, at least for the ASCII range.

word-char-prefix: charset "+-."
word-char-first: charset ["!&*=?^^_`|~" #"A" - #"Z" #"a" - #"z"]
word-char-rest: charset ["!&'*+-.=?^^_`|~" #"0" - #"9" #"A" - #"Z" #"a" - #"z"]
word-follow: charset [0 - 32 {"()/;[]{}} 127]
word-char: complement union word-follow charset "@"
money-char: complement word-follow
word-follow-error: charset "#$%,<>\" ; See #537 for details
word-follow-arrow: charset [0 - 32 "]" 127] ; See #1317 for why "]" is here
word-follow-slash: union word-follow charset "+-.#$%,<>\" ; See #1856 for why +-. is here
digits: charset "0123456789"
word-rule: [
    some "/" [end | and word-follow-slash] |
    "<" [end | and word-follow-arrow] |
    a: ["," not digits | "\"] any word-char not "@" b: (
        cause-error 'syntax 'invalid reduce ["word" copy/part a b]
    ) |
    a: ["<>" | "<=" | ">=" | "<<" | ">>" | ">"]
    [end | and word-follow | some word-follow-error b: (
        cause-error 'syntax 'invalid reduce ["word" copy/part a b]
    )] |
    a: [
        word-char-prefix opt [word-char-first any word-char-rest] |
        word-char-first any word-char-rest
    ] [end | and word-follow | "$" any money-char b: (
        cause-error 'syntax 'invalid reduce ["money" copy/part a b]
    ) | some word-follow-error b: (
        cause-error 'syntax 'invalid reduce ["word" copy/part a b]
    )]
]

This is just for the syntax of the word! type. The word-rule will match correct words, and will trigger errors in the specific cases shown above.

Note: See #1230, #1477, #1478, #1855 and #1856 for the syntax bugs of the other word types.
(0003220)
Ladislav
18-Feb-2012 10:51

See also #1229.

Date User Field Action Change
18-Feb-2012 10:51 Ladislav Comment : 0003220 Added -
18-Feb-2012 03:00 BrianH Comment : 0003110 Modified -
29-Mar-2011 11:32 BrianH Comment : 0003110 Modified -
29-Mar-2011 11:01 BrianH Comment : 0003110 Modified -
29-Mar-2011 04:25 BrianH Comment : 0003110 Modified -
29-Mar-2011 04:20 BrianH Comment : 0003110 Modified -
29-Mar-2011 01:59 BrianH Comment : 0003110 Modified -
17-Feb-2011 22:22 BrianH Comment : 0003110 Modified -
17-Feb-2011 10:48 BrianH Comment : 0003110 Modified -
17-Feb-2011 10:48 BrianH Comment : 0003110 Modified -
17-Feb-2011 09:49 BrianH Comment : 0003110 Modified -
17-Feb-2011 09:49 BrianH Comment : 0003110 Added -
28-Oct-2009 17:55 carl Comment : 0001717 Added -
28-Oct-2009 17:49 carl Status Modified reviewed => waiting
27-Oct-2009 20:15 BrianH Comment : 0001704 Added -
27-Oct-2009 20:12 BrianH Type Modified Note => Issue
27-Oct-2009 20:12 BrianH Status Modified submitted => reviewed
27-Oct-2009 13:49 meijeru Description Modified -
27-Oct-2009 13:48 meijeru Code Modified -
27-Oct-2009 13:48 meijeru Description Modified -
27-Oct-2009 13:45 meijeru Ticket Added -