Formal grammar for Data Jevko

Darius J Chuck

2021-12-10

The below formal grammar in ABNF describes Data Jevko well enough to build or generate a parser for it.

Value = List / Keyed / Primitive / Emptiness
List = 1*(Blanks Subvalue) Blanks
Keyed = 1*(Blanks Key Blanks Subvalue) Blanks
Primitive = 1*Symbol
Emptiness = ""
Subvalue = "[" Value "]"
Key = Nonblank [*Symbol Nonblank]
Blanks = *(%x20 / %x0a / %x0d / %x09) ; (1) 
Symbol = Escape / %x0-5a / %x5e-d7ff / %xe000-10ffff ; (2)
Nonblank = Escape / %x21-5a / %x5e-d7ff / %xe000-10ffff ; (3)
Escape = "\" ("\" / "[" / "]")

Lines that end with a number in parentheses are explained below:

  1. Blanks means zero or more occurences of space, newline, carriage return, or horizontal tab (” ” / \n / \r / \t).
  2. Symbol means an Escape sequence or any Unicode code point except [ / ] / \.
  3. Nonblank means Symbol minus control characters and space.

Jevko as a strict superset

To show that the above grammar is a strict subset of Jevko, we can simply change Value to be:

Value = *(Subvalue / Primitive)

The whole grammar then reduces to:

Value = *(Subvalue / Primitive)
Primitive = 1*Symbol
Subvalue = "[" Value "]"
Symbol = Escape / %x0-5a / %x5e-d7ff / %xe000-10ffff
Escape = "\" ("\" / "[" / "]")

which is equivalent to Jevko.