Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ungrammar (rust-analyzer.github.io)
152 points by todsacerdoti on Oct 24, 2020 | hide | past | favorite | 14 comments


Very interesting. I am actually working for a few days now on something very related: Generate syntax tree data structures (in Swift) from a grammar (which is the actual grammar used for parsing). What I found very useful is to distinguish between hidden, auxiliary and visible symbols. Hidden symbols don't appear in the tree, visible symbols appear as normal, and for auxiliary symbols only their children appear in the syntax tree. Seems to work very nicely so far. This way, left oder right recursion for example doesn't matter if your recursion symbol is auxiliary, because it will just be flattened into a list anyway. Therefore I can use regular expressions in a grammar by desugaring them via auxiliary symbols in the usual way via context-free rules, and have a sensible syntax tree generated for that.


Left or right recursion makes a semantic difference for non-associative operators like subtraction.


In those cases I'd suggest you don't make the corresponding symbol auxiliary.


Wasn't this what ASDL did, way back from the Zephyr project?

https://www.cs.princeton.edu/research/techreps/TR-554-97

I think Python is still using this.


It's in the post very briefly:

If you’ve heard about ASDL, ungrammar is ASDL for concrete syntax trees.

https://www.oilshell.org/blog/2016/12/11.html


That will show me to read fully. :/


Is it possible to take some extended BNF and standard nonterminal names (stat, funcname, etc.) then output some Hello, World! example from it?

It would be really cool to just generate the look and feel of a programming language just using extended BNF.


Yes, grammars can be used for both string matching and string generation. You could generate strings from an ungrammar, but you would need a way to generate valid tokens too.


Can someone ELI5? Is it just a parser for Rust concrete syntax tree?


It's a generator for the parse tree nodes which the parser uses to construct a concrete syntax tree.


AFAICT it's not even a parser:

   In rust-analyzer, it is paired with a hand-written parser.
I think it's just a convention for how to represent concrete syntax tress as Rust abstract datatypes.


... that is, is a semantic as opposed to syntactic model?


Cool project, really weird and offputting branding.

I'd call this a "concrete grammar", calling something that's a grammar an "ungrammar" is cutesy. It feels lazy, like they stopped at the first idea they had.


I didn’t get that impression




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: