For industrial and production settings, Dhall is my favourite. You offload to the language / type checker a lot of tribal knowledge that would usually end up ill-specified in a README or learnt through countless deployment incidents.
Creating languages is fun but writing code in a half-assed language created by brilliant but not too experienced engineer with no real support for anything, is not! Then, when this engineer leaves your company, you are left with a mess to clean up...
IMO better strategy would be to reduce config files to absolute minimum and let the code that reads them handle all the complexity. Whatever is your main language, it is better supported than any of this.
Yeah at one job I worked, a smart but then-inexperienced guy had developed a language, implemented a transpiler, written a bunch of tooling in the language, and left the company. We'll call him "Steve". By the time I got there it had been largely replaced, but one still heard rumors about the dreaded "Stevecode".
I wrote a little mini language[1] for a project last year (I only inflicted it on myself, not any poor user). It served its purpose well: allowing me to quickly implement people's requests in a way that was sandboxed/isolated from other people's accounts, was impossible to take the system down no matter how bad the bugs, and I could implement these requests without redeploying the underlying system. In that respect, it was a huge success (and that's what it was for). But in every other way, I think it was a failure: I came to hate writing code in it, despite having designed it. It had missing language features that I never got around to implementing, strange bugs that I didn't have time to fix (so just avoided doing those things), nobody else but me could use it (the project was just me developing, but the users couldn't use it either despite having access to it. It replaced a simple but very limited command language, but in doing so added too much real programming). So while it succeeded in what it set out to do, it wasn't a sustainable or future proof solution. I did learn what users actually wanted though and got to test/prototype it before adding it to the underlying system. I'm also working on a declarative replacement now (basically a TOML file where values can be expressions -- a template system), that supports everything the users actually want, but is vastly simpler and has much less custom logic.
It was a great learning experience though and I'm glad I did it, both in terms of learning how to write grammars (I used Clojure's Instaparse) and interpreters, and in learning about the users needs and wants.
But I shudder to think about if anybody else would have had to use it...
[1] Some interesting (IMHO) technical details: it was a synchronous language (of the transformational system variety), meaning that while it was executing, time was essentially frozen: it would gather all of its inputs, then evaluate the code as a pure transformation of its inputs (made it very easy to interpret!) and then write its output. The inputs are essentially immutable while the evaluation is happening. This was run in a database transaction for consistency. Each individual script was basically stop-the-world non-concurrent, but the scripts were independent and many could execute at once. I'm a very big fan of event driven synchronous languages.
Thanks for sharing. There are often good reasons for creating DSLs, internal or external.
However I argue that I haven't seen many cases where configuration DSL were well justified.
By configuration here I mean something that describes differences between different deployed instances of the same code. If you take all deployed instances of the existing system and see what is minimal amount of configuration to describe these, you would usually find that very little information, just a few lines would be enough.
The rest of configuration file complexity comes from the designer's attempts to predict the future extensions. My point is, these should be resisted for the sake of future generations of maintainers.
What do you think of the HashiCorp configuration language (HCL)?
Overall, I completely agree with you, but I do find that HCL seems to be a pretty good fit for what its used for (at least for Terraform, which I have more experience with than their other tools).
My little story was really just me dumping my thoughts after reading about Stevecode, as someone who made a little language (not a configuration language though) and isn't an argument for doing so at all. I'm definitely in favour of sticking with an existing format for configuration (personally, I like to use TOML or EDN for my own projects' configuration needs, or JSON if its not normally meant to be edited by humans) and even for scripting, normally I'd stick with Lua or Javascript. My language came into existence because neither of those was sandboxed in the way I wanted (it was also deterministic and guaranteed to halt as there were no unbounded loops).
I think you are conflating several distinct concerns.
1. Using a file format that depends on a particular engineer
2. Static vs dynamic configuration
3. Volume of configuration
For example, you can reduce your configuration to a minimal set of files which use a standard dynamic configuration format (e.g., starlark).
Note that The principle of lesser power should guide your decision between static and dynamic config format; however, if you find yourself building dynamism atop YAML (e.g., CloudFormation) or doing text templating of YAML (e.g., Helm), you’re misunderstanding the principle and suffering unnecessarily.
Starlark is not configuration, it's a Turing complete "feature-challenged" programming language. If this is what you mean by dynamic configuration, this looks like an unfortunate outcome of someone's urge to try on humans what they learned in the PL class (or in SW methodology class :-) ). After suffering few thousand lines of a bloody mess created in this language, I can certainly say that Python, or Java would have been much better in this context. Pants (Twitter's version of Bazel) looks much better architected system in this respect.
The problem with what you call the principle of lesser power is that people fail to accurately predict "suitability" of their solution and choose wrong tool for the job. They start with "I just want to set a couple of variables" and end up with full-scale development system with IDE, debugger and profiler, badly designed and partially implemented.
This is pretty smug, but it seems like you don't understand the principle of least privilege at all. Notably, it exists to prevent exactly the scenario you describe: 'They start with "I just want to set a couple of variables" and end up with full-scale development system with IDE, debugger and profiler, badly designed and partially implemented.'
> Starlark is not configuration, it's a Turing complete "feature-challenged" programming language.
It's both. Programs are configuration. Not all configuration can be expressed statically except by expressing the program that derives it. That's what we mean by "dynamic configuration". Further, for the purposes of configuring most common applications, we want the language to be free of side effects ("feature-challenged" as you call it) because we want the execution to be reproducible--note that this is also an application of the principle of least privilege. There are also lots of practical reasons not to use Python or the JVM, notably the dependencies on big bulky runtimes.
> After suffering few thousand lines of a bloody mess created in this language, I can certainly say that Python, or Java would have been much better in this context. Pants (Twitter's version of Bazel) looks much better architected system in this respect.
I've used Pants in vain and I wasn't able to discern much 'architecture' in the original version. The rewrite seems to be better thought-through, benefiting from the experience of the initial version, but time will tell how it fares. That said, Bazel also left a lot to be desired in the way of extensibility, usability, quality, documentation, etc the last time I looked into it. In whichever case, the configuration language isn't the problem with either system. That said, as I respond to this comment, I'm building my own build system that is dramatically simpler and friendlier than existing systems, and it also uses Starlark as its configuration language (only because it's familiar and the most straight-forward to integrate).
We can definitely have deterministic Turing-complete programs. Do you mean you want totality? I agree that's nice, but in practice, being free from side-effects is likely more important than ensuring the program halts.
this is something I often hear about both configuration and validation languages - but why? You evidently are ok with loading programs that may or many not crash your computer? So crashing problems happen at the configuration step first.
EDN with a few added reader tags is great for configuration. I love Duct[1] configs: its just EDN, but it adds tags for including other files, reading environment variables, referencing other keys in the config (that's added by Integrant[2] not Duct itself): https://github.com/duct-framework/duct/wiki/Configuration
Author here, thanks for the feedback. This is v. much a work-in-progress, I'm in the process of trying to build something non-trivial with it. Just a couple of things to address in general from the comments so far:
- I'm aware of dhall, jsonnet and friends, I think they're cool, but I think the most important for me is this is just js - that means nothing new to learn + you can use it from front end code + free linters etc.
- I think the to-html bit is important, this is something I tangentially talked about in more detail here - https://leontrolski.github.io/dom-syntax.html. There was some discussion on hacker news about that one too, again, I think the important "feature" is that it's just js. Again, to me that's really important, so hiccup, jsx, etc don't cut it.
To clarify, the non-trivial thing I'm building is more for exploring the "unified templating language for backend and frontend" bit. I want to be able to use the same code to:
By now I shouldn't be, but I'm always a little surprised when people come to the conclusion that the best way to represent HTML is JSON or Javascript (or anything but HTML).
Why is XML a better way to represent HTML than any other hierarchical data representation language? “JSON doesn’t have Multiline strings” would be a good argument, but there are lots of good reasons that XML is a bad format (overly verbose, overly complicated, parsers frequently have security vulnerabilities, etc etc).
Element attributes are awkward to express in JSON, at the very least. Maybe if the markup language was originally designed around json it would be a more natural fit.
Why so? In web world we represent most of our non-UI structural data in JSON. HTML is also structural data but what makes it so special that serialization format is needed? Or why won't we represent any abstract business data as HTML even if it would have similiar structure?
Therefore I think it's very understandable why people come to that conclusion so often.
for people who claim this most of the time another goal is having "programmable components". also there can always be ergonomic/stylistic improvements like pug
I do think that something that's enough more than JSON to make it more usable for config (comments, imports, reference env vars, etc) would be great to have, but I don't see the two goals (config and templates) of dnjs as particularly compatible. HOCON[1] is the best attempt at a "better JSON" that I've seen but it's cross support of Java properties syntax makes it a little weird to use in a JavaScript centric stack.
I agree that there's a ton of value in only using a subset of JavaScript, but wouldn't this be better implemented as a set of lint rules?
Are there other advantages to a custom runtime that I'm missing? Even if it's only a subset of the language, the Python runtime probably isn't faster than Node.js.
I don't understand dnjs either, but the reasons I use jsonnet over a mainstream language (like Python, Lua or JS) for configuration:
- purity (within bounds of the local filesystem state; no dynamically computed imports; no networking; no file writes)
- declarative, lazily evaluated
- override syntax (allowing for a balance between explicitly defined APIs and some level of monkeypatching)
- file-relative import syntax
- simple interpreter that's easily embeddable into tools (no modules, no interpreter path, no global configuration files, etc)