Skip to content

Symbolic strings (Nix string contexts-like)Β #948

@yannham

Description

@yannham

Is your feature request related to a problem? Please describe.

Working on Nickel-nix and in general Nix integration (#693), we've been needing something like Nix string context.

String context is a way of implicitly and automatically attaching and combining metadata to string values (in the case of Nix, the dependencies that must be built before the paths present inside the string become valid). When interpolating strings with context inside another string, all the dependencies (the contexts) are combined. This feature is really useful to avoid specifying obvious dependencies explicitly (e.g. source files).

However, we don't want to implement Nix string contexts as it is, because it's pretty ad-hoc and Nix specific. We would rather like to have a more general mechanism, of which string context would just be an instance, that may be used for other domains (Terraform, Kubenertes, etc.), or different use-cases within Nix (IFD/recursive Nix-like).

Fundamentally, Nix string context are an overloading of string interpolation (and other string operations) to work on richer values than just string. Very schematically, Nix strings are rather {ctxt : Array Deps, value: Str}.

We've discussed the possibilities many times. Having a general ad-hoc overloading mechanism would be possible but pretty heavy (think trait/typeclasses, or even a very restricted form just for strings), with the usual problems of coherence, complexity for new users, etc.

In some way, Nix string context might be implemented armed with effects (#85), e.g. if we allow to perform effects at string interpolation. However such an effect system is still to be properly designed for Nickel, and effects handler would be implemented in Rust, as interpreter plugins, which make them rather heavy to implement and distribute. For something like Nix string context, that could be ok, as we would have to do it once and for all per target tools. It's still a long way to get there.

This issue makes a simple and lighter proposal that could achieve the same effect, but relies only on one language feature (very small) and otherwise pure Nickel library code. It also seems to be forward-compatible with performing effects at string interpolation.

Describe the solution you'd like

We propose to introduce a new form of strings, let's call them symbolic strings, and write them using the delimiters s%" and "%s. Normal strings with interpolation are parsed as a list of chunks, where one chunk is either a string literal or an interpolated expression. For example, "foo %{bar} baz" is represented as (something like) [Chunk::Literal("foo "), Chunk::Expr(..), Chunk::Literal(" baz")]. String chunks are then evaluated at runtime, when first encountered, and turned into an actual string.

Symbolic strings would be almost the same, but they would return the chunks as a normal Nickel expression, and wouldn't evaluate them further. For example:

s%"foo %{bar} baz"%s would just be equivalent to write

{
  tag = `StrChunks,
  chunks = [
    {tag = `Literal, value = "foo "},
    {tag = `Expression, value = bar},
    {tag = `Literal, value = " baz"}
  ]
}

(the shape of chunks is just an example, and up to discussion)

Then, the library consuming such a string, or even just the contract attached to the field, would be in charge of doing whatever they want with it. Typically, in Nickel-Nix, there already is a nix_string_hack function that can process this kind of list and produce an AST that is re-interpreted on the Nix side, reconstructing the contexts, thus giving the same automatic and implicit dependency management as in Nix. But it uses normal function calls and arrays, which is arguably not very nice to read. Here is an example of how it is used:

args = [ "-c",
  ([inputs.gcc, "/bin/gcc ", inputs.hello, " -o hello\n"]
   @ [ inputs.coreutils, "/bin/mkdir -p $out/bin\n"]
   @ [ inputs.coreutils, "/bin/cp hello $out/bin/hello"])
   |> nix.lib.nix_string_hack

Symbolic string would just be an alternative, better syntax for this expression, allowing to write:

args = [ "-c", s%"
  %{inputs.gcc}/bin/gcc %{inputs.hello} -o hello
  %{inputs.coreutils}/bin/mkdir -p $out/bin
  %{inputs.coreutils}/bin/cp hello $out/bin/hello
"%s, ]

Which is really not different from what you would write in Nix today.

The change on the language side is really minimal (interpolated strings are already parsed as chunks, we just need to transform them into a Nickel value). Because symbolic strings are just composite Nickel values, the only operation that is natively supported is interpolation (for example, you can't call string.length or ++ on them). That being said, interpolation seems to be what you use 99% of the time, and string operations don't even make sense in some cases (such as knowing the length of a Terraform computed value like an IP). The library writers providing an "interpreter" for those strings may then export additional string manipulation functions if they make sense (in the case of Nix, we can know the path at evaluation time, so we may define and export more string primitives in the library).

Related approaches

In fact, this idea is very close to the quasiquote/unquote/unquote-slice mechanism of Lisp. Or, even more specifically, to the G-expressions of Guix, but with a more idiomatic Nickel string syntax (and probably a few unimportant differences, as in this proposal interpolating would probably be more like unquote than unquote-slice, that is we wouldn't automatically "flatten" the AST but let that to the library code).

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions