> This might be the worst advice I have seen in a long while. All the world is n...

teddyh · on Oct 1, 2024

> I didn’t say $TERM isn’t used. I said it’s unreliable for determining terminal capabilities.

You did say that every terminal has vt100 capabilities. But TERM=dumb does not, which is why some environments do set it that way.

> Hardware terminals don’t even use env vars, instead they use ANSI escape sequences to describe terminal capabilities.

Firstly, for physical (i.e. non-emulated) terminals, the environment variable is set by getty, who gets it from either /etc/inittab, or now possibly a systemd service setting. So TERM is supposed to always be set. Secondly, how would you know what escape sequences to send to query the capabilities, if TERM is not set? You can’t send VT100 escape codes blindly, because, again, TERM might be “dumb”.

hnlmorg · on Oct 1, 2024

> You did say that every terminal has vt100 capabilities. But TERM=dumb does not, which is why some environments do set it that way.

Fair. However dumb terminals aren't a typical use case because a whole plethora of standard utilities don't work properly on them. eg from pagers to `top`. They're also very easy to detect and do not invalidate any of the other points I've been making.

Dumb terminals aren't exactly common though. They were already superseded by smart terminals as early as the 1970s. In fact the last time I used a dumb terminal professionally was way back in the 1990s for a Bullfrog time sharing system that was already ancient at that point.

Your eshell example is fair, however last time I checked that wasn't even hooked to a TTY and there's ansi-term for when eshell needs smart functionality. If that's still the case then the original mitigation of TTY detection works anyway. So I don't agree that even the eshell example proves your point.

Let's be completely honest here, how many dumb terminals with a TTY attached do you think people are actually using? It's likely a rounding error of 0. I'd expect tools like terminfo to cater for edge cases like these but an internal company tool? I don't think that's a fair ask of anyone's time.

> Firstly, for physical (i.e. non-emulated) terminals, the environment variable is set by getty, who gets it from either /etc/inittab, or now possibly a systemd service setting. So TERM is supposed to always be set.

On modern POSIX systems, yes. It's not on non-POSIX systems nor on many ancient mainframes.

On some non-POSIX systems, it's entirely down to the drivers for the terminal / authors of the terminal emulator to set that env var. Pretty much all of them these days will do but in my career I've actually used several terminals that didn't.

So it's not a guarantee you'll have $TERM, or even a string that's truly representative set in $TERM (I'll go into more below)

But I don't really want to get into the weeds about whether $TERM is set or not. My real issue is what people set it's value to, not whether it's set or not. And how there are conceptually better ways to check for terminal capabilities (in my opinion at least) that unfortunately fell by the wayside.

> Secondly, how would you know what escape sequences to send to query the capabilities, if TERM is not set? You can’t send VT100 escape codes blindly, because, again, TERM might be “dumb”.

Dumb terminals are an extreme edge case and there's already easy methods to detect them. So lets move on from that:

For detecting terminal capabilities on the older hardware terminals, there are escape sequences like the following:

    CSI Ps c  Send Device Attributes (Primary DA).

Personally, I much prefer this approach because it more specifically defines what a terminal can and cannot do. Like how Javascript can query browser capabilities instead of making assumptions based purely on a User-Agent string. But instead we have $TERM and nearly every terminal emulator then defaults that value to `xterm`, or some variation of. Or `screen` if you're a multiplexer.

Clearly that does naff all to help our situation when detecting terminal capabilities.

To be more specific: the $TERM annoyance hampers our abilities to expose new terminal features to applications. For example, if we want to check for things like image support, there's half a dozen different environmental variables that need to be checked to identify what terminal emulator is actually running and whether inlining images will work (ie not get intercepted and broken by a multiplexer).

This isn't just a theoretical example either, I've written tools for working with image data in the terminal and detecting what escape codes to send is a bloody nightmare.

----

Going back to the original reason for this discussion:

For the OPs software, hardcoding vt100 sequences and falling back dumb mode if stdout is not a TTY is a perfectly reasonable suggestion given their requirements.

If their tool explodes in popularity (unlikely because its an internal work tool) and they find people are needing more sophisticated terminal detection, then they can worry about that at that stage. But we both know that my suggestion is good enough to cover even all of the common edge cases. Plus, as it was a work utility anyway, they have complete control over those edge cases to begin with. Thus catering for non-vt100 compatible terminals running regular shells is clearly an overreach of their time.

I also really don't appreciate the tone of your comments. You come off as being antagonistic. Maybe I'm reading your comments too harshly. If I am then I apologize.

[edited: toning down my own comment because it was potentially a bit rude in places]

teddyh · on Oct 2, 2024

> It's not on non-POSIX systems

If we cannot assume POSIX, we cannot assume even the existence of environment variables.

> For the OPs software, hardcoding vt100 sequences and falling back dumb mode if stdout is not a TTY is a perfectly reasonable suggestion given their requirements.

Linking to ncurses and (if stdout is a TTY) using ncurses to look up the correct escape sequences is simple and guaranteed to be correct. If ncurses comes back and says “nope, this terminal does not support your requested operations”, in that case you have some leeway in what to do. (I would personally advocate to stopping there; if somebody has a capable terminal, but has not configured their system to advertise it and its capabilities correcly, that is their problem.) If you insist on doing extra work to deal with mis- or unconfigured systems, you should first certainly first handle TERM=dumb specially and not send any escape codes.

But hardcoding terminal escape codes? People worked hard to make libraries to abstract away all these things for programs. Programs should offload as much work as possible to (sufficiently reliable) higher-level libraries.

This feels like advocating for x86-64 assembly in, say, 2016, when the Intel/AMD hegemony was more or less absolute. It might feel like a reasonable assumption at the time, but the pendulum always swings back to multi-platform support being necessary.

hnlmorg · on Oct 2, 2024

> If we cannot assume POSIX, we cannot assume even the existence of environment variables.

That's basically just expanding my point, but yeah, you cannot guarantee $TERM.

For what it's worth though, all the major non-POSIX platforms do support the concept of environmental variables. Even Windows.

> Linking to ncurses and (if stdout is a TTY) using ncurses to look up the correct escape sequences is simple and guaranteed to be correct.

Simple if you want to write in a language that supports C FFIs. Some don't. Furthermore, even those C calls can be sub-optimal. For example C calls break thread safety in some languages.

This particular tool was written in Go. And frankly there isn't any good options:

- cgo creates build and runtime complexities

- the established Go-native ncurses-style libraries come with way way way more than what is needed here. You just dont want nor need that bloat. It's more to audit, it's more to go wrong.

- the Go-native terminfo libraries haven't been battle tested. So you're not really much better off with them than you are hardcoding the tiny number of escape sequences needed (more on that later) and assuming vt100 compatibility for TTYs.

If we were talking about writing a full TUI then I'd agree with you regarding relying on a battle tested ncurses (or equivalent) package. But that's not what this project is. This is literally just printing an ordered list.

> If you insist on doing extra work to deal with mis- or unconfigured systems, you should first certainly first handle TERM=dumb specially and not send any escape codes.

You're the one insisting on doing extra work. Not me :)

Plus I've already put the dumb terminal argument to bed.

> But hardcoding terminal escape codes?

Lets be specific here, I wasn't suggesting every application should hardcode every escape sequence.

I was suggesting that the escape sequences for this specific internal company tool being discussed here are simple enough that they can be hardcoded.

It's literally just 2 escape sequences needed. 3 if you want to get clever.

Having that hardcoded makes more sense than having a 3rd party library that you then need audit (again, this is an internal company tool), just to print 2 escape sequences.

Context matters.

> People worked hard to make libraries to abstract away all these things for programs.

I know. I'm the author of several such libraries ;)

> Programs should offload as much work as possible to (sufficiently reliable) higher-level libraries.

"sufficiently reliable" is doing a lot of heavy lifting there. See my comments about auditing imports for internal corporate tools.

> This feels like advocating for x86-64 assembly in, say, 2016, when the Intel/AMD hegemony was more or less absolute. It might feel like a reasonable assumption at the time, but the pendulum always swings back to multi-platform support being necessary.

That's an absurd comparison and I think even you know that. My entire point is about trying to avoid boiling the ocean -- you're the one advocating creating more unnecessary work, not me.

And if you're worried about vt100 compatibility disappearing tomorrow, actually the opposite is true. We are seeing more non-POSIX platforms implement vt100 compatibility. (to be honest, I'd love to see terminals progress beyond in-band data formats presented as buggy escape sequences, but it's just not going to happen).

---

I don't think we're going to see eye to eye on this. So maybe we just agree to disagree?

teddyh · on Oct 3, 2024

Hardcoding an escape sequence or two because you’re using an as-of-yet immature language environment with cumbersome and/or inadequate, or even buggy, libraries is… actually a good argument. Personally, I would have the program run tput(1) as a subprocess and capture (and save) the correct escape sequences from the pipe, for later re-use, but I could imagine how, in some special cases, that might not be practical.

> You're the one insisting on doing extra work.

I am insisting that programs handle terminals correctly. It was you who brought up the rather remote edge case of when TERM is not enough to identify a terminal; i.e. when the operations you want to perform are so advanced that they might differ even when TERM is the same, or be missing even if TERM is set correctly, or not even be present in terminfo. What I wrote was my suggestions to handle that case, which would indeed be extra work, but it was you who asked for that case to be handled.

> I wasn't suggesting every application should hardcode every escape sequence.

> I was suggesting that the escape sequences for this specific internal company tool being discussed here are simple enough

You originally wrote “in reality you can get away with hardcoding vt100 sequences for most things.” But now you want to argue that you were always talking only about this specific use case?

(Also, it seems that for this specific use case, only one escape sequence would be needed: “cursor_up”/“cuu1”.)

> That's an absurd comparison and I think even you know that.

I do not.

> And if you're worried about vt100 compatibility disappearing tomorrow, actually the opposite is true.

The same was said about the x86 instruction set, and it did look completely entrenched for many years. But then it wasn’t. And old ideas and concepts about what can be done with terminals, like MGR, may yet rise again. I see things like sixels only as symptoms of a pent-up demand for this. And people won’t want to go through ANSI X3.64 sequences to render their remote applications, so non-ANSI terminal protocols are likely to emerge again.

hnlmorg · on Oct 3, 2024

> Personally, I would have the program run tput(1) as a subprocess and capture (and save) the correct escape sequences from the pipe, for later re-use, but I could imagine how, in some special cases, that might not be practical.

That's not a bad suggestion generally speaking. I've seen a fair amount of shell scripts that do this too.

In this specific case, they're running DNS lookups in parallel to speed the tool up, so I think forking tput wouldn't be preferable.

But for tools that are more forgiving for latency (and specifically targeting POSIX too), that's definitely an option.

> You originally wrote “in reality you can get away with hardcoding vt100 sequences for most things.” But now you want to argue that you were always talking only about this specific use case?

Fair point.

> (Also, it seems that for this specific use case, only one escape sequence would be needed: “cursor_up”/“cuu1”.)

I was thinking cursor down too, but actually you're right, \n would do that job better for a multitude of reasons.

Edit: thinking about this more, there’s a reverse carriage return ASCII character so you don’t actually need any ANSI escape sequences at all.

> The same was said about the x86 instruction set, and it did look completely entrenched for many years. But then it wasn’t. And old ideas and concepts about what can be done with terminals, like MGR, may yet rise again. I see things like sixels only as symptoms of a pent-up demand for this. And people won’t want to go through ANSI X3.64 sequences to render their remote applications, so non-ANSI terminal protocols are likely to emerge again.

Professionally speaking:

I can't see that happening. More systems are implementing vt100 support, not less. And escape sequences are "easy" to implement and "good enough" that most people have a quick moan about them but then quickly move on to getting whatever job they want done, done.

Plus the decline of x86 was driven by $$$ (like most changes in IT). Apple wanting to own more of the pipeline. Datacentres wanting to reduce infrastructure and operational costs. And ARM was already as old and established as x86 so it wasn't like trying to create a new standard (how many other architectures have fallen by the wayside?). There isn't anything like that happening for terminals right now.

However, personally speaking:

I genuinely hope you're right. The status quo sucks. Just so long as whatever new that comes along is a complete redesign. In-band meta sequences are just horrible.

But I think realistically, those that don't like the status quo with terminals either use web technologies or Microsoft RPCs instead. And those that like the terminal have already learned to live with its many warts.

teddyh · on Oct 4, 2024

Just like UTF-8 unified ASCII with extended character sets, maybe ANSI X3.64 can be unified with something more, incorporating features from MGR and/or sixels.

> there’s a reverse carriage return ASCII character

Which one? And if it works, why does ”tput cuu1” not output it, instead of ^[M or ^[[A?

hnlmorg · on Oct 5, 2024

> Just like UTF-8 unified ASCII with extended character sets, maybe ANSI X3.64 can be unified with something more, incorporating features from MGR and/or sixels.

Not another character set please. Terminals are better off UTF8 and we are better off keeping control sequences out of character sets. It made sense in the 60s and 70s but makes zero sense these days. Plus Unicode has enough footguns as it is without introducing control sequences.

In my opinion the only good way to advance beyond escape sequences is to do away entirely with in-band control codes. Control codes should be on a separate channel. This is basically the only thing HTML gets right: separating content from formatting and control sequences.

Having that separation in the terminal would also allow for less hacks in pipe detection. To elaborate, `isatty()`, the C function that's used to detect if a fd is a pipe or TTY, basically just performs a terminal-specific I/O control operation to see if the file descriptor can handle terminal-specific controls, and if it can't, then assumes the fd is not a TTY. Its a complete hack in my opinion. A hack that works and has worked for a great many years. But a hack non-the-less.

Also having a separate data and control channel allows you to incorporate more meta data about the data channel. I actually use this trick in my $SHELL, Murex [0]. It's a typed shell (like Powershell but far less verbose and far more ergonomic to use) however it works perfectly fine with standard POSIX pipes because the type annotations are sent over a different channel. So you have the best of both worlds: raw byte streams for anything classical, but also rich type metadata for anything that understands Murex pipes. While introducing zero boilerplate code for anyone who uses the shell.

> > there’s a reverse carriage return ASCII character

Which one? And if it works, why does ”tput cuu1” not output it, instead of ^[M or ^[[A?

You're right. I was thinking of ^[M [1] (C1 control sequence) and getting confused with C0 sequences (ie ASCII characters). Wasn't helped by me misremembering that modern terminal emulators tend to treat multiple different ASCII characters as LF, such as VT (vertical tab) and FF (form feed) [2]

[0] https://murex.rocks

[1] https://github.com/lmorg/mxtty/blob/main/virtualterm/ansi_c1...

[2] https://github.com/lmorg/mxtty/blob/main/virtualterm/ansi_c0...

teddyh · on Oct 6, 2024

> Not another character set please.

Oh no, I was merely using character sets as an analogy. What I was envisoning was a new control scheme, backward compatible with ANSI X3.64, but not merely a bundle of extensions either.

> Control codes should be on a separate channel.

I’m not sure about that. Two separate channels results in synchronization problems (which is why the parallel port went away and why USB and all modern protocols are serial). However, I would not be adverse to having the channel transferring richer data than plain bytes.

hnlmorg · on Oct 6, 2024

> Oh no, I was merely using character sets as an analogy. What I was envisoning was a new control scheme, backward compatible with ANSI X3.64, but not merely a bundle of extensions either.

Interesting.

I'm working on something that's vaguely in that area at the moment: https://github.com/lmorg/mxtty

I'm currently working on getting - full vt100 compatibility (it's mostly there, just failing a couple of tests in vttest which I believe are related to cursor save and restore)

- most of xterm

- plus a few original escape codes to demonstrate it's concept

I plan to add Tektronix 4014 support

and a few other never seen before Terminal features that I haven't yet decided how best to implement.

I'm also going to leverage the fact that I already have a relatively mature $SHELL and tie in some specific support between the shell and the terminal emulator. If just to demonstrate the terminal's capabilities.

It's still pretty alpha but I'd definitely welcome any feedback on the project's mission.

> I’m not sure about that. Two separate channels results in synchronization problems (which is why the parallel port went away and why USB and all modern protocols are serial).

Synchronizing electrical signals is a very different problem to synchronizing data streams. There's plenty of protocols out there that solve the second problem already: TCP/IP (order of packets), video codecs (audio / video synchronization), and so on.

They key isn't to have the zero control sequences in the content stream. It's to reduce those control sequences to just being tags or markers, and shifting the meta data out to the data stream.

An example of this is HTML vs CSS. HTML is the content, CSS is the formatting. Granted they're not streamed, but it's a visual demonstration for how separate concerns can be divided but still connected.

This type of approach would also solve the existing synchronization issues with stdout and stderr. Though if we're already redesigning the terminal in backwards incompatible ways, I'd do away with stderr entirely and replace it dedicated control sequences and first class error handling.

None of this would ever happen but a guy can dream.... (though this is why I started the mxtty project: to see just how far I could push terminal capabilities in a backwards compatible way)