I'm not convinced your L/S dichotomy applies. The concern there is that the natural world (or some objective target domain) has natural joints, and the job of the scientist (, philosopher, et al.) is to uncover those joints. You want to keep 'hair splitting' until the finest bones of reality are clear, then grouping hairs up into lumps, so their joints and connections are clear. The debate is whether the present categorisation objectively under/over-generates , and whether there is a factor of the matter. If it over-includes, then real structure is missing.
In the case of embeddings vs. vectors, classical vs., baroque, transpiler vs., compiler -- i think the apparent 'lumper' is just a person ignorant of classification scheme offered, or at least, ignorant of what property it purports to capture.
In each case there is a real objective distinction beneath the broader category that one offers in reply, and that settles the matter. There is no debate: a transpiler is a specific kind of compiler; an embedding vector is a specific kinds of vector; and so on.
There is nothing at stake here as far as whether the categorisation is tracking objective structure. There is only ignorance on the part of the lumper: the ignorant will, of course, always adopt more general categories ("thing" in the most zero-knowledge case).
A real splitter/lumper debate would be something like: how do we classify all possible programs which have programs as their input and output? Then a brainstorm which does not include present joint-carving terms, eg., transformers = whole class, transformer-sourcers = whole class on source code, ...
> i think the apparent 'lumper' is just a person ignorant of classification scheme offered, or at least, ignorant of what property it purports to capture.
>In each case there is a real objective distinction
No, Lumper-vs-Splitter doesn't simply boil down to plain ignorance. The L/S debate in the most sophisticated sense involves participants who actually know the proposed classifications but _chooses_ to discount them.
Here's another old example of a "transpiler" disagreement subthread where all 4 commenters actually know the distinctions of what that word is trying to capture but 3-out-of-4 still think that extra word is unnecessary: https://news.ycombinator.com/item?id=15160415
Lumping-vs-Splitting is more about emphasis vs de-emphasis via the UI of language. I.e. "I do actually see the extra distinctions you're making but I don't elevate that difference to require a separate word/category."
The _choice_ by different users of language to encode the difference into another distinct word is subjective not objective.
Another example could be the term "social media". There's the seemingly weekly thread where somebody proclaims, "I quit all social media" and then there's the reply of "Do you consider HN to be social media?". Both the "yes" and "no" sides already know and can enumerate how Facebook works differently than HN so "ignorance of differences" of each website is not the root of the L/S. It's subjective for the particular person to lump in HN with "social media" because the differences don't matter. Likewise, it's subjective for another person to split HN as separate from social media because the differences do matter.
> Here's another old example of a "transpiler" disagreement subthread where all 4 commenters actually know the distinctions of what that word is trying to capture but 3-out-of-4 still think that extra word is unnecessary
Ha. I see this same thing play out often where someone is arguing that “X is confusing” for some X, and their argument consists of explaining all relevant concepts accurately and clearly, thus demonstrating that they are not confused.
I agree there can be such debates; that's kinda my point.
I'm just saying, often there is no real debate it's just one side is ignorant of the distinctions being made.
Any debate in which one side makes distinctions and the other is ignorant of them will be an apparent L vs. S case -- to show "it's a real one" requires showing that answering the apparent L's question doesnt "settle the matter".
In the vast majority of such debates you can just say, eg., "transpilers are compilers that maintain the language level across input/output langs; and sometimes that useful to note -- eg., that typescript has a js target." -- if such a response answers the question, then it was a genuine question, not a debate position.
I think in the cases you list most people offering L-apparent questions are asking a sincerely learning question: why (because I don't know) are you making such a distinction? That might be delivered with some frustration at their misperception of "wasted cognitive effort" in such distinction-making -- but it isnt a technical position on the quality of one's classification scheme
> it's just one side is ignorant of the distinctions being made.
> No, Lumper-vs-Splitter doesn't simply boil down to plain ignorance.
If I can boil it down to my own interpretation: when this argument occurs, both sides usually know exactly what each other are talking about, but one side is demanding that the distinction being drawn should not be important, while the other side is saying that it is important to them.
To me, it's "Lumpers" demanding that everyone share their value system, and "Splitters" saying that if you remove this terminology, you will make it more difficult to talk about the things that I want to talk about. My judgement about it all is that "Lumpers" are usually intentionally trying to make it more difficult to talk about things that they don't like or want to suppress, but pretending that they aren't as a rhetorical deceit.
All terminology that makes a useful distinction is helpful. Any distinction that people use is useful. "Lumpers" are demanding that people not find a particular distinction useful.
Your "apparent L's" are almost always feigning misunderstanding. It's the "why do you care?" argument, which is almost always coming from somebody who really, really cares and has had this same pretend argument with everybody who uses the word they don't like.
I mean, I agree. I think most L's are either engaged in a rhetorical performance of the kind you describe, or theyire averse to cognitive effort, or ignorant in the literal sense.
There are a small number of highly technical cases where an L vs S debate makes sense, biological categorisation being one of them. But mostly, it's an illusion of disagreement.
Of course, the pathological-S case is a person inviting distinctions which are contextually inappropriate ("this isnt just an embedding vector, it's a 1580-dim! EV!"). So there can be S-type pathologies, but i think those are rarer and mostly people roll their eyes rather than mistake it as an actual "position".
All ontologies people claim to be ontologies are false in toto
All "ontologies" are false.
There is, to disquote, one ontology which is true -- and the game is to find it. The reason getting close to that one is useful, the explanation of utility, is its singular truth
In the case of embeddings vs. vectors, classical vs., baroque, transpiler vs., compiler -- i think the apparent 'lumper' is just a person ignorant of classification scheme offered, or at least, ignorant of what property it purports to capture.
In each case there is a real objective distinction beneath the broader category that one offers in reply, and that settles the matter. There is no debate: a transpiler is a specific kind of compiler; an embedding vector is a specific kinds of vector; and so on.
There is nothing at stake here as far as whether the categorisation is tracking objective structure. There is only ignorance on the part of the lumper: the ignorant will, of course, always adopt more general categories ("thing" in the most zero-knowledge case).
A real splitter/lumper debate would be something like: how do we classify all possible programs which have programs as their input and output? Then a brainstorm which does not include present joint-carving terms, eg., transformers = whole class, transformer-sourcers = whole class on source code, ...