Hacker Newsnew | past | comments | ask | show | jobs | submit | thesz's commentslogin

  > Assume FP64 units are ~2-4x bigger.
This is wrong assumption. FP64 usually uses the same circuitry as two FP32, adding not that much ((de)normalization, mostly).

From the top of my head, overhead is around 10% or so.

  > Why would gamers want to pay for any features they don't use?
https://www.youtube.com/watch?v=lEBQveBCtKY

Apparently FP80, which is even wider than FP64, is beneficial for pathfinding algorithms in games.

Pathfinding for hundredths of units is a task worth putting on GPU.


Has FP80 ever existed anywhere other than x87?

Then you get to definitions like ": open ( string -- handle 1 | 0) ... ;" which describes returning algebraic type Maybe Handle unboxed on the stack. Algebraic types are fun, they can easily represent Peano arithmetic and get us into the realm Goedel incompleteness theorem very quickly.

Or you can deduce signature for EXEC EXEC sequence. EXEC's stack effect can be described as ( \alpha (\alpha -- \beta) -- \beta), where \greekletter is a placeholder for a stack part of arbitrary length. Notice that this type comment has nested brackets and does not adhere to Forth stack-effect comment convention.

When I thought about this more than fifteen years ago, I've got at least two equally valid types for the EXEC EXEC: one where xt at top of stack consumes all its input and leaves no output ( \alpha (\alpha -- \gamma) \beta (\beta -- ) -- \gamma) and when first EXEC produces something for second to execute upon ( \alpha \beta (\beta -- \gamma (\alpha \gamma -- \theta) -- \theta).

One can argue that second type of EXEC EXEC subsume first one, if greek-letter-named-stack-parts are allowed to be empty.

Still it shows that typing Forth, at the very least, needs unification on the Peano's arithmetic level, implementing deduction from length zero to unbounded length.

So, in my opinion, for LLM to dependably combine typed Forth/concatenative definitions, it needs to call external tool like Prolog to properly deduce type(s) of the sequence of Forth's (or concatenative language's) definitions.

And here we enter a realm interesting in itself.

Here it is: https://github.com/stassa/louise

This is a Prolog system to learn programs in polynomial time. For one example, it can one-shot-learn a grammar, without being "trained" on millions of samples.

So, should one use a LLM that either needs a paid access or just slow to run, or freely go where "old school" systems like Eurisco [1] and Cyc went?

[1] https://en.wikipedia.org/wiki/Eurisko

Eurisco demonstrated superhuman abilities in 1982-83. It also demonstrated knowledge transfer at the time, where rules from VLSI place-and-route algorithms were used to design winning Traveler TCS fleet.


How several wrong assumptions make it right with increasing trials?

You can ask Opus 4.6 to do a task and leave it running for 30min or more to attempt one-shooting it. Imagine doing this with three agents in parallel in three separate work trees. Then spin up a new agent to decide which approach of the three is best on the merits. Repeat this analysis in fresh contexts and sample until there is clear consensus on one. If no consensus after N runs, reframe to provide directions for a 4th attempt. Continue until a clear winning approach is found.

This is one example of an orchestration workflow. There are others.


  > Then spin up a new agent to decide which approach of the three is best on the merits. Repeat this analysis in fresh contexts and sample until there is clear consensus on one.
If there are several agents doing analysis of solutions, how do you define a consensus? Should it be unanimous or above some threshold? Are agents scores soft or hard? How threshold is defined if scores are soft? There is a whole lot of science in voting approaches, which voting approach is best here?

Is it possible for analyzing agents to choose the best of wrong solutions? E.g., longest remembered table of FizzBuzz answers amongst remembered tables of FizzBuzz answers.


We have a voting algorithm that we use, but we're not at the level of confidential disclosure if we proceed further in this discussion. There's lots of research out there into unbiased voting algorithms for consensus systems.

You conveniently decided not to answer my question about quality of the solutions to vote on (ranking FizzBuzz memorization).

To me, our discussion shows that what you presented as a simple thing is not simple at all, even voting is complex, and actually getting a good result is so hard it warrants omitting answer altogether.


Yeah, you've got unrealistic expectations if you expect me to divulge my company's confidential IP in a HN comment.

I had no expectations at all, I just asked questions, expecting answers. At the very beginning the tone of your comment, as I read it, was "agentic coding is nothing but simple, look they vote." Now answers to simple but important questions are "confidential IP."

Okay then, agentic coding is nothing but complex task requiring knowledge of unbiased voting (what is this thing really?) and, apparently, use of necessarily heavy test suite and/or theorem provers.


It was a scene from a sci-fi movie (i mean Claude demo to CTOs)

  > It will accelerate the work and change your role from what it is today to something different;
We yet to see if different is good.

My short experience with LLM reviewing my code is that LLM's output is overly explanatory and it slows me down.

  > something that takes time and experience to work with.
So you invite us to participate in sunken cost fallacy.

Tell it to summarize?

I cannot because these reviews are comments in Github PRs. I have to read them.

  > You should have test coverage, type checking, and integration tests that catch the edge cases automatically.
You should assume that if you are going to cover edge cases your tests will be tens to hundredths times as big as the code tested. It is the case for several database engines (MariaDB has 24M of C++ in sql directory and 288M of tests in mysql-test directory), it was the case when I developed VHDL/Verilog simulator. And not everything can be covered with type checking, many things, but not all.

AMD's FPU had hundredths of millions test cases for its' FPU and formal modeling caught several errors [1].

[1] https://www.cs.utexas.edu/~moore/acl2/v6-2/INTERESTING-APPLI...

SQLite used to have 1100 LOC of tests per one LOC of C code, now the multiplier is smaller, but still is big.


  > This capability has been demonstrated multiple times, specially when it was politically convenient, like for example the intercepted Hamas calls that showed that some of the rockets fall inside Gaza by mistake.
Can it be something generated? One can display something that is politically convenient and not true at once.

"IDF releases intercepted Hamas call on hospital explosion" - https://www.timesofisrael.com/liveblog_entry/idf-releases-in...

  > Dion language demo (experimental project which stores program as AST).
Michael Franz [1] invented slim binaries [2] for the Oberon System. Slim binaries were program (or module) ASTs compressed with the some kind of LZ-family algorithm. At the time they were much more smaller than Java's JAR files, despite JAR being a ZIP archive.

[1] https://en.wikipedia.org/wiki/Michael_Franz#Research

[2] https://en.wikipedia.org/wiki/Oberon_(operating_system)#Plug...

I believe that this storage format is still in use in Oberon circles.

Yes, I am that old, I even correctly remembered Franz's last name. I thought then he was and still think he is a genius. ;)


Interesting. It looks to me this was more about the portability of the resulting binary, IIUC.

Dion project was more about user interface to the programming language and unifying tools to use AST (or Typed AST?) as a source of truth instead of text and what that unlocks.

Dion demo is here: https://vimeo.com/485177664


I took a look.

Their system allow for intermediate state with errors. If that erroneous state can be stored to disk, they using a storage representation that is equivalent to text. If erroneous state cannot be stored, this makes Dion system much less usable, at least for me.

They also deliberately avoided pitfalls of languages like C. While they can do that because they can, I'd like to see how they will extend their concepts of user interface to the programming language and unifying tools to use (Typed) AST to C or, forgive me, C++, and what it'll unlock.

Also, there is an interesting approach of error correcting parsers: https://www.cs.tufts.edu/comp/150FP/archive/doaitse-swierstr...

Much extended version is in Haskell at Hackage: https://hackage.haskell.org/package/uu-parsinglib

As it allows monadic parsing combinators, it can parse context-sensitive grammars such as C.

It's interesting to see whether their demonstration around 08:30 of Visual Studio unable to recover from an error properly can be improved with error correction.


Arithmetic coding of a single bit preserves ordering of encoded bits, if CDF(1) > CDF(0). If byte's encoding process is going from higher bits to lower bits, arithmetic coding (even with dynamic model) will preserve ordering of individual bytes.

In the end, arithmetic coding preserves ordering of encoded strings. Thus, comparison operations can be performed on the compressed representation of strings (and big-endian representations of integers and even floating point values), without the need to decompress data until that decompressed strings are needed.

Another view: strings are compared by memcmp as if they are mantissas with the base 256. "hi!" is 'h'(1/256)+'i'(1/256)^2+'!'(1/256)^3+0(1/256)^4 and then there are zeroes to the infinity. Arithmetic encoding represents encoded strings as mantissas where base is 2. Range coding can utilize other bases such as 256.


Intriguing claim! At first I was skeptical, thinking that there would be an issue with leading zeros being discarded. However, with some piloting, I was able to use Claude and ChatGPT to construct a proof of your claim.

Sketch of the argument:

First, an arithmetic coder maps strings to non-overlapping subintervals of [0, 1) that respect lexicographic order.

Second, the process of emitting the final encoding preserves this. If enc(s) ∈ I(s), enc(t) ∈ I(t), and I(s) <= I(t), then enc(s) <= enc(t).

Finally, binary fractions compared left-to-right bitwise yield the same order as their numerical values — this is just memcmp.

Thus, we have a proof of your claim that arithmetic coding preserves lexicographic order! Nice result!

My mistake was in thinking that leading zeros are discarded -- it is tailing zeros that are discarded!


Alphabet Inc, as Youtube owner, faces a class action lawsuit [1] which alleges that platform enables bad behavior and promotes behavior leading to mental health problems.

[1] https://www.motleyrice.com/social-media-lawsuits/youtube

In my not so humble opinion, what AI companies enable (and this particular bot demonstrated) is a bad behavior that leads to possible mental health problems of software maintainers, particularly because of the sheer amount of work needed to read excessively lengthy documentation and review often huge amount of generated code. Nevermind the attempted smear we discuss here.


So I went to check whether LLM addiction is a thing, because that's was a pole around which the grandparent's comment revolves.

It appears that LLM addiction is real and it is in same room as we are: https://www.mdpi.com/1999-4893/18/12/789

I would like to add that sugar consumption is a risk factor for many dependencies, including, but not limited to, opioids [1]. And LLM addiction can be seen as fallout of sugar overconsumption in general.

[1] https://news.uoguelph.ca/2017/10/sugar-in-the-diet-may-incre...

Yet, LLM addiction is being investigated in medical circles.


I definitely don't deny that LLM addiction exists, but attempting to paint literally everyone that uses LLMs and thinks they are useful, interesting, or effective as addicted or falling for confidence or persuasion tricks is what I take issue with.

Did he do so? I read his comment as a sad take on the situation when one realizes that one is talking to a machine instead of (directly) to another person.

In my opinion, to participate in discussion through LLM is a sign of excessive LLM use. Which can be a sign of LLM addiction.


Interesting how you've painted everyone who uses LLMs and LLM addicts the same color to steelman your argument.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: