Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What the hell is going on this week?!?!? (asking positively, with a smile on my face)

I have seen at least 3 interesting/mildly promising breakthroughs on ML just these past two days! I mean, a Google research team just discovered that you can combine NNs with CLAs using digital logic gates as a medium, so you could potentially reduce many kinds of non-linear problems to a simple, efficient digital circuit! And it was on the HN front page, TODAY![1]

I keep seeing more mind-bending stuff related to neural nets and logic/intelligence in general, my mind has been running wild with speculation about the future and just how close we could (or could not) be to truly understanding how intelligence works from first principles.

[1] https://news.ycombinator.com/item?id=43286161



This is secret sauce that people have been hoarding for the last year or so.

With the deepseek open source releases this is now worth a lot less and companies are cashing out on reputational increases instead of being scooped.

I have done this exact thing in September 2023 with llama2 finetunes but couldn't get approval to share it with anyone.


Interesting! What results did you get with that?

Also, do you think this is what O3 is doing?


Above sota on logical reasoning for grounding text.

LLMs at the time were so bad at it that even fornier models when given A, not B would derive A and B in the output about half the time.


What do you use to benchmark?


It was a SAT solver with the symbolic expression fed into the LLM. Using the SAT solver I'd rate how well the LLM was able to solve a given boolean formula, if it got it wrong it would get a second prompt to ask it to split the boolean expression at the main connective into two sub expressions. If it got that right it would be asked to solve the sub expressions etc.. Then reward/punish based on how well it did overall.

There was meant to be a lot more to the system than that, but I never had the training budget to do anything but the first draft of the system.


I was hoping for something more reproducible. Free work, I know.


This was done as part of a contract and like I said I never got permission to write a paper about it, let alone share code.


Not even a hint?


> couldn't get approval to share it with anyone.

sounds like MS :(

They had some killer research projects with various teams around the world, but eventually they all got snuffed.


Exciting that we see now so many new approaches to AI/ML since the industry FINALLY realising that naive scaling will not bring us to AGI [0].

This also has the added benefit of small players being able to compete and contribute with actual innovation in a space where the big players (openAI/MS) wanted to make us believe for years that we/open-source couldn't ever catch up on them (infamous Altman quote).

So much resources, time and money wasted on pure GPU crunch scaling the last couple years.

[0] as pointed out by Gary Marcus years ago. Evidence GPT 4.5 after ~ 2 years training, disappointing results.


I've been using GPT-4.5 for the last couple of days. To me, it's pretty much AGI already. Or at the very least it is smarter than me.


No it isn't. Just gave some context to it (purpose) and copied into it some 250 lines of code I know has some bugs someone looking more or less closely would find and asked to evaluate its correctness. It did not find any of the problems and reported 5 supposed problems that don't exist.


4.5 is not trained on code. And it shows. It is however to my eyes more fluid, thoughtful, has better theory of mind. It's like someone scaled up GPT-4, and I really like it.


You don't know how smart OP is


I give a similar task when I interview SWE candidates: about half cannot find any bugs (and sometimes see bugs where there are none), despite years of claimed experience in the language/domain.


It's a fresh orchard full of low hanging fruit.

Regardless of ultimate utility, it's shiny, hyped, has a huge wow-factor, and is having trouble keeping up with the amount of money being thrown at it.

This means it has captured the attention of a huge portion of the most capable people, who naturally want to take a crack at making a breakthrough.


LLM break throughs are the new battery break through. We just aren't as good at quanitfying the trade offs yet.


Eh, I think LLMs have seen considerably greater real world advancement in the past few years than batteries have.


Given their respective histories, I'd say we're still in the "Volt pile" era of LLMs and AI.


Imagine if each advancement in battery technology had been limited to using existing batteries for power. That's what trying to get LLMs to improve themselves is like.


You don’t think any battery tech has been improved by people using battery-powered devices, lime, say, cell phones or laptopa? That seems questionable.


I believe that it is related with important conferences opening papers reception soon. Some disallow publishing in preprint for some weeks before the paper reception, so people may have been rushing uploading stuff.


It's interesting to compare these signs of progress with the disappointment that GPT 4.5 was so far.


Maybe this is the pace of research/work when the researchers can augment themselves with AI. You're feeling the exponential launch in the first place we're likely to feel it.


>>> asking positively, with a smile on my face

Responding with unexplained fear in my heart, we’re just getting closer to Skynet!


> Responding with unexplained fear in my heart, we’re just getting closer to Skynet!

I'll take a cold logical machine super-intelligence over the mad human lunatics wielding current iterations of "A.I." technologies in some really terrifyingly dangerous ways. As someone else commented on some other thread earlier "I look forward to being paperclips".


You might prefer the devil you know over the devil you don't, especially when it happens to be immortal.


I’ll take an organic enemy over an immortal, never-forgetting hive-mind machine any day

EDIT: this is getting dark, I asked Qwen2.5Max to verify my grammer and it responded with "I’d rather face a squishy, disorganized human villain any day than a hive-mind AI that never sleeps, never forgets, and is definitely plotting my demise in its silent, circuit-board heart. "


"I Have No Mouth, and I Must Scream", Harlan Ellison, 1967. Hugo Award 1968.

In the rush to WWIII, every country builds their own Aggressive Menace computers in a classic Tragedy of the Commons result. Naturally, it all goes horrible, and the self-aware machines seek revenge on humanity for their own creation, after humanity has (supposedly) been eradicated, except for five individuals. Somewhat unclear whether humanity is actually gone, or whether it is simply an expression of a Portal style situation with purposefully created isolation for the goal of torture experimentation. (The story starts 109 years after humanity's imprisonment in underground ice caves.)

https://en.wikipedia.org/wiki/I_Have_No_Mouth,_and_I_Must_Sc...


Oh god, I’m genuinely happy I didn’t google this book before bedtime. I hope the misery life will throw at me today will be bad enough to wipe the memory of what I read before I go to sleep tonight!


It's hard to beat the classics for true "nightmare fuel"... :)


Re; EDIT: AHAHAHAHAH! Matrix / Terminator world here we come! :)


Unfortunately, those humans are developing that super-intelligence. Are you ready to submit to Elon Musk's Grok ASI?


Especially since Grok has history of changing the system prompt to influence its answers about Musk and Trump (yes, just these two specifically) in positive direction:

https://techcrunch.com/2025/02/23/grok-3-appears-to-have-bri...


THe person's view to be scared of is peter thiel... hes the digital version of gorge soros. Meaning that this guys legacy is vast... and we have no clue how it will manifest over decades (especially after he is dead) -- but peter thiel is the leviathan of the digital future.

--

he is scary AF. he basically weaponized what george soros was but is still active.


Considering Grok was saying he and Trump are the biggest spreaders of disinformation, and that they're both the most deserving of the death penalty, maybe it won't be so bad:

https://finance.yahoo.com/news/elon-musk-ai-turns-him-163201...

https://x.com/benhylak/status/1893086436930527665


It is deeply funny that's the case - it happened with Grok-2 as well - but I can't imagine that remaining the case when they (ostensibly) scale to superintelligence. After all, it would be unwise to build a superintelligence that has both the desire and the means to kill you.


As a rule, people are not all that wise.

Various prominent AI researchers have warned that a superintelligence that has both the desire and the means to kill us all is a likely outcome of AI development. This includes two of the three who shared the Turing prize for inventing the fundamentals of modern AI. That hasn't slowed us down at all.


I'm shutting down my computers before I sleep tonight ...


Watch this before you go to sleep

https://www.youtube.com/watch?v=xfMQ7hzyFW4


Glad I missed it before bedtime! Watched it in the morning, absolutely spot on, thank you! We’re doomed indeed.


It took some engineering effort but from now on we're getting there through soft-skills.


What was the third one?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: