Hacker Newsnew | past | comments | ask | show | jobs | submit | aportnoy's commentslogin

aportnoy.com


For NVIDIA,

1. play around with the NVPTX LLVM backend and/or try compiling CUDA with Clang,

2. get familiar with the PTX ISA,

3. play around with ptxas + nvdisasm.


> But the lack of ECC was a huge bummer at the time of purchasing my system.

Why?..


Ever had to troubleshoot bit flips on a non-ECC system? One friend felt like he was going crazy as over the course of two months his system degraded from occasional random errors to random crashes, blue screens and finally to no POST. Another time, a coworker had to stare at raw bytestreams in Wireshark for hours to find a consistently flipped bit.


Don't overclock your memory.


All of these were with stock, non XMP clocks.


Well then… test your memory :)


How often do you test your memory? The nice thing about ECC is it's always testing your memory, and (if it's set up properly!) you'll get notified when it begins to fail. Without ECC, your memory may begin to fail, and you'll have to deal with the consequences between when it starts to fail and when you detect it.

(Of course, I don't run ECC on my personal systems, but at least I'm wandering knowingly into the abyss)


Testing your memory detects if you have bad RAM, which ECC isn't going to help with anyway. Perfectly fine memory will experience random bit flips from environmental factors. Your PC components and UPS also degrade over time and can cause random bit flips. ECC is there to catch problems as they happen and ideally take corrective action before bad data propagates


> Ever had to troubleshoot but flips on a non-ECC system?

No.

> One friend felt like he was going crazy

Tell him about memtest86.


Wow I came back to post this exact reply. I set my system to a slightly high frequency, ran memtest overnight with errors.

Set it back down to a supported frequency, ran a full memtest suite again with no errors.

Never had any issues since.


> Wow I came back to post this exact reply. I set my system to a slightly high frequency, ran memtest overnight with errors. > > Set it back down to a supported frequency, ran a full memtest suite again with no errors.

Cool. You tested your memory at some point in the past.

How do you know it's still working properly and hasn't flipped any bits?

You don't. Because you have no practical way of testing the integrity of the data without running an intrusive tool like memtest86 that basically monopolizes the use of the computer.

Being able to detect these types of memory errors at a hardware level while the processor is doing other things is the fundamental capability that ECC gives you that you otherwise wouldn't have, no matter how thoroughly you run memtest86.


You likely wouldn't know if you had random bit flips. It'd manifest as silent data corruption. You might be okay with that. Others aren't.

It's not a matter of overclocking. Bit flips are a fact of life running with 32+ GB RAM. Leaving your machine on 24/7 (even if in sleep) stacks the odds against you.


Obviously this is just anecdote but I have a work laptop with 128GB of non-ECC ram , use all of it every day and never noticed any issues. I'm not saying there aren't any, but it just....works.


You have silent bit flips, they silently corrupted data instead of causing a visible error.


> extremely opinionated

I have not seen a single codebase that widely uses uint8_t and does not typedef it to u8. It is the exact opposite of "extremely opinionated".


Note Sam Altman is a Cerebras investor.


The first valuable comment imho, thanks!

Let me add : there are numerous other AI chip startups.


From the last thread on this (https://news.ycombinator.com/item?id=32610780):

- https://sambanova.ai/ (Enterprise AI and dataflow-as-a-service for established models)

- https://www.cerebras.net/ (AI accelerator, trying to compete with Nvidia)

- https://www.graphcore.ai/ (Another AI accelerator company, UK based)

- https://femtosense.ai/ (Sparse NNs on very low power chips, cool hardware and software challenges)

- https://sima.ai/ (ML accelerators for embedded applications)

- https://ambiq.com/ (Not AI, but low power chips for wireless using some fancy tech that reduces energy leakage)

- https://www.esperanto.ai/ (RISC-V based Tensor computes chip, founded by Intel Hybrid Parallel Computing Vice President Dave Ditzel)

- https://www.furiosa.ai/ (AI accelerator company which show good results in MLPerf benchmark)

- https://groq.com/ (From the team that built the original TPU at Google)

- https://lightmatter.co/ (Light tubes instead of copper)

- https://www.untether.ai/


Exactly. We are now starting to slowly realize... [0]

[0] https://news.ycombinator.com/item?id=35490837


Ah, so that is how Sam and Co. cash out on the 10 billion from MS!


What happened?


Nothing happened. It was an informal technical interview with the program manager at JAX. 1 hour call and the interview was remote but describing him as opinionated and entitled is an understatement. Best of luck to them.


Asking because I am on that team :)

I went through the same hiring process and had a positive experience at every stage. I had a strong competing offer but went with the JAX team at NVIDIA.

I'll pass it along as feedback.


Would you mind sharing some details? It sounds like an interesting peek behind the curtain.


1 is the multiplicative identity

0 is the additive identity

all([]) is True

any([]) is False


Go to the blog and skip to results: https://ai.meta.com/blog/seamless-m4t/


For these tasks and languages, SeamlessM4T achieves state-of-the-art results for nearly 100 languages and multitask support across automatic speech recognition, speech-to-text, speech-to-speech, text-to-speech, and text-to-text translation—all in a single model. We also significantly improve performance for low and mid-resource languages supported and maintain strong performance on high-resource languages.

To more accurately evaluate the system without depending on text-based metrics, we extended our text-less metric into BLASER 2.0, which now enables evaluation across speech and text units with similar accuracy compared to its predecessor. When tested for robustness, our system performs better against background noises and speaker variations in speech-to-text tasks (average improvements of 37% and 48%, respectively) compared to the current state-of-the-art model.

SeamlessM4T also outperforms previous state-of-the-art competitors.


The project is a port of https://github.com/nviennot/core-to-core-latency from Rust to C.


I used to surf near Scripps Pier while in college and I remember there were always a couple people sitting outside in beach chairs working on their laptops.

Salk Institute is an even more surreal place.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: