Hacker Newsnew | past | comments | ask | show | jobs | submit | mesmertech's commentslogin

Just some observations and experience from an indiehacker who has been primarily using Django for the last 4-5 years

So just to start, I've been suffering from backpain in the recent few months due to sciatica and it has basically killed my motivation for coding after I'd already been suffering mild burnout, with my usual workflow before being just sitting for long periods of time and focusing on coding.

As a recovery thing I've been walking for long periods of time, ~2-3 hours per day which ends up making me feeling unproductive because my focus time is at mornings and if I don't start coding first thing in the morning I'm not motivated enough to start in the afternoon.

All this preamble to explain why I had the thought of coding on an ipad in the first place. With Claude code I've noticed that most of the time I'm "coding" is just me inputting text and waiting for it to actually implement the change. Which is why I had a thought, if simple sentences are all I'm inputting, would it not be possible to use dictation on ipad and work on my side projects that way instead of typing it out. I got into detail about my setup in the video itself. Apologies for the bad audio, didn't remember that it was supposed to be just a test video to check my webcam's mic and forgot to use my actual external mic.

tldr; use vscode dev on browser, superwhisper for dictation, roo code(claude code doesn't work well with dictation cause terminal)


Not the OP but this is smth I've vibecoded using cursor: https://bestphoto.ai/ MRR ~$150. It basically started as a clone of my other site: https://aieasypic.com (MRR 2.5k, 5-8k/mo rev) since I was having trouble keeping code context in mind and claude was pretty bad at doing full features with the tech stack I used for that site(Django BE, NextJS FE) making adding new features a pain, so I completely switched to a stack that claude is very good at NextJS fullstack(trpc BE) and now it can basically one-shot a feature request.

Just putting this here because a lot of times AI coding seems to be dismissed as smth that can't do actual work ie generate revenue, while its more like making money as a solo dev is already pretty rare and if you're working in a corp. instead you're not going to just post your company name when asked for examples on what you're using AI for.


Those are exactly the kind of AI slop products I would expect to be vibe coded. You've created yet another wrapper around LLM APIs where the business model is charging a premium over existing services. Your revenue depends on the ignorance of customers to not realize they can get the same or better service for cheaper from companies that actually do the hard work. I bet SEO hacking is really important to you.

It's irrefutable that AI tools can be used to create software that generates revenue. What's more difficult is using them to create something that brings actual value into the world.


> yet another wrapper around LLM APIs

Patio11 famously built, ran for a number of years (profitably) and then sold a "wrapper for a random number generator" (bingocardcreator.com)

Value is in the eye of the beholder, and only tangentially related to the technical complexity or ingenuity.


I'm not arguing in favor of technical complexity or ingenuity.

My point is that the perceived value of a service or product is directly related to its competitive advantage, product differentiation, and so on. When the service is made from the same cookie cutter template as all the others, the only value that can be extracted from it is by duping customers who don't know better.

There are entire industries flooded with cheap and poorly made crap from companies that change brand names every week. Code generation tools have now enabled such grifters to profit from software as well.


> When the service is made from the same cookie cutter template as all the others,

You're conflating "underlying tech" and "cookie cutter template".

The fact that something uses the same LLM under the hood as some other product doesn't mean it's not differentiated.


It's people configuring WordPress with various themes from Envato at jacked-up prices in a trench coat /s

I'm only half joking.


Eh, it is more like an extended/better UI. Plenty of people are willing to pay for just that.

There are lots of people that only use LLMs in whatever UI the model companies are providing. I have colleagues that will never venture outside the ChatGPT website, even though with some effort they could make their tooling richer by using the API and building some wrapper or UI for it.


Sure man, any product you don't like is just "another wrapper". I guess every website is just a wrapper over postgres or wordpress too. I run my own serverless GPU containers on runpod with a combination of comfy and my own fastapi servers using diffusers, not that it'd even matter if I just used some third party APIs. It originally even started as smth that was hacked together using 4x 4070ti supers in my basement that I then moved to runpod. Indiehacking is mostly marketing, nobody cares if you built some technically beautiful thing.

Also its easy to criticize from the sidelines but, do you have products that you made by yourself that are used by hundreds of thousands of people? I have 5 such sites, 2 of which I named above


Hey, don't blame me for the fact that your sites are indistinguishable from hundreds of others that offer the same service. Everything I said is logical to assume, since all these sites look the same.

Good on you for learning how AI tools work, but there's no way for anyone to tell whether your backend is self-managed or not, and practically it doesn't really matter. I reckon your users would get better results from proprietary models that expose an API than self-hosted open source ones, but then your revenue would probably be lower.

> Also its easy to criticize from the sidelines but, do you have products that you made by yourself that are used by hundreds of thousands of people? I have 5 such sites, 2 of which I named above

That's a lazy defense considering anyone is free to criticize anyone else's work, especially if they're familiar with the industry. Just like food and film critics don't need to be chefs and movie producers.

But I'll give you credit for actually building and launching something that generates revenue. I admit that that is more than I have managed with my personal projects.


I love examples like these. I eventually want to start a bunch of these too.

thanks for sharing.


Mostly Lora training not full finetunes. Eval for image gen is esp. hard because if you look at AI generated images of someone for too long you can start to miss which one looks more similar, even with your own images.

And yes model training is available on the site, the serverless pipeline itself was easy, making it fast(less than an hour) and have the most similarity while being flexible enough for a general user was the hard part.

Let me know if that answers your question


Thanks a lot for the reply, that does answer my questions.

As a follow-up, I think I have something that could potentially help you, especially for the evaluation part. We have an API that can allow you to really quickly get feedback and preference data for images. For example whether one image is better than another based on a criteria. You can try it out via our UI at https://app.rapidata.ai/compare or read more about the API capabilities at https://docs.rapidata.ai. Do you think this is something that could be useful to you? I am happy to have a chat to talk more in detail. Feel free to hit me up, my email is in my profile.


I mean if you're looking for a todo list, travelling to third world countries and tipping just like $5 for normal service and seeing people's eyes light up is fun.


I did this once in a restaurant in a small town in Mexico. I called one of the service people (not even a waiter, this was a casual joint where you order at the counter), slipped them $100, and watched their face light up. it was cool.


You don’t need $60m to do that.

I’ll bet literally every person here in the comments can do that.


Location: Slovakia, EU. OK with working in US timezones Technologies: Python, Django, NextJS, AWS, Docker Experience with platforms like AWS, Azure, Hetzner, Runpod In addition: Worked with frontend technologies such as Nextjs, React, Zustand, Mantine, Material UI, Tailwind

Email: mesmerpercy@gmail.com

Resume/CV: DM/Email

Some extra info: 6 years of experience as a fullstack dev with Django + React/NextJS. I've built and run a site that receives ~20 mill pageviews per month(private, can showcase on interview if needed). Currently interested in generative AI, and I have experience creating serverless pipelines for GPU tasks to reduce costs instead of using something like Replicate, huggingface etc. Recent public examples of my work:

https://aieasypic.com/ (800k views per month, 4k+ MRR)- Built my own serverless api for the dreambooth training and AI image inference to reduce costs. Also uses 4 of my own 4090s in my basement


looks like its not even available on EU, I'll call that a win


Where are you? I'm in Poland and I just disabled it both on phone and tablet - the setting switcher is "conveniently" placed at the bottom of Photos settings page


Demo on actual 4090 with flux schnell for next few hours: https://5jkdpo3rnipsem-3000.proxy.runpod.net/

Its basically H100 speeds with 4090, 4.80it/s. 1.1 sec for flux schenll(4 steps) and 5.5 seconds for flux dev(25 steps). Compared to normal speeds(comfyui fp8 with "--fast" optimization") which is 3 seconds for schnell and 11.5 seconds for dev


It's worth noting this is laptop 4090 GPU which is more like in the range of desktop 4070 performance.


This specific link I shared is the quant running on a 4090 I rented on runpod, I have no affiliation with the repo itself


The compute differential between an H100 and a 4090 is not huge. The main single GPU benefits are larger memory (and thus memory bandwidth) and native fp8. But these matter less for diffusion models.


Thats what I thought as well, but FP8 is much faster on h100, like 2x-3x. You can check it/s here: https://github.com/aredden/flux-fp8-api

Its why fal, replicate, pretty much all big diffusion api providers use h100

tldr; 4090 is max 3.51 it/s even with all the current optimizations. h100 is 11.5it/s with all optimizations, and even without its 6.1 it/s


Providers use h100 because using 4090 in DCs is grey area, since Nvidia doesn't permit it.

Paper discussing here is using 4 bit compute, which is 4x on 4090 in comparison with bf16 compute, while h100 doesn't have this at all (i.e. best you can get is 2x compute with fp8). So this paper will even out difference between those two to some extent. If to judge by theoretical numbers - H100 has 1979 TFLOPs fp8 compute, and 4090 has 1321 TOPS. Which puts it around ~65% of performance. Given the price of it ~$2K compared to H100s ~$30K this seems like a very good deal.

But again, no 4090 in DCs.


Damn, it runs very fast.


Hey, can you share the inference code please? Thanks..



Cannot compile it locally on Fedora 40:

  nunchaku/third_party/spdlog/include/spdlog/common.h(144): error: namespace "std" has no member "function"
  using err_handler = std::function<void(const std::string &err_msg)>;
                                   ^


Yea its a pain, I'm trying to make an api endpoint for a website I own, and working on a docker image. This is what I have for now that "just" works:

the conda always yes thing makes sure that you can just paste the script and it all works instead of having to press "y" for each install. Also if you don't feel like installing a wheel from random person on the internet, replace that step with "pip install -e ." as the repo suggests. I compiled that one with cuda 12.4 cause that was the part takes the most time and is what most often seems to be breaking.

Also I'm not sure if this will work on Fedora, I tried this on a runpod machine with 4090(apparently it only works on few cards, 3090, 4090, a100 etc) with Cuda 12.4 on host machine and "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04" this image as base.

EDIT: using pastebin instead as HN doesn't seem to jive with code blocks: https://pastebin.com/zK1z0UdM


Almost working:

  [2024-11-09 19:33:55.214] [info] Initializing QuantizedFluxModel
  [2024-11-09 19:33:55.359] [info] Loading weights from ~/.cache/huggingface/hub/models--mit-han-lab--svdquant-models/snapshots/d2a46e82a378ec70e3329a2219ac4331a444a999/svdq-int4-flux.1-schnell.safetensors
  [2024-11-09 19:34:01.432] [warning] Unable to pin memory: invalid argument
  [2024-11-09 19:34:02.143] [info] Done.
  terminate called after throwing an instance of 'CUDAError'
    what():  CUDA error: pointer does not correspond to a registered memory region (at /nunchaku/src/Serialization.cpp:32)


prolly make sure your host machine cuda is also 12.4 and if not, update the other cuda versions I have on the pastebin to the one you have. I don't think it works with cuda 11.8 tho, remember trying it once

but yea, can't help you outside of runpod, I haven't even tried this on my home PCs yet. for my usecase of serverless API, it seems to work


its that color for people who recently signed up


interesting that's the first time I've noticed it.


You can check out civitai's repo if you want smth for NextJS: https://github.com/civitai/civitai

Found it pretty useful for my usecase at least since most of my tech stack aligns with them, along with the component framework they use(Mantine)


What aspects of this codebase do you consider to be exemplary? I'm not familiar-enough with some of the libraries to offer a meaningful critique at a glance, but I'm also not seeing any tests.


this is a fantastic example


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: