Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I like IPFS, I really do, but whenever I try to use it, it's either too slow to become usable or sometimes it plain doesn't work. I pinned a whole bunch of files on IPFS a while back to experiment with it and the system seems to work, but every time I try to fetch those resources from a location that hasn't cached the content yet, it takes several seconds to show me the HTML/JSON/PNG files.

HTTP may be inefficient for document storage, but IPFS is inefficient for almost everything else.

I like the concepts behind IPFS but it's simply not practical to use the system in its current form. I hope my issues will get resolved at some point but I can't shake the thought that it'll die a silent death like so many attempts to replace the internet before it.



I love IPFS. It's one of my favorite recent technologies, but I think people have unrealistic expectations about such a young idea.

Decentralized tech doesn't work well until the network effects build up.

IPFS has the interesting quality that the more popular a piece of content is, the easier it is to get ahold of. If millions of people were using IPFS, the most popular content would be being served by many thousands of people, and finding it and downloading it would be extremely fast. Then subsequent viewings would be instantaneous because you can cache it for life.

This leaves interesting incentives for monetizing pinning and caching for less popular content.

It makes sense if you ask me. If I love a piece of music so much that I'm willing to give it to others for free then everyone benefits from being able to access it easier.

Content that people care about organically becomes more resilient and nearly impossible to remove.

Content that no one cares about is slow and inefficient because it has to be hauled out of cold storage the one time a year anyone cares.

If someone thinks that content is more important than people are giving it credit for they can host it or pay for someone else to do it.

If you have a website and you have "fans" that subscribe to you and help pin all your stuff, then your stuff becomes faster and easier to get. Your "fans" can even get paid for helping to serve your content.

So, to me, it's early days for IPFS, and the way to make it better is to try to build apps that increase its usage, so the power of the network effects is felt.


> IPFS has the interesting quality that the more popular a piece of content is, the easier it is to get ahold of.

That sounds like the worst of the current internet, but even worse


Or, you know, BitTorrent, which works perfectly fine.


There's a reason private trackers have to incentivize keeping the long tail alive.


Look at any public tracker and you'll find torrents that are very old, swarming strong.


Sometimes I try to get something that is a year old (or even 6 months) and it's stuck for weeks at 99%


Along with those that have 0 seeders.


IPFS solved that by making sure even popular stuff is difficult to get.


Thanks, quite a bit of coffee is now on my keyboard...


Dude this made my day. Thanks! xD


Survivor bias, there.


Yes, there is a reason: because they're private and have limited users.


It’s not a young idea though. The basic technology for p2p networks has been around for decades. DHTs, voting, vouching, etc were all got academia topics like 10-20 years ago. It’s just engineering skills at this point.

I remember popcorntime was as responsive as Netflix at the time it came out and it scared the shit out of the MPAA so they killed it with prejudice.

IPFS doesn’t have an excuse for sucking beyond a basic lack of engineering effort.


> If millions of people were using IPFS...

...then IPFS would just get even slower and use even more resources to manage the index and find content as I am pretty sure the DHT they are using doesn't scale the way you seem to think it does.


Why ? iirc the time I studied them, DHTs scale pretty well (like log(size of the network) complexity for everything).


log(size of the network) still means it gets slower as it gets larger without any aforementioned speed advantages for all but the IPFS Google popularity equivalent class content.


You seem to underestimate how slowly logarithms grow.


That doesn't matter, as it clearly grows faster than constant (as in, O(1)), which means that as the system gets larger it will take more time to do queries and maintain the index (which is ridiculously expensive on IPFS), not less (which was the claim we are contending is wrong... a claim which would still be wrong even if the system somehow were magically constant); and any supposed advantages in "caching" don't fix this (except maybe for extremely popular files: at best, ones more popular than the median file, though my intuition tells me it is going to be some inverse log worse than that, and I also suspect it might be the mean file instead of the median) as one is going to expect the number of unique files stored in the system as well as the number of queries performed to scale with the number of people using the system.


> think people have unrealistic expectations about such a young idea.

This interesting given your description:

> IPFS has the interesting quality that the more popular a piece of content is, the easier it is to get ahold of. If millions of people were using IPFS, the most popular content would be being served by many thousands of people, and finding it and downloading it would be extremely fast.

This idea hasn’t been new since the turn of the century (BitTorrent offered exactly that in 2001) and nothing in that description explains why this is different than the many previous attempts. It’d be interesting to hear about how IPFS plans to maintain that without the problems with abuse and how it keeps competitive performance relative to non-P2P in a world where things like CDNs are much cheaper and easily available than they were around the turn of the century. Using P2P means giving up a lot of control for the content provider and that’s a challenge both from the perspective of the types of content offered and the ability to update or otherwise support it on your schedule.


IPFS is basically BitTorrent if all the torrents could share with each other. IPFS is as if each "torrent" is a single chunk of data instead of a siloed collection of stuff.

IPFS expands BitTorrent into a global filesystem.

You can mount IPFS on your filesystem and address files by pointing at local resources on your machine. So you could have an HTML file say `<img src="/ipfs/QmCoolPic" />`. You can't do that with BitTorrent.


Okay, but it's not 2001 anymore. Bittorrent was useful because parallelizing uploads across a broad network increased speed to a degree that content hosts couldn't manage.

But that's not true anymore, most internet power-users are on broadband connections, many of which are symmetric, transfer speeds up or down are no longer a limiter that pushes people towards decentralization.

So when considering a decentralized system like IPFS, the downsides of decentralization, like availability, edit control, and service support, are much more salient.

There are a lot of things that "could work if everybody uses it". You can never get there if the thing isn't desirable compared to existing alternatives.


BitTorrent v2 shares dedups chunks between torrents as a side affect of changes to the hashing algorithm.


> This idea hasn’t been new since the turn of the century (BitTorrent offered exactly that in 2001)

I feel like freenet (2000) is maybe a better comparison


Yes. I can't stand that IPFS basically took the Freenet model, stripped out all the anonymity & privacy, and added the weird "interplanetary" marketing push.


> If millions of people were using IPFS, the most popular content would be being served by many thousands of people, and finding it and downloading it would be extremely fast.

Except that most people outside tech would probably be using phones/tablets/crappy pcs, with upload speed that is 10% of their download speed.


It isn't that early. They have a stupendous amount of money and have been around since 2014. By now they should have something to show for their work.


> IPFS has the interesting quality that the more popular a piece of content is, the easier it is to get ahold of.

So niche content is difficult to get a hold of?

Sounds like a bad idea.


Yeah sounds like our existing situation where content is largely moderated by centralized governing bodies. Big -1


No more difficult than it already was to obtain.

The point being, obtaining the popular stuff is no longer subject to DoS because distributed caching is built into the protocol itself.


> I think people have unrealistic expectations about such a young idea.

It launched in Feb 2015.

Things that have launched since then:

The idea of Donald Trump as President of the USA

TikTok

Covid-19

Tesla Model 3

OpenAI

It's entirely possible that everything could change any day now. It's equally plausible that it's just a bad implementation of a decent idea, and something similar could come along and deliver on its promise.


>The idea of Donald Trump as President of the USA

I find this disturbing, because not only is it not true, that it originated in 2015, and many people have commented on how The Simpsons predicted it in 2000...

https://en.wikipedia.org/wiki/Bart_to_the_Future

...but the reason they predicted it had a lot to do with Trump running in 2000, and more recently, he's reportedly been saying he is such a winner he won the first time he ran.

So it reminds me of the famous photo with Stalin and the "vanishing Commissar"...but what do you know - that has been deleted from Wikipedia recently!

https://commons.wikimedia.org/wiki/File:The_Commissar_Vanish...

It was there prior to 2016 though:

https://web.archive.org/web/20150516002908/https://commons.w...


You are missing the point of what I'm saying.

Trump wasn't taken seriously as a candidate in 2015. He didn't declare his candidacy until July 2015 at which point his odds were 150/1 and it didn't get above 66/1 in 2015.

https://www.bbc.co.uk/news/newsbeat-36392621

In 2000 his run wasn't taken seriously either. He had a approval rating of 7%. https://en.m.wikipedia.org/wiki/Donald_Trump_2000_presidenti...


Ok, fine, you wrote "the idea" and you meant "taking seriously".

But I suspect you had it right the first time. I sort of think "the idea" is the relevant stage. And that adds ~15 years to that particular thing.


> If millions of people were using IPFS

You criticize that people have unrealistic expectation, and then you are making an unrealistic claim...


I don't think that's necessarily true. Ethereum 2 is using libp2p to facilitate p2p communication between nodes. IPFS also uses libp2p. That means that every Ethereum node could easily become an IPFS node so people may end up running an IPFS node without even realizing it.


Wonder if it's practical to "buffer" popular content on IPFS by copying it to normal HTTP servers.

Requesting an IPFS document would query a few popular repositories, then revert back to normal IPFS if it's not found.

These buffer servers would also track what's popular and shuffle around what they store accordingly.


I think this is exactly what Cloudflare's and ipfs.io's web proxies do. They won't cache your stuff forever, but they'll cache it as long as someone requests the content again before the content gets removed from cache.

The downside of this approach is that it only works with popular nodes and you'd be back to the old, centralised internet architecture for all real use cases.

I don't think you can accurately gauge what is and isn't popular in a P2P network like IPFS. You never have a view of the entire network, after all.

There's also the problem of running such a system. Who pays for the system's upkeep and do we trust them? If we'd use Cloudflare's excellent features, who says Cloudflare won't intentionally uncache a post criticising their centralisation of the internet, forcing the views they disagree with to the slow net while the views they agree with get served blazingly fast.

I don't think such a system would work well if we intend to keep the decentralised nature of IPFS alive. Explicit caching leads to centralization, that's the exact reason caching works.

Instead, the entire network needs a performance boost. I don't know where the performance challenges in IPFS lie, but I'm sure there's ways to improve the system. Getting more people to run and use IPFS would obviously work, but then you'd still only be caching only popular content.

Edit: actually, I don't really want to see caching happen through popularity of the service either, because as it stands IPFS essentially shares your entire browsing history with the world by either requesting documents in plain text or even caching the documents you've just read. I wonder if that IPFS-through-Tor section on their website ever got filled in, because the last few times I've checked that was just a placeholder in their documentation.


How much were you paying for your IPFS pin? E.g., if you are getting something via HTTP, there's a server somewhere with that content just waiting for you to request it, typically stored on an SSD, etc. V.s. IPFS pins which are typically packed on to massive disks shared with lots of other people

IDK a whole lot about IPFS though. Maybe it was the metadata resolving / DHT lookup or whatever that was super slow. BitTorrent latency was always pretty high, but it didn't matter because throughput was also high


My IPFS pin was just one or two of my servers running an IPFS daemon. Since that daemon was running on Oracle's free VPS's, the answer is probably "a small fraction of what it costs for Oracle to have you in their database".

Paying for pinning sounds like something that could work but it would introduce some of the same problems that the real web suffers from back into IPFS. The idea "a web for the people, by the people" becomes problematic when you start paying people to make your content more accessible.


if it was slow running on a dedicated vps, not super encouraging.

The thing I liked about the idea of IPFS pinning is that you are paying per byte stored, v.s. per byte accessed, as long as the p2p sharing works. I.e. hosting-via-pinning a website only you read would cost the same as hosting a website that the whole internet reads.


To be fair to the software itself, the system was never pegged for CPU usage or anything, and it wasn't a fast VPS to begin with.

From what I could tell the performance issue was mostly located in the networking itself, getting the client to resolve the content on the right server. That's something that could be improved through all kinds of algorithms without breaking compatibility or functionality, so there's hope.

I agree that pinning comes with some interesting ways to monetize hosting without the need for targeted advertising that the web seems to have these days. Small projects like blogs, webcomics and animations could be entirely hosted and supported by the communities around a work, while right now giant data brokers need to step in and host everything for "free".


> They won't cache your stuff forever, but they'll cache it as long as someone requests the content again before the content gets removed from cache.

"It stays in the cache as long as it stays in the cache"

??? What on earth does this mean?


Content is cached for a certain amount of time (default is 24 hours, I think?) before it gets deleted. If the content is requested again, the timer is reset.

This is opposed to long-term caches like Cloudflare's that'll cache the contents of your website regardless of how many requests come in. Cloudflare will happily just refresh the contents of your website even if nobody has been to your website for weeks, and quickly serve it up when it's needed.


That’s not how Cloudflare normally works: the HTTP cache is demand based and does not guarantee caching. What you’re describing sounds like their Always Online feature which regularly spiders sites to serve in the event of an error.


I read it as saying that if someone downloads it before the cache timer deletes it, it resets the timer. So if the file is downloaded regularly, it is never removed from the cache.


The irony is that this and other IPFS problems will (must?) be fixed by recentralization. Cloudflare is doing this with IPFS Gateway, and Google will surely embrace/extend/usurp IPFS if it becomes popular. The user experience of bare IPFS is just not good enough.


I agree with a [previously] dead/deleted commented at this level:

"Doesn't matter. the point of ipfs is that when cloudflare and google shut down their gateway, the ipfs content is still available at the same address."


This really is one of the cruxes of decentralisation being built in at the protocol level. Even if centralised services exist, as long as one person exists who cares, the content lives on.

Without decentralisation being supported at the protocol level, as soon as the host dies, it's gone. This is particularly problematic because centralised services slowly subsume small services/sites and this either cuts off the flow to the other small sites or eventually something changes on the big centralised site and a bunch of these little sites break.


… if someone else paid to host a copy. Major companies hosting it makes that less likely and if their backing increases usage that also increases the cost of hosting everything, making it more likely that the content you want will be available. When Google shuts down their mirror, suddenly all of that traffic is hitting nodes with far fewer resources.

The underlying problem is that storage and bandwidth cost money and people have been conditioned not to think about paying for what they consume so things end up either being ad supported or overwhelming volunteers.


> suddenly all of that traffic is hitting nodes with far fewer resources.

One of the points of IPFS (and bittorrent before it) is that this is not a problem; each node that downloads the data also uploads it to other nodes, so having lots of traffic actually makes it easier to serve something (indeed, if it was already widely seeded by Google's mirror, there wouldn't be any sudden traffic).


I'm not particularly familiar with IPFS: does it have some solution for free-riding?

BitTorrent as many have noted is great for popular things, even not-particularly-popular things, but absent incentives to continue seeding (i.e. private trackers' ratio requirements) even once-popular things easily become inaccessible as the majority of peers don't seed for long, or at all.

I guess what I don't quite is what IPFS adds vs. say, a trackerless BitTorrent magnet link that uses DHT? Or is it really just a slight iteration/improvement on that system?


> I guess what I don't quite is what IPFS adds vs. say, a trackerless BitTorrent magnet link that uses DHT?

Beats me! I think there might be support for finding new versions of things, but I'm not sure about the details or how it prevents authors from memory-holing stuff by saying "The new version is $(cat /dev/null), bye!".


No it doesn’t

If nobody pins a link it disappears but there is no strong incentive it just rides on abundant space and bandwidth and wealthy Gen Xers that want to be a part of something

The same group released filecoin which experiments with digital asset incentives.. and venture capital

Inconclusive results


Bittorrent use breaks Tor, IPFS download does not.

so that's one advantage to one audience


Doesn't matter. the point of ipfs is that when cloudflare and google shut down their gateway, the ipfs content is still available at the same address.


> Wonder if it's practical to "buffer" popular content on IPFS by copying it to normal HTTP servers.

I guess the approach would be to simply run IPFS on those servers, with the popular content in it, as a seed.


Sounds like it's working fine for you. "Several seconds" of lag is nothing for an "Inter-Planetary File System", in fact it's on par with other decentralised P2P networks.


> "Several seconds" of lag is nothing for an "Inter-Planetary File System", in fact it's on par with other decentralised P2P networks.

That's good enough for kicking off batch file transfers (assuming you mean P2P networks like BitTorrent), but there's no evidence that people will tolerate a slow web, and lots of evidence that they won't.


Skynet fetches files in under 100ms, you can definitely get a decentralized system going as fast as the centralized web if you build it right.

The main challenge for me with this comment is that you can't expect distributed/decentralized networks to win if you set an expectation that "things will just be slower than the normal web". Nobody is going to migrate to that.


> Skynet

I don't know Skynet. I first checked Google, got a Wikipedia link describing a movie, then checked the Wikipedia disambiguation page, but got nothing.

https://en.wikipedia.org/wiki/Skynet

Also, why would a project duplicate the efforts of IPFS rather than contribute to it?


https://siasky.net/ and https://docs.siasky.net/

IPFS has chosen an architecture which fundamentally keeps it non-performant, Skynet is built from the ground up in a different way, and gets 10-100x improvements on performance for content-addressed links, and 100-1000x improvements on performance for dynamic lookups (IPNS)


Try browsing the IPFS example "website". I opens for me under a few hundred milliseconds.

    ipfs cat /ipfs/QmQPeNsJPyVWPFDVHb77w8G42Fvo15z4bG2X8D2GhfbSXc/readme


The IPFS network tends to run quickly if the file you are fetching is stored in the cache of either ipfs.io or in the cache of cloudflare. Everything else has lookup times of 30-60 seconds, sometimes more.

DHTs just aren't a good choice for massive data systems if you need low latency.


I think that's an unfair example because those files come pre-pinned or at least pre-loaded with most IPFS installs.

I've just pinned a 12MiB file filled with random bytes on one of my servers (`dd if=/dev/urandom of=test.dat bs=1M count=12; ipfs add test.dat; ipfs pin <hash that came out>`). The server has a 50mbps uplink, so transferring the file to my laptop should take about two seconds.

Dumping this blog's contents over IPFS takes the server about 3 seconds (first time load) so the network seems to be in working order, at least when downloading data. `ipfs swarm peers` lists about 800 known peers. On the server itself, `ipfs cat /ipfs/redacted > /tmp/test.dat` runs in about a second, which is all perfectly acceptable overhead for a transfer that'll take two to three seconds anyway.

On my laptop, I've tried to get the file but I just cancelled it after waiting for 16 minutes. Halfway throughout the wait, I've tried opening the file through the ipfs.io proxy, which finally gave me the file after a few minutes, but no such luck yet if I retry the ipfs command.

I don't know if it's the random file, the size, or something different, but if I'm launching a blog or publishing documents on IPFS, visitors should not be expected to wait five to ten minutes for the data to load. "After the first twenty visitors it'll get faster" is not a solution to this problem, because there won't be twenty visitors to help the network cache my content.

Maybe I'm expecting too much here; maybe the files shouldn't be expected to be available within half an hour, or before Cloudflare caches it. Maybe there's something wrong with my laptop's setup (I haven't done any port forwarding and I'm behind a firewall). Either way, if I follow the manual but can still buy a domain, set up DNS and hosting on my VPS and send a link to a friend faster than I can get the file through P2P, I don't think IPFS will ever get off the ground. Fifteen minutes is an awful lot of time for a data transfer these days!

Edit: actually, now it seems ipfs.io and cloudflare have picked up the file in their caches. Data transfer is up to normal speed now. If you want to try to replicate my experiment, I've just uploaded a new test file to /ipfs/QmbBD872kjfoutAmTKFCxTCApw9LBB9qxxRyXpEGYzsqMH.

Edit 2: I realized that by saying I downloaded the file and that the file is random, I just announced my personal IP address to the world through the IPFS hash, so I removed it. That was pretty dumb of me, and also a pretty clear problem of IPFS in my book.


$ time ipfs cat /ipfs/QmQPeNsJPyVWPFDVHb77w8G42Fvo15z4bG2X8D2GhfbSXc/readme

(...)

real 0m0.219s


I agree that the http caching layers are currently needed to achieve decent UX. But it’s also possible that won’t be necessary forever. The network could expand and get to the point where resolution time comes down to accessible levels. Only time will tell I suppose.


On what basis do you think this is likely? What part of IPFS gets faster when more users are online? If the file exists on a system in the network, it should make a connection and start fetching the file within milliseconds ideally. It's not like we are distributing multi gb files where more seeders means more speed, this is all just slowness in setting up a connection.

Even torrents take 1-10 seconds to initiate downloads with an absolutely massive network.


If you want to optimize for latency, you can't use a DHT. You need something that goes point-to-point instead of routing through a series of machines.


Which is why a DHT system is not appropriate for web style uses. No one wants to wait multiple seconds for a page to even start downloading. It's fine for a large file download but not for frequent small requests.


When downloading content using a magnet link downloads are usually quite slow started, while torrent files usually start directly. It's not a fair comparison since the torrents are from a private tracker with high quality peers, but it's noticeable that the DHT stuff is slower. Not sure how cjdns solves that.

Will the network be faster the more people join it?

Is centralized infra inevitable for fast things because of all assumptions and insight a centralized provider can use? BTC is also slow, I don't know of any cryptocurrency that can provide VISA transaction volumes, even if they use more power than some countries alone.


Cryptocurrency is different from storage because cryptocurrency needs to provide global consistency guarantees.

Skynet is a decentralized network with lookup times that have a p50 TTFB of under 200ms. It achieves this by looking things up directly on hosts rather than routing through a DHT. There's a bit of overhead to accomplish this (around 200kb of extra bandwidth per lookup), but for a smooth web experience that tradeoff is more than worthwhile.


I keep IPFS companion turned on constantly but when I hit a site that's getting loaded through IPFS it often takes so long that I end up turning IPFS off so it just fetches it from the central server


This echoes my experience as well, and I (used to) run a pinning service.


Wireguard and the usual tools for file search and retrieval works fine.

Do we need all that comes with IPFS? Not just technically, but the user training and pivot of technical doers?

So many of these projects feel like programmer vanity projects, there’s really little difference between them and a guy on the corner telling me why Protestants are wrong, join his flock.

That it’s a technical project not entirely ephemeral nonsense doesn’t matter; solutions exist already we just don’t implement that way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: