Amazon Elastic Container Service now supports Amazon EFS file systems

txcwpalpha · on April 9, 2020

I see a lot of notes about EFS's performance in the comments. I figured it's at least worth noting, for anyone considering using ECS with EFS, that just last week EFS had its read throughput on its general purpose tier increased by 400%.

That probably won't solve all EFS performance issues, but it's a pretty big boost and a nice announcement to come alongside ECS support.

https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-el...

thatsamonad · on April 8, 2020

This is great news!

Yes, these containers are supposed to be stateless, but I was tasked with converting an app at my previous job over to using ECS on Fargate and we hit so many issues because of the limits on storage per container instance. We ended up having to tweak the heck out of nginx caching configurations and other processes that would generate any "on disk" files to get around the issues. Having EFS available would have made solving some of those problems so much easier.

I've also been wanting to use ECS on Fargate for running scheduled tasks with large files (50gb+) but it wasn't really possible given the previous 4gb limit on storage.

rumanator · on April 8, 2020

> Yes, these containers are supposed to be stateless,

You got it backwards. NFS type services help containers be stateless because they are a separate service accessed through an interface where all the state is handled by a third party.

Thus by using a NFS-type service to store your local files, you are free to kill and respawn containers at will because their data is persisted elsewhere.

dekhn · on April 8, 2020

containers shouldn't necessarily be stateless; most existing codes don't know how or want to talk to services via RPC interfaces. In some sense, a mounted remote filesystem is just a standard API the OS provides you to access state in a convenient way that happens to be high performance, indexed, etc.

lsaferite · on April 9, 2020

Yeah, I think they are conflating stateless and ephemeral in this case.

thatsamonad · on April 13, 2020

> conflating stateless and ephemeral in this case

You're totally right, I was mixing up stateless and ephemeral. My mistake and thanks for pointing it out!

blueter · on April 9, 2020

https://aws.amazon.com/about-aws/whats-new/2020/04/aws-farga... PV 1.4 comes with Single 20gb Volume.

jboggan · on April 8, 2020

Oh man, awesome. We had a rather janky workload where ECS would spin up an EC2 that would then mount an EFS volume and then write a file over to S3. This is going to make that so much easier and cleaner.

If you're wondering why you'd ever have to do something like that, the answer is SAP.

koolba · on April 8, 2020

This is going to make a lot of container workloads that were possible, but inconvenient to setup, suddenly trivial to deploy. Very nice!

mark242 · on April 8, 2020

This is the single biggest blocker to running something like Postfix in ECS. This is a huge, huge win.

sciurus · on April 9, 2020

I think postfix would perform horrifically on EFS, which has absymal latency and is terrible for workloads with lots of random i/o.

geertj · on April 9, 2020

(One of the product managers on the Amazon EFS team here). We have many customers that use EFS for a wide variety of use cases, including hosting Postfix. As with all applications, performance needs are relative. Use EFS if your application requires consistent low single digit ms latencies, shared POSIX file system, and a pay as you go elastic usage model. As with all AWS services, EFS is always launching greater performance capabilities, including IOPS, throughput and lower latencies to meet the needs of our customers. As example, on 4/1, EFS launched a 400% improvement in read IOPS for its General Purpose performance mode, from 7,000 to 35,000. Given the type of file system operations that Postfix performs, it should nicely benefit from this improvement.

zapita · on April 8, 2020

How's the performance on EFS? Has anyone used it in production that is willing to share their experience?

We evaluated it for a relatively simple use case, and the performance seemed abysmal, so we didn't select it. I'm hoping that we made a mistake in our evaluation protocol, which would give me an excuse to give it another try.

codeduck · on April 8, 2020

It's terrible. Very slow when we tried to use it. There are ways to work around this, and ways to tune the performance, but honestly it was not worth it for our use case and instead we found a way to make EBS work.

EFS is a great way to get a lot of iowait on your cpu graphs. Would not recommend it for anything that had to be fast.

txcwpalpha · on April 9, 2020

AWS just last week upped the read throughput speed on EFS pretty significantly (4x). That probably won't solve all of your speed problems and its still not as fast as EBS, but it might be worth giving it another try if your workload isn't write-heavy.

https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-el...

dekhn · on April 8, 2020

I have to agree with this- I found it challenging to tune EFS to get the claimed performance. The most important details of any system like this are: provision a very large filesystem, use large files, use lots of concurrent access (threads or machines).

There is a whole market of small companies that make high performance filers that do what many people want but they also have limits (high cost/byte).

mwcampbell · on April 8, 2020

> we found a way to make EBS work.

Can you say more about what you did with EBS? It seems like it would be necessary to make some compromises in availability and disaster recovery because any given EBS volume is restricted to the availability zone where it was created.

codeduck · on April 8, 2020

We were hosting third party software in an EKS cluster and needed a way to share state between components of this system. We tried EFS initially but it actually killed the EKS cluster with iowait under load. We found a way to divert most of the systems requirements to local emptydir volumes, leaving only infrequently accessed media files on EFS

dcveloper · on April 8, 2020

Would you recommend it for a CMS like system (Drupal) that's backed by a CDN?

geertj · on April 9, 2020

(One of the product managers on the Amazon EFS team here). Drupal is a common application that customers use with EFS, both in combination with a CDN and without. The following are two important considerations when running Drupal on an NFS service like EFS:

i) You should configure the OPcache so that it does not revalidate its cache on every request. Cache validation uses stat() in a serial loop on potentially hundreds of files, where each stat() would add O(ms) to the request.

ii) We recommend you store log files locally. NFS does not define an atomic O_APPEND operation, so appends require a file lock to prevent interleaving with appends from other clients. I've seen PHP application do 100s of file locks to a log file per request, each adding O(ms) to the total request latency. This is what you'd like to avoid.

geerlingguy · on April 8, 2020

It is highly dependent on your needs. It's NFS, and performs accordingly (though EFS has been rock solid in a few different scenarios from an availability standpoint and a baseline performance, assuming you use dedicated IOPS).

Should you run a database on EFS? No. Can you use it to back media files for a web application that are cached using a CDN, or for data files used for processing or temporary storage? Yes, it shines in those use cases... and it's cheaper than dedicating the time required to maintain your own NFS cluster.

Even Gluster or Ceph is, IMO, not worth the effort unless you (a) know how to run and maintain it, and (b) absolutely need the potential speed up that you can get, assuming a well-configured and well-maintained system.

banana_giraffe · on April 8, 2020

It feels like the performance and cost is really built around a very specific use case that basically boils down to "write logs and only read a tiny fraction of those logs".

And then, I've seen way too many people treat it like a traditional file system, and stick things on it that don't expect to find themselves on NFS, and wonder why they get corrupted files.

And, really, I tend to avoid the AWS services with "Burst Balances". It's painful to get a system running smoothly only to have it grind to a halt when you use it under load because some burst balance somewhere went to zero. Your mileage may vary, of course.

josegonzalez · on April 8, 2020

EC2 SSDs have Burst Balances as well, and have since 2016: https://aws.amazon.com/blogs/aws/new-burst-balance-metric-fo...

banana_giraffe · on April 8, 2020

Trust me, I know. We have alarming on all of our SSD burst balances after a few painful lessons.

At least those are mostly OK, since in our case at least, the really EBS hungry clients now have volumes of 1024 GB or more, so the burst balance issues don't apply.

objectivefs · on April 9, 2020

For a high performance shared file system on AWS, an alternative is ObjectiveFS[0]. It uses memory caching to achieve performance closer to local disk.

[0]: https://objectivefs.com

geerlingguy · on April 8, 2020

Technically it supported it before, but you had to configure everything manually (or with your own automation). Having it native is a lot nicer, and brings provisioning of NFS-style volumes up to par with the current Kubernetes experience.

WatchDog · on April 8, 2020

This is going to make running teamcity or jenkins from fargate, much simpler.

rkwasny · on April 8, 2020

EFS performance is just horrible. Running containers on it is asking for problems.

My advice, stick to EC2 + EBS, it works.

djstein · on April 8, 2020

FINALLY!!! edit: thanks a lot ECS team

nnx · on April 9, 2020

Hope this is added to Lambda soon. EFS scalability would shine with Lambda.