> profilers are more important than careful design. > I have found that it's act...

ChrisMarshallNY · on June 12, 2020

> But isn't excessive blocking/synchronization not something the should already be tackled in your design instead of trying to rework it after the fact ?

Yes and no. Again, I have not profiled or optimized servers or interpreted/JIT languages, so I bet there's a new ruleset.

Blocking can come from unexpected places. For example, if we use dependencies, then we don't have much control over the resources accessed by the dependency.

Sometimes, these dependencies are the OS or standard library. We would sometimes have to choose alternate system calls, as the ones we initially chose caused issues which were not exposed until the profile was run.

In my experience, the killer for us was often cache-breaking. Things like the length of the data in a variable could determine whether or not it was bounced from a register or low-level cache, and the impact could be astounding. This could lead to remedies like applying a visitor to break up a [supposedly] inconsequential temp buffer into cache-friendly bites.

Also, we sometimes had to recombine work that we had sent to threads, because that caused cache hits.

Unit testing could be useless. For example, the test images that we often used were the classic "Photo Test Diorama" variety, with a bunch of stuff crammed onto a well-lit table, with a few targets.

Then, we would run an image from a pro shooter, with a Western prairie skyline, and the lengths of some of the convolution target blocks would be different. This could sometimes cause a cache-hit, with a demotion of a buffer. This taught us to use a large pool of test images, which was sometimes quite difficult. In some cases, we actually had to use synthesized images.

Since we were working on image processing software, we were already doing this in other work, but we learned to do it in the optimization work, too.

When my team was working on C++ optimization, we had a team from Intel come in and profile our apps.

It was pretty humbling.