Please don't use lossy compression for scientific data.
Disk space is cheap relative to people-time and data-collection-time.
"Perceptual" is misleading- we're not just humans looking at jpegs. Many people are doing hard-code quantitative analysis for machine learning. better to downscale the images (2X, 4X, 8X or more) than to use lossy compression if you are truly low on disk budget.
It sounds like there should be a standard for compression of scientific data, probably as a noise ceiling instead of a compression ratio, if the noise is normally distributed (a big if). But if I had to choose, I would side with you and vote to use lossless in order to avoid a whole class of errors that might come from lossy compression. Especially when we can get a 1 TB SSD for under $50.
To play devil's advocate, I think there might be an opportunity for lossy compression of images below the Nyquist Rate of the microscope:
I only bring this up because I worked with industrial cameras at a previous job and even the best lenses only got us to maybe 1/2 or 1/4 of the resolution that the sensors were capable of. Microscopes should probably include an effective resolution in each image file's metadata. If they don't, then we have no data on how noisy the image is, so this is all a moot point.
Next, when you make a point like "we can get a 1TB SSD for under $50". It shows you're not understanding the domain microscopists work in. Their scopes typically cost $100K (with a camera for another $50K, and a stage that costs $10K). The cost of the storage system (or parking the data in S3) for 10 years is usually far less than the capital cost of the scope, or the operational expense of maintaining staff on site.
I believe the effective resolution can already be computed from the captured metadata, which includes the objective's numerical aperture, as well as the frequency used for imaging (many microscopes today use lasers to image with a very specific frequency to cause fluorenscence, which itself has a fairly tight output spectrum).
Thanks those are better terms. I remember doing a test where the camera photographs a ceramic plate with an almost perfectly black diagonal line to calculate the effective resolution, but I can't remember what it's called, maybe the diffraction limit. It's basically a measure of how blurry the image is, where a matched lens and sensor would look like a Bresenham line with no antialiasing.
TBH, I think that the scientific community has huge blind spots, mostly due to its own gatekeeping. Some of the things like this that seem to be a struggle remind me of the design-by-committee descent of web development. The brilliant academic work of the 1990s has been mostly replaced by nondeterministic async soup and build processes of such complexity that we have to be full stack developers just to get anything done. All the fault of private industry hoarding the wealth and avoiding any R&D that might risk its meal ticket. Starving the public of grants, much less reforms that are actively blocked like UBI. Now nobody ever seems to step back and examine problems from first principles anymore. That's all I was trying to do.
Edit: I found the term for calculating the effective resolution of a camera with a high-contrast diagonal line, it's "slanted edge modulation transfer function (MTF)":
Disk space is cheap relative to people-time and data-collection-time.
"Perceptual" is misleading- we're not just humans looking at jpegs. Many people are doing hard-code quantitative analysis for machine learning. better to downscale the images (2X, 4X, 8X or more) than to use lossy compression if you are truly low on disk budget.