BUG: Wrong values extracted from images with dim > 500px

darlingm · September 28, 2013, 9:37pm

This one’s taken days to track down. Even wrote a few C++ programs to look at pixel values in TIFF files to help figure it out.

Fortunately for you, once I figured out what’s going on, there’s a much simpler way to illustrate the bug. Boy, the other post I thought I was going to have to write was going to be a much longer one.

The way that I ran into this bug was from graphing images that I knew didn’t have LAB values within a few points of L* 100, and ColorThink was graphing pixels above that. (Scanned files converted from the scanner’s ICC profile to ProPhoto RGB using absolute colorimetric, so the whitepoint wasn’t scaled. There simply can’t be pixels of L* values higher than the profile’s whitepoint.)

… So, this isn’t some weird case with the sample TIFF I’m presenting. It is going to affect all photographs and scanned images that have a dimension greater than 500 pixels. I have no idea how many wrong decisions I’ve made from not having noticed this before…

Reproduce - From Scratch
Skip steps 1-4 by downloading this 640x640px TIFF file - which contains alternating rows of black and white, and is saved with LZW (lossless) compression.

(Step 1) Create a small document in Photoshop. I started with 10x10 pixels. Zoom in as as FAR as you can - probably to 3200%.

(Step 2) Make the document contain alternating rows of black and white. Select the pencil tool. Make it 1 pixel, 100% hardness, 100% opacity. Draw a line in rows 1, 3, 5, 7, and 9.

(Step 3) Turn this into a document with its long dimension LARGER than 500 pixels, containing the same alternating pattern. I went to 640x640 pixels. I did this repeatedly by turning the background layer into its own layer, doubling the canvas size in both dimensions with the layer anchored to the upper left, then duplicating the layer three times, moving them into position, merging visible layers, and repating. You get from 10x10 to 640x640 in only 7 steps - doesn’t take long at all.

(Step 4) Flatten the document, and save to a TIFF file. Use no, LZW, or ZIP compression for a lossless save. Don’t use JPEG compression.

Reproduce - Can jump in at here at Step 5 if you download this file

(Step 5) In ColorThink, go into a 3D Graph, and open the TIFF file. You should see two points. One at LAB 0/0/0, and another at LAB 100/0/0. All the pixels in the TIFF will bunch up at these two exact points. However, you won’t see a dot at either of those LAB points. You’ll see quite a few dots at X/0/0 - but not one at 0 or 100.

(Step 6) In ColorThink, go into Color Worksheet. Open the same TIFF file. (You’ll notice the thumbnail doesn’t look like alternating black and white stripes. This does NOT matter, at least it shouldn’t. Displaying a small thumbnail which can’t handle such a fine pattern shouldn’t affect the other functionality of the program. You get the same effect in any image software, including Photoshop, when you zoom out.) Now, next to the Images label click the TIFF filename. Select “Extract Unique Color Values”. Wait for it, expecting to see two rows, one for LAB 0/0/0 and another for LAB 100/0/0. Nope! In the case of 640x640px, you get 21 rows showing LAB values ranging between L* 13.39 and 92.46.

What’s going on here?

Well, if you look at the Color Worksheet where you opened the image and extracted the unique values, you’ll see the Name fields have values of 500, 10000, or 39500. These numbers correspond to the times ColorThink found that RGB combination in the image.

Add all those up, and you come up with 250,000 pixels. However, our 640x640 pixel file has 409,600 pixels. Just so happens the square root of 250,000 is 500.

If you use other dimensioned files, you wind up with a total always of or less than 250,000 pixels. If you divide the number of counted pixels by 500, that gives you the exact number that fits the ratio of the image you gave ColorThink.

This leads me to believe when ColorThink opens an image larger than 500 pixels on its long dimension, that it internally sizes it down to exactly 500 pixels on the long dimension without showing you that it does this.

I think it’s reasonable to scale it down. I have no problem with that, although being able to quickly extract unique values would of course be better. When I forced ColorThink to take a CGATS text file containing all 1.8 million pixels (A reasonably sized 1212x1418px image), it choked it. It got up to using 1.8GB of memory in task manager, and crashed.

Scaling it down properly wouldn’t cause this issue, you just run the risk of not having a few pixels graphed.

The only thing that makes sense to me is that ColorThink is internally using an improper resampling technique during this downsizing, such as bilinear, bicubic, etc. If you use one of these resampling techniques in Photoshop, you also loose the pure black and white pixels.

This resizing is what creates RGB & LAB values in ColorThink that aren’t in the image.

In Photoshop, using nearest neighbor keeps pure black and white pixels - while of course screwing up the alternating pattern. (It couldn’t keep the pattern, unless you cropped it rather than resized/resampled it down.)

The bottom line is that ColorThink should not show there are RGB or LAB values in an image that aren’t actually in it. This effects:

If you’re graphing an image to determine which material can represent the image’s gamut the best.
If you’re using Color Worksheet to track profile transformations

A workaround until a new version is released is to downsize your image to 500 pixels on the longest dimension in your editor, using nearest neighbor. I’ve always downsized to 1000 pixels on the largest dimension, knowing that was small enough it wouldn’t take ColorThink down.

BONUS BUG

If I crop the 640x640px image to 1x640px, ColorThink silently fails when opening it in Color Worksheet or 3D graphing. Once you click open, the file dialog box goes away, and it doesn’t do anything or give an error. Best guess (less sure than everything above) is that it might be coming up with wanting a 0x500px image, and just skipping that whole process.

Steve_Upton · September 28, 2013, 10:50pm

I appreciate all the hard work that went into the diagnosis of this… bug, but it’s actually a documented behavior:

http://www.colorwiki.com/index.php?title=ColorThink_Pro_-_Color_Worksheet_Part_1&oldid=7713#Automatic_down-sampling

But, to fair, it would be best if ColorThink let you know what was going on.

Also, the 500px limit (and the downsampling method) was set quite some time ago and both could be updated for the more modern machines we use today.

This is also interesting. I’ll see about killing this one in the next release.

thanks,

Steve

darlingm · September 28, 2013, 11:05pm

Very welcome. I hadn’t seen the documentation saying it automatically down samples to 500 pixels. I see that now, and agree that part is documented.

As long as it used a downsampling method that chose pixel values that were actually in the original image, that would be fine with me. Nearest neighbor, or something to that effect.

It’s not doing that though, it’s using downsampling that creates new RGB values.

If you see a previous post I made at [url]http://one.imports.literateforums.com/t/colors-expand-converting-from-printer-profile-to-prophotorgb/1008/8] about “Colors expanding converting from printer profile to ProPhotoRGB”, which I now know is caused by the resampling method chosen, that perfectly illustrates the problem with using a downsampling method that tries to make the image look nicer.

All of the pixels in graph 3 in that post should be in gamut of the printer profile, but are massively expanded outside of it.