Page 1 of 1

identify -format %n number of images tiff fails large file

Posted: 2012-05-23T11:09:56-07:00
by mmuller
Linux 2.6.31..., 3GB memory. identify used 22% at max (according to top)
There is no problem with page thrashing or memory that I see (top,vmstat,/proc/#/status)
myfile can be downloaded:
ftp://ftp.swri.org/pub/incoming/multi29 ... bit.tif.gz (1.17GB)
or not gzipped:
ftp://ftp.swri.org/pub/incoming/multi29 ... s16bit.tif (2.15GB)

identify -format %n myfile.tif
never returns (at least not in 4 hours). File has 0-29515 images.
identify -format %n myfile.tif[29000-30000] returns 516 as expected. It (-format %n) works ok when we have a few thousand images.

I have 2 alternatives but none as short as using identify (tiffinfo requires another package):
1)
#!/bin/bash
convert -quiet myfile.tif[0] delme.tif
onesiz=`ls -ln delme.tif | awk '{print $5}'`
allsiz=`ls -lnH $file | awk '{print $5}'`
rm delme.tif
nguess=`expr $allsiz / $onesiz`
nlow=`expr $nguess / 1000 \* 1000`
nhigh=`expr $nlow + 1000`
ncount=`identify -format %n -quiet myfile.tif[$nlow-$nhigh]`
nImages=`expr $nlow + $ncount`
nImages=`expr $nImages - 1`
2)
tiffinfo SOK120116_01493480_16bit.tif 2>errs | grep TIFF | wc -l

Also, after i crop out a slice from the middle of each of the 30K images from the multi29515images file and try to paste the 30K resulting files together with convert, it also takes forever. Appending 10K files takes 120 minutes, 5K files takes 7 minutes, and appending 2K files takes just .5 minutes.
Coded creating 29516 files from multi29515images.tif then appending all 29516 tiny slice-files into 1 file, like this:
#!/bin/bash
for ( i = 0 ; i < 29515; i=i+1 ); do
orderFile=$myfilename`printf "%06d" $i`.tif
convert -quiet -crop "1"x+"95" $myfile.tif[$i] tempDir/$orderFile
done
convert tempDir/*.tif +append ${myfilename}_alltogether.tif


Here is snipped identify -verbose on myfile.tif[0]
Image: myfile_16bit.tif
Base filename: myfile_16bit.tif
Format: TIFF (Tagged Image File Format)
Class: DirectClass
Geometry: 190x190+0+0
Resolution: 100x100
Print size: 1.9x1.9
Units: Undefined
Type: Grayscale
Base type: Grayscale
Endianess: MSB
Colorspace: RGB
Depth: 16-bit
Channel depth:
gray: 16-bit
Channel statistics:
gray:
min: 403 (0.00614939)
max: 737 (0.0112459)
mean: 589.022 (0.0089879)
standard deviation: 48.8253 (0.000745027)
kurtosis: 0.772726
skewness: -0.57136
Histogram:
---snip---
Rendering intent: Undefined
Interlace: None
Background color: white
Border color: rgb(223,223,223)
Matte color: grey74
Transparent color: black
Page geometry: 190x190+0+0
Dispose: Undefined
Iterations: 0
Compression: None
Orientation: TopLeft
Properties:
date:create: 2012-05-09T14:03:47+00:00
date:modify: 2012-05-09T14:03:47+00:00
signature: bfcf539284fd2e11b6b782232322ac5073f665b8d0629e31828110424f15e03d
tiff:photometric: min-is-black
tiff:rows-per-strip: 4
Artifacts:
verbose: true
Tainted: False
Filesize: 1.9998gb
Number pixels: 35.3kb
Version: ImageMagick 6.5.4-8 2009-10-24 Q16 OpenMP http://www.imagemagick.org
identify: myfile_16bit.tif: unknown field with tag 4881 (0x1311) encountered. `TIFFReadDirectory' @ tiff.c/TIFFWarnings/546.

Advice?

Re: identify -format %n number of images tiff fails large f

Posted: 2012-05-23T11:42:40-07:00
by magick
Can you post a URL to your TIFF image? We need to reproduce the problem and track it so we can identify the bottleneck before we can offer any solution.

Re: identify -format %n number of images tiff fails large f

Posted: 2012-05-23T23:38:53-07:00
by anthony
File has 29515 images
You do realise that IM will try to read in ALL the images into memory so it can count them!
That is a lot of images, and Im may be reaching memory limits and going to disk swaping - very slow!

You may be better off with a more specialised TIFF specific utility.