Return to Digital Photography Articles
JPEGsnoop - JPEG File Decoding Utility
by Calvin Hass © 2008
JPEGsnoop is a free Windows application that examines and decodes the inner details of JPEG and MotionJPEG AVI files. It can also be used to analyze the source of an image to test its authenticity.
Overview
Introduction

Every digital photo contains a wealth of hidden information -- JPEGsnoop was written to expose these details to those who are curious.
Not only can one determine the various settings that were used in the digital camera in taking the photo (EXIF metadata, IPTC), but one can also extract information that indicates the quality and nature of the JPEG image compression used by the camera in saving the file. Each digical cameras specifies a compression quality levels, many of them wildly different, leading to the fact that some cameras produce far better JPEG images than others.
While there are several free JPEG decoding utilities out there, most are unsupported and command-line based (e.g. JPEGdump or dumpJPEG).
What can I do?
Check out a few of the many possible uses for JPEGsnoop!
One of the latest features in JPEGsnoop is an internal database that compares an image against a large number of compression signatures. JPEGsnoop reports what digital camera or software was likely used to generate the image. This is extremely useful in determining whether or not a photo has been edited / tampered in any way. If the compression signature matches Photoshop, then you can be pretty sure that the photo is no longer an original! This type of feature one of several sometimes referred to as Digital Image Ballistics / Forensics.
JPEGsnoop reports a huge amount of information, including: quantization table matrix (chrominance and luminance), chroma subsampling, estimates JPEG Quality setting, JPEG resolution settings, Huffman tables, EXIF metadata, Makernotes, RGB histograms, etc. Most of the JPEG JFIF markers are reported. In addition, you can enable a full huffman VLC decode, which will help those who are learning about JPEG compression and those who are writing a JPEG decoder.
Other potential uses: determine quality setting used in Photoshop Save As or Save for Web settings, increasing your scanner quality, locating recoverable images / videos, decoding AVI files, examining .THM files, JPEG EXIF thumbnails, extract embedded images in Adobe PDF documents, etc.
File Types Supported
JPEGsnoop will open and attempt to decode any file that contains an embedded JPEG image, such as:
- .JPG - JPEG Still Photo
- .THM - Thumbnail for RAW Photo / Movie Files
- .AVI* - AVI Movies
- .DNG - Digital Negative RAW Photo
- .CRW, .CR2, .NEF, .ORF, .PEF - RAW Photo
- .MOV* - QuickTime Movies, QTVR (Virtual Reality / 360 Panoramic)
- .PDF - Adobe PDF Documents
* Note that video file formats (such as .AVI and .MOV) are containers, which can include video streams encoded in one of a wide variety of codecs. JPEGsnoop can only interpret this video footage if the codec used is based on Motion JPEG (MJPG).
Fixing Corrupt Photos!
After spending considerable time analyzing JPEG bitstreams, I am now able to offer to fix your corrupted photos!
Download the Latest Version of JPEGsnoop!
![]() Click to Download .ZIP |
Version: 1.2.0 Version History Released: 01/22/2008 Downloads: 13381 |
Help Support JPEGsnoop DevelopmentIf you have found JPEGsnoop useful and would like to support its continued development, consider making a small contribution. Donations will help encourage me to add new and interesting features. Found an interesting use for the tool? Let me know! |
Beta Version Available!Version: 1.2.1 (beta 3). Version HistoryReleased: 04/09/2008 Click to Download .ZIP Please note that beta versions have not seen as much testing and therefore may not be as reliable. As always, please use JPEGsnoop on a copy of your images. If you encounter any issues with the beta, please let me know. Thanks! |
System RequirementsThis application has been designed and tested to run on Windows XP and Windows Vista, but it should also work for Windows 95/98/NT/2000. Terms of UseJPEGsnoop is free for personal use. Commercial users are kindly asked to make a contribution. |
InstallationNo installation required. Simply unzip the download and run! Version HistoryFor information about features added in previous versions of JPEGsnoop, please check out the version history page. |
![]() |
Main Window |
---|
![]() |
Channel Histograms |
![]() |
MCU Grid & Positioning |
Documentation
Please see the options page for information on how to use JPEGsnoop and other interesting uses for the tool
Recent Features
- Full detailed Huffman VLC decoding output for those interested in writing a decoder or learning JPEG compression
- Automatic display of YCC DC block values (16-bit)
- MCU Grid overlay and automatic display of mouse MCU position and file offset in image display window.
- Test overlay function enhanced to allow quick apply and binary code readout.
- Full-resolution image view - Inverse DCT decode done on (DC+AC) in addition to DC-only mode. Marking of partial MCU boundaries.
- Image zoom level from 12.5% - 800%.
- Extract embedded JPEGs -- can be used to extract thumbnails, hidden JPEG files, as well as frames from Motion JPEG AVI files.
- Compression detection enhanced to detect rotated signatures, comment field.
- Complete rewrite of user interface: clipboard, printing, find, unlimited report log length, etc.
- Drag & Drop files to main window to open
- Full AVI file parsing (to identify MotionJPEG)
- DQT table searches in Executables (for "hackers")
- Management of user signature database
- Automatic checking for new updates
- Configuration options and user preferences retained
- Increased image decode performance by 10x
- Enhanced support of DNG files
- Detect edited images or identify original digital camera that took a photo!
- Integrated database of thousands of compression signatures (image fingerprint) for digital cameras and editing software
- FlashPix decode (partial)
- Proper decoding of Grayscale images
- Horizontal scrolling of enlarged previews (useful when zooming in to preview)
- File overlay test function
- Multi-channel preview: RGB, YCC, R/G/B, Y/Cb/Cr
- Multiple zoom levels for MCU block preview (100% - 800%)
- Pixel position lookup into file offset
- Examine Motion JPEG .AVI or .MOV (Quicktime) files (MJPG or MJPEG) and play through!
- Examine any file fragments that may contain a JPEG image
- Decode JPEG JFIF Thumbnail image
- Search forward/backward for next image (SOI)
- User-specified offset into files (hex or decimal)
- Display IPTC IIM v4 Metadata decode / extraction
- Display and dumping of Luminance DC Histogram
- Split screen view of Image & Decoding data
- Drag and drop onto File Icon to open
- RGB channel histogram display
- Color correction / clipping statistics reports
- Command-line execution
- Better memory handling, improved scan parsing speed
- Huffman variable-length code statistics
- Calculation of compression ratio
- Expansion of DHT (Huffman Table Expansion into bitstrings)
- Determine IJG JPEG Quality factor
- Detection of truncated images
- Detection of YCC or RGB clipping during color conversion step
- Parsing of Scan Data segment (matching variable-length huffman codes)
- Extra options: Makernote decode, Scan Dump, Scan Parsing
- Extract chroma subsampling parameters
Background Material
I strongly recommend reading my articles on JPEG compression in order to understand some of the details that are reported by JPEGsnoop:
Upcoming Features
I'm always looking for new ideas, but here are some of the ones that I'm working on:
- Improved image display and AC scan decode performance.
- Enhanced Digital Image Ballistics / Forensics / Tampering utilities
- Detection and enhancement of embedded JPEG quality in software
- Full AVI / MotionJPEG (MJPEG) video file decoding, including parsing of RIFF
- Video analysis: run-time plots of bandwidth, JPEG quality, etc.
- Correct / fix corrupted JPEG images!
- JPEG Comparison Test
This feature can be used to perform a lossless rotation test on your JPEG images. Many people are unsure whether or not their software uses lossless rotation. This test will let you know! - Batch processing
Suggestions
As this is a work in progress, I would be very interested in hearing from you, particularly for feature requests, suggestions, comments, bug reports, etc. If you currently use JPEGsnoop and find it useful, let me know!
5 users online
Reader's Comments:
Please leave your comments or suggestions below!Thanks for very great job!
Here's the data:
Data copied and pasted from a hex editor:
FF DB 00 C5 00 04 03 03 03 03 02 04 03 03 03 04
04 04 05 06 0A 06 06 05 05 06 0C 08 09 07 0A 0E
0C 0F 0E 0E 0C 0D 0D 0F 11 16 13 0F 10 15 11 0D
0D 13 1A 13 15 17 18 19 19 19 0F 12 1B 1D 1B 18
1D 16 18 19 18 01 04 04 04 06 05 06 0B 06 06 0B
18 10 0D 10 18 18 18 18 18 18 18 18 18 18 18 18
18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18
18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18
18 18 18 18 18 18 02 04 04 04 06 05 06 0B 06 06
0B 18 10 0D 10 18 18 18 18 18 18 18 18 18 18 18
18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18
18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18
18 18 18 18 18 18 18 FF C4
Copy and paste from JPEGsnoop:
*** Marker: DQT (xFFDB) ***
Define a Quantization Table.
OFFSET: 0x00001C49
Table length = 197
----
Precision=8 bits
Destination ID=0 (Luminance)
DQT, Row #0: 4 3 2 4 6 10 12 15
DQT, Row #1: 3 3 3 5 6 14 14 13
DQT, Row #2: 3 3 4 6 10 14 17 13
DQT, Row #3: 3 4 5 7 12 21 19 15
DQT, Row #4: 4 5 9 13 16 26 25 18
DQT, Row #5: 6 8 13 15 19 25 27 22
DQT, Row #6: 12 15 19 21 25 29 29 24
DQT, Row #7: 17 22 23 24 27 24 25 24
Approx quality factor = 88.05 (scaling=23.90 variance=1.21)
----
Precision=8 bits
Destination ID=1 (Chrominance)
DQT, Row #0: 4 4 6 11 24 24 24 24
DQT, Row #1: 4 5 6 16 24 24 24 24
DQT, Row #2: 6 6 13 24 24 24 24 24
DQT, Row #3: 11 16 24 24 24 24 24 24
DQT, Row #4: 24 24 24 24 24 24 24 24
DQT, Row #5: 24 24 24 24 24 24 24 24
DQT, Row #6: 24 24 24 24 24 24 24 24
DQT, Row #7: 24 24 24 24 24 24 24 24
Approx quality factor = 87.95 (scaling=24.11 variance=0.22)
----
Precision=8 bits
Destination ID=2 (Chrominance)
DQT, Row #0: 4 4 6 11 24 24 24 24
DQT, Row #1: 4 5 6 16 24 24 24 24
DQT, Row #2: 6 6 13 24 24 24 24 24
DQT, Row #3: 11 16 24 24 24 24 24 24
DQT, Row #4: 24 24 24 24 24 24 24 24
DQT, Row #5: 24 24 24 24 24 24 24 24
DQT, Row #6: 24 24 24 24 24 24 24 24
DQT, Row #7: 24 24 24 24 24 24 24 24
Approx quality factor = 87.95 (scaling=24.11 variance=0.22)
*** Marker: DHT (Define Huffman Table) (xFFC4) ***
Thanks!
George
When JPEGsnoop generates and compares a signature, it is considering a number of characteristics about the image in addition to the DQT. However, the DQT tables do make up the majority of this comparative data.
You correctly identified the DQT section within the JFIF file in your hex editor output. If you skip past the marker (FF DB) and additional data (00 C5 00), you begin the raw table contents.
The part that may not be immediately obvious is that the sequence of bytes in the file does not match the matrix representations that most people use to document the quantization tables. That is because a zig-zag ordering method is used (please refer to my page on JPEG compression for some further detail on it). In essence, you'll still identify the table's bytes in the file bytestream if you traverse them in the following order:
Start at the 04 (top left corner of DQT matrix), then across to 2nd column (03). Advance diagonally down and to-the-left (only one move brings us to 1st col, 2nd row = 03). Now we drop down one row (to 1st col, 3rd row = 03) and head back up diagonally up and to-the-right (03 02). We continue this zig-zag sequence until we've hit all entries in the matrix!
Hopefully this makes it a little clearer. Ideally, I should draw up a diagram to help indicate the sequencing.
I know it's too much, but I'm curious if there's a too for all formats, or separate tools for each format.
thks for replied , I finished my dec/enc but i am facing problem while in Linux Platform, The output of the encoder is showing 0 byte file, plz tell is it any specific writing format in linux or JPEG header may i have to change..
waiting for reply..
regards,
giri
I was wondering whether you would consider adding the ability to export the image to a file in XYZ colourspace, decoded in accordance with the sYCC specification, i.e. using unclipped rgb values and the gamma encoding mirrored about zero. Ideally calculated with 16bit precision to a 16bpc tiff.
Since I posted here last summer I have only found two programs that seem to understand the sYCC colourspace. It is strange when more than 100million cameras are produced each year that encode to this space. I don't know whether it is the NIH factor or just that the camera makers speak Japanese and the sofware writers speak English and they don't understand each other.
One issue is that the ICC specification annex F requires clipping of rgb, so sYCC cannot be handled directly in a colour managed system.
The two programs are Silkypix (which is Japanese) and QPcolorsoft which I am using.
QPcolorsoft is software for color matching to a target. It simply adjusts the colours and saves to AdobeRGB colourspace as a JPEG or TIFF. Because it removes the oversaturation and saves to a larger colourspace it reduces the clipping. But it is noticeable that it will still save out-of-range colours back to a JPEG. So it seems to understand sYCC for both loading and saving.
If JPEGsnoop could export an XYZ tiff, it should be possible to load it into a colour managed application. Or failing that, use TIFFICC to convert to another colourspace and do gamut checking to see how much damage was being done.
It would be interesting to be able to compare a camera JPEG "perfectly decoded" by JPEGsnoop with the clipped version most programs produce.
It seems even if APP2 is absent, its ok, but if APP3 is not present, it gives error, saying
"expected marker 0xFF at offset ---"
I tried to disable "show maker notes", but nothing changes. Do you have a hint?
I found your JPEGSnoop tool two weeks ago, and it is quite exactly what I need to analyze problems with JPEG codecs. The format of the detailed decode with the bitstream is very useful.
The only problem is that the detailed decode sometimes stops to consume bits and to decode at places where there is not even an error in the stream. E.g. I have some images where it stops at a certain point when I make a detailed decode from beginning, but when I start the detailed decode at a later MCU, it decodes this part correctly. If you are interested I can send you such an image with details how to reproduce this problem.
Many thanks & best regards,
Michael
I saved file to m_sfw.jpg with quality 60, extracted tables with JpegSNOOP, and use them as input for cjpeg.exe.
..\cjpeg.exe -sample 1x1,1x1,1x1 -qtables sfw_60.txt -optimize -dct float m.bmp result.jpg
But cjpeg version still has much more noise then photoshop version.
Which another tricks does the SFW use?
images, .bat conversion file - http://files.shapegame.ru/fileshare/public/compare_cjpeg_sfw.zip
The differences are especially noticeable in some of the MCUs near the bottom of the girl's pink dress (beside the yellow trim).
It looks like Photoshop is doing a better job at avoiding some of the compression artefacts that cjpeg is producing. It is possible that Photoshop might be doing some smoothing on the bitmap data before passing it to the compression routine (even though Photoshop's Blur setting in the Save for the Web dialog may be set to 0). It is also possible that the color conversion algorithm (RGB -> YCC) differs, but that doesn't appear to be the cause.
Do you have a plan on MPEG/H.264 for the same thing?
I can not find a competent product in the market.
Considering the amount of video streams that torture my hard disk everyday, it would be good to have MPEGsnoop.
JPEG Snoop has helped me a lot while trying to create a binary to JPEG image converter. But i have some skewed images being created. I am writing the code in matlab and am not able to find the error. Whats ur email, as I would like to email you the image to see if you can find the where the errors are created.
Am currently studying about jpeg images.I think it wud great if decoder supports also non-interleaved and progressive images.In non-interleaved images,i see that its able to decode only the luminance part and not the chrominance part
Cheers
Shobana
I would like to convey what your work has enabled me to accomplish.
Some months ago, I had a fairly bad disk crash that resulted in a bunch of jpg files being concatenated as lost clusters and recovered by Norton Disk Doctor upon reboot (98SE/FAT32 here). At the time, I had little clue as to how to proceed with them, so I just archived the lot and stuck them away, practically forgetting them in the interim.
Later, I found jpeg analyser console app, which at least gave me the offsets to look for in a hex editor, and through a series of copy and paste operations was able to recover the images out of two of the concatenated .ndd files. It took so long though that it was easier to just put the whole matter off.
Then I found your site, and impressed as I was with the general knowledge, had my sox blown off by the presence of JPEGSnoop. It wasn't until a couple days later even that I decided to load one of the .ndd files in it, then got to poking around in the menus and found a dedicated function to seek to the next SOI marker, and yet another to save as a separate file.
I actually couldn't believe it; almost turned inside out (ewww!). Well, I got the job of recovery done in around two hours over 16 or so .ndd files, recovering some 170 usable jpgs, thanks to JPEGSnoop.
Had it been left to the hex editor method, the .ndd files would in all probability be still sitting there, waiting for me to complete a job for which the word "tedium" is a underscored understatement.
Many thanks!
Is there any way to detect the brightest pixel using JpegSnoop? Im looking for a command line tool that will allow me to screen images for the brightest pixel and dump those coordinates to a file or db.
As for detecting the brightest pixel, I can definitely add this. A few others have asked for something similar -- it may even allow for the image whitepoint / white balance to be determined. I'll add it to my list.
Please help
Glenn
If you open up the AVI file with JPEGsnoop, you'll see that your file should indicate that MotionJPEG was detected (a FourCC value of "MJPG"). In the videos that you are trying to copy over, is this the case? If not, what FourCC Codec value is shown?
In addition to the .AVI file, you will also require a companion .THM file. This file contains the metadata (video date, time, thumbnail, etc.) that the digicam will use when browsing the memory card. Once you have made sure that the .AVI file is in the correct format, creating the .THM file will be easy. Again, you can use JPEGsnoop to compare the .THM files that are generated from videos created by the digicam against the one that you are trying to convert to place back onto the memory card.
Try to do any more than that and the best you'll be able to do is modify (decompress, edit, then recompress) some sub-blocks of the image (8x8 or 16x16 pixel regions) and then preserve the remaining blocks "losslessly". There are a few programs (such as BetterJPEG) that allow you to do localized edits while preserving the rest of the image, but this is not commonly supported.
Lastly, it may also be possible to reduce the amount of recompression error by simply saving out your edited JPEG image as a bitmap / TIFF, then using a custom JPEG encoder (such as cjpeg) to reuse the same quantization tables as the original JPEG file. This would minimize recompression error if the edits were kept to a small region. But this method does take a few extra steps and is not particularly straightforward. If I have misunderstood your question, please feel free to repost!
I just discovered JPEGsnoop! Thanks a LOT for this little app! Finally I see why my PS save for web @ 60 images always looked so much better than IrfanViews @ 60 images. Adobe seems to have shifted/adjusted their 0-100 scale a bit.
Finally I can directly analyze my digicam compression settings without running sort of successive approximation test compressions on an image. So cool!
I really found your whole page about digital photography highly interesting!
Cheers,
Hennes
Anyshoes, I spent over a week reading and trying to learn about jpeg coding and encoding and asking questions in forums. There have been no answers forthcoming. Then I found JPEG Snoop. Now I have an answer! Granted I still have NO idea what it means, but at least I know for a fact that there IS a difference between the blasted "identicalmyrearend" images even if they do look the same. Even if windows and graphics programs and a few forums full of well meaning knowitalls insist they are the same - they are not, JPEG Snoop sees the difference. So thank you very much for sharing this creation. You're a prince sir. I wish I had found this that first night...
I did try to do a tiny return of the favor by adding my camera to your database when asked but I receive the dreaded illegal-op error and the program closes when I try to send it. I would be happy to send the info in an email or post it here if you like. I'm on a win98se machine so I'm guessing that like your readers, your software is also too advanced for the old timer!
I would really like to ask a jpeg question here though, because I've been getting conflicting answers from the Adobe, Paint Shop Pro and other forums - is it normal for a 60kb jpg to balloon UP over 100kb when it is re-saved and no changes have been made? Logic tells me no, but one fellow insists in the adobe forum that jpegs get bigger "EVERY time you save/close/reopen/save a .jpg..." I don't believe that. But, I have found that when I "save copy as" via my graphics program the copy is over twice the original size, same thing happens if I change 1 pixel and then just save it. I cannot tell by JPEG Snoop what is happening to the image after I save it to make such a huge difference because of a single new or changed pixel.
(For clarity I am not looking at the images file size when it is open in the graphics program either - I do know that the open file is larger while you work on it - these are closed images with file size as reported by windows.)
Would some kind soul please enlighten me in a way that even a spoonless bowl of Jello could comprehend?
Thank you again Calvin, and thank you Uber-smart-folk for reading!
p.s. my feature request would be a split pane window that you could drag&drop images into to do side by side compares of course!!!
After reading much of the same repeated information in the forums, I had enough doubts about the reliability of what was being said that I decided to develop this program. In developing JPEGsnoop, I can now confirm that much of what gets shared on the internet regarding image quality, resaving, lossless, etc. is unfortunately often incorrect.
In any case, if you try to compare an image before and after resaving, you will almost certainly notice a file size change. This doesn't necessarily mean that the image content has been modified! The file size you see in Windows Explorer is composed of two parts: 1) the image content (scan data) and 2) the metadata (EXIF photo details, etc.). When we are concerned about image changes, we are only really interested in the scan data differences -- although some people would be upset if the metadata (date, time, camera setting details, etc.) were to be removed or modified.
Resaving an image will almost always change the quantization tables (seen the section DQT in the JPEGsnoop listing), and in doing so, it almost guarantees that the image was recompressed. In really general terms, the quantization table tells the photo saving software what details of the image can be left out to save file space!
Recompressing is exactly where most of the "error" comes in from resaving. The selection of values in the quantization tables are the most significant decider of overall file size. An image with a particular DQT (with a reported quality of 50) might be 500Kb, while the same image saved with a different DQT (with a reported quality of 95) might be 3500Kb. In the first example, the table tells the software to ignore most of the small details in the image so that only a small amount of file space is used in preserving the remaining large details of the image. In the second example, the table told the software to keep almost all of the details, large and small, but obviously this will cost us more file space.
If you are seeing a difference of 100Kb, it is very likely due to a change in the quantization tables (ie. you are getting some loss in image information after resaving, and the new table is more "wasteful" of file size).
Almost without exception, resaving an image with these photo editors (even without touching any pixels in the image) will cause a "lossy" recompression which can either increase or decrease the resulting image size -- it depends on what DQT the software is using compared to the DQT of the previous save / camera. This also explains the "Save Copy As" -- your photo editor is not actually duplicating the original file -- instead it is decompressing it, displaying it, then resaving it with a different filename. The new file (after resaving) will use whatever DQT that software chose for a given quality setting, .
Now that we've convinced ourselves that the image content is in fact different, you are probably wondering "how different"? This sounds like a fantastic feature request for a new version of JPEGsnoop! Resaving images within photo editors (when the "Quality Setting" is reasonably high) often produces images that are imperceptibly different to the average person, but sometimes visible to those who know what to look for. However, if the "Quality Setting" is low enough, the differences in the images can really start to become significant (blocking and ringing on edges, etc.) -- to the point of ruining the image. Many photographers are hyper-sensitive to image "loss", so hence the hype over lossless operations (rotation, crop, etc.)
Hopefully this has explained some of what you're observing, and please feel free to post any other questions!
PS> As for the Add Signature function not running in Windows 98, this is unfortunately due to the internet functionality not being compatible with Win98. I am not sure yet if I will plan to rewrite it to support the older operating systems.
[UPDATE 10/06/2007]: The most recent version of JPEGsnoop now supports Win98. Give it a try.
I just thought I should notice you on this.
Thanks for a great program,
I have a situation here dealing with some porno images, which is a part of my forensics investigation(file carving) , well the person who had uploaded the images, had modified the image using Photoshop which scrambled the face of the person, now what i need is , is there a way by which i can see the original jpeg, prior to modification.
Thumbs.db, is not available, so i am helpless.
also, many jpegs are not having EXIF data, might have used Save for Web.
Please suggest any options.
Nitin Kushwaha
India
Thanks for the quick reply, it all works fine for me now.
I have to say this is a great program, and very helpful site, it helped me no end in designing my own hardware encoder.
I have included a dump of the output.
...< output listing removed - showed 1 DQT, 1 channel, followed by the error: >
Thanks!
very nice tool for checking jpg info. Would be insanely useful if you had a 'resave' option as well. Where you could save using the same luminance and chrominance tables that the file already uses (to minimise loss of quality when resaving) or even an option to choose custom tables to use when saving the picture. Of course this would only be useful if it also imported bmp files. So you could touch up a jpg in photoshop, save as a bmp and then resave using your program with a minimum of loss.
have fun,
steven
I know this is a special case but I can't be the only person doing this sort of work. And I've run into a whole bunch of old Photoshop for Mac images that have two complete JPGs in a single file (why??!)
In extending JPEGsnoop to decode AVI files, I am going to make it more flexible with the decode process. Nonetheless, perhaps you could email me an example file so that I could understand your need better.
Thanks, Cal.
Suggestions... I'd like to be able to point it at a directory tree and have it gang-process all the JPGs therein, appending the extracted info to a single file, with the option to only list files that have errors. (In fact, "append to log" would be a good option generally.)
Also, I'd like the option to display the thumbnail fullsize if the file is below a certain size, say 320x240 or so. At 1/8th size, small images' display is microscopic. :)
I was surprised when it showed me thumbs from an AVI that the HD recoverly software had broken up into individual frames...now I know for sure what those files were!! No other image tool could even OPEN them, let alone view them. (Anyone know of a tool that can reconstruct these orphan frames into a working AVI?? the index was lost, all I have are naked, mostly-headerless JPGs as peeled out of the AVI. I *might* still have the RIFF header.)
Side thoughts: it would be great if JPEGsnoop could be incorporated into a JPG editor such that one could both manipulate the raw data, and SEE the result on the spot. Or at least have each control byte marked as to function, so one could edit 'em right then and there. I've not seen any tool like this.
Again, I'm SO glad I stumbled onto JPEGsnoop -- thanks so much for making it!!
~REZ~
Thanks Rez!
One of the planned upcoming features for JPEGsnoop was in fact to recover damaged JPEGs. I have been analyzing corrupted JPEGs trying to decide the best techniques to recover the images. Many types of corruption can be corrected, but I still need more time to decide on the best approach. The hardest part is figuring out precisely where an error occurs in the scan data segment. So, with that in mind, I tried to develop the tool to be slightly tolerant of errors.
I actually have batch processing built-in to JPEGsnoop, but haven't exposed the interface as I want to decide on a suitable output methodology: I'm thinking that I'd simply write an individual log file per image, with the logs simply having a different extension.
As for the small thumbnails, I agree. I've started working with JPEGsnoop on video files and thumbnails (.THM) and can appreciate that enlarging the preview in some cases (perhaps user-selectable) would be nice.
The reason why JPEGsnoop can actually view the broken-out AVI frames is that I have already been extending JPEGsnoop to decode AVI files (MotionJPEG), so it can handle the implicit huffman tables. The ability to open AVI files completely will be available soon (I just need to work out how to do this from the user-interface perspective).
With respect to the editing -- what sections did you feel would benefit from being editable?
Thanks for sharing your great suggestions!
Thanks for the quick response and thoughtful reply.
So there is no magic bullet, huh? The quest continues...
Your example clarified a lot for me. I thought the degradation of an image from multiple re-saves was a downward slide, so whichever image held the higher approximate quality rating was "better", more or less. I can see now it's not so simple.
The true resize-resample-re-save history of many images in the public domain is an unknown that will never be solved.
The suggestion I made could have some limited appeal--in the sense that, for example, out of a thousand images of Paris, you would be able to group together the images with a similar most-recent save history because their approximate quality value is at least in the same ballpark. (This presumes the last person to tinker with the photos kept their settings the same for each one.) Or if someone wants to tag a logo onto their own photos from a few years back, but doesn't remember the setting they used to save them the last time.
I don't know if either of those scenarios warrants the time it would take from you to implement this idea. I'll leave that up to you. Maybe it's best as "food for thought."
I read your "Quantitative Results" section, and not having the original image to compare RGB values against would seem to be a dead end as far as "intelligent" image evaluation / comparison goes. I have to believe such an idea has great value, but I have no idea how it would best be implemented.
Thanks again... I learned quite a bit today.
From a different perspective, it is conceivably possible to estimate JPEG compression quality through other means. Quantifying the degree of JPEG blocking artefacts (discontinuities across MCU boundaries) as well as the lack of high-frequency content may help provide another perspective when comparing different photos. However, it would only be practical in filtering out images that were saved at fairly low image qualities in the past. A combination of methods might actually be a reasonable starting point: deprioritize images with a high degree of blocking artefacts and those that have been saved most recently with a low JPEG quality setting.
I may consider seeing if the qualitative measure of JPEG images quality (through blocking artefact detection, etc.) would be an interesting feature to add, purely for the curiosity factor.
Great program, and very informative site.
I have a suggestion for JPEGsnoop--if this is not in keeping with your goals or plans for the program, then I apologize in advance for taking the ball and running in the wrong direction with it!
I have collected a great many digital images of historic places. To keep the collection manageable, I first will run a utility to detect "byte for byte" or hash-based duplicates and eliminate those. Then I will run a visual similarity-type utility on what's left. This will identify images which appear to be identical to the human eye, although they are often of different file size, resolution, and possibly format. I am on my own for deciding which one of each pair of duplicates I want to keep. If you have a large collection, opening them up one by one in Photoshop and enlarging them to compare "defects" or "jaggies" is not practical.
To my knowledge, there is no tool currently available that will evaluate a batch collection of jpg images and report back the approximate quality of each. Granted, "quality" is often subjective (like art) but it is also true that "numbers do not lie."
JPEGsnoop is perfect for indentifying an image's approximate quality rating. Since I am not a programmer I don't know how difficult my next suggestion is, but here goes: a tool that works similarly to JPEGsnoop that will process a large batch of jpg files, and allow the file name to be renamed with it's approximate quality value as a prefix. For example: prefix "Q092_" added to "grand canyon.jpg" leaving you with "Q092_grand canyon." This is desireable because it would allow me to sort my collection by quality first, since often times these images may have been renamed or re-compressed many times over the years.
If I'm going to delete a visually similar image, I might as well keep the one that has the approximately higher quality value--even if the difference is slight. It's not always the higher resolution version, and in many cases the resolution is the same.
A utility that would use an "intelligent" evaluative comparison of a batch of jpgs would fill a void that has plagued historians, photographers, and anyone else who needs to find the best quality image of a given subject quickly.
Thanks for allowing me to share my suggestion. Keep up the good work.
- James
I think I understand your requirements. However, I think it is worth pointing out an issue that may make this approach less useful:
Without entering into the realm of complex image processing (analyzing mosquito noise, blocking artefacts, etc.), the only pertinent information available within the JPEG image is the most recent compression quality. As there is no history saved within JPEG images, and so the only objective datapoint is the approximate quality factor of the most recent save.
Therefore, if you are comparing images that have been taken from a multitude of different cameras, without editing, then this methodology may work. It will indicate images from cameras with the least compression (even though the content itself may be out of focus!). However, if you are including images into this comparison that have been edited / resized within Photoshop or other imaging tools, then you will face a dilemna: you will only be seeing the compression quality that the imaging tool used to resave the image.
In most cases, people tend to get this JPEG resaving very wrong. They resave with a quality factor that is very much unlike the quality of the original image. Most of the time people will resave with a quality factor much lower than the original (not understanding Photoshop's quality scale). More serious photographers sometimes under-compress the resulting edited image (by choosing Photoshop Quality 12).
Therefore, you may be comparing an image from a digital SLR using an approximate quality factor of 93 versus a point-and-shoot image (that would have had an original quality factor of 80) but resaved in Photoshop with a quality factor of 95. Such a comparison would yield that the second image is "better" than the first, but the converse may be true from a human observer point of view.
In summary: resaving with a higher quality factor does not improve the underlying image quality, while resaving with a lower quality factor directly decreases the image quality.
To get around this, you could consider factoring in whether or not an image had been resaved into the comparison, but due to most people's need for resizing, you'll probably find that the majority of images fall into this category.
If you don't think that this issue is likely to be a problem for your needs, then I'll see if I can find a suitable way to integrate such functionality into the tool. Thanks for sharing your suggestion!
Also, a suggestion - why not publish the estimated quality values with your jpeg quantization tables for various cameras? That's really the data I'm after anyway - that way we can all benefit from that info (and better yet, I don't have to do the work! ;) )
Thanks again!
As for the quality values, I have collated most of them into a big table on the JPEG Quality page.
In the interim, jpegdump is currently available with source code, and it might do some of what you're looking for (extraction of JFIF headers).
Have you considered the option to extract the JPEG info from within a TIFF employing JPEG compression?
Cheers,
Derek
Thank you for creating the JPEGsnoop utility.
But I have one question, what is Annex Ratio meant in the log file which is near the JPEG quantization table? Can you explain?
Best
Eric
Thanks a lot for providing us the great tool and so many detailed explanations!
I'm writing a program to decode an Extended Sequential (SOF1) image.
Do you know any differents between decode Baseline and Extend Sequential image?
Thanks!
Length=12: -4095...-2048,2048...4095
Length=13: -8191...-4096,4096...8191
Length=14: -16383...-8192,8192...16383
Length=15: -32767...-16384,16384...32767
Great tool! I was just dumping a JPEG file I had that had been manipulated using the LeadTools library. It shows in the JPEGSnoop preview in greyscale, yet it shows as a colour image in other viewers I've tried. It appears to only have one DQT table (Luminance), so perhaps this is the cause. Is a single DQT (containing all "2"s), which I assume is shared for the luminance and chrominance channels. Is the single DQT fairly typical or a typical, or even recommended?
Thanks for pointing this out! As for whether it is typical or recommended, I would not say that it is typical at all. Most of the time one will choose a different quantization table for the chrominance and luminance channels as the human eye (HVS) has a different response curve for color. Setting them the same means that the compression is retaining or throwing image information away without taking into account the nuances / limitations of the human visual system. The difference in compression is quite minimal, so some encoders may decide to use a single table anyway.
That said, it is part of the standard, so most decoders should be able to interpret the images correctly.
Thank you very much for your detailed explanations! Of course you are right - the average DC Y value could be used for calculating an offset value. Meanwhile I've come up with a new approach that might be worth investigation. I call it "selective subdivide".
I divide the picture in a 3x3 unit grid and calculate the average DC Y value for each grid. If this value differs substantially from the total picture DC Y value, I further subdivide the grid in 3x3 units and repeat the procedure and so forth.
I think the strength of this method is the mix of spatial and statistical information. When I reflected the histo-only method, I came across some weaknesses: It's not only the enourmous amount of data, but also the fact that different pictures might have similar histograms.
If you provide me with your email adresss, i can further provide u with detailed information.
Kind regards,
Franz
Another interesting application for these matching algorithms is in the recent use for facial recognition. Imagine how incredible it would be for a photo cataloging application to have some degree of built-in learning recognition that assists in the tedious tagging efforts! You might find some research in this area could offer you some other clues in methods to decompose photos into details.
Email sent. Thanks!
you are awesome! Thanks for providing me with the needed functionality so immediately!!! I already gave it a try and played around with the DC Histogram to use it as an CBIR algorithm.
When trying to reproduce the algorithm in the paper, I encountered the following "challenges":
Best Regards,
Franz
The JPEGsnoop dumps all 2048 bins, for the sake of completeness, but obviously this will come at the expense of data management and comparison performance. I have definitely come across images that use 512 bins and others that use all 2048 bins, so it is going to be image dependent.
I haven't thought about it long enough to have a good handle on what the best way to strip down the data for the purposes of similarity matching. To allow for some variability, you will probably require a combination of histogram bin reduction and scaling of the population within each bin. Reducing to a small meaningful set of coefficients is going to be difficult. When you look at the histogram of the luminance component, there is really nothing to throw away except for reducing the number / precision of the bins. Clearly there are other programs out there that perform an image matching function, but it's not evident how this is done or stored. Out of curiosity, I might look into this area further.
You are right in that the average DC value can be useful as a starting point. However, if a series of images have slightly different exposures, then I think this average value serves as an even more important characteristic. The average DC luminance value could probably serve as a reference point to adjust the rest of the luminance histogram. By shifting (offset) the histogram so that the average DC Y value is 0 (for example), the histogram comparison functionality will work even in the presence of exposure differences!
For this very reason, I'd suggest, in fact, that you don't use the average value in the comparison (as a first rough comparison before going on to the in-depth bin comparison), but instead use it to offset the rest of the histogram values.
thanks a lot for this interesting tool - it helps me a lot to understand the internals of JPEG compression.
I'm especially interested in doing Content-Based-Image-Retrieval, i.e. finding duplicate versions of an image in different sizes and formats.
The last days I stumbled across an CBIR-algorithm, that uses DC coefficient histograms of JPEG images to calculate their similarity. The advantage: the image needn't be decompressed fully to do the analysis, which means a much better performance.
I want to evaluate this algorithm with my own pictures. Does JPEGSnoop allow me to generate such histograms?
I'm interested in DC Component histograms for doing similarity searches with JPEG files. I already stumbled across the "YCC histogram in DC" section in the output file of JPEGsnoop.
However, the average value presented there seems to be not "accurate" enough for my analysis, meaning that two images with little similarity have close avg values. This is not always the case, however I'd prefer to do the calculations on my own... Is there a way to dump the DC coefficient values? Do you know of any other tool?
Nice Regards,
Franz
You are right in that the average value shown in the YCC Histogram in DC section of JPEGsnoop will not be a useful metric in comparing image similarity. Instead of averaging, you will need to compare a histogram of a sufficient number of bins. The maximum number of bins to represent the luminance (Y-channel) histogram is 2048 (-1024 to +1023 ), but as the paper indicates, you may be able to get adequate results from less precision in the binning.
I have now added this feature into JPEGsnoop release 0.6.6! Just turn on the Histo Y Dump option, and the report will contain the full-precision histogram.
I might do some experiments to see if I can use this property to search for similar JPEG images on my hard drive. I expect that the proposed algorithm will allow perfect identification of duplicates that have been rotated or flipped. It will probably also handle resizing and changes in save quality (quantization) fairly well too. However, I would expect that it would have some difficulty in identifying duplicates after an image had a curves / levels or brightness adjustment.
Let me know if this helps!
The color conversion stats was something that I just added recently, and so I am still trying to decide how best to show this. I agree that providing some sort of histogram would be useful -- so I'll look further into adding that capability.
As for how the clipping works -- I have not yet examined how other JPEG decoders handle any clipping of YCC components (after summing the cumulative DC offsets from prior MCUs). My initial assumption (remember that this decoder was largely written from my reverse engineering and the spec, not looking at other decoders) was that the encoded JPEG data would have spanned an 8-bit range in each component of the YCbCr data, but it is possible that this is not the case. If so, it may be that no clipping should be done in the YCC stage, but instead only after the conversion to RGB. I am definitely going to be investigating this area next, to ensure that the methodology and stats are meaningful and accurate. Great question!
With drag & drop, it was working (drop on the application icon), but I broke it with changes to implement the progress status bar. I have now fixed this, which also corrects command line execution as well.
All of these changes should be available in the upcoming JPEGsnoop 0.6.2 release.
Thanks for the suggestions!
If you're interested in some special hints that resulted from the codec I developed:
To get the best result with best speed, I combined a bicubic and bilinear algorithm. The bicubic is used for luminance, bilinear for chrominance components.
As long as the image is twice as large as the desired target image, a simple 2x2 to 1x1 reduction by mean value is applied. This results in sharp images. Photoshop seems to work that way too (as far as I can tell by comparison without looking at the code).
The JPEG group standard tables were used and a quality setting is applied percentually on the table values. I use a quality of 75% for luminance and 20% for chrominance components. The last is a bit higher because they're already reduced by a chroma subsampling of 2x2. That way the colors don't "bleed" much.
For very big images, a much higher compression can be used with good results, but for web-sized images (and thumbnails) that is just fine.
A special trick that allows smaller images without visible loss is a threshold filter. If dct values are within a certain range near 0, the value is replaced by zero. I used a range of -5 to +5 for luminance and -15 to 15 for chrominance. A nice sideeffect is a little smoothening that only affects an MCU that does not contain much detail.
With just some little tweaks, an image quality was achieved that is comparable to Photoshop 60% (high) with about 25% smaller filesize. The quality comparison also includes the scaling.
Hope I could give some interesting info in return for your nice tool.
I am excited about re- saving a jpg in photoshop at the appropriate photoshop image quality settings to match input quality. This is for normal save and not for web. I have scanned the output file from JPEG Snoop and the only section with a photoshop reference provides me with this
So I took a guess and presumed a save at Photoshop quality 3 was needed. This saved my file as 340 K when it started out at 4219K so I obviously got that wrong.
I hope you are enjoying Cambodia, I live out in Thailand and am planning to arrange tours for photographers into Cambodia, Laos etc.
Your help would be much appreciated, the software will be invaluable once I am past this point.
Thanks for your help. Les Wilk
I attach below a dump of my output files if that helps
A file modified in Photoshop would probably show the Exif IFD0 [Software] field as "Adobe Photoshop" or something similar. Yours shows the value (Digital Camera FinePixS2Pro Ver1.00) that is set by the camera (and perhaps the Fuji camera software). However, we do see evidence of editing by Photoshop in the APP1 and APP13 sections.
The APP13 marker section in JPEGsnoop would display Photoshop Quality = if it detected Photoshop's Save As image quality setting. However, your file doesn't contain the special entry that indicates a Save As quality setting from Photoshop (It would have shown 8BIM: [0x0406]). Therefore it may have been stripped out by the Fuji software.
The other possibility is that when you edited this in Photoshop you were using some strange resaving mode that I have never seen before (only modifying certain parts of the image's EXIF information and not others).
So, in conclusion: JPEGsnoop did not identify the Photoshop Save As quality (it would have stated it explicitly) because the entry was missing in the file. It appears that while it may have been edited previously with Photoshop, it looks like your file may have been modified further with the Fuji camera software. JPEGsnoop can only analyze the most recent resave/edit.
Hope that helps, Cal.
PS> I just returned today from Cambodia and loved it. It was really hard to leave after spending a month in smaller towns with the very friendly people.
My JPEG has an APP0 that isn't a JFIF header, but custom for this video source. Sadly jpegSpoop barfed trying to parse this custom APP0 as if it were a standard JFIF or JFEX APP0.
Suggest you cease internal parsing if the identifier field isn't as expected. That should allow continued parsing of any following standard blocks.
I believe my file has a bad huffman code in it, and I was hoping to prove that one way or the other.
-Jesse
Can you help?
In the meantime, there are a number of utilities that seem to serve this purpose quite well, scanning the entire raw data of your memory card for recoverable data. One free tool that seems to have reasonable reviews is PC Inspector Smart Recovery. No harm in trying it out. Most important, though, don't use your memory card for anything else!
...
P.S. nice portal =)
If you have any questions on how to perform any of the decode steps, please feel free to ask.
Nice Work!
But, when I fire up the JPEGsnoop v0.4, it tells me
"Can't find the MFC71.DLL". I think you have to compile it as static...
Please note that due to the large number of comments on ImpulseAdventure, I am not able respond to all questions.