81541

Compressing images in existing pdfs makes the resulting PDF file bigger (Lowagies resizing method an

Question:

Im having a problem with image compression. I used the answer described in this question <a href="https://stackoverflow.com/questions/20614350/compress-pdf-with-large-images-via-java" rel="nofollow">compress pdf with large images via java</a> if i set the FACTOR variable to 0.9f or 1f (original size) the resulting pdf file starts to get bigger than the ORIGINAL. But that is not the case for all files. Some files created by myself are getting smaller like planned but some just get bigger like +1/3rd and i get black backgrounds on some images ontop of it. this is getting even worse when im using the normal image compression without resizing the image <a href="http://www.iso.org/iso/annual_report_2009.pdf" rel="nofollow">This</a> is my test file.

Lowagies method: (resize the images)

<pre class="lang-java prettyprint-override"> // TODO Auto-generated method stub PdfName key = new PdfName("ITXT_SpecialId"); PdfName value = new PdfName("123456789"); // Read the file PdfReader reader = new PdfReader(args[0]); int n = reader.getXrefSize(); PdfObject object; PRStream stream; // Look for image and manipulate image stream for (int i = 0; i < n; i++) { object = reader.getPdfObject(i); if (object == null || !object.isStream()) continue; stream = (PRStream)object; // if (value.equals(stream.get(key))) { PdfObject pdfsubtype = stream.get(PdfName.SUBTYPE); System.out.println(stream.type()); if (pdfsubtype != null && pdfsubtype.toString().equals(PdfName.IMAGE.toString())) { PdfImageObject image = new PdfImageObject(stream); BufferedImage bi = image.getBufferedImage(); if (bi == null) continue; int width = (int)(bi.getWidth() * 1f); int height = (int)(bi.getHeight() * 1f); BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB); AffineTransform at = AffineTransform.getScaleInstance(1f, 1f); Graphics2D g = img.createGraphics(); g.drawRenderedImage(bi, at); ByteArrayOutputStream imgBytes = new ByteArrayOutputStream(); ImageIO.write(img, "JPG", imgBytes); stream.clear(); stream.setData(imgBytes.toByteArray(), false, PRStream.BEST_COMPRESSION); stream.put(PdfName.TYPE, PdfName.XOBJECT); stream.put(PdfName.SUBTYPE, PdfName.IMAGE); stream.put(key, value); stream.put(PdfName.FILTER, PdfName.DCTDECODE); stream.put(PdfName.WIDTH, new PdfNumber(width)); stream.put(PdfName.HEIGHT, new PdfNumber(height)); stream.put(PdfName.BITSPERCOMPONENT, new PdfNumber(8)); stream.put(PdfName.COLORSPACE, PdfName.DEVICERGB); } } // Save altered PDF PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("/Applications/XAMPP/xamppfiles/htdocs/pdf_compress/download/"+args[2])); stamper.close(); reader.close();

My method (Using real compression by setting the quallity of the image instead of resizing it)

<pre class="lang-java prettyprint-override"> PdfReader reader = new PdfReader(args[0]); // Read the file int n = reader.getXrefSize(); PdfObject object; PRStream stream; // Look for image and manipulate image stream for (int i = 0; i < n; i++) { object = reader.getPdfObject(i); if (object == null || !object.isStream()) continue; stream = (PRStream)object; PdfObject pdfsubtype = stream.get(PdfName.SUBTYPE); if (pdfsubtype != null && pdfsubtype.toString().equals(PdfName.IMAGE.toString())) { System.out.println(pdfsubtype.length()); PdfImageObject image = new PdfImageObject(stream); BufferedImage bi = image.getBufferedImage(); if (bi == null) continue; int width = (int)(bi.getWidth()); int height = (int)(bi.getHeight()); if(width <=30 || height <=30){ continue; } BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB); AffineTransform at = null; Graphics2D g = img.createGraphics(); g.drawRenderedImage(bi, at ); ByteArrayOutputStream imgBytes = new ByteArrayOutputStream(); Iterator iter = ImageIO.getImageWritersByFormatName("JPG"); ImageWriter writer = (ImageWriter)iter.next(); ImageWriteParam iwp = writer.getDefaultWriteParam(); iwp.setCompressionMode(ImageWriteParam.MODE_EXPLICIT); // here goes the compression iwp.setCompressionQuality(Float.valueOf(args[1])); ImageOutputStream imageos = ImageIO.createImageOutputStream(imgBytes); writer.setOutput(imageos); IIOImage images = new IIOImage(img, null, null); writer.write(null,images , iwp); imageos.close(); writer.dispose(); stream.clear(); stream.setData(imgBytes.toByteArray(), false, PRStream.BEST_COMPRESSION); stream.put(PdfName.TYPE, PdfName.XOBJECT); stream.put(PdfName.SUBTYPE, PdfName.IMAGE); stream.put(PdfName.FILTER, PdfName.DCTDECODE); stream.put(PdfName.WIDTH, new PdfNumber(width)); stream.put(PdfName.HEIGHT, new PdfNumber(height)); stream.put(PdfName.BITSPERCOMPONENT, new PdfNumber(8)); stream.put(PdfName.COLORSPACE, PdfName.DEVICERGB); } } PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("/Applications/XAMPP/xamppfiles/htdocs/pdf_compress/download/"+args[2])); stamper.setFullCompression(); stamper.close(); reader.close(); System.out.println("Done");

What is wrong with the code? Should i use a different image compression method? Are there any others?

Answer1:

When I only replace JPEGs, I already get a lower file size. Removing the unused object also helps:

public class ReduceSize { public static final String SRC = "resources/pdfs/annual_report_2009.pdf"; public static final String DEST = "results/images/annual_report_2009.pdf"; public static final float FACTOR = 0.5f; public static void main(String[] args) throws DocumentException, IOException { File file = new File(DEST); file.getParentFile().mkdirs(); new ReduceSize().manipulatePdf(SRC, DEST); } public void manipulatePdf(String src, String dest) throws DocumentException, IOException { PdfReader reader = new PdfReader(src); int n = reader.getXrefSize(); PdfObject object; PRStream stream; // Look for image and manipulate image stream for (int i = 0; i < n; i++) { object = reader.getPdfObject(i); if (object == null || !object.isStream()) continue; stream = (PRStream)object; if (!PdfName.IMAGE.equals(stream.getAsName(PdfName.SUBTYPE))) continue; if (!PdfName.DCTDECODE.equals(stream.getAsName(PdfName.FILTER))) continue; PdfImageObject image = new PdfImageObject(stream); BufferedImage bi = image.getBufferedImage(); if (bi == null) continue; int width = (int)(bi.getWidth() * FACTOR); int height = (int)(bi.getHeight() * FACTOR); if (width <= 0 || height <= 0) continue; BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB); AffineTransform at = AffineTransform.getScaleInstance(FACTOR, FACTOR); Graphics2D g = img.createGraphics(); g.drawRenderedImage(bi, at); ByteArrayOutputStream imgBytes = new ByteArrayOutputStream(); ImageIO.write(img, "JPG", imgBytes); stream.clear(); stream.setData(imgBytes.toByteArray(), false, PRStream.NO_COMPRESSION); stream.put(PdfName.TYPE, PdfName.XOBJECT); stream.put(PdfName.SUBTYPE, PdfName.IMAGE); stream.put(PdfName.FILTER, PdfName.DCTDECODE); stream.put(PdfName.WIDTH, new PdfNumber(width)); stream.put(PdfName.HEIGHT, new PdfNumber(height)); stream.put(PdfName.BITSPERCOMPONENT, new PdfNumber(8)); stream.put(PdfName.COLORSPACE, PdfName.DEVICERGB); } reader.removeUnusedObjects(); // Save altered PDF PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest)); stamper.setFullCompression(); stamper.close(); reader.close(); } }

This reduces the 10,510 KB file to 9,159 KB. Of course: fonts also take up quite some space.

Recommend

  • C# iTextSharp Extracted CMYK images returns in RGB Format
  • VHDL directly comparing vectors
  • How to define Subtypes in Isabelle and what they mean?
  • Form Not Being Passed On Server
  • Retrieving Amazon SES Event Data - retrieve custom data from bounce event
  • Java casting an exception (not class cast exception)
  • In Dart, what's the difference between List.from and .of, and between Map.from and .of?
  • R: Make unique the duplicated levels in all factor columns in a data frame
  • How to return correct type from an overriden method in Scala?
  • How to get the class type for which a collection object is created using reflection
  • How to add u3d into existing pdf using itext7 with C#
  • BufferCB not being called by SampleGrabber
  • Powershell - How to select a block of text from a text based log file that has a time stamp in the l
  • Column dependent on other column value
  • Is it possible to use an http url as your source location of a Source Filter in DirectShow .Net?
  • How to change file type in solution explorer from Form to Class
  • CATransition white flash in background?
  • How to download attachment from gmail in C# using IMAP?
  • Are there any supported high bit-depth video or image formats in DirectShow
  • How to fail Phing without triggering backtrace
  • Linq Full Outer Join on Two Objects
  • How to get the index of element in the List in c#
  • App restarts from wrong activity
  • Time complexity of a program which involves multiple variables
  • why do I get the error when installing the gem 'pg'? [duplicate]
  • Jetty Server not starting: Unable to establish loopback connection
  • FileReader+canvas image loading problem
  • Why value captured by reference in lambda is broken? [duplicate]
  • sending/ receiving email in Java
  • Rearranging Cells in UITableView Bug & Saving Changes
  • Circular dependency while pushing http interceptor
  • Linker errors when using intrinsic function via function pointer
  • Windows forms listbox.selecteditem displaying “System.Data.DataRowView” instead of actual value
  • FormattedException instead of throw new Exception(string.Format(…)) in .NET
  • How do I configure my settings file to work with unit tests?
  • Change div Background jquery
  • IndexOutOfRangeException on multidimensional array despite using GetLength check
  • LevelDB C iterator
  • Binding checkboxes to object values in AngularJs
  • How can i traverse a binary tree from right to left in java?