13401

Why so big difference in sizes of almost identical documents

Question:

Have two pdfs, first created with libharu and second created with PDF::API2. If not mention to coordinates then content is the same. But first pdf oversize second by four times. Only one distinction what i found that is type of fonts embedding showed in document properties fonts tab.

In first

Verdana (Embedded Subset) Type: TrueType Encoding: Custom

In second

Verdana Type: TrueType Encoding: Custom Actual Font: Verdana Actual font Type: TrueType

How to deal with that embedded subset?

Answer1:

There are many factors that affect the size of the PDF. Your problem <em>may</em> be in the way the PDF creation libraries handle font embedding, specifically:

<ul><li>"Embedded subset" means that part of the font's metrics, like glyph widths, are included in the file.</li> <li>If the font is not embedded, presumably it is loaded by the reader from the system, reducing the size of the file.</li> </ul>

<em>If</em> the PDF is already small (only has one page, little text and no images), embedding fonts may make a relatively big difference on the size of the document. Still, in absolute terms, an embedded font shouldn't take a lot of space.

Another factor you should check is compression. PDF is mostly a plain-text stream, but it usually comes in compressed form. Try opening both PDFs in a plain text editor and see if it's readable or gibberish. The gibberish (compressed) form will naturally take less space.

Finally, you can inspect the objects the PDF file is composed from using the many PDF inspectors out there, for example <a href="http://sourceforge.net/projects/itextrups/" rel="nofollow">this one</a> (I just googled it up, no guarantees it'll work as expected).

Answer2:

this is an old question but I had a similar issue.

Did you set libharu to compress your pdf?

in C++, from the <a href="https://github.com/libharu/libharu/wiki/Usage-examples" rel="nofollow">documentation</a>

HPDF_SetCompressionMode (pdf, HPDF_COMP_ALL);

Recommend

  • How to capture Visual Studio commands in a VSPackage Plugin?
  • font-family crash with @font-face src
  • 'nodejs web.js' works, 'foreman start' doesn't
  • IE9 Refusing to Load custom font?
  • Delphi Chromium Embedded - Clear browser cache
  • addressing in assembler
  • Many to Many in Linq using Dapper
  • How to discover Font Type?
  • How can I prevent the need to copy strings passed to a avr-gcc C++ constructor?
  • Understanding RTF and edit it with vb.net
  • How do I formally document a C# Attribute in UML?
  • Salesforce Different WSDL files and when to use
  • How To Customize ASP.NET Chart Databound To SqlDataSource
  • How to access culture data in globalize.js V1.0.0
  • Combining two different ActiveRecord collections into one
  • android.support.v7.widget.Toolbar VectorDrawableCompat IllegalStateException when using support lib
  • presentShareDialogWithParams posts to FB wall, but callback handler results say error
  • Eloquent update method change created_at timestamp
  • custom UITableViewCell with image for highlighting
  • How do I access an unhandled exception in an MVC Error view?
  • How to change the font size of a single index for UISegmentedControl?
  • Time complexity of a program which involves multiple variables
  • Google Custom Search with transparent background
  • Checking free space on FTP server
  • Highlight and Bold text in JTextPane
  • C++ Partial template specialization - design simplification
  • NHibernate Validation Localization with S#arp Architecture
  • Xamarin Forms - UWP Fonts
  • Encrypt data by using a public key in c# and decrypt data by using a private key in php
  • Bug in WPF DataGrid
  • SSO with signing and signature validation doesn't work
  • Deserializing XML into class C#
  • How to apply VCL Styles to DLL-based forms in Inno Setup?
  • Change an a tag attribute in JavaScript based on screen width
  • Redux, normalised entities and lodash merge
  • Font Awesome Showing Box instead of Icons
  • Google cloud sdk not working when python points python3
  • Unanticipated behavior
  • how does django model after text[] in postgresql [duplicate]
  • costura.fody for a dll that references another dll