
I am working on compacting a database (SQLite
) and I am trying to optimize the size of the data types that I am assigning in the schema. The data I have can be broadly divided into 3 types of numbers. They are:
101010000
- 9 digits</li>
<li>0
or 1
, - Just one or zero</li>
<li>13.4567
- Float type. I would like to just store 13.45 here.</li>
</ol>
All the numbers will be used for mathematical manipulation within a program. I am using this website as a reference: http://www.sqlite.org/datatype3.html
I would like to know which data type affinity will give me the most compact database. I am currently getting a database size close to 40 GB!
Thanks for the help.
Answer1:
In SQLite, integer data types can occupy 1,2,4 or 8 bytes, depending on how big is the data value. Boolean type is stored as integer 0 or 1 and will always occupy 1 byte. REAL (float) data needs 8 bytes to store. If you are willing to round your data to 2 decimal digits, you may want to multiply your float data by 100 and store it as 4 byte integer value (and save 4 bytes per value). 9 binary digits that you mentioned should need 2 bytes to store.
How can you optimize your data? There are many strategies. If you have a lot of boolean fields, you might be able to use one byte to store up to 8 bits. This should allow you to improve storage efficiency by factor of 8. Another good strategy is to normalize your data: split it into more than 1 table. However, this only works if your data schema can permit such split.
Often, if you have too many indexes, they may significantly increase size of database on disk. Right solution is to revise actual SQL queries are used by software using EXPLAIN, find out indexes which are not really needed and drop them.
Finally, don't forget that using VACUUM FULL you will make database occupy as small space on disk as possible, but running it could take a while and requires exclusive database lock. Good luck!