How Does a Computer Compress Files?

Uncategorized

How Does a Computer Compress Files?

July 30, 2022

Today, the amount of data consumed by users is increasing with the technology gaining more and more space in our lives.. File sizes are also growing in parallel with this development.. Now that the photos and videos we take are of higher quality, the space they occupy is increasing.. Of course, computer scientists have found a workaround for this problem as well.. This technology is “Compression” technology, which has been part of our lives for a long time.. Compression technology, with compression algorithms, reduces the large files we have to smaller sizes, enabling them to take up less space.

We use applications such as WinRAR and WinZip to compress the files we use.. So, how do these applications reduce the size of these files? First of all, we should know that text compression is different from image and video compression.. You will experience data loss when you compress files in pictures and videos. In other words, the compressed version of the image contains less detail than the original.. But text compression is completely lossless.. We should not lose information while compressing the text so that there is no ambiguity anyway.

When you create a text, each character is kept in the computer as 8-bit, 8-digit 1s and 0s. This gives us the opportunity to store 2 to the power of 8, that is, 256 different characters.. These characters include uppercase, lowercase, punctuation marks, and even special characters in our alphabet such as ç, ü, ğ, ı, ş.. So why 8? Because 8 digits is enough for now to store all the characters.

Huffman Coding

In 1949, Claude Shannon and Robert Fano discovered Shannon-Fano Coding. Although this method, which is a compression algorithm, is not at an optimal level, it compresses files at a certain level according to the possibilities of the characters with the cumulative distribution function.

In 1952, David Albert Huffman developed Huffman Coding, an entropy coding application during his doctorate. Huffman Encoding was much more optimized than Shannon-Fano, and the compression ratio of the files was even higher.

How Can We Do Huffman Encoding?

To compress your files with Huffman Encoding, first in the given text find how many passes of each character (i.e. their frequency) and list them in ascending order. Then get the bottom 2 elements of your list. These two elements will be the leaves, that is, the lowest elements of the Huffman Tree that we will create.. Then connect these 2 elements together and write the sum of the frequencies of the 2 elements (how many times they occur in the text we calculated at the beginning) to the value of the connecting node. Then continue below and add each element to your tree and you’ll see a big Huffman tree.

Now it’s time to compress our text.. Suppose the text you want to compress is “”. After creating our Huffman tree, the first character we will compress is the letter “t”. We determine the location of this letter in our tree and try to go from the starting point to the node where that letter is.. In the process we go, we write 1 every time we pass to the right and 0 every time we pass to the left.. When we reach the target node, the new value of the letter “t” is now 0 and 1, which we combine. the characters are. So, since more recurring characters stay at the top of the tree, they increase our compression percentage each time.. Also, this Huffman encoding is not a very profitable method for texts that do not contain many bits such as “”.

If you want to create your own Huffman Tree and compress your text, you can create your own Tree from this site and try text compression.

Bayer prepares for the holographic storage revolution

Huffman Coding

How Can We Do Huffman Encoding?

LEAVE A REPLY Cancel reply