May 292014
 

Today I want to repost for my readers a really interesting article by Gionatan Danti first posted on his blog http://www.ilsistemista.net/, I hope you enjoy it as much as I do

File compression is an old trick: one of the first (if not the first) program capable of compressing files was “SQ”, in the early 1980s, but the first widespread, mass-know compressor probably was ZIP (released in 1989).

In other word, compressing a file to save space is nothing new and, while current TB-sized, low costs disks provide plenty of space, sometime compression is desirable because it not only reduces the space needed to store data, but it can even increase I/O performance due to the lower amount of bits to be written or read to/from the storage subsystem. This is especially true when comparing the ever-increasing CPU speed to the more-or-less stagnant mechanical disk performance (SSDs are another matter, of course).

While compression algorithms and programs varies, basically we can distinguish to main categories: generic lossless compressors and specialized, lossy compressors.

If the last categories include compressors with quite spectacular compression factor, they can typically be used only when you want to preserve the general information as a whole, and you are not interested in a true bit-wise precise representation of the original data. In other word, you can use a lossy compressor for storing an high-resolution photo or a song, but not for storing a compressed executable on your disk (executable need to be perfectly stored, bit per bit) or text log files (we don’t want to lose information on text files, right?).

So, for the general use case, lossless compressors are the way to go. But what compressor to use from the many available? Sometime different programs use the same underlying algorithm or even the same library implementation, so using one or another is a relatively low-important choice. However, when comparing compressors using different compression algorithms, the choice must be a weighted one: you want to privilege high compression ratio or speed? In other word, you need a fast and low-compression algorithm or a slow but more effective one?

In this article, we are going to examine many different compressors based on few different compressing libraries:

  • lz4, a new, high speed compression program and algorithm
  • lzop, based on the fast lzo library, implementing the LZO algorithm
  • gzip and pigz (multithreaded gzip), based on the zip library which implements the ZIP alg
  • bzip2 and pbzip2 (multithreaded bzip2), based on the libbzip2 library implementing the Burrows–Wheeler compressing scheme
  • 7-zip, based mainly (but not only) on the LZMA algorithm
  • xz, another LZMA-based program

Continue reading »

flattr this!

Nov 162013
 

arkos

Recently I’ve discovered this project that has great ambitions:

arkOS is an open-source platform for securely self-hosting your online life.

Everything started from the founder Jacob Cook and the CitizenWeb Project he founded. It’s designed to run on a Raspberry Pi – a super-low-cost single board computer – and ultimately will let users, even of the non-technical variety, run from within their homes email, social networking, storage and other services that are increasingly getting shunted out into the cloud, and so under the control of big companies.

So in short arkOS is a lightweight Linux-based operating system that runs on a Raspberry Pi.

It allows you to easily host your own website, email, “cloud” and more, all within arm’s reach. It does this by interfacing with existing software and allowing the user to easily update and change settings with a graphical interface. No more need to depend on external cloud services, which can be insecure “walled gardens” that require you to give up control over your data.

arkOS will have several different components that come together to make a seamless self-hosting experience possible on your Raspberry Pi. Each of these components will work with each other out-of-the-box, allowing you to host your websites, email, social networking accounts, cloud services, and many other things from your arkOS node.

Continue reading »

flattr this!

May 222012
 

Back in December 2011, data-intensive Linux users rejoiced as Apache Hadoop reached its 1.0.0 milestone. Setting a benchmark for distributed computing software, this wonderful little program is now into release 1.0.3 but what is Hadoop and how can you benefit from using it?

Designed with ‘web-scale’ operations in mind, Hadoop can handle massive amounts of information, allowing you to quickly and efficiently process volumes of data that other systems simply cannot handle. But that’s just the beginning. Hadoop also allows you to network this process – it can distribute large amounts of work across a cluster of machines, allowing you to handle workloads that a single processor simply cannot manage.

Continue reading »

flattr this!