Feb 172011
 

nutchToday I present you this excellent and comprehensive article on an open source search engine: Nutch, you can find the original article with the code examples here

After reading this article readers should be somewhat familiar with the basic crawling concepts and core MapReduce jobs in Nutch.

What is a web crawler?

A Web Crawler is a computer program that usually discovers and downloads content from the web via an HTTP protocol. The discovery process of a crawler is usually simple and straightforward. A crawler is first given a set of URLs, often called seeds. Next the crawler goes and downloads the content from those URLs and then extracts hyperlinks or URLs from the downloaded content. This is exactly the same thing that happens in the real world when a human is interfacing with a web browser and clicks on links from a homepage, and pages that follow, one after another.
Continue reading »

Flattr this!

Feb 162011
 

escherTired of the “usual” screensaver present on the major Windows Manager?

Today we will see some alternative programs that can be used on our linux to have new and original effects.

Matrixgl

Matrixgl is a free, open source 3D screensaver based on The Matrix Reloaded. It supports widescreen setups, and can be run on Windows, Mac OSX, Linux, BSD, and many other Unix based operating systems.

Continue reading »

Flattr this!

Feb 062011
 

squeezeThe news of the day it’s that finally it’s been released the new stable version of Debian, release 6 code name Squeeze.

This is the official news from the Debian site.

After 24 months of constant development, the Debian Project is proud to present its new stable version 6.0 (code name “Squeeze”). Debian 6.0 is a free operating system, coming for the first time in two flavours. Alongside Debian GNU/Linux, Debian GNU/kFreeBSD is introduced with this version as a “technology preview”.

Debian 6.0 includes the KDE Plasma Desktop and Applications, the GNOME, Xfce, and LXDE desktop environments as well as all kinds of server applications. It also features compatibility with the FHS v2.3 and software developed for version 3.2 of the LSB.

Continue reading »

Flattr this!