Jan 062011
 

recoll
Article wrote by: Francesco Di Leo
Actually, I tried them all, from the mythical Beagle to Tracker. Beagle is now in effect a dead project. It was quite interesting, able to search many types of files and miscellaneous information, but personally I was not willing to use so many resources for Mono. Patience, they had developed from the outset with the standard libraries perhaps it might be on all computers. I tried other less known programs, Catfish, Pinot and some other that escapes me, cursed memory, up to lead to Tracker a searching program for files (and information) for Gnome. Until version 0.6 it seemed that someone had finally decided to engage seriously in the implementation of a program that allowed me to not always use the memory to remember which file or e-mail contained the information i was searching for. I was promptly denied (and betrayed).




Not only with the later versions it’s slightly regressed, but the bugs are so many in every new release that I do not recommend it for a “serious” use . I reported a bug a while ago on the incorrect indexing of files whose name or content is made up of letters and numbers arranged in various ways. The problem still exists partly, although I have not yet tried the latest versions (and I do not know when I’ll do it because of dependencies). Google Desktop also suffers the same problem. Disheartened by the existence of many searching files programs that promised much but that showed to be just “demo”, i landed at Recoll, with its Spartan interface. But I did not care, I wanted a program that could index all my files without slamming on the reasons for its failure. I did the usual test drive on “road” and I noticed that it also indexes the files on which the others failed. My search was over.

The Program

Recoll consists mainly of program/client “recoll” and the program/daemon recollindex. The latter is responsible for indexing files and can be run as a daemon for indexing files in real time. Depending on the distribution you may also find “recollq”, which is the client command line. Once launched, the client Recoll warns you to proceed with its configuration before starting the indexing. Click on “cancel” otherwise it will indexes all your personal folder. In the window that appears enter the folders and subfolders that you want to index by selecting the button “+” (remember to remove “~”with the “-”). Click “OK” and Recoll will start with the indexing of files in these folders. At the end try to find one or more files that contain a particular word or phrase and instantly you’ll get the search result. For future indexing by hand simply select “Update Index” from the “File” menu.

recoll-2

(Figure 1 – example of search)

As you can see in Figure 1 recoll indexes the contents of different file types along with any metadata. It also provides a text-only preview of the content and metadata index as well as the percentage of relevance, a summary of the content and other informations.

In conclusion, if you are looking for a fairly comprehensive program that helps your memory, Recoll deserves to be seriously tested. For me I used it for some time and it fully meets my needs. It also has a comprehensive manual, thing that should not be underestimated.

Notes and observations:

Recoll need different support programs. Under “Help” select “Show missing helpers”and if the answer is “No helpers found missing” then it means that you have already installed all you need on your Linux box.

Recoll also indexes the email and their attachments but you must enter the full address of the folder that stores e-mails (I use the mbox fomat).

You may encounter problems with Microsoft doc file if the content is formed by two or three words. To be exact Recoll fails for the program used for conversion a better filter could solve this small issue.

Ability to use extended attributes (a feature I have personally tested with the author).

The author addresses seriously the few bugs that occur.

The recoll client is based on the Qt library vers. 4 (personally I use it safely with Xfce) while the programs recollq and recollindex are based on a few standard libraries. The filters used may require other libraries (python, etc.)..

Recoll is highly configurable and customizable, though this is not a priority.

Francesco Di Leo

Popular Posts:

flattr this!

  11 Responses to “Recoll, sometimes the memories are not enough.”

  1. Francesco, I could not agree more with you! I too searched for ages and ages to find a replacement for “Sleuthhound” which ran under Windows and was not too bad (although it could not handle indexes over networks with 200,000 files or so).

    I tried all the programs you wrote about.

    Finally Recoll came along and for me it does everything I want and need at the present time.

    regards from CH
    Martin

  2. Ottimo articolo.
    Ho installato immediatamente recoll ma mi dava alcuni missing helpers (che ho provveduto ad installare tramite terminale).
    Tuttavia recoll non li vede ancora.
    Devo riavviare il sistema o cos’altro?

  3. Very interesting. Would the deamon be able to run on a Synology disk station (which uses a Linux kernel:

    Linux SynologyDisk 2.6.32.12 #1372 SMP Tue Nov 2 17:57:51 CST 2010 x86_64 GNU/Linux synology_x86_1010+

    and then query it from remote?

    • I’ve no experience with Synology, which distribution it can run ? a debian like ? if so perhaps you could be able to install Recoll on it.

      And (if possible) call the recoll client via ssh with X forward.
      So you can cann the remote client on your desktop and check for files/contents ailable on your NAS.

      Let us know if this is doable i think that can interest many persons.

  4. @Stephan
    I guess that as the Synology seems to run with X86 Linux you should be able to get recolld running. Does the Synology allow you to ssh into it? Is there a package management system for it (deb, rpm, etc)? Can you compile programs on it (is there a developer build package installed with make etc)?
    Once recolld is running, could you not mount (either as a Samba share or with sshfs) the xapiandb folder on your local machine and query the database “locally” using the recoll gui? In order for Preview and Open to work you would need to have the document tree loaded in a similar location locally as on the server I suppose. I would be interested to hear your results.
    regards
    Martin

  5. Risposta a Adso72
    Per quanto mi riguarda, ho installato i missing helpers tramite i pacchetti ufficiali della distribuzione che uso. Come ho accennato nell’articolo, alcuni di questi possono richiedere altre dipendenze e quindi bisogna controllare che tutte le dipendenze siano soddisfatte.

    Francesco Di Leo

  6. @Linuxari, @Martin:
    I think it is Gentoo based, but not sure. You can SSH into the device. I haven’t checked if it does support X. I can mount a share and make it seamless (using sshfs), so qeurying would work if there’s no X.
    Synology has a packet management system, but it’s not deb or rpm. They do have a developer sections and actively encourage 3rd party add-ons: http://www.synology.com/enu/apps/3rd-party_application_integration.php including the offer to list the app on their site.
    I can lend moral support for a port but my C is worse than my Chinese.

  7. Ho installato anche io i missing helpers tramite apt-get (catdoc, antidoc, unrtf) su due pc (casa e ufficio), entrambi con Kubuntu 64bit 10.10.

    Uno va benissimo, l’altro mi continua a dire che non trova i missing helpers :(

  8. Quali sono i missing helpers che non riesce a trovare?
    Se non hai indicizzato alcunchè cancella la directory nascosta .recoll nella tua directory personale.

  9. I use it, it’s a private google on your computer, very powerful, i love it. You will rediscover thing that you totally forget. A must have

 Leave a Reply

(required)

(required)


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>