Jun 032013
 

Recently a colleague of mine should run multiple parallel jobs on a server (around 20 in our specific case), every job does “an elaboration” and this can take a time longer or shorter in respect of the others, once a job finishes its task it must read from a text file the first line that indicate the ID of a new job, it removes that ID from the top of the text file and starts to elaborate it.

Problem: On a long run (a run of 50K+ jobs) it happens “frequently” that 2 jobs finish at the same time and so both take the same ID causing problems to the whole process, we tried to use a simple lock file “touched” just after the job was opening the file with the list of ID, but it seems that this operation it’s too slow and we had some case of failed concurrency again.

The solution was to use the bash function flock



flock(2) it’s used to apply advisory locks to open files. it can be used to synchronize access to resources across multiple running processes. While flock(2) does solely act on files (actually, on file handles), the file itself need not be the resource to which access is being controlled. Instead, the file can be used as a semaphore to control access to a critical section, in which any resource can be accessed without concurrency concerns.

This sound complicated ?

Perhaps a small example will help, this is my /tmp/hello.sh script:

#!/bin/bash
 
set -e
 
scriptname=$(basename $0)
lock="/var/run/${scriptname}"
 
exec 200>$lock
flock -n 200 || exit 1
 
## The code:
pid=$$
echo $pid 1>&200
sleep 60
echo "Hello world"

Some explanation of the interesting parts of the code:

set -e

When this option is on, if a simple command fails for any of the reasons listed in Consequences of Shell Errors or returns an exit status value >0, and is not part of the compound list following a while, until, or if keyword, and is not a part of an AND or OR list, and is not a pipeline preceded by the ! reserved word, then the shell shall immediately exit.

exec 200>$lock

Normally “exec” in a shell script is used to turn over control of the script to some other program. But it can also be used to open a file and name a file handle for it. Normally, every script has standard input (file handle 0), standard output (file handle 1) and standard error (file handle 2) opened for it. The call “exec 200>$lock” will open the file named in $lock for reading, and assign it file handle 200

flock -n 200||exit 1

Tells flock to exclusively lock the file referenced by file handle 200 or exit with code 1. The state of being locked lasts after the flock call, because the file handle is still valid. That state will last until the file handle is closed, typically when the script exits.

After that i collect in the variable $pid the PID of this process and I write it in the lock file, i sleep 60 seconds (to test what happen if the script is run a second time) and at the end I give my message to the world.

And this is the output (in verbose mode) of this script:

First run:

bash -x /tmp/hello.sh 
+ set -e
++ basename /tmp/hello.sh
+ scriptname=hello.sh
+ lock=/var/run/hello.sh
+ exec
+ flock -n 200
+ pid=4683
+ echo 4683
+ sleep 60
+ echo 'Hello world'
Hello world

Second run (while the first it’s sleeping):

bash -x /tmp/hello.sh 
+ set -e
++ basename /tmp/hello.sh
+ scriptname=hello.sh
+ lock=/var/run/hello.sh
+ exec
+ flock -n 200
+ exit 1

The main options that you can use with the command flock are:

-s, –shared
Obtain a shared lock, sometimes called a read lock.

-x, -e, –exclusive
Obtain an exclusive lock, sometimes called a write lock. This is the default.

-u, –unlock
Drop a lock. This is usually not required, since a lock is automatically dropped when the file is closed. However, it may be required in special cases,
for example if the enclosed command group may have forked a background process which should not be holding the lock.

-n, –nb, –nonblock
Fail (with an exit code of 1) rather than wait if the lock cannot be immediately acquired.

-w, –wait, –timeout seconds
Fail (with an exit code of 1) if the lock cannot be acquired within seconds seconds. Decimal fractional values are allowed.

Conclusions

Flock it’s an easy function that can assure you to run multiple jobs/processes without any problem on resources that must be read or write by a single process at time.

Popular Posts:

flattr this!

  2 Responses to “Linux shell: Introduction to Flock”

  1. I discovered flock recently and use it every time the system boots and runs many iptables rules from the script. iptables looks like doesn’t have a lock function, so the consequence is some iptables rules are not loaded at all.

    So I always instead of:
    IPT=`which iptables`

    use
    IPT=’flock /dev/shm/iptables.lock iptables’

  2. You could look at this free software project as well: http://sourceforge.net/projects/flom/
    It must me compiled and installed, but after that step, everything becomes as easy as running
    flom — my_command_I_need_to_synchronize

    Cheers
    Ch.F.

 Leave a Reply

(required)

(required)


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>