Recently a colleague of mine should run multiple parallel jobs on a server (around 20 in our specific case), every job does “an elaboration” and this can take a time longer or shorter in respect of the others, once a job finishes its task it must read from a text file the first line that indicate the ID of a new job, it removes that ID from the top of the text file and starts to elaborate it.
Problem: On a long run (a run of 50K+ jobs) it happens “frequently” that 2 jobs finish at the same time and so both take the same ID causing problems to the whole process, we tried to use a simple lock file “touched” just after the job was opening the file with the list of ID, but it seems that this operation it’s too slow and we had some case of failed concurrency again.
The solution was to use the bash function flock