User Tools

Site Tools


Just some questions I had while using this cluster system.

What is the maximum ''wall'' time?

It's 30 days. You can run jobs for as long as that and you don't need to ask permission to an administrator. Also, have in mind that this limit cannot be extended.

If you are going to run very long jobs, please consult the checkpointing manual, which contains notes from a Calcul Québec workshop.

One of my jobs is running under an account that is not mine!

Happened to me once. See this:

$ showq -r -u jmateos

30486658            R  gm-  93.14      1.2 no   jmateos eim-670-      sw-2r14-n14    12  1:16:49:28  Thu Apr 16 03:51:21
30486659            R  gm-  88.11      1.3 no   jmateos eim-670-      sw-2r15-n70    12  1:18:30:22  Thu Apr 16 05:32:15
30621080            R  gm-  91.52      1.0 no   jmateos  atlaspt      sw-2r14-n13    12  3:03:19:00  Thu Apr 16 14:20:53

Job number 30621080 seems to we running under allocation group atlaspt and not one of our own. I contacted guillimin support and they told me this:

This is an issue with our scheduler software that is only a cosmetic problem. The job 30621080 will be correctly charged to the account eim-670-aa. The 'group' parameter in the scheduler is distinct from one called 'account' which is the one used for accounting purposes. The group parameter is correct on the worker nodes, but gets confused at some point on the scheduler node. However, it will not affect your jobs.

So don't worry too much about this.

I would like to receive an e-mail when my jobs are done

You can use the -m and -M switches in your script header. Example:

#PBS -m abe

The -m abe option instructs the scheduler to send you and e-mail when your jobs start (b), finish (e) or abort due to an error (a). You can select any combination of the possible options.

guillimin_faq.txt · Last modified: 2016/11/03 17:23 (external edit)