If you want to analyse a large dataset you can create a single
masterjob and then specify how much data should be analysed in each of
a set of subjobs. For DAVINCI jobs there are two splitters of interest:
The SplitByFiles splitter takes a single argument which
specifies the number of input data files that each sub job will process. The
following example will create a DAVINCI master job with the default 10 data
files per subjob.
In [31]:j = Job(application=dv, splitter=SplitByFiles())
The DiracSplitter works as SplitByFiles but ensures
that a given job will only have data that is located at the same Grid
site. This is the recommended splitter to use for DAVINCI jobs when using
the DIRAC backend.