Dirac
DIRAC API Class
All DIRAC functionality is exposed through the DIRAC API and this serves as a source of documentation for the project via EpyDoc.
- The DIRAC API provides the following functionality:
A transparent and secure way for users to submit jobs to the Grid, monitor them and retrieve outputs
Interaction with Grid storage and file catalogues via the DataManagement public interfaces (more to be added)
Local execution of workflows for testing purposes.
- class DIRAC.Interfaces.API.Dirac.Dirac(useCertificates=False, vo=None)
Bases:
DIRAC.Core.Base.API.API
DIRAC API Class
- __init__(useCertificates=False, vo=None)
Internal initialization of the DIRAC API.
- addFile(lfn, fullPath, diracSE, fileGuid=None, printOutput=False)
Add a single file to Grid storage. lfn is the desired logical file name for the file, fullPath is the local path to the file and diracSE is the Storage Element name for the upload. The fileGuid is optional, if not specified a GUID will be generated on the fly. If subsequent access depends on the file GUID the correct one should
Example Usage:
>>> print dirac.addFile('/lhcb/user/p/paterson/myFile.tar.gz','myFile.tar.gz','CERN-USER') {'OK': True, 'Value':{'Failed': {}, 'Successful': {'/lhcb/user/p/paterson/test/myFile.tar.gz': {'put': 64.246301889419556, 'register': 1.1102778911590576}}}}
- Parameters
lfn (string) – Logical File Name (LFN)
diracSE (string) – DIRAC SE name e.g. CERN-USER
printOutput (boolean) – Optional flag to print result
- Returns
S_OK,S_ERROR
- checkSEAccess(se, access='Write')
returns the value of a certain SE status flag (access or other)
- Parameters
se (string) – Storage Element name
access (string in ('Read', 'Write', 'Remove', 'Check')) – type of access
- Returns
True or False
- deleteJob(jobID)
Delete (set status=DELETED) to job or list of jobs from the WMS If running, these jobs will be first killed.
Example Usage:
>>> print dirac.deleteJob(12345) {'OK': True, 'Value': [12345]}
- getAccessURL(lfn, storageElement, printOutput=False, protocol=False)
Allows to retrieve an access URL for an LFN replica given a valid DIRAC SE name. Contacts the file catalog and contacts the site endpoint behind the scenes.
Example Usage:
>>> print dirac.getAccessURL('/lhcb/data/CCRC08/DST/00000151/0000/00000151_00004848_2.dst','CERN-RAW') {'OK': True, 'Value': {'Successful': {'srm://...': {'SRM2': 'rfio://...'}}, 'Failed': {}}}
- getAllReplicas(lfns, printOutput=False)
Only differs from getReplicas method in the sense that replicas on banned SEs will be included in the result.
Obtain replica information from file catalogue client. Input LFN(s) can be string or list.
Example usage:
>>> print dirac.getAllReplicas('/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst') {'OK': True, 'Value': {'Successful': {'/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst': {'CERN-RDST': 'srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst'}}, 'Failed': {}}}
- getConfigurationValue(option, default)
Export the configuration client getValue() function
- getFile(lfn, destDir='', printOutput=False)
Retrieve a single file or list of files from Grid storage to the current directory. lfn is the desired logical file name for the file, fullPath is the local path to the file and diracSE is the Storage Element name for the upload. The fileGuid is optional, if not specified a GUID will be generated on the fly.
Example Usage:
>>> print dirac.getFile('/lhcb/user/p/paterson/myFile.tar.gz') {'OK': True, 'Value':{'Failed': {}, 'Successful': {'/lhcb/user/p/paterson/test/myFile.tar.gz': '/afs/cern.ch/user/p/paterson/myFile.tar.gz'}}}
- Parameters
lfn (string) – Logical File Name (LFN)
printOutput (boolean) – Optional flag to print result
- Returns
S_OK,S_ERROR
- getInputDataCatalog(lfns, siteName='', fileName='pool_xml_catalog.xml', ignoreMissing=False)
This utility will create a pool xml catalogue slice for the specified LFNs using the full input data resolution policy plugins for the VO.
If not specified the site is assumed to be the DIRAC.siteName() from the local configuration. The fileName can be a full path.
Example usage:
>>> print print d.getInputDataCatalog('/lhcb/a/b/c/00001680_00000490_5.dst',None,'myCat.xml') {'Successful': {'<LFN>': {'pfntype': 'ROOT_All', 'protocol': 'SRM2', 'pfn': '<PFN>', 'turl': '<TURL>', 'guid': '3E3E097D-0AC0-DB11-9C0A-00188B770645', 'se': 'CERN-disk'}}, 'Failed': [], 'OK': True, 'Value': ''}
- Parameters
lfns (LFN str or list []) – Logical File Name(s) to query
siteName (string) – DIRAC site name
fileName (string) – Catalogue name (can include path)
- Returns
S_OK,S_ERROR
- getInputSandbox(jobID, outputDir=None)
Retrieve input sandbox for existing JobID.
This method allows the retrieval of an existing job input sandbox for debugging purposes. By default the sandbox is downloaded to the current directory but this can be overridden via the outputDir parameter. All files are extracted into a InputSandbox<JOBID> directory that is automatically created.
Example Usage:
>>> print dirac.getInputSandbox(12345) {'OK': True, 'Value': ['Job__Sandbox__.tar.bz2']}
- Parameters
jobID (integer or string) – JobID
outputDir (string) – Optional directory for files
- Returns
S_OK,S_ERROR
- getJobAttributes(jobID, printOutput=False)
Return DIRAC attributes associated with the given job.
Each job will have certain attributes that affect the journey through the workload management system, see example below. Attributes are optionally printed to the screen.
Example Usage:
>>> print dirac.getJobAttributes(79241) {'AccountedFlag': 'False','ApplicationNumStatus': '0', 'ApplicationStatus': 'Job Finished Successfully', 'CPUTime': '0.0'}
- getJobCPUTime(jobID, printOutput=False)
Retrieve job CPU consumed heartbeat data from job monitoring service. Jobs can be specified individually or as a list.
The time stamps and raw CPU consumed (s) are returned (if available).
Example Usage:
>>> d.getJobCPUTime(959209) {'OK': True, 'Value': {959209: {}}}
- Parameters
jobID (int or string) – JobID
printOutput (Boolean) – Flag to print to stdOut
- Returns
S_OK,S_ERROR
- getJobDebugOutput(jobID)
Developer function. Try to retrieve all possible outputs including logging information, job parameters, sandbox outputs, pilot outputs, last heartbeat standard output, JDL and CPU profile.
Example Usage:
>>> dirac.getJobDebugOutput(959209) {'OK': True, 'Value': '/afs/cern.ch/user/p/paterson/DEBUG_959209'}
- Parameters
jobID (int or string) – JobID
- Returns
S_OK,S_ERROR
- getJobInputData(jobID)
Retrieve the input data requirement of any job existing in the workload management system.
Example Usage:
>>> dirac.getJobInputData(1405) {'OK': True, 'Value': {1405: ['LFN:/lhcb/production/DC06/phys-v2-lumi5/00001680/DST/0000/00001680_00000490_5.dst']}}
- getJobJDL(jobID, original=False, printOutput=False)
Simple function to retrieve the current JDL of an existing job in the workload management system. The job JDL is converted to a dictionary and returned in the result structure.
Example Usage:
>>> print dirac.getJobJDL(12345) {'Arguments': 'jobDescription.xml',...}
- Parameters
jobID (int or string) – JobID
- Returns
S_OK,S_ERROR
- getJobLoggingInfo(jobID, printOutput=False)
DIRAC keeps track of job transitions which are kept in the job monitoring service, see example below. Logging summary also printed to screen at the INFO level.
Example Usage:
>>> print dirac.getJobLoggingInfo(79241) {'OK': True, 'Value': [('Received', 'JobPath', 'Unknown', '2008-01-29 15:37:09', 'JobPathAgent'), ('Checking', 'JobSanity', 'Unknown', '2008-01-29 15:37:14', 'JobSanityAgent')]}
- Parameters
jobID (int or string) – JobID
printOutput (Boolean) – Flag to print to stdOut
- Returns
S_OK,S_ERROR
- getJobOutputData(jobID, outputFiles='', destinationDir='')
Retrieve the output data files of a given job locally.
Optionally restrict the download of output data to a given file name or list of files using the outputFiles option, by default all job outputs will be downloaded.
Example Usage:
>>> dirac.getJobOutputData(1405) {'OK':True,'Value':[<LFN>]}
- getJobOutputLFNs(jobID)
Retrieve the output data LFNs of a given job locally.
This does not download the output files but simply returns the LFN list that a given job has produced.
Example Usage:
>>> dirac.getJobOutputLFNs(1405) {'OK':True,'Value':[<LFN>]}
- Parameters
jobID (int or string) – JobID
- Returns
S_OK,S_ERROR
- getJobParameters(jobID, printOutput=False)
Return DIRAC parameters associated with the given job.
DIRAC keeps track of several job parameters which are kept in the job monitoring service, see example below. Selected parameters also printed to screen.
Example Usage:
>>> print dirac.getJobParameters(79241) {'OK': True, 'Value': {'JobPath': 'JobPath,JobSanity,JobPolicy,InputData,JobScheduling,TaskQueue', 'JobSanityCheck': 'Job: 768 JDL: OK, InputData: 2 LFNs OK, '}
- Parameters
jobID (int or string) – JobID
printOutput (Boolean) – Flag to print to stdOut
- Returns
S_OK,S_ERROR
- getJobStatus(jobID)
Monitor the status of DIRAC Jobs.
Example Usage:
>>> print dirac.getJobStatus(79241) {79241: {'Status': 'Done', 'MinorStatus': 'Execution Complete', 'ApplicationStatus': 'some app status' 'Site': 'LCG.CERN.ch'}}
- getJobSummary(jobID, outputFile=None, printOutput=False)
Output similar to the web page can be printed to the screen or stored as a file or just returned as a dictionary for further usage.
Jobs can be specified individually or as a list.
Example Usage:
>>> dirac.getJobSummary(959209) {'OK': True, 'Value': {959209: {'Status': 'Staging', 'LastUpdateTime': '2008-12-08 16:43:18', 'MinorStatus': '28 / 30', 'Site': 'Unknown', 'HeartBeatTime': 'None', 'ApplicationStatus': 'unknown', 'JobGroup': '00003403', 'Owner': 'joel', 'SubmissionTime': '2008-12-08 16:41:38'}}}
- Parameters
jobID (int or string) – JobID
outputFile (string) – Optional output file
printOutput (Boolean) – Flag to print to stdOut
- Returns
S_OK,S_ERROR
- getLfnMetadata(lfns, printOutput=False)
Obtain replica metadata from file catalogue client. Input LFN(s) can be string or list. LFN(s) can be either files or directories
Example usage:
>>> print dirac.getLfnMetadata('/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst') {'OK': True, 'Value': {'Successful': {'/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst': {'Status': '-', 'Size': 619475828L, 'GUID': 'E871FBA6-71EA-DC11-8F0C-000E0C4DEB4B', 'ChecksumType': 'AD', 'CheckSumValue': ''}}, 'Failed': {}}}
- Parameters
lfns (LFN str or list []) – Logical File Name(s) to query
printOutput (boolean) – Optional flag to print result
- Returns
S_OK,S_ERROR
- getOutputSandbox(jobID, outputDir=None, oversized=True, noJobDir=False, unpack=True)
Retrieve output sandbox for existing JobID.
This method allows the retrieval of an existing job output sandbox. By default the sandbox is downloaded to the current directory but this can be overridden via the outputDir parameter. All files are extracted into a <JOBID> directory that is automatically created.
Example Usage:
>>> print dirac.getOutputSandbox(12345) {'OK': True, 'Value': ['Job__Sandbox__.tar.bz2']}
- Parameters
jobID (integer or string) – JobID
outputDir (string) – Optional directory path
oversized (boolean) – Optionally disable oversized sandbox download
- Returns
S_OK,S_ERROR
- getPhysicalFileAccessURL(pfn, storageElement, printOutput=False)
Allows to retrieve an access URL for an PFN given a valid DIRAC SE name. The SE is contacted directly for this information.
Example Usage:
>>> print dirac.getPhysicalFileAccessURL('srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/data/CCRC08/DST/00000151/0000/00000151_00004848_2.dst','CERN_M-DST') {'OK': True, 'Value':{'Failed': {}, 'Successful': {'srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/data/CCRC08/DST/00000151/0000/00000151_00004848_2.dst': {'RFIO': 'castor://...'}}}}
- getPhysicalFileMetadata(pfn, storageElement, printOutput=False)
Allows to retrieve metadata for physical file(s) on a supplied storage element. Contacts the site endpoint and performs a gfal_ls behind the scenes.
Example Usage:
>>> print dirac.getPhysicalFileMetadata('srm://srm.grid.sara.nl/pnfs/grid.sara.nl/data /lhcb/data/CCRC08/RAW/LHCb/CCRC/23341/023341_0000039571.raw','NIKHEF-RAW') {'OK': True, 'Value': {'Successful': {'srm://...': {'SRM2': 'rfio://...'}}, 'Failed': {}}}
- getReplicas(lfns, active=True, preferDisk=False, diskOnly=False, printOutput=False)
Obtain replica information from file catalogue client. Input LFN(s) can be string or list.
Example usage:
>>> print dirac.getReplicas('/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst') {'OK': True, 'Value': {'Successful': {'/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst': {'CERN-RDST': 'srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst'}}, 'Failed': {}}}
- Parameters
lfns (LFN str or list []) – Logical File Name(s) to query
active (boolean) – restrict to only replicas at SEs that are not banned
preferDisk (boolean) – give preference to disk replicas if True
diskOnly (boolean) – restrict to only disk replicas if True
printOutput (boolean) – Optional flag to print result
- Returns
S_OK,S_ERROR
- getReplicasForJobs(lfns, diskOnly=False, printOutput=False)
Obtain replica information from file catalogue client. Input LFN(s) can be string or list.
Example usage:
>>> print dirac.getReplicasForJobs('/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst') {'OK': True, 'Value': {'Successful': {'/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst': {'CERN-RDST': 'srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/data/CCRC08/RDST/00000106/0000/00000106_00006321_1.rdst'}}, 'Failed': {}}}
- Parameters
lfns (LFN str or list []) – Logical File Name(s) to query
diskOnly (boolean) – restrict to only disk replicas if True
printOutput (boolean) – Optional flag to print result
- Returns
S_OK,S_ERROR
- killJob(jobID)
Issue a kill signal to a running job. If a job has already completed this action is harmless but otherwise the process will be killed on the compute resource by the Watchdog.
Example Usage:
>>> print(dirac.killJob(12345)) {'OK': True, 'Value': [12345]}
- listCatalogDirectory(directoryLFN, printOutput=False)
lists the contents of a directory in the DFC
Example usage:
>>> res = dirac.listCatalogDir("/lz/data/test", printOutput=True) Listing content of: /lz/data/test Subdirectories: /lz/data/test/reconstructed /lz/data/test/BACCARAT_release-2.1.1_geant4.9.5.p02 /lz/data/test/BACCARAT_release-2.1.0_geant4.9.5.p02 Files: /lz/data/test/sites.log /lz/data/test/sites2.log
>>> print(res) {'OK': True, 'Value': {'Successful': {'/lz/data/test': {'Files': {'/lz/data/test/sites.log': {'MetaData': {'Status': 'AprioriGood', 'GUID': 'AD81AD07-3BC0-A9FE-1D82-786C4DC9D380', 'ChecksumType': 'Adler32', 'Checksum': '8b994dd5', 'Size': 1100L, 'UID': 2, 'OwnerGroup': 'lz_production', 'Owner': 'daniela.bauer', 'GID': 24, 'Mode': 509, 'ModificationDate': datetime.datetime(2021, 6, 11, 14, 23, 51), 'CreationDate': datetime.datetime(2021, 6, 11, 14, 23, 51), 'Type': 'File', 'FileID': 27519475L}}, '/lz/data/test/sites2.log': {'MetaData': {'Status': 'AprioriGood', 'GUID': 'AD81AD07-3BC0-A9FE-1D82-786C4DC9D380', 'ChecksumType': 'Adler32', 'Checksum': '8b994dd5', 'Size': 1100L, 'UID': 2, 'OwnerGroup': 'lz_production', 'Owner': 'daniela.bauer', 'GID': 24, 'Mode': 509, 'ModificationDate': datetime.datetime(2021, 6, 16, 15, 26, 21), 'CreationDate': datetime.datetime(2021, 6, 16, 15, 26, 21), 'Type': 'File', 'FileID': 27601076L}}}, 'Datasets': {}, 'SubDirs': {'/lz/data/test/reconstructed': True, '/lz/data/test/BACCARAT_release-2.1.1_geant4.9.5.p02': True, '/lz/data/test/BACCARAT_release-2.1.0_geant4.9.5.p02': True}, 'Links': {}}}, 'Failed': {}}}
- Parameters
directoryLFN (string or list in LFN format) – LFN of the directory to be listed
printOutput (bool) – prints output in a more human readable form
- Returns
S_OK,S_ERROR. S_OK returns a dictionary. Please see the example for its structure.
- peekJob(jobID, printOutput=False)
The peek function will attempt to return standard output from the WMS for a given job if this is available. The standard output is periodically updated from the compute resource via the application Watchdog. Available standard output is printed to screen at the INFO level.
Example Usage:
>>> print dirac.peekJob(1484) {'OK': True, 'Value': 'Job peek result'}
- Parameters
jobID (int or string) – JobID
- Returns
S_OK,S_ERROR
- pingService(system, service, printOutput=False, url=None)
The ping function will attempt to return standard information from a system service if this is available. If the ping() command is unsuccessful it could indicate a period of service unavailability.
Example Usage:
>>> print dirac.pingService('WorkloadManagement','JobManager') {'OK': True, 'Value': 'Job ping result'}
- Parameters
system (string) – system
service (string) – service name
printOutput (Boolean) – Flag to print to stdOut
url (string) – url to ping (instad of system & service)
- Returns
S_OK,S_ERROR
- preSubmissionChecks(job, mode)
Internal function. The pre-submission checks method allows VOs to make their own checks before job submission. To make use of this the method should be overridden in a derived VO-specific Dirac class.
- removeFile(lfn, printOutput=False)
Remove LFN and all associated replicas from Grid Storage Elements and file catalogues.
Example Usage:
>>> print dirac.removeFile('LFN:/lhcb/data/CCRC08/RAW/LHCb/CCRC/22808/022808_0000018443.raw') {'OK': True, 'Value':...}
- Parameters
lfn (string) – Logical File Name (LFN)
printOutput (Boolean) – Flag to print to stdOut
- Returns
S_OK,S_ERROR
- removeReplica(lfn, storageElement, printOutput=False)
Remove replica of LFN from specified Grid Storage Element and file catalogues.
Example Usage:
>>> print dirac.removeReplica('LFN:/lhcb/user/p/paterson/myDST.dst','CERN-USER') {'OK': True, 'Value':...}
- Parameters
lfn (string) – Logical File Name (LFN)
storageElement (string) – DIRAC SE Name
- Returns
S_OK,S_ERROR
- replicate(lfn, destinationSE, sourceSE='', printOutput=False)
Replicate an existing file to another Grid SE. lfn is the desired logical file name for the file to be replicated, destinationSE is the DIRAC Storage Element to create a replica of the file at. Optionally the source storage element and local cache for storing the retrieved file for the new upload can be specified.
Example Usage:
>>> print dirac.replicate('/lhcb/user/p/paterson/myFile.tar.gz','CNAF-USER') {'OK': True, 'Value':{'Failed': {}, 'Successful': {'/lhcb/user/p/paterson/test/myFile.tar.gz': {'register': 0.44766902923583984}}}}
- Parameters
lfn (string) – Logical File Name (LFN)
destinationSE (string) – Destination DIRAC SE name e.g. CERN-USER
sourceSE (string) – Optional source SE
printOutput (boolean) – Optional flag to print result
- Returns
S_OK,S_ERROR
- replicateFile(lfn, destinationSE, sourceSE='', localCache='', printOutput=False)
Replicate an existing file to another Grid SE. lfn is the desired logical file name for the file to be replicated, destinationSE is the DIRAC Storage Element to create a replica of the file at. Optionally the source storage element and local cache for storing the retrieved file for the new upload can be specified.
Example Usage:
>>> print dirac.replicateFile('/lhcb/user/p/paterson/myFile.tar.gz','CNAF-USER') {'OK': True, 'Value':{'Failed': {}, 'Successful': {'/lhcb/user/p/paterson/test/myFile.tar.gz': {'register': 0.44766902923583984, 'replicate': 56.42345404624939}}}}
- Parameters
lfn (string) – Logical File Name (LFN)
destinationSE (string) – Destination DIRAC SE name e.g. CERN-USER
sourceSE (string) – Optional source SE
localCache (string) – Optional path to local cache, if not specified a temp dir will be created in CWD
printOutput (boolean) – Optional flag to print result
- Returns
S_OK,S_ERROR
- rescheduleJob(jobID)
Reschedule a job or list of jobs in the WMS. This operation is the same as resubmitting the same job as new. The rescheduling operation may be performed to a configurable maximum number of times but the owner of a job can also reset this counter and reschedule jobs again by hand.
Example Usage:
>>> print dirac.rescheduleJob(12345) {'OK': True, 'Value': [12345]}
- runLocal(job)
- Internal function. This method is called by DIRAC API function submitJob(job,mode=’Local’).
All output files are written to the local directory.
This is a method for running local tests. It skips the creation of a JobWrapper, but preparing an environment that mimics it.
- Parameters
job (Job) – a job object
- selectJobs(status=None, minorStatus=None, applicationStatus=None, site=None, owner=None, ownerGroup=None, jobGroup=None, date=None, printErrors=True)
Options correspond to the web-page table columns. Returns the list of JobIDs for the specified conditions. A few notes on the formatting:
date must be specified as yyyy-mm-dd. By default, the date is today.
jobGroup corresponds to the name associated to a group of jobs, e.g. productionID / job names.
site is the DIRAC site name, e.g. LCG.CERN.ch
owner is the immutable nickname, e.g. paterson
Example Usage:
>>> dirac.selectJobs( status='Failed', owner='paterson', site='LCG.CERN.ch') {'OK': True, 'Value': ['25020', '25023', '25026', '25027', '25040']}
- Parameters
status (string) – Job status
minorStatus (string) – Job minor status
applicationStatus (string) – Job application status
site (string) – Job execution site
owner (string) – Job owner
jobGroup (string) – Job group
date (string) – Selection date
- Returns
S_OK,S_ERROR
- splitInputData(lfns, maxFilesPerJob=20, printOutput=False)
Split the supplied lfn list by the replicas present at the possible destination sites. An S_OK object will be returned containing a list of lists in order to create the jobs.
Example usage:
>>> d.splitInputData(lfns,10) {'OK': True, 'Value': [['<LFN>'], ['<LFN>']]}
- Parameters
lfns (list) – Logical File Name(s) to split
maxFilesPerJob (integer) – Number of files per bunch
printOutput (boolean) – Optional flag to print result
- Returns
S_OK,S_ERROR
- submitJob(job, mode='wms')
Submit jobs to DIRAC (by default to the Workload Management System). These can be either:
- Instances of the Job Class
VO Application Jobs
Inline scripts
Scripts as executables
Scripts inside an application environment
JDL File
JDL String
Example usage:
>>> print dirac.submitJob(job) {'OK': True, 'Value': '12345'}
- DIRAC.Interfaces.API.Dirac.parseArguments(args)