CDR FAQ: Data storage and access

4. Data storage and access - CASTOR

Q1. Where does CDR store my data

All data written with CDR are stored in the CASTOR,  a hierarchical data management system developed in the IT dept to manage the migration of files between disk and tape storage. All CASTOR files are stored in a namespace with a directory structure of the form:

/castor/cern.ch/<experiment>/ . . .

CASTOR uses disk pools for intermediate storage before writing data files to cartridge tape. Disk pools are made up one or several UNIX file systems which may be located on one or several file servers. A daemon process called the stager controls the file activity in each data pool and in general, there is one stager per experiment. However, a stager can control several different pools as defined in a configuration file. The stager also handles file deletion once a pool is critically full and again, these thresholds are defined in the stager configuration file.

In order to transfer data files from CASTOR to a local UNIX filesystem, a remote file copy command rfcp is available and this is the recommended way to access CASTOR files.


Q2. How do I see my CDR files and directories in CASTOR

CASTOR provides several utility programs to examine the size, ownership rights etc of files. The commands are called name server commands and are installed as part of the CASTOR software. The commands names are modeled after the corresponding UNIX command with the letters ns prefixed to the command name. For example

nsls -l /castor/cern.ch/<experiment>/testbeam/

will generate a long listing of all files/directories in the testbeam directory.


Q3. How do I access data recorded to CASTOR via CDR

In order to transfer data files from CASTOR to a local UNIX filesystem, a remote file copy command rfcp should be used.

rfcp   <input-filename>   <output-filename>

A filename can be a CASTOR file, beginning /castor/cern.ch/ . . or a local UNIX file name. rfcp will always transfer the file via a staging pool and an associated stage host, the machine where the stager is running. The host and poolname can be specified by the environment variables STAGE_HOST and STAGE_POOL. If nothing is specified, the public staging pool will be used.


Q5. Data files on the DAQ are not being migrated to storage

CDR copies files from the DAQ to an intermediate staging pool before migration to tape. Problems with the CASTOR staging daemon can cause this migration process to fail and if a file remains unmigrated for hours or even days, cdr.support@cern.ch should be contacted. However, should tape migration fail, the files are not deleted on the DAQ system.


Q6. How do I know if my data in CASTOR have been migrated to tape ?

This can be checked from any system where CASTOR is installed using the nsls command:

nsls -l /castor/cern.ch/<expt>/testbeam/<group>/filename

For files which are migrated to tape, the leftmost field of the file access privileges will be set to m


Q7. Data in CASTOR are not the same as the data on the DAQ disks

This is a serious problem and may result from data corruption or from a CDR configuration problem where files are transferred while still being written on the DAQ. CDR will not purge such files on the DAQ system and cdr.support@cern.ch may be contacted to help in deciding what action to take.

 



The European laboratory for particle physics
Feedback and questions concerning this site should be directed to cdr.support@cern.ch