Note
This document is available in different formats: HTML (online), PDF, or EPUB.
See: http://svn.nexusformat.org/definitions/trunk/misc/impatient/_build/html/index.html
The NeXus data format is a tool which has been designed to solve the problems of travelling scientists, who undertake experiments at several different neutron, x-ray or muon facilities. Such people will apply to different facilities with different proposals in order to get their science done, then combine the data thus collected at the various sources. However, the life of a travelling scientist is complicated by some of the following factors:
NeXus is designed to solve these problems by defining a data format with the following properties:
NeXus uses HDF-5 files as container files. HDF-5 is a popular scientific data format which has been developed by the National Center for Supercomputing Applications (NCSA), University of Illinois Urbana-Champaign and is currently being maintained by the HDF Group. There is built-in support for HDF-5 in many scientific packages. Other users of HDF-5 include NASA, Boeing, meteorological offices around the world and many more. NeXus is thus able to inherit many desirable properties for free from HDF-5, such as: extendable, self-describing, platform independent, public domain, and efficient. For historical reasons, NeXus supports two further container file formats: HDF-4 and XML. The use of these formats is now deprecated.
In order to understand NeXus it is important to know about some of the objects which live in an HDF-5 file:
HDF-5 does not, however, know anything about the application domain of neutron, muon or x-ray scattering. In order to remedy this, NeXus adds the following:
NeXus defines two main group hierarchy types:
There are additional hierarchy variations for multi-method instruments and for a general purpose dump structure. Documentation for these hierarchy types and be found in the NeXus manual.
This hierarchy is applicable to raw data files as written by some facility instrument:
1 2 3 4 5 6 7 8 9 10 11 | entry:NXentry
instrument:NXinstrument
source:NXsource
....
detector:NXdetector
data:NX_INT32[512,512]
@signal = 1
sample:NXsample
control:NXmonitor
data:NXdata
data --> /entry/instrument/detector/data
|
The following groups are required to be present in all NeXus data files:
The following additional groups are present in most NeXus data files:
Note
A few words on notation in this representation:
Before starting to describe how to decide what goes into a NeXus file, some more details about NeXus groups and base classes need to be explained. As seen in the examples, NeXus uses groups with well-defined class names starting with “NX”. NeXus calls these NX classes “base classes”, which is slightly misleading when you are used to object-oriented notations. For each NeXus base class, there exists a dictionary description that details which other groups and which fields are allowed in this base class. This dictionary is where you will find appropriate field names for the data items you wish to describe. The NeXus base classes are documented in the NeXus Reference Manual. [4] A common misconception among NeXus beginners is that you have to specify all fields which exist in a given NeXus base class. This is not the case! You only need to choose those fields from the NeXus base class dictionary which make sense for your application. But, you are encouraged to store the additional information if it is available since it can be used to diagnose problems with the instrument. The minimum set of fields that are appropraie to a given technique are usually specifed in an “application definition”.
Before the mechanics of writing a NeXus file can be explained, we need to know which fields are written into the NeXus file at which position in the hierarchy. The example will be to store basic data. A couple of steps are required:
1 2 | entry:NXentry
data:NXdata
|
Example 3: NeXus Raw Data File Template
Before beginning this process, it might be worthwhile to look at some of the NeXus application definitions in the NeXus reference manual for examples and inspiration. But be aware that each NeXus application definition only defines the minimum sets for a certain usage case.
In this process you might encounter the situation that you wish to store more information then foreseen by NeXus. There are two options which have to be considered:
Be sure that the names of things you define have no embedded whitespace and begin with a letter.
This is a simplified hierarchy style applicable to the results of data reduction or data analysis applications. Such results can consist of large multidimensional arrays, so it can be advisable to use NeXus for storing such data:
1 2 3 4 5 6 7 8 9 10 | entry:NXentry
reduction:NXprocess
program_name = "pyDataProc2010"
version = "1.0a"
input:NXparameter
filename = "sn2013287.nxs"
sample:NXsample
data:NXdata
data
@signal = 1
|
Here the NXentry contains:
Optionally, a processed data entry can contain an NXinstrument group in order to describe the instrument if this matters at this stage.
Scanning means to vary some variable in a certain, defined way and collect data as the variable progresses. Scans are a versatile experimental technique and are thus very difficult to standardize. NeXus solves this problem through a couple of rules. Before these rules can be discussed, the symbol NP has to be introduced. NP is simply the number of scan points.
This is an example of a NeXus raw data file describing a scan where the sample is rotated and data is collected in an area detector:
1 2 3 4 5 6 7 8 9 10 11 12 13 | entry:NXentry
instrument:NXinstrument
detector:NXdetector
data:[NP,xsize,ysize]
@signal = 1
sample:NXsample
rotation_angle[NP]
@axis=1
control:NXmonitor
data[NP]
data:NXdata
data --> /entry/instrument/detector/data
rotation_angle --> /entry/sample/rotation_angle
|
Any program whose aim is to identify plottable data should use the following procedure:
When trying to establish a data standard, we encounter a few challenges, some of which can slow effort:
However, there are many benefits to be gained from having the NeXus data standard:
Storing all this metadata when saving the data takes extra effort, but benefits include:
- The file will include the necessary fields for yet unforeseen ways to analyse the data.
- If something is wrong with the data, it becomes possible to figure out what went wrong.
- There is a better record of what has been measured. This helps to protect against scientific fraud.
The simplest way to read and plot a NeXus file is through the Python PyTree API:
1 2 | import nxs
nxs.load('powder.h5').plot()
|
In order for this to be possible, PyTree uses the NeXus conventions to locate the plottable data and the axes to use. In particular, this plots the first NXdata group in the first NXentry in the powder.h5 file. The NeXus python package provides additional support for working with NeXus groups.
The plot could also be created by directly accessing the HDF-5 file using the h5py [1] package:
1 2 3 4 5 | import pylab, h5py
file = h5py.File('powder.h5')
pylab.plot(file['/entry1/data1/two_theta'], file['/entry1/data1/counts'])
pylab.title(file['/entry1/title'][0])
pylab.show()
|
Matlab support in version R2011b is similar:
1 2 3 4 5 | >> two_theta = h5read('powder.h5', '/entry1/data1/two_theta');
>> counts = h5read('powder.h5', '/entry1/data1/counts');
>> title = h5read('powder.h5', 'entry1/title');
>> plot(two_theta, counts)
>> title(title)
|
Note that matlab will require explicit casting from integer data to floating point data to perform many operations. For example, to plot a 2D data set [2] using log intensity:
1 2 | >> data = h5read('lrcs3701.nx5','/Histogram1/data/data');
>> h = pcolor(log(double(data+1))); set(h,'EdgeAlpha',0)
|
Support for HDF is available in other scientific computing environments, including IDL, Igor, Mathematica and R.
Reading the file using the HDF-5 C API is a little more involved:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | /**
* Reading example for reading NeXus files with plain
* HDF-5 API calls. This reads out counts and two_theta
* out of the file generated by nxh5write.
*
* WARNING: I left out all error checking in this example.
* In production code you have to take care of those errors
*
* Mark Koennecke, October 2011
*/
#include <hdf5.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
float *two_theta = NULL;
int *counts = NULL, rank, i;
hid_t fid, dataid, fapl;
hsize_t *dim = NULL;
hid_t datatype, dataspace, memdataspace;
/*
* Open file, thereby enforcing proper file close
* semantics
*/
fapl = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_fclose_degree(fapl,H5F_CLOSE_STRONG);
fid = H5Fopen("NXfile.h5", H5F_ACC_RDONLY,fapl);
H5Pclose(fapl);
/*
* open and read the counts dataset
*/
dataid = H5Dopen(fid,"/scan/data/counts");
dataspace = H5Dget_space(dataid);
rank = H5Sget_simple_extent_ndims(dataspace);
dim = malloc(rank*sizeof(hsize_t));
H5Sget_simple_extent_dims(dataspace, dim, NULL);
counts = malloc(dim[0]*sizeof(int));
memdataspace = H5Tcopy(H5T_NATIVE_INT32);
H5Dread(dataid,memdataspace,H5S_ALL, H5S_ALL,H5P_DEFAULT, counts);
H5Dclose(dataid);
H5Sclose(dataspace);
H5Tclose(memdataspace);
/*
* open and read the two_theta data set
*/
dataid = H5Dopen(fid,"/scan/data/two_theta");
dataspace = H5Dget_space(dataid);
rank = H5Sget_simple_extent_ndims(dataspace);
dim = malloc(rank*sizeof(hsize_t));
H5Sget_simple_extent_dims(dataspace, dim, NULL);
two_theta = malloc(dim[0]*sizeof(float));
memdataspace = H5Tcopy(H5T_NATIVE_FLOAT);
H5Dread(dataid,memdataspace,H5S_ALL, H5S_ALL,H5P_DEFAULT, two_theta);
H5Dclose(dataid);
H5Sclose(dataspace);
H5Tclose(memdataspace);
H5Fclose(fid);
for(i = 0; i < dim[0]; i++){
printf("%8.2f %10d\n", two_theta[i], counts[i]);
}
}
|
More examples of reading NeXus data files can be found in the Examples chapter of the NeXus Reference Documentation. [4]
You can obviously skip this section if you only wish to read NeXus files.
For writing the NeXus file, you have the option to use the NeXus API or to use the HDF-5 API. The complexity of NeXus file writing code is similar to the reading code. For both approaches, more information is available in the NeXus Manual [3] or the NeXus Reference Documentation. [4]
To give you a taste of what it is like to write a NeXus file using the NeXus API, here is a complete code example in C. It shows how to create a scan:NXentry/data:NXdata structure and store two arrays, counts and two_theta:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | #include "napi.h"
int main()
{
NXhandle fileID;
NXopen ("NXfile.nxs", NXACC_CREATE, &fileID);
NXmakegroup (fileID, "Scan", "NXentry");
NXopengroup (fileID, "Scan", "NXentry");
NXmakegroup (fileID, "data", "NXdata");
NXopengroup (fileID, "data", "NXdata");
/* somehow, we already have arrays tth and counts, each length n*/
NXmakedata (fileID, "two_theta", NX_FLOAT32, 1, &n);
NXopendata (fileID, "two_theta");
NXputdata (fileID, tth);
NXputattr (fileID, "units", "degrees", 7, NX_CHAR);
NXclosedata (fileID); /* two_theta */
NXmakedata (fileID, "counts", NX_FLOAT32, 1, &n);
NXopendata (fileID, "counts");
NXputdata (fileID, counts);
NXclosedata (fileID); /* counts */
NXclosegroup (fileID); /* data */
NXclosegroup (fileID); /* Scan */
NXclose (&fileID);
return;
}
|
More examples of writing NeXus data files can be found in the Examples chapter of the NeXus Reference Documentation. [4]
Did we get you interested? Here is where you can get more information. Our main entry point is the NeXus WWW-site at http://www.nexusformat.org where you can find more information, download the NeXus API, NeXus User Manual [3] and NeXus Reference Documentation. [4]
If you encounter problems then please help us make NeXus better. Report your problem to the NeXus mailing list (nexus@nexusformat.org). Problems that we never know about have absolutely no chance of getting resolved.
NeXus is a voluntary effort. Thus, if you have spare time and are willing to lend us a hand, you are more welcome to contact us via nexus-committee@nexusformat.org
NeXus was developed from three independent proposals from Jonathan Tischler, APS, Przemek Klosowski, NIST and Mark Koennecke, ISIS (now PSI) by an international team of scientists during a series of SoftNess workshops in 1994 - 1996. More work was done during NOBUGS conferences. Since 2001, NeXus is overseen by the NeXus International Advisory Committee (NIAC) which meets once a year. The NIAC strives to have a representative for each participating facility. The NIAC has a constitution which you can find on the NeXus WWW site.
| [1] | h5py: http://code.google.com/p/h5py/ |
| [2] | lrcs3701.nx5 (NeXus HDF-5 data file): http://svn.nexusformat.org/definitions/exampledata/IPNS/LRMECS/lrcs3701.nx5 |
| [3] | (1, 2) NeXus User Manual: http://download.nexusformat.org/doc/html/UserManual.html |
| [4] | (1, 2, 3, 4, 5) NeXus Reference Documentation: http://download.nexusformat.org/doc/html/ReferenceDocumentation.html |