In my previous blog post I described how you can use python with JASON’s external command processing item to perform operations on your data at any point of the NMR processing chain. This is extremely powerful and flexible tool for users who, for example, are developing novel data processing approaches.
But what if your favourite programming language is not python? Well, JASON’s external command processing item can send data to any external program or script which can read HDF5 format data. This includes the vast majority of modern (and not so modern!) programming languages.
In this blog post, I will show how both MATLAB and R scripts can be called from JASON and perform the same operations as I described in the previous post in this series. MATLAB natively supports reading and writing HDF5 files, while R requires you to install an additional library. In the example presented in this post we use the rhdf5 library.
The MATLAB script is conceptually very similar to the invert.py python script I described previously. MATLAB files implement a function which is applied to the data. Since MATLAB already has a function called invert(), we give this one a different name, in this case invertspec(), so as to avoid accidently calling the wrong function. The h5read() and h5write() functions reading and writing data from the HDF5 file:
Like for the python script in the previous post, as we are not changing the size of the dataset, we can just replace the original with the modified data.
Calling this script from within JASON is straightforward. All you need to do is add the external command processing item to the processing list, and then set a couple of options. In the case of MATLAB, the command is just “matlab”, and the arguments supplied are: -nojvm -batch “invertspec(‘$TMPFILE’)””. The -nojvm -batch flags stop the main MATLAB GUI from starting so that the processing is done in the background using just the MATLAB kernel.
Due to the way function arguments are passed to MATLAB, we have to explicitly specify the name of the temporary file which JASON is going to send and then close the brackets at the end of the string. However, for security reasons this filename is randomly generated each time the processing list is executed. So how to we achieve this? There is a special variable in JASON called for the external command, and JASON will replace this with the randomly generated file name.
The results of including this MATLAB script in the processing list should be identical to those obtained using the python script from the previous blog post.
The R language is extremely popular for general statistical analysis, and is heavily used by the metabolomics community. We can implement the same example, but this time as an R script. In order to read HDF5 files we require an additional library, in this case the rhdf5 library. If you do not have this installed, you can run the command install.packages(“BiocManager”), followed by BiocManager::install(“rhdf5”) at the R prompt.
The general layout of our R script is the same as the python and MATLAB versions. We can use the commandArgs() function to capture the filename and the h5read() and h5write() functions from the rhdf5 library perform the same operations on the data as for the python and MATLAB examples.
The command we wish to execute from within JASON in this case is the Rscript command. This is a lightweight version of R which doesn’t launch any of the interactive interface, a similar functionality to the arguments we passed to MATLAB to stop the GUI appearing in the previous example.
The output of both the MATLAB and R scripts is identical to that of the python scripts from my previous post. It doesn’t matter which programming language you use for your external processing. There are subtle differences in how the HDF5 bindings have been supported in different languages, and how the balance between high level / utility functions and low level access to the HDF5 data has been approached. Please see the documentation for the appropriate HDF5 library available for your favourite programming language!
Once the data is returned to JASON from your script the remainder of the processing chain is applied and the results displayed on the canvas. You can now perform all the usual visualisation and analysis operations available in JASON on your data.
In the next part of this series, we will look at implementing some more advanced data processing as an external command.
The scripts used in this series of posts are available here and can be used as the basis for your own external commands. The download includes MATLAB and R versions of both invert.py and double.py.