Introduction to Parser Instructions

Parser instructions are the central concept of the output plugin. They provide the output plugin with an easily configurable, and extensible, parsing functionality. The role of the output plugin can thus be understood, in simple terms, as a boiler-plate needed to load, execute, and store the results returned by the parser instructions.

Note

Parsing of the output is achieved by executing a sequence of parser instructions!

Specifying the Parser Instruction Input

In order to customize the output parsing process we need to specify which instructions should be used as a part of the input. The instructions are specified using a special key PARSER_INSTRUCTIONS, within the settings input node, as shown below:

settings = {'PARSER_INSTRUCTIONS': []}
instr = settings['PARSER_INSTRUCTIONS']  # for easier access
instr.append({
    'instr': 'dummy_data_parser',
    'type': 'data',
    'params': {}
})

...

calc.use_settings(ParameterData(dict=settings))

Where the calc is an instance of the VaspCalculation class.

In the example above we are appending a single data parser instruction called dummy_data_parser. The parser instructions are supposed to be specified as a dictionary with three keys: instr, type, and params.

Currently there are three parser instruction types implemented: data, error, and structure. The distinction between these types comes into play during the instruction loading, where the output parser appends different auxiliary parameters to the instruction based on its type. For example, to every error type instruction a SCHED_ERROR_FILE parameter is appended. More information about the plugin’s default behaviour can be found here.

Defining New Parser Instructions

All parser instructions inherit from the base class, BaseInstruction, which provides the interface towards the output plugin. Therefore, the BaseInstruction is a template for implementing custom parser instructions.

In order to implement a new parser instruction one must inherit from the base class and overrride the two following things:

  1. list of input files, given by the class property _input_file_list_, or by setting the _dynamic_file_list_ = True, when the names of the input files are not known in advance.
  2. override the BaseInstruction._parser_function(self), which is a method that is called when the instruction is executed - it implements the actual parsing of the output.

Below we give examples on how to implement these two different instruction types.

Note

In future versions we may implement BaseInstruction subclasses for each instruction type, i.e. StaticInstruction and DynamicInstruction, in order to be explicit about our intents.

Static Parser Instruction

Static parser instruction is just an ordinary parser instruction for which we can specify the list of input file names in advance, i.e. the input file names are static.

The use of this method is advantageous since the input plugin will automatically update the list of files to be retreived, and the instruction itself will automatically check if the required files are present before the parsing starts.

Since the static parser instructions offer both the user commodity and additional safeties against an invalid user input, they should be prefered over the dynamic parser instructions!

Example:

The Default_vasprun_parserInstruction is an example of the static parsing instruction. It operates only on the statically named vasprun.xml file.

First thing in defining a static parsing instruction is to override the _input_file_list_:

In the case above the only input file is the vasprun.xml.

Next follows the implementation of the _parser_function. The _parser_function implements the output parsing logic. This part depends only on the user preferences and does not depend on the internal working of the AiiDA.

Finally, the output must be returned as a tuple:

The nodes_list is just an arbitrary list of tuples, e.g. [(‘velocities’, ArrayData_type), (‘energies’, ParameterData_type), ...], where the second tuple value needs to be an instance of the AiiDA’s Data type.

The second item in the return tuple is the parameter_warnings object, which is just a dictionary in which we can log useful information, e.g. non-critical errors, during the instruction execution. For example:

parser_warnings.setdefault('error name', 'details about the error')

After the instruction returns, the parser warnings are converted to a node, (‘errors@Default_vasprun_parserInstruction’, ParameterData(parser_warnings)), which is stored in the AiiDA database as a part of the output.

Note

In summary, static parser instruction is implemented by overriding the _input_file_list_ and the _parser_function. The parsed output must be returned in a format described above.

Dynamic Parser Instruction

Dynamic parser instruction differ from the static parser instructions in that the input file names must be provided by the user as an instruction parameter during the instruction specification, i.e. during the VASP calculation setup. This represents and overhead and allows for a typo to cause an instruction execution crash during the output parsing. For this reason the static parser methods should be used whenever that is possible.

Example:

The Default_error_parserInstruction is an example of the dynamic parser instruction.

The whole code is given below:

First the _dynamic_files_list_ is set to True, followed by the _parser_function implementation:

  1. get the input file name, self._params[‘SCHED_ERROR_FILE’], to open for parsing. (See the note below.)
  2. read the whole standard error file. We could be looking for a particular kind of error here instead!
  3. set up the output node list and return. In this example only one node, runtime_errors, is created. The parser_warnings is just an empty dictionary.

Note

The SCHED_ERROR_FILE parameter is appended automatically by the output parser to every error instruction type. This is an example of the default behaviour.