Developing Custom Asimmodules
This section will guide you through taking your own in-house simulation and integrating it into asimtools. The process is designed to be as straight-forward as reasonably possible and with continued user-feedback we hope to make it more seamless.
As an example, we will ASIMplify the calculation of equations of state as shown in ASE. To understand the code, visit the ASE tutorials page. Ultimately, the ASIMplified code will work with any chemical species, any supported calculator and any configured computing environment with fully parallelizable job submission on HPCs in just a few steps!
The code is given below:
from ase.build import bulk
from ase.calculators.emt import EMT
from ase.eos import calculate_eos
from ase.db import connect
db = connect('bulk.db')
for symb in ['Al', 'Ni', 'Cu', 'Pd', 'Ag', 'Pt', 'Au']:
atoms = bulk(symb, 'fcc')
atoms.calc = EMT()
eos = calculate_eos(atoms)
v, e, B = eos.fit() # find minimum
# Do one more calculation at the minimum and write to database:
atoms.cell *= (v / atoms.get_volume())**(1 / 3)
atoms.get_potential_energy()
db.write(atoms, bm=B)
There are a couple of issues here that make it difficult to make this a high-throughput calculation. First, because we directly setup the calculator in the code, if we want to change the calculator we have to change the code. The same applies for the structures and chosen chemical species. In addition, the current code runs independent calculations one after the other. If each of these calculations was an expensive calculation, it would be much less efficient than running them in separate jobs on a HPC with a job scheduler. To fix any of these issues, we would have to modify the code, job scripts and perhaps even file structure. We will show how ASIMTools can help with all of these issues. The end-result will be that any of these changes can made without touching code or a job script.
1. Build and test with a cheap calculator
To make things easier, it is best to first replace the expensive parts of your workflow with toy examples e.g. use an empirical calculator like EMT instead of DFT, use a smaller/simpler structure, loop over fewer cases etc.
The ASE example already uses a simple EMT calculator and simple structures.
2. Wrap in a function
The workhorse of ASIMTools is the asimmodule, any code wrapped in a function that returns a dictionary can be run within the ASIMTools framework. The easiest thing to do would be to take the code and copy and paste it in a function which is defined inside an asimmodule with the same name
from ase.build import bulk
from ase.calculators.emt import EMT
from ase.eos import calculate_eos
from ase.db import connect
def ase_eos():
db = connect('bulk.db')
for symb in ['Al', 'Ni', 'Cu', 'Pd', 'Ag', 'Pt', 'Au']:
atoms = bulk(symb, 'fcc')
atoms.calc = EMT()
eos = calculate_eos(atoms)
v, e, B = eos.fit() # find minimum
# Do one more calculation at the minimu and write to database:
atoms.cell *= (v / atoms.get_volume())**(1 / 3)
atoms.get_potential_energy()
db.write(atoms, bm=B)
return {}
Immediately as it is, this is an asimmodule can be run in ASIMTools. You can run it with the following sim_input.yaml
asimmodule: /path/to/ase_eos.py
env_id: inline
workdir: results
then call
asim-execute sim_input.yaml
Only a little bit more complicated than calling python ase_eos.py
3. Make the asimmodule use any calculator
This asimmodule however still depends on specific structures and a specific calculator. Let’s do the easy thing first, let’s make the asimmodule work with any calculator using a simple change.
from ase.build import bulk
from ase.eos import calculate_eos
from ase.db import connect
from asimtools.calculators import load_calc
def ase_eos(
calc_id,
):
calc = load_calc(calc_id)
db = connect('bulk.db')
for symb in ['Al', 'Ni', 'Cu', 'Pd', 'Ag', 'Pt', 'Au']:
atoms = bulk(symb, 'fcc')
atoms.calc = calc
eos = calculate_eos(atoms)
v, e, B = eos.fit() # find minimum
# Do one more calculation at the minimu and write to database:
atoms.cell *= (v / atoms.get_volume())**(1 / 3)
atoms.get_potential_energy()
db.write(atoms, bm=B)
return {}
Just like that we can now run the asimmodule with any correctly configured
calculator for all the structures! We can even now run calc_array
to
iterate getting the results using different calculators.
4. Make the asimmodule use any structure
The final change we will make is to parallelize over structures as below
from ase.build import bulk
from ase.eos import calculate_eos
from ase.db import connect
from asimtools.calculators import load_calc
def ase_eos(
image,
calc_id,
):
calc = load_calc(calc_id)
db = connect('bulk.db')
atoms = get_atoms(**image)
atoms.calc = calc
eos = calculate_eos(atoms)
v, e, B = eos.fit() # find minimum
# Do one more calculation at the minimu and write to database:
atoms.cell *= (v / atoms.get_volume())**(1 / 3)
atoms.get_potential_energy()
db.write(atoms, bm=B)
return {}
Easy-peasy. We now have an asimmodule that works with arbitrary environment, arbitrary calculator and arbitrary input structure (Of course the simulation will fail if we give a bad structure/calculator for example)
5. Final cleanup
We can do some final cleanup of the asimmodule so that it sends outputs to
output.yaml
and logs some checkpoints. Additionally, any asimmodules added
to the repository will need clear syntax highlighting and documentation.
from typing import Dict
import logging
from ase.eos import calculate_eos
from ase.db import connect
from asimtools.calculators import load_calc
from asimtools.utils import get_atoms
def ase_eos(
image: Dict,
calc_id: str,
db_file: 'bulk.db'
) -> Dict:
calc = load_calc(calc_id)
db = connect(db_file)
atoms = get_atoms(**image)
atoms.calc = calc
eos = calculate_eos(atoms)
v, e, B = eos.fit() # find minimum
logging.info('Successfully fit EOS')
# Do one more calculation at the minimu and write to database:
atoms.cell *= (v / atoms.get_volume())**(1 / 3)
atoms.get_potential_energy()
db.write(atoms, bm=B)
results = {'v': float(v), 'e': float(e), 'B': float(B)}
return results
To run this asimmodule on an arbitrary structure say Argon with say the LennardJones calculator, in a slurm job we can now use the following input files.
sim_input.yaml:
asimmodule: /path/to/ase_eos.py
env_id: batch
workdir: results
args:
image:
builder: bulk
name: Ar
calc_id: lj_Ar
calc_input.yaml:
lj_Ar:
name: LennardJones
module: ase.calculators.lj
args:
sigma: 3.54
epsilon: 0.00802236
emt: #This is not used if an LJ calculator is chosen
name: EMT
module: ase.calculators.emt
args: {}
env_input.yaml:
batch:
mode:
use_slurm: true
interactive: false
slurm:
flags:
- -n 2
precommands:
- source ~/.bashrc
- conda activate asimtools
inline: # This is not used if env_id is batch
mode:
use_slurm: false
interactive: true
6. Running multiple simulations in a workflow
Going back to the original problem, we wanted to run the simulation of multiple
different elements with the EMT calculator. To achieve that in parallel, we can
nest the ase_eos
asimmodule in a
asimtools.asimmodules.workflows.sim_array.sim_array()
asimmodule as follows
sim_input.yaml:
asimmodule: workflows.sim_array
workdir: results
args:
key_sequence: ['args', 'image', 'name']
array_values: ['Al', 'Ni', 'Cu', 'Pd', 'Ag', 'Pt', 'Au']
env_ids: 'batch'
template_sim_input:
asimmodule: /path/to/ase_eos.py
args:
calc_id: emt
image:
builder: bulk
crystalstructure: 'fcc'
To make the asimmodule easier to access without having to use the full path, you can set the environment variable
export ASIMTOOLS_ASIMMODULE_DIR=/path/to/my/asimmodule/dir/
You can then move the ase_eos.py
asimmodule to
/path/to/my/asimmodule/dir/
i.e. the asimmodule directory. This allows you
to refer to asimmodules prepended with the asimmodule dir as below
asimmodule: workflows.sim_array
workdir: results
args:
key_sequence: ['args', 'image', 'name']
array_values: ['Al', 'Ni', 'Cu', 'Pd', 'Ag', 'Pt', 'Au']
env_ids: 'batch'
template_sim_input:
asimmodule: ase_eos/ase_eos.py
args:
calc_id: emt
image:
builder: bulk
crystalstructure: 'fcc'
The above example loops over crystals for which ASE already has FCC lattice
parameters, but what if we want to loop over the species and corresponding
lattice parameters? We can either specify a list of images
dictionaries as
array_values
or use secondary_array_values
. We can also explicitly tell
ASIMTools to include the array_values in the directory names in the standard
format (e.g. id-0000__Al__
, id-0001__Ni__
etc.).
asimmodule: workflows.sim_array
workdir: results
args:
key_sequence: ['args', 'image', 'name']
array_values: ['Al', 'Ni', 'Cu', 'Pd', 'Ag', 'Pt', 'Au']
labels: values
secondary_key_sequences:
- ['args', 'image', 'a']
secondary_array_values:
- [4.0479, 3.524, 3.6149, 3.8907, 4.0853, 3.9242, 4.0782]
env_ids: 'batch'
template_sim_input:
asimmodule: ase_eos/ase_eos.py
args:
calc_id: emt
image:
builder: bulk
crystalstructure: 'fcc'
This will perform the EOS calculation for each species with the corresponding lattice parameter. That’s 7x5=35 calculations in parallel without touching code or a job script!