NistChemPy

Cookbook:

  • Basic Search
  • Compound Properties
  • Structural Search
    • Live WebBook structural search
    • Local structural search over an index
  • Local WebBook Index
  • Local Index Workflow
  • Requests Configuration

Package details:

  • Package API
  • Data notice
  • Development workflows
  • AI-assisted development
  • Changelog
NistChemPy
  • Structural Search
  • View page source

Structural Search

This notebook demonstrates two structural-search workflows. The first uses the live NIST Chemistry WebBook structural-search endpoint. The second searches a tiny local CSV index fixture using RDKit and the indexed InChI/InChIKey fields.

RDKit is required for SMILES/InChI conversion and for local structural search. Install it with pip install -e ".[structure]" or with conda install -c conda-forge rdkit.

[1]:
import nistchempy as nist

molblock = nist.molblock_from_smiles('c1ccccc1')
print(molblock.splitlines()[0])
print('M  END' in molblock)

True

Live WebBook structural search

The live structural search sends a MOL block to the WebBook. Passing a MOL file or MOL block does not require RDKit, but this example uses RDKit to convert SMILES to a MOL block first.

[2]:
search = nist.run_structural_search(
    smiles='c1ccccc1',
    search_type='struct',
)
search.success, search.num_compounds, search.compound_ids[:5]
[2]:
(True, 1, ['C71432'])

Local structural search over an index

The local index stores InChI and InChIKey values. With RDKit installed, NistChemPy can screen those indexed structures locally. This is a linear scan over the CSV table, not a persistent fingerprint database.

[3]:
from pathlib import Path

index_path = Path('example_index.csv')
if not index_path.exists():
    index_path = Path('docs/source/example_index.csv')

index = nist.get_local_index(index_path)
index.structural_search(
    smiles='c1ccccc1',
    mode='exact',
).loc[:, ['ID', 'name', 'formula']]
[3]:
ID name formula
1 C71432 Benzene C6H6
[4]:
index.structural_search(
    smiles='CCO',
    mode='similarity',
    threshold=0.1,
).loc[:, ['ID', 'name', 'formula', 'similarity']]
[4]:
ID name formula similarity
2 C64175 Ethanol C2H6O 1.0
Previous Next

© Copyright 2023-2026, Ivan Yu. Chernyshov.

Built with Sphinx using a theme provided by Read the Docs.