Compound properties and structured records

This notebook shows the live compound-loading workflow and the structured record helpers added for NistChemPy 2.x.

[1]:
import nistchempy as nist

compound = nist.get_compound('C71432')  # benzene
compound
[1]:
NistCompound(ID=C71432)
[2]:
compound.to_dict()
[2]:
{'record_type': 'compound',
 'compound_id': 'C71432',
 'source_url': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432',
 'retrieved_at': '',
 'ID': 'C71432',
 'name': 'Benzene',
 'synonyms': ['[6]Annulene',
  'Benzol',
  'Benzole',
  'Coal naphtha',
  'Cyclohexatriene',
  'Phenyl hydride',
  'Pyrobenzol',
  'Pyrobenzole',
  'Benzolene',
  'Bicarburet of hydrogen',
  'Carbon oil',
  'Mineral naphtha',
  'Motor benzol',
  'Benzeen',
  'Benzen',
  'Benzin',
  'Benzine',
  'Benzolo',
  'Fenzen',
  'NCI-C55276',
  'Phene',
  'Rcra waste number U019',
  'UN 1114',
  'NSC 67315',
  '1,3,5-Cyclohexatriene'],
 'formula': 'C 6 H 6',
 'mol_weight': 78.1118,
 'inchi': 'InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H',
 'inchi_key': 'UHOVQNZJYSORNB-UHFFFAOYSA-N',
 'cas_rn': '71-43-2',
 'mol_refs': {'mol2D': 'https://webbook.nist.gov/cgi/cbook.cgi?Str2File=C71432',
  'mol3D': 'https://webbook.nist.gov/cgi/cbook.cgi?Str3File=C71432'},
 'data_refs': {'cTG': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=1#Thermo-Gas',
  'cTC': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=2#Thermo-Condensed',
  'cTP': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=4#Thermo-Phase',
  'cTR': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=8#Thermo-React',
  'cSO': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=10#Solubility',
  'cIE': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=20#Ion-Energetics',
  'cIC': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=40#Ion-Cluster',
  'cIR': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=80#IR-Spec',
  'cMS': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=200#Mass-Spec',
  'cUV': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=400#UV-Vis-Spec',
  'cES': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=800#Electronic-Spec',
  'cGC': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C71432&Mask=2000#Gas-Chrom',
  'Fluid Properties': 'https://webbook.nist.gov/cgi/fluid.cgi?ID=C71432&Action=Page'},
 'nist_public_refs': {'Electron-Impact Ionization Cross Sections (on physics web site)': 'https://physics.nist.gov/cgi-bin/Ionization/table.pl?ionization=C6H6xx0',
  'Gas Phase Kinetics Database': 'https://kinetics.nist.gov/kinetics/rpSearch?cas=71432',
  'X-ray Photoelectron Spectroscopy Database, version 5.0': 'https://srdata.nist.gov/xps/SpectralByCompdDd/2349',
  'NIST Polycyclic Aromatic Hydrocarbon Structure Index': 'https://pah.nist.gov/?q=pah001'},
 'nist_subscription_refs': {'NIST / TRC Web Thermo Tables, "lite" edition (thermophysical and thermochemical data)': 'https://wtt-lite.nist.gov/wtt-lite/index.html?cmp=benzene',
  'NIST / TRC Web Thermo Tables, professional edition (thermophysical and thermochemical data)': 'https://wtt-pro.nist.gov/wtt-pro/index.html?cmp=benzene'}}

Property loaders still use WebBook pages. They now return the loaded objects and also store them on the compound object.

[3]:
ms_spectra = compound.get_ms_spectra()
len(ms_spectra), ms_spectra[0] if ms_spectra else None
[3]:
(1, Spectrum(C71432, Mass spectrum #0))
[4]:
ms_spectra[0].to_dict(include_raw=False) if ms_spectra else None
[4]:
{'record_type': 'spectrum',
 'compound_id': 'C71432',
 'source_url': 'https://webbook.nist.gov/cgi/cbook.cgi?JCAMP=C71432&Index=0&Type=Mass',
 'retrieved_at': '',
 'spectrum_type': 'MS',
 'spectrum_index': '0',
 'parsed': {}}
[5]:
chromatograms = compound.get_gas_chromatography()
len(chromatograms), chromatograms[0] if chromatograms else None
[5]:
(16,
 Chromatogram(C71432, Kovats' RI, non-polar column, isothermal: 276 data points))
[6]:
chromatograms[0].to_dict()['data'][:3] if chromatograms else []
[6]:
[{'Column type': 'Capillary',
  'Active phase': 'RTX-5',
  'Column length (m)': '30.',
  'Carrier gas': 'N2',
  'Substrate': '',
  'Column diameter (mm)': '0.25',
  'Phase thickness (μm)': '0.25',
  'Temperature (C)': '100.',
  'I': '685.',
  'Reference': 'Ádámová, M.; Orinák, A.; Halás, L., Retention indices as identification tool in pyrolysis-capillary gas chromatography, J. Chromatogr. A, 2005, 1087, 1-2, 131-141, https://doi.org/10.1016/j.chroma.2005.01.003',
  'Comment': 'MSDC-RI'},
 {'Column type': 'Capillary',
  'Active phase': 'RTX-5',
  'Column length (m)': '30.',
  'Carrier gas': 'N2',
  'Substrate': '',
  'Column diameter (mm)': '0.25',
  'Phase thickness (μm)': '0.25',
  'Temperature (C)': '120.',
  'I': '694.74',
  'Reference': 'Ádámová, M.; Orinák, A.; Halás, L., Retention indices as identification tool in pyrolysis-capillary gas chromatography, J. Chromatogr. A, 2005, 1087, 1-2, 131-141, https://doi.org/10.1016/j.chroma.2005.01.003',
  'Comment': 'MSDC-RI'},
 {'Column type': 'Capillary',
  'Active phase': 'RTX-5',
  'Column length (m)': '30.',
  'Carrier gas': 'N2',
  'Substrate': '',
  'Column diameter (mm)': '0.25',
  'Phase thickness (μm)': '0.25',
  'Temperature (C)': '60.',
  'I': '672.74',
  'Reference': 'Ádámová, M.; Orinák, A.; Halás, L., Retention indices as identification tool in pyrolysis-capillary gas chromatography, J. Chromatogr. A, 2005, 1087, 1-2, 131-141, https://doi.org/10.1016/j.chroma.2005.01.003',
  'Comment': 'MSDC-RI'}]
[7]:
[(record.record_type, type(record).__name__) for record in compound.to_records()]
[7]:
[('compound', 'CompoundRecord'),
 ('spectrum', 'SpectrumRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord'),
 ('gas_chromatography', 'ChromatogramRecord')]

Spectrum records intentionally keep raw JCAMP-DX text. Numeric digitization/parsing can be added later without changing the basic record interface.