Scraping the IUPAC-NIST Solubility Data Series

This website was developed as a proof of concept to take an existing website of data, The NIST Solubility Database (extracted from papers), and 'scrape' the pages for the data and metadata, clean it, and put it in a relational database (see this project on GitHub). Subsequently, users (humans and computers) can search the data more effectively and access it via a REST style API as follows

For Volumes

  • Index of Volumes: /volumes
  • View a Volume: /volumes/view/<volume#>/<format>
    (format can be xml, json, or jsonld (none=html))
     

For Chemicals

  • Index of Chemicals: /chemicals
  • View a Chemical: /chemicals/view/<chemicalID>/<format>
    (chemicalID can be name, CASRN, InChI Key, or formula)
    (format can be xml, json, or jsonld (none=html))
     

For System Types

  • Index of Systems: /systemtypes
  • View a System: /systemtypes/view/<systypeID>
     

For Experimental System Data

  • Index of Systems: /systems
  • View a System: /systems/view/<sysID>
     

For Citations

  • Index of Citations: /citations
  • View a Citation: /citations/view/<id>

 
NOTE: The focus of this site was getting the data from the web pages and into a database. Fields of data were only cleaned/organized if they were necessary to afford the correct functioning of this site. Thus, there are many fields of data that have not been cleaned.