Fitting thermodynamic models with pycalphad

Overview

ESPEI

ESPEI, or Extensible Self-optimizing Phase Equilibria Infrastructure, is a tool for thermodynamic database development within the CALPHAD method. It uses pycalphad for calculating Gibbs free energies of thermodynamic models.

Read the documentation at espei.org.

Installation Anaconda (recommended)

ESPEI does not require any special compiler, but several dependencies do. Therefore it is suggested to install ESPEI from conda-forge.

conda install -c conda-forge espei

What is ESPEI?

  1. ESPEI parameterizes CALPHAD models with enthalpy, entropy, and heat capacity data using the corrected Akiake Information Criterion (AICc). This parameter generation step augments the CALPHAD modeler by providing tools for data-driven model selection, rather than relying on a modeler's intuition alone.
  2. ESPEI optimizes CALPHAD model parameters to thermochemical and phase boundary data and quantifies the uncertainty of the model parameters using Markov Chain Monte Carlo (MCMC). This is similar to the PARROT module of Thermo-Calc, but goes beyond by adjusting all parameters simultaneously and evaluating parameter uncertainty.

Details on the implementation of ESPEI can be found in the publication: B. Bocklund et al., MRS Communications 9(2) (2019) 1–10. doi:10.1557/mrc.2019.59.

What ESPEI can do?

ESPEI can be used to generate model parameters for CALPHAD models of the Gibbs energy that follow the temperature-dependent polynomial by Dinsdale (CALPHAD 15(4) 1991 317-425) within the compound energy formalism (CEF) for endmembers and Redlich-Kister-Mugganu excess mixing parameters in unary, binary and ternary systems.

All thermodynamic quantities are computed by pycalphad. The MCMC-based Bayesian parameter estimation can optimize data for any model that is supported by pycalphad, including models beyond the endmember Gibbs energies Redlich-Kister-Mugganiu excess terms, such as parameters in the ionic liquid model, magnetic, or two-state models. Performing Bayesian parameter estimation for arbitrary multicomponent thermodynamic data is supported.

Goals

  1. Offer a free and open-source tool for users to develop multicomponent databases with quantified uncertainty
  2. Enable development of CALPHAD-type models for Gibbs energy, thermodynamic or kinetic properties
  3. Provide a platform to build and apply novel model selection, optimization, and uncertainty quantification methods

The implementation for ESPEI involves first performing parameter generation by calculating parameters in thermodynamic models that are linearly described by non-equilibrium thermochemical data. Then Markov Chain Monte Carlo (MCMC) is used to optimize the candidate models from the parameter generation to phase boundary data.

Cu-Mg phase diagram

Cu-Mg phase diagram from a database created with and optimized by ESPEI. See the Cu-Mg Example.

History

The ESPEI package is based on a fork of pycalphad-fitting. The name and idea of ESPEI are originally based off of Shang, Wang, and Liu, ESPEI: Extensible, Self-optimizing Phase Equilibrium Infrastructure for Magnesium Alloys Magnes. Technol. 2010 617-622 (2010).

Implementation details for ESPEI have been described in the following publications:

Getting Help

For help on installing and using ESPEI, please join the PhasesResearchLab/ESPEI Gitter room.

Bugs and software issues should be reported on GitHub.

License

ESPEI is MIT licensed.

The MIT License (MIT)

Copyright (c) 2015-2018 Richard Otis
Copyright (c) 2017-2018 Brandon Bocklund
Copyright (c) 2018-2019 Materials Genome Foundation

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Citing ESPEI

If you use ESPEI for work presented in a publication, we ask that you cite the following publication:

  1. Bocklund, R. Otis, A. Egorov, A. Obaied, I. Roslyakova, Z.-K. Liu, ESPEI for efficient thermodynamic database development, modification, and uncertainty quantification: application to Cu–Mg, MRS Commun. (2019) 1–10. doi:10.1557/mrc.2019.59.
@article{Bocklund2019ESPEI,
         archivePrefix = {arXiv},
         arxivId = {1902.01269},
         author = {Bocklund, Brandon and Otis, Richard and Egorov, Aleksei and Obaied, Abdulmonem and Roslyakova, Irina and Liu, Zi-Kui},
         doi = {10.1557/mrc.2019.59},
         eprint = {1902.01269},
         issn = {2159-6859},
         journal = {MRS Communications},
         month = {jun},
         pages = {1--10},
         title = {{ESPEI for efficient thermodynamic database development, modification, and uncertainty quantification: application to Cu–Mg}},
         year = {2019}
}
Comments
  • Compute metastable/unstable single phase driving forces in ZPF error

    Compute metastable/unstable single phase driving forces in ZPF error

    Thanks to Tobias Spitaler for suggesting this and to @richardotis for brainstorming this solution concept.

    This PR introduces two new functions in ZPF error, _solve_sitefracs_composition and _sample_solution_constitution. Their purpose is to facilitate computing metastable or unstable single phase driving forces when a phase has a miscibility gap. This should improve the convergence for any phase that has a stable or metastable miscibility gap.

    Rationale

    ESPEI currently computes the "single-phase hyperplane" at a vertex by performing an equilibrium calculate at a black point and then subtracting that from the target hyperplane energy at that composition. As illustrated in the figure Tobias constructed (below), this is problematic for phases with a miscibility gap because a "single-phase" equilibrium calculation in pycalphad will always compute the global minimum energy and give two composition sets.

    driving-force-Spitaler

    What ESPEI should do is what Tobias illustrates by the orange x and the green driving force line. This solution ensures that minimizing the driving force will force the Gibbs energy curve to match the energy of the black points on the multi-phase target hyperplane.

    Historically, we didn't implement this because one would like to use equilibrium to minimize the internal degeres of freedom, but pycalphad always computes the global minimum energy, so it was not possible to do via equilibrium. More recently, ESPEI had introduced the idea of approximate_equilibrium, which uses starting_point to more quickly determine a minimum energy solution from a discrete point smapling grid. The approximate_equilibrium method we use still has the same problem as pycalphad's equilibrium because starting_point will still give the global minimum solution for the discrete sampling.

    Solution

    In an ideal world, pycalphad should be able to turn off global minimization (automatically introducing new composition sets) and enable a condition to be set for the composition of a phase, i.e. X(BCC,B). In practice, being able to turn off global minimum and provide a valid starting point for only one composition set that has a global composition condition would simulate a phase composition condition. Unfortunately, neither turning off global minimization nor phase composition conditions are currently implemented. So we need to do a workaround.

    The two functions introduced here consider each single phase composition at a tie-vertex and construct a point grid that only contains points which satisfy the prescribed overall composition (and the internal phase constraints). This can be used in either approximate or exact equilibrium modes to find lowest energy starting point and then to pass that equilibrium with the constrained point grid so the global minimization step has no new composition sets to introduce (i.e. it cannot detect a miscibility gap).

    For perfomance, we pre-compute the grid of points for every phase composition in the ZPF datasets and re-use them to compute the grid, starting point and equilibrium at every parameter iteration (note that this would be invalid if a parameter changes the number of moles, like varying coordination number in the MQMQA).

    To summarize the impact:

    1. This method will be entirely backwards compatible for phases without a miscibility gap.
    2. For cases where a miscibility gap is present in the parameters, but a single phase is prescribed, there will be a driving force to eliminate the miscibility gap, so the single phase compositions are more meaningful too. This is significant because you can prescribe single phase regions in ZPF datasets and it will enforce that no miscibility gap occurs, which is not true today.
    3. For phase compositions inside a miscibilty gap, the Gibbs energy curve will match the multi-phase global minimum hyperplane at the phase compositions (at convergence).
    opened by bocklund 20
  • ERROR occurred using the new development version

    ERROR occurred using the new development version

    Dear Administrator, There were some tests that failed when I try to run pytest after install the new development version(2021/4/21, Beijing time). Meanwhile, there is some error occurred when I run some example cases that successfully run using other versions before. errorlog.txt pytestfail.txt condalist.txt

    opened by duxiaoxian 12
  • Error releasing un-acquired lock in dask

    Error releasing un-acquired lock in dask

    Was distributed (1.18.0) when this error occurred. Changed to distributed (1.16.3).

      File "/Applications/anaconda/envs/my_pycalphad/bin/espei", line 11, in <module>
        sys.exit(main())
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/run_espei.py", line 135, in main
        mcmc_steps=args.mcmc_steps, save_interval=args.save_interval)
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/paramselect.py", line 754, in fit
        for i, result in enumerate(sampler.sample(walkers, iterations=mcmc_steps)):
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 259, in sample
        lnprob[S0])
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 332, in _propose_stretch
        newlnprob, blob = self._get_lnprob(q)
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 382, in _get_lnprob
        results = list(M(self.lnprobfn, [p[i] for i in range(len(p))]))
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/utils.py", line 39, in map
        result = [x.result() for x in result]
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/utils.py", line 39, in <listcomp>
        result = [x.result() for x in result]
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/distributed/client.py", line 155, in result
        six.reraise(*result)
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/six.py", line 685, in reraise
        raise value.with_traceback(tb)
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/distributed/protocol/pickle.py", line 59, in loads
        return pickle.loads(x)
    RuntimeError: cannot release un-acquired lock```
    bug 
    opened by ghost 10
  • dask workers can sometimes die without warning

    dask workers can sometimes die without warning

    I haven't been able to reproduce it consistently, but dark workers sometimes die with the dask scheduler.

    To debug this, I turned on debugging output by scheduler = LocalCluster(n_workers=cores, threads_per_worker=1, processes=True, silence_logs=verbosity[output_settings['verbosity']]).

    I am still waiting for that job to have workers die to see the output, but for now as iterations in emcee complete the results are processed in Python (it is known that this is happening because of the progress bar output). During this time, the LocalCluster debugging gives output

    distributed.core - WARNING - Event loop was unresponsive for 1.69s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
    

    Usually I get two similar messages in a row.

    As another possibility, the most recent time I was able to reproduce this was when I had two instances of ESPEI running at the same time. I wouldn't think that the different client instances would interact, but maybe it should be investigated.

    opened by bocklund 6
  • Issues reproducing Cu-Mg example

    Issues reproducing Cu-Mg example

    I had several issues running the Cu-Mg example from the ESPEI website. I installed ESPEI using the conda command, and took the Cu-Mg data directory from the ESPEI-datasets repository.

    I first tried reproducing the diagram from the section titled, First-principles phase diagram The code successfully ran, but the returned phase diagram didn't match the example well: diagram_dft

    I then tried reproducing the results in the MCMC optimization section. I wasn't able to successfully perform the MCMC optimization. The code returned numerous errors over the course of several minutes and eventually hung with no further output.

    This file contains the full python output when I ran the optimization: espei_mcmc_error.txt

    Here is my python version and installed packages/versions: python_info.txt

    opened by npaulson 6
  • The latest version of espei = 0.7.2 get an error when plot

    The latest version of espei = 0.7.2 get an error when plot

    I have recently used the latest version of espei = 0.7.2 and I always get an error, but I used espei = 0.6 and it works fine. image

    My current computer can't use espei = 0.6 again, so I don't know which version to use, I don't know what went wrong. I always get MPI errors when I use espei = 0.6 image

    AG_CU_1214.zip

    opened by duxiaoxian 5
  • Run ESPEI via input files, rather than command line arguments

    Run ESPEI via input files, rather than command line arguments

    A first draft and feedback was written in this gist

    The current iteration is:

    Header area.
    Include any metadata above the `---`.
    ---
    # core run settings
    run_type: full # choose full | dft | mcmc
    phase_models: input.json
    datasets: input-datasets # path to datasets. Defaults to current directory.
    scheduler: dask # can be dask | MPIPool
    
    # control output
    verbosity: 0 # integer verbosity level 0 | 1 | 2, where 2 is most verbose.
    output_tdb: out.tdb
    tracefile: chain.npy # name of the file containing the mcmc chain array
    probfile: lnprob.npy # name of the file containing the mcmc ln probability array
    
    # the following only take effect for full or mcmc runs
    mcmc:
      mcmc_steps: 2000
      mcmc_save_interval: 100
    
      # the following take effect for only mcmc runs
      input_tdb: null # TDB file used to start the mcmc run
      restart_chain: null # restart the mcmc fitting from a previous calculation
    

    This issue will focus on the development of a first generation input file structure and spec, and also as a place to brainstorm options that should be user-facing.

    opened by bocklund 5
  • Limit the degrees of freedom for non-active phases in MCMC to prevent them from diverging?

    Limit the degrees of freedom for non-active phases in MCMC to prevent them from diverging?

    Phases that do not have phase equilibria data should have their parameters fixed before the MCMC run.

    A particular phase in an ESPEI run can have single phase DFT data and no phase equilibria. This means that the parameters that were calculated in the single phase fitting have no effect on the error function that is used in the MCMC run.

    When parameters have no effect on the error function, they diverge when used in emcee because the ensemble sampler scales them up to infinity in an attempt to force that parameter to affect the error function.

    bug enhancement 
    opened by bocklund 5
  • Error when running Cu-Mg example

    Error when running Cu-Mg example

    Hello, I am trying to run ESPEI for the first time.

    I created a conda env and installed ESPEI using conda. I downloaded json and yaml files as well as the contents of the Cu-Mg folder in ESPEI-datasets, renamed it to input-data. After running espei --input espei-in.yaml, I get the errors below. Could you please let me know if I am doing anything wrong?

    Thanks!

    Traceback (most recent call last):
      File "/Users/latmarat/miniforge3/envs/espenv/bin/espei", line 10, in <module>
        sys.exit(main())
      File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/espei_script.py", line 307, in main
        run_espei(input_settings)
      File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/espei_script.py", line 177, in run_espei
        dbf = generate_parameters(phase_models, datasets, refdata, excess_model,
      File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/paramselect.py", line 517, in generate_parameters
        aliases = extract_aliases(phase_models)
      File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/utils.py", line 370, in extract_aliases
        aliases = {phase_name: phase_name for phase_name in phase_models["phases"].keys()}
    AttributeError: 'list' object has no attribute 'keys'
    
    opened by latmarat 4
  • AttributeError: 'NoneType' object has no attribute 'values'

    AttributeError: 'NoneType' object has no attribute 'values'

    Dear Administrator, An 'AttributeError' occurred when I run 'espei --input espei-in-2.yaml' using the latest development version of ESPEI. Would you mind help me to check my dataset? Thanks. errorprint-log.txt verbosity-log.txt CO-CU-20201104.zip

    f:\users\zhang\pycalphad\pycalphad\codegen\callables.py:97: UserWarning: State variables in build_callables are not {N, P, T}, but {T, P}. This can lead to incorrectly calculated values if the state variables used to call the generated functions do not match the state variables used to create them. State variables can be added with the additional_statevars argument. "additional_statevars argument.".format(state_variables)) Traceback (most recent call last): File "F:\Users\zhang\Anaconda32020\envs\espei2020test\Scripts\espei-script.py", line 33, in sys.exit(load_entry_point('espei', 'console_scripts', 'espei')()) File "f:\users\zhang\espei\espei\espei_script.py", line 311, in main run_espei(input_settings) File "f:\users\zhang\espei\espei\espei_script.py", line 260, in run_espei approximate_equilibrium=approximate_equilibrium, File "f:\users\zhang\espei\espei\optimizers\opt_base.py", line 36, in fit node = self.fit(symbols, datasets, *args, **kwargs) File "f:\users\zhang\espei\espei\optimizers\opt_mcmc.py", line 238, in fit self.predict(initial_guess, **ctx) File "f:\users\zhang\espei\espei\optimizers\opt_mcmc.py", line 289, in predict multi_phase_error = calculate_zpf_error(parameters=np.array(params), **zpf_kwargs) File "f:\users\zhang\espei\espei\error_functions\zpf_error.py", line 315, in calculate_zpf_error target_hyperplane = estimate_hyperplane(phase_region, parameters, approximate_equilibrium=approximate_equilibrium) File "f:\users\zhang\espei\espei\error_functions\zpf_error.py", line 186, in estimate_hyperplane grid = calculate(dbf, species, phases, str_statevar_dict, models, phase_records, pdens=500, fake_points=True) File "f:\users\zhang\espei\espei\shadow_functions.py", line 55, in calculate largest_energy=float(1e10), fake_points=fp) File "f:\users\zhang\pycalphad\pycalphad\core\calculate.py", line 190, in _compute_phase_values param_symbols, parameter_array = extract_parameters(parameters) File "f:\users\zhang\pycalphad\pycalphad\core\utils.py", line 361, in extract_parameters parameter_array_lengths = set(np.atleast_1d(val).size for val in parameters.values()) AttributeError: 'NoneType' object has no attribute 'values'

    opened by duxiaoxian 4
  • Migrate pycalphad refdata to ESPEI

    Migrate pycalphad refdata to ESPEI

    Tracking from https://github.com/pycalphad/pycalphad/issues/120

    Assume that SGTE91Stable is correct per https://github.com/pycalphad/pycalphad/issues/120. Then we must

    • [x] Remove the metastable phases not present in the SGTE91 original paper
    • [ ] Check that remaining phases have correct descriptions
    opened by bocklund 4
  • MCMC Initialized chains should include initial point

    MCMC Initialized chains should include initial point

    During the initialization of the chains for the MCMC optimizer, a Gaussian distribution about an initial point is taken. https://github.com/PhasesResearchLab/ESPEI/blob/7c797191d4c3178fe4a22275bbaee9c2977786ad/espei/optimizers/opt_mcmc.py#L98

    I would suggest including the initial point in that set of initial chains. If everything is set up correctly, this won't matter, but for cases where the standard deviation is too high while the initial guess is quite good, the current behavior will lead to a lot of bad starting points. Modifying the initial set to include the initial guess point should ensure that at least this state (or acceptable permutations of it) will survive the MCMC run. What do you think?

    opened by toastedcrumpets 0
  • formatted_parameter broken by SymEngine

    formatted_parameter broken by SymEngine

    Switching the symbolic backend to SymEngine broke espei.utils.formatted_parameter. Here's a test to validate (run from the tests directory for the testing_data module to be importable).

    # espei/tests/test_utils.py
    
    from pycalphad import Database
    from espei.utils import formatted_parameter, database_symbols_to_fit
    from .testing_data import CU_MG_TDB
    def test_cu_mg_parameters_can_be_formatted_to_strings():
        """Formating parameters should work for common variables parameters"""
        dbf = Database(CU_MG_TDB)
        for sym in database_symbols_to_fit(dbf):
            assert isinstance(formatted_parameter(dbf, sym), str), f"Formatted parameter for symbol {sym} (value = {dbf.symbols[sym]}) in database not a string"
    

    Running this gives an error:

    Traceback (most recent call last):
      File "/Users/bocklund1/src/calphad/espei/tests/dummy.py", line 11, in <module>
        test_cu_mg_parameters_can_be_formatted_to_strings()
      File "/Users/bocklund1/src/calphad/espei/tests/dummy.py", line 9, in test_cu_mg_parameters_can_be_formatted_to_strings
        assert isinstance(formatted_parameter(dbf, sym), str), f"Formatted parameter for symbol {sym} (value = {dbf.symbols[sym]}) in database not a string"
      File "/Users/bocklund1/src/calphad/espei/espei/utils.py", line 295, in formatted_parameter
        term = parameter_term(result['parameter'], symbol)
      File "/Users/bocklund1/src/calphad/espei/espei/utils.py", line 218, in parameter_term
        coeff, root = term_coeff.as_coeff_mul(symbol)
    AttributeError: 'symengine.lib.symengine_wrapper.Symbol' object has no attribute 'as_coeff_mul'
    

    I think the breakage might be because espei.utils.parameter_term isn't correctly picking up the first condition, since for the case of symbol being a symengine.lib.symengine_wrapper.Symbol, I think expression == symbol should evaluate to true, but evidently (via the traceback) it is evaluating to false.

    opened by bocklund 0
  • Memory leak when running MCMC in parallel

    Memory leak when running MCMC in parallel

    Due to a known memory leak when instantiating subclasses of SymEngine (one of our upstream dependencies) Symbol objects (see https://github.com/symengine/symengine.py/issues/379), running ESPEI with parallelization will cause memory to grow in each worker.

    Only running in parallel will trigger significant memory growth, because running in parallel uses the pickle library to serialize and deserialize symbol objects and create new objects that can't be freed. When running without parallelization (mcmc.scheduler: null), new symbols are not created.

    Until https://github.com/symengine/symengine.py/issues/379 is fixed, some mitigation strategies to avoid running out of memory are:

    • Run ESPEI without parallelization by setting scheduler: null
    • (Under consideration to implement): when parallelization is active, use an option to restart the workers every N iterations.
    • (Under consideration to implement): remove Model objects from the keyword arguments of ESPEI's likelihood functions. Model objects contribute a lot of symbol instances in the form of v.SiteFraction objects. We should be able to get away with only using PhaseRecord objects, but there are a few places Model.constituents to be able to infer the sublattice model and internal degrees of freedom that would need to be rewritten.
    opened by bocklund 1
  • Unable to use activity data in binary Fe-C with Graphite as reference state

    Unable to use activity data in binary Fe-C with Graphite as reference state

    Hi,

    We are currently trying to use activity data for Fe-C. Lobo1976 measured the activity of C in alpha-iron relative to Graphite as the standard state, but get erroneous results. (Lobo, Joseph A., and Gordon H. Geiger. "Thermodynamics and solubility of carbon in ferrite and ferritic Fe-Mo alloys." Metallurgical Transactions A 7.8 (1976): 1347-1357.)

    I have added the input file below. With this input file, we get chemical potential difference: [nan] (verbosity 3 output). Is the input file correct or are we missing something? I have had a look at the value of ref_result within the activity_error.py and this does give only nan results for the specified reference state. Graphite only has C as a component. An equilibrium calculation of Graphite specifying x.V('C') gives an error as Number of dependent components different from one. Can this cause an error here as well? Used versions: espei: 0.8.6 and pycalphad 0.9.2. I have added a zip-file with the TDB file and espei input files which reproduces this behaviour.

    Thank you for your help, Tobias

    {
            "components": ["FE", "C", "VA"],
            "phases": ["BCC_A2", "GRAPHITE"],
            "weight": 1000,
            "reference_state": {
                    "phases": ["GRAPHITE"],
                    "conditions": {
                            "P": 101325,
                            "T": 1056.15,
                            "X_C": 1
    
                    }
            },
            "conditions": {
                    "P": 101325,
                    "T": 1056.15,
                    "X_C": [0.00013017]
            },
            "output": "ACR_C",
            "values": [[[0.087]]
                    ],
            "reference": "Lobo1976_1056K",
            "meta_data": {
                    "DOI": "10.1007/BF02658820",
                    "literature reference": "Thermodynamics and Solubility of Carbon in Ferrite and Ferritic Fe-Mo Alloys",
                    "table/figure": "table 1",
                    "measured data": "C-activity in Alpha-Iron",
                    "experimental details": "not available",
                    "weight": "default"
            }
    }
    

    minimal_example.zip

    opened by tobiasspt 1
  • ENH: Allow multiple datasets directories to be specified in YAML input

    ENH: Allow multiple datasets directories to be specified in YAML input

    Sometimes it is useful to load datasets from different filesystem locations, for example if one folder contains hand-curated data and another contains automatically generated data.

    In code, it would be pretty simple to handle this. Instead of

    from espei.datasets import load_datasets, recursive_glob
    directory = '/path/to/directory/'
    load_datasets(recursive_glob(directory))
    

    we could do

    from itertools import chain
    from espei.datasets import load_datasets, recursive_glob
    directories = ['/path/to/directory_1/', '/path/to/directory_2/']
    load_datasets(chain(*map(recursive_glob, directories)))
    
    opened by bocklund 1
Releases(0.8.9)
Owner
Phases Research Lab
Research group lead by Dr. Zi-Kui Liu at The Pennsylvania State University.
Phases Research Lab
Incubator for useful bioinformatics code, primarily in Python and R

Collection of useful code related to biological analysis. Much of this is discussed with examples at Blue collar bioinformatics. All code, images and

Brad Chapman 560 Jan 03, 2023
Data Science Environment Setup in single line

datascienv is package that helps your to setup your environment in single line of code with all dependency and it is also include pyforest that provide single line of import all required ml libraries

Ashish Patel 55 Dec 16, 2022
Renato 214 Jan 02, 2023
Clean and reusable data-sciency notebooks.

KPACUBO KPACUBO is a set Jupyter notebooks focused on the best practices in both software development and data science, namely, code reuse, explicit d

Matvey Morozov 1 Jan 28, 2022
In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

Raster_Sampling_Demo (Resulting graph of this demo) Background Sampling values of a raster at specific geographic coordinates can be done with a numbe

2 Dec 13, 2022
a tool that compiles a csv of all h1 program stats

h1stats - h1 Program Stats Scraper This python3 script will call out to HackerOne's graphql API and scrape all currently active programs for informati

Evan 40 Oct 27, 2022
songplays datamart provide details about the musical taste of our customers and can help us to improve our recomendation system

Songplays User activity datamart The following document describes the model used to build the songplays datamart table and the respective ETL process.

Leandro Kellermann de Oliveira 1 Jul 13, 2021
[CVPR2022] This repository contains code for the paper "Nested Collaborative Learning for Long-Tailed Visual Recognition", published at CVPR 2022

Nested Collaborative Learning for Long-Tailed Visual Recognition This repository is the official PyTorch implementation of the paper in CVPR 2022: Nes

Jun Li 65 Dec 09, 2022
:truck: Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

To launch a live notebook server to test optimus using binder or Colab, click on one of the following badges: Optimus is the missing framework to prof

Iron 1.3k Dec 30, 2022
Catalogue data - A Python Scripts to prepare catalogue data

catalogue_data Scripts to prepare catalogue data. Setup Clone this repo. Install

BigScience Workshop 3 Mar 03, 2022
Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day.

Analyse the limit order book in seconds. Zoom to tick level or get yourself an overview of the trading day. Correlate the market activity with the Apple Keynote presentations.

2 Jan 04, 2022
Office365 (Microsoft365) audit log analysis tool

Office365 (Microsoft365) audit log analysis tool The header describes it all WHY?? The first line of code was written long time before other colleague

Anatoly 1 Jul 27, 2022
Automated Exploration Data Analysis on a financial dataset

Automated EDA on financial dataset Just a simple way to get automated Exploration Data Analysis from financial dataset (OHLCV) using Streamlit and ta.

Darío López Padial 28 Nov 27, 2022
Average time per match by division

HW_02 Unzip matches.rar to access .json files for matches. Get an API key to access their data at: https://developer.riotgames.com/ Average time per m

11 Jan 07, 2022
Python-based Space Physics Environment Data Analysis Software

pySPEDAS pySPEDAS is an implementation of the SPEDAS framework for Python. The Space Physics Environment Data Analysis Software (SPEDAS) framework is

SPEDAS 98 Dec 22, 2022
ASTR 302: Python for Astronomy (Winter '22)

ASTR 302, Winter 2022, University of Washington: Python for Astronomy Mario Jurić Location When: 2:30-3:50, Monday & Wednesday, Winter quarter 2022 Wh

UW ASTR 302: Python for Astronomy 4 Jan 12, 2022
Analysiscsv.py for extracting analysis and exporting as CSV

wcc_analysis Lichess page documentation: https://lichess.org/page/world-championships Each WCC has a study, studies are fetched using: https://lichess

32 Apr 25, 2022
Python dataset creator to construct datasets composed of OpenFace extracted features and Shimmer3 GSR+ Sensor datas

Python dataset creator to construct datasets composed of OpenFace extracted features and Shimmer3 GSR+ Sensor datas

Gabriele 3 Jul 05, 2022
Deep universal probabilistic programming with Python and PyTorch

Getting Started | Documentation | Community | Contributing Pyro is a flexible, scalable deep probabilistic programming library built on PyTorch. Notab

7.7k Dec 30, 2022