Skip to content

chalcogenide-perovskite-screening

ML-guided screening of chalcogenide perovskites as solar energy materials

Overview

Chalcogenide perovskites (ABX₃, X = S²⁻, Se²⁻) have emerged as promising absorber materials for next-generation photovoltaic devices, yet their experimental realization remains limited by competing phases, structural polymorphism, and synthetic challenges.

This repository presents a fully data-driven and experimentally grounded screening framework to assess the stability and experimental feasibility of chalcogenide perovskites, integrating interpretable analytical descriptors, machine-learning models, and sustainability metrics.

D. A. Garzón, L. Himanen, L. Andrade, S. Sadewasser, J. A. Márquez, "ML-guided screening of chalcogenide perovskites as solar energy materials", arXiv:2602.21812 (2026). https://arxiv.org/abs/2602.21812

Screening Pipeline

The framework chains five complementary computational and data-driven methods:

Screening pipeline workflow

Step Method Purpose
1 SISSO-derived tolerance factor (τ*) Interpretable analytical descriptor for perovskite structural stability
2 CrystaLLM crystal structure generation Generative prediction to validate corner-sharing perovskite-type topology
3 GCNN crystal-likeness scoring Experimental plausibility and synthesizability assessment
4 CrabNet bandgap estimation Composition-based prediction trained on experimental data
5 Sustainability analysis Multi-objective ranking using HHI, ESG scores, and supply risk metrics

Getting Started

Installation & setup

Set up the environment with uv or pip, configure API keys, and prepare the Jupyter kernel for running the analysis pipeline.

Get Started

Pipeline Notebooks

Step-by-step analysis

Run the full screening pipeline — from SISSO feature engineering through CrystaLLM validation, CrabNet bandgap prediction, and sustainability ranking.

View Pipeline

References & Citation

Methods and attribution

Key references for the methods used in this pipeline, BibTeX citation, acknowledgements, and license information.

View References

Key Features

  • Interpretable Descriptors: SISSO-derived tolerance factor (τ*) outperforming the classical Goldschmidt tolerance factor on experimental data
  • Generative Structure Validation: CrystaLLM crystal structure generation to confirm perovskite-type topology
  • ML Bandgap Prediction: CrabNet composition-based models trained on experimental halide perovskite and chalcogenide semiconductor data
  • Sustainability Metrics: Multi-objective ranking integrating element scarcity (HHI), ESG risk, and supply chain metrics
  • Synthesizability Scoring: GCNN-based crystal-likeness assessment for experimental plausibility
  • Fully Reproducible: All data, code, and trained models are openly available under the MIT License

Citation

@misc{garzon2026mlguided,
  title   = {{ML-guided screening of chalcogenide perovskites as solar energy materials}},
  author  = {Garz{\'o}n, Diego A. and Himanen, Lauri and Andrade, Luisa
             and Sadewasser, Sascha and M{\'a}rquez, Jos{\'e} A.},
  year    = {2026},
  eprint  = {2602.21812},
  archivePrefix = {arXiv},
  primaryClass  = {cond-mat.mtrl-sci},
  doi     = {10.48550/arXiv.2602.21812},
  url     = {https://arxiv.org/abs/2602.21812},
}