TY - GEN
T1 - Mining the critical conditions for new hypotheses of materials from historical reaction data
AU - Ouyang, Zhenchao
AU - Liu, Yu
AU - Niu, Jianwei
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/4
Y1 - 2018/12/4
N2 - The new findings in material science often require a high research cost for the following two aspects. First is that the chemical reaction craft needs continuous optimization and may consume lots of valuable reactants and apparatus during daily experiments. Second, the success of a designed experiment relies heavily on researchers' experience. With the starting of the Materials Genome Initiative (MGI) project, researchers are beginning to record historical reaction data, and seek new solutions via computer techniques, such as data mining and machine learning. In this paper, we study the reaction data of inorganic-organic hybrid materials from the Dark Reaction Project from Haverford College with simple machine learning algorithms (i.e., Bayes Net, SVM and C4.5), ensemble learning models (i.e., Random Forest, Stacking, Gradient Boosting Decision Tree (GBDT) and XGBoost), and deep neural network models. Besides accuracy of the prediction models, we also analyze the reaction conditions that have important reflecting in chemistry with different ranking algorithms. With a series of evaluation, we find that the welldesigned stacking-based ensemble learning model can reach the highest prediction accuracy of 61% (8% higher than GBDT and 5% higher than XGBoost) on the top50 subsets based on 'symmetrical uncertainty ranking' on the standalone data set which was not used in the Dark Reaction Project before.
AB - The new findings in material science often require a high research cost for the following two aspects. First is that the chemical reaction craft needs continuous optimization and may consume lots of valuable reactants and apparatus during daily experiments. Second, the success of a designed experiment relies heavily on researchers' experience. With the starting of the Materials Genome Initiative (MGI) project, researchers are beginning to record historical reaction data, and seek new solutions via computer techniques, such as data mining and machine learning. In this paper, we study the reaction data of inorganic-organic hybrid materials from the Dark Reaction Project from Haverford College with simple machine learning algorithms (i.e., Bayes Net, SVM and C4.5), ensemble learning models (i.e., Random Forest, Stacking, Gradient Boosting Decision Tree (GBDT) and XGBoost), and deep neural network models. Besides accuracy of the prediction models, we also analyze the reaction conditions that have important reflecting in chemistry with different ranking algorithms. With a series of evaluation, we find that the welldesigned stacking-based ensemble learning model can reach the highest prediction accuracy of 61% (8% higher than GBDT and 5% higher than XGBoost) on the top50 subsets based on 'symmetrical uncertainty ranking' on the standalone data set which was not used in the Dark Reaction Project before.
KW - Chemical reaction
KW - Dark Reaction Project
KW - Ensemble learning
KW - Machine Learning
KW - Materials
UR - https://www.scopus.com/pages/publications/85060307537
U2 - 10.1109/SmartWorld.2018.00087
DO - 10.1109/SmartWorld.2018.00087
M3 - 会议稿件
AN - SCOPUS:85060307537
T3 - Proceedings - 2018 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018
SP - 316
EP - 322
BT - Proceedings - 2018 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018
A2 - Loulergue, Frederic
A2 - Wang, Guojun
A2 - Bhuiyan, Md Zakirul Alam
A2 - Ma, Xiaoxing
A2 - Li, Peng
A2 - Roveri, Manuel
A2 - Han, Qi
A2 - Chen, Lei
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE SmartWorld, 15th IEEE International Conference on Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018
Y2 - 7 October 2018 through 11 October 2018
ER -