Transformer-based models for identifying alloy properties

by Lia Gold-Garfinkel

Researchers develop AlloyBert, a modeling tool designed to predict the properties of alloys.

Identifying alloy properties can be expensive and time-consuming. Experiments involving alloys often require a number of resources. Calculating alloys can also become extremely complicated, with a seemingly endless number of configurations of properties. Alloy properties can be determined using Density Functional Theory (DFT) calculations; however, this method is limited and can also be extremely time-consuming in the event of a particularly complex alloy. Amir Barati Farimani and his team aim to reduce both the time and cost of this process, and their recent work has led to the creation of AlloyBert, a modeling tool designed to predict the properties of alloys.

AlloyBert is a transformer-based model, meaning researchers input simple English-language descriptors to gain their desired output. Descriptors can include information such as the temperature at which the alloy was processed or the chemical composition of an alloy. AlloyBert will then use this information to predict either the elastic modulus or yield strength of the alloy.

Barati Fairmani, an associate professor of mechanical engineering, and his team specifically designed AlloyBert to reduce both the amount of time and cost that is usually required to identify alloy properties. Most language-learning models require users to input the information they have using extremely precise wording, which is a time-consuming process. By making AlloyBert a transformer-based model, users can be more flexible with their inputs.

We wanted a model that can easily get specific physical properties without being overly concerned with what information we have and whether it is in a specific format.
Akshat Chaudhari, master’s student, Materials Science and Engineering

“We wanted a model that can easily get specific physical properties without being overly concerned with what information we have and whether it is in a specific format,” says Akshat Chaudhari, a master’s student in materials science and engineering. “Accurate information and formatting are still important, but AlloyBert allows for a much higher level of flexibility.”

AlloyBert’s foundational model is RoBERTa, a pre-existing encoder. RoBERTa was used due to its self-attention mechanism, a feature that allows the model to judge the importance of specific words in a sentence. This self-attention mechanism was incorporated into AlloyBert’s training, using two datasets of alloy properties. AlloyBert finetuned the RoBERTa model. The results of the study indicated that transformer models can be used as effective tools in predicting alloy properties.

AlloyBert currently has two deviations that the team hopes to further investigate. The accuracy of AlloyBert’s predictions is not always consistent with the level of detail of the input. The team anticipated that the more information they provided AlloyBert, the more accurate the output would be. However, their experiments indicated that in some cases, inputting the least amount of data resulted in the most accurate output. The team posits this may be due to AlloyBert’s training being limited to two datasets. “Training the model on a very large corpus may give more consistent results,” notes Chaudhari.

The second deviation was found as the research team employed two training strategies, one that involved first pre-training and then fine-tuning their model, and another method that only involved fine-tuning of the model. The team hypothesized that the method employing both pre-training and fine tuning would result in more accurate outputs. This occurred one out of eight times in each dataset. While their hypothesis was primarily supported, they found that in some cases only fine-tuning the model resulted in better results in comparison with some inputs containing more information. The team predicts this deviation might be because their pre-training used a Masked Language Model (MLM). Future studies may employ alternate pre-training models.

Overall, this study and AlloyBert’s development have opened the door for a number of possibilities. In addition to the two deviations mentioned, AlloyBert’s code can be further developed to identify other materials besides alloys. Fairmani’s team also envisions the development of a model that performs the reverse operation of AlloyBert—that is, a model that is given input of an alloy property, and then breaks down the elements that compose it.

Transformer-based models in general are proving to be a potentially valuable tool for future scientific research. “For scientific uses, you need concrete, accurate answers, and the existing research shows that there is a good scope for that. These models can be trained in such a way to give better results than existing methods,” Chaudhari explains.

AlloyBert’s software is now accessible on GitHub.

Mechanical engineering Ph.D. students Chakradhar Guntuboina and MSE alum Hongshuo Huang, who is currently a a Ph.D. student at the Univerity of Michigran, were also part of the team that developed AlloyBert.