SwiftERM Hyper-personalisation for ecommerce email marketing
SwiftERM logo
Advances in Deep Learning Technology

Advances in Deep Learning Technology

Deep learning (DL) is rapidly becoming one of the most prominent topics in the realm of materials data science, with a quickly expanding array of applications across many functions. It can also refer to a method that sees things as separate units rather than a whole) image-based, spectral, and textual data types. DL facilitates the examination of unstructured data and the automated recognition of neural connections, such as in a consumer’s buying decisions.

The recent creation of extensive materials databases has particularly propelled the use of DL techniques in atomistic predictions. Conversely, progress in image and spectral data has primarily utilised synthetic data generated by high-quality forward models and generative unsupervised DL methods.

Advances in Deep Learning Technology



For each data type, we explore applications that include both theoretical and experimental data, common modeling strategies along with their advantages and drawbacks, as well as pertinent publicly accessible software and datasets. We wrap up the review by discussing recent interdisciplinary work focused on uncertainty quantification in this area and offer a brief outlook on limitations, challenges, and possible growth opportunities for DL techniques within materials science. “Processing-structure-property-performance” is a central principle in Materials Science and Engineering (MSE).

The scales in length and time associated with material structures and phenomena differ considerably among these four components, adding additional complexity. For example, structural data can range from precise atomic coordinates of elements to the microscale distribution of phases (microstructure), to connectivity of fragments (mesoscale), and images and spectra. Drawing connections between these components presents a significant challenge. Both experimental and computational methods are valuable in uncovering such relationships between product purchase selection. With the rapid advancement in the automation of experimental instruments and the massive growth in computational capabilities, the volume of public materials datasets has surged exponentially.

Numerous large experimental and computational datasets have emerged through the Materials Genome Initiative (MGI) and the increasing embrace of Findable, Accessible, Interoperable, Reusable (FAIR) principles. This surge of data necessitates automated analysis, which can be facilitated by machine learning (ML) approaches.

DL applications are swiftly replacing traditional systems in various facets of everyday life, such as image and speech recognition, online searches, fraud detection, email/spam filtering, and financial risk assessment, among others. DL techniques have demonstrated their ability to introduce exciting new functionalities across numerous domains (including playing Go, autonomous vehicles, navigation, chip design, particle physics, protein science, drug discovery, astrophysics, object recognition, and more). Recently, DL methods have begun to surpass other machine learning approaches in multiple scientific disciplines, including chemistry, physics, biology, and materials science.

The application of DL in MSE remains relatively novel, and the field has yet to fully realise its potential, implications, and limitations. DL offers innovative methods for exploring material phenomena and has encouraged materials scientists to broaden their conventional toolkit. DL techniques have proven to serve as a complementary strategy to physics-based methods in materials design. Although large datasets are often regarded as essential for successful DL applications, strategies such as transfer learning, multi-fidelity modeling, and active learning can frequently render DL applicable for smaller datasets as well.

General machine learning concepts

Although deep learning (DL) techniques offer numerous benefits, they also come with drawbacks, the most prominent being their opaque nature, which can obstruct our understanding of the physical processes being studied. Enhancing the interpretability and explainability of DL models continues to be a vibrant area of investigation.

Typically, a DL model comprises thousands to millions of parameters, complicating the interpretation of the model and the direct extraction of scientific insights. While there have been several commendable recent reviews on machine learning (ML) applications, the rapid progress of DL in materials science and engineering (MSE) necessitates a focused review to address the surge of research in this area

Artificial intelligence (AI) refers to the creation of machines and algorithms that replicate human intelligence, for example, by optimising actions to achieve specific objectives. Machine learning (ML) is a branch of AI that enables systems to learn from data without being explicitly programmed for a particular dataset, such as in chess playing or social network recommendations. Deep learning (DL) is a further subset of ML that draws inspiration from biological brains and employs multilayer neural networks to tackle ML tasks.

Commonly used ML techniques include linear regression, decision trees, and random forests, where generalised models are trained to determine coefficients, weights, or parameters for a specific dataset. When applying traditional ML methods to unstructured data (like pixels or features from images, sounds, text, and graphs), challenges arise because users must first extract generalised, meaningful representations or features on their own (for instance, calculating the pair distribution for an atomic structure) before training the ML models. This makes the process labor-intensive, fragile, and difficult to scale.

This is where deep learning (DL) techniques gain significance. DL methods rely on artificial neural networks and related techniques. According to the “universal approximation theorem,” neural networks can approximate any function with arbitrary precision. However, it is crucial to recognise that this theorem does not ensure that these functions can be learned effortlessly.

Neural networks

Perceptron A, also known as a single artificial neuron, serves as the fundamental unit of artificial neural networks (ANNs) and facilitates the forward transmission of information. For a collection of inputs [x1, x2, …, xm] directed towards the perceptron, we assign real-valued weights (and biases to adjust these weights) [w1, w2, …, wm], which we then multiply together correspondingly to produce a cumulative sum.

Some widely used software frameworks for training neural networks include PyTorch, TensorFlow, and MXNet. It is important to acknowledge that certain commercial devices, instruments, or materials are referenced in this document to clarify the experimental methodology. This mention is not meant to suggest any endorsement or recommendation by NIST, nor does it imply that the identified materials or equipment are definitively the best options available for the intended purpose.

Activation function: Activation functions (including sigmoid, hyperbolic tangent [tanh], rectified linear unit (ReLU), leaky ReLU, and Swish) are essential nonlinear elements that allow neural networks to combine numerous simple components to learn intricate nonlinear functions. For instance, the sigmoid activation function transforms real numbers to the interval (0, 1); this function is frequently utilised in the final layer of binary classifiers to represent probabilities.

The selection of an activation function can influence both the efficiency of training and the ultimate accuracy. Loss function, gradient descent, and normalization The weight matrices of a neural network are either initialised randomly or derived from a pre-trained model. These weight matrices interact with the input matrix (or the output from a previous layer) and are processed through a nonlinear activation function to generate updated representations, commonly referred to as activations or feature maps. The loss function (sometimes called the objective function or empirical risk) is computed by comparing the neural network’s output with the known target value data.

Typically, the weights of the network are iteratively adjusted using stochastic gradient descent algorithms to reduce the loss function until the desired level of accuracy is reached. Most contemporary deep learning frameworks support this process by employing reverse-mode automatic differentiation to calculate the partial derivatives of the loss function about each network parameter through the continuous application of the chain rule. This process is informally known as back-propagation. Common algorithms for gradient descent include Stochastic Gradient Descent (SGD), Adam, and Adagrad, among others.

The learning rate is a crucial parameter in gradient descent. Except SGD, all other methods utilise adaptive tuning for learning parameters. Depending on the specific goal, whether classification or regression, various loss functions, such as Binary Cross Entropy (BCE), Negative Log Likelihood (NLL), or Mean Squared Error (MSE) are employed. Typically, the inputs of a neural network are scaled, meaning they are normalised to achieve a zero mean and a unit standard deviation. Scaling is also applied to the inputs of hidden layers (through batch or layer normalisation) to enhance the stability of artificial neural networks.

Share :

Leave a Reply

Your email address will not be published. Required fields are marked *