Researchers at UCLA have developed a novel mathematical theorem to revolutionize the training of large-scale artificial neural networks (ANN).
Artificial neural networks (ANNs) have gained popularity in recent years due to their exceptional performance and applicability to a wide array of machine learning applications. ANNs digitally mimic the structure and behavior of brain tissue by creating an interconnected network of simple processing units, termed neurons.
The size of the ANN increases with the complexity of the application and desired degree of accuracy. Pivotal tasks such as medical image diagnosis, biometric security, and self-driving cars are extremely complex and require a high degree of accuracy, so ANNs need to be trained before they can execute tasks.
The current gold standard, known as steepest descent, is ineffective at training large-scale networks. Second order methods can train ANNs much more effectively, but their use is limited to small to medium-sized networks due to limits in computational technology. Effective training of large-scale ANNs will have immense effects on the advancement of artificial intelligence.
Researchers at UCLA from the Department of Chemistry and Biochemistry have developed a novel mathematical theorem to rapidly train large-scale artificial neural networks (ANNs). Their algorithm prevents the exponential increase of computational cost with the size of the ANN. As a proof of concept, ANNs were trained on a variety of benchmark applications using steepest descent, standard second order methods, other state-of-the-art methods, and this novel method.
Their algorithm was able to consistently perform training operations at least 10x faster than the other methods. The increased efficiency enables training networks with higher complexity and more neurons than currently possible with existing training algorithms and computational technology.
Any application that uses artificial neural networks:
|United States Of America||Published Application||20200285956||09/10/2020||2017-730|
Artificial intelligence, neural networks, machine learning, training, algorithms, software, big data, computational complexity, speech recognition, image processing, automated driving, robotics, automation