A New Optimizer for Image Classification using Wide ResNet (WRN)
DOI:
https://doi.org/10.25007/ajnu.v9n4a858Abstract
A new optimization algorithm called Adam Meged with AMSgrad (AMAMSgrad) is modified and used for training a convolutional neural network type Wide Residual Neural Network, Wide ResNet (WRN), for image classification purpose. The modification includes the use of the second moment as in AMSgrad and the use of Adam updating rule but with and (2) as the power of the denominator. The main aim is to improve the performance of the AMAMSgrad optimizer by a proper selection of and the power of the denominator. The implementation of AMAMSgrad and the two known methods (Adam and AMSgrad) on the Wide ResNet using CIFAR-10 dataset for image classification reveals that WRN performs better with AMAMSgrad optimizer compared to its performance with Adam and AMSgrad optimizers. The accuracies of training, validation and testing are improved with AMAMSgrad over Adam and AMSgrad. AMAMSgrad needs less number of epochs to reach maximum performance compared to Adam and AMSgrad. With AMAMSgrad, the training accuracies are (90.45%, 97.79%, 99.98%, 99.99%) respectively at epoch (60, 120, 160, 200), while validation accuracy for the same epoch numbers are (84.89%, 91.53%, 95.05%, 95.23). For testing, the WRN with AMAMSgrad provided an overall accuracy of 94.8%. All these accuracies outrages those provided by WRN with Adam and AMSgrad. The classification metric measures indicate that the given architecture of WRN with the three optimizers performs significantly well and with high confidentiality, especially with AMAMSgrad optimizer.
Downloads
References
2. Borges, L. R., 2015, “Analysis of the Wisconsin Breast Cancer Dataset and Machine Learning for Breast Cancer Detection”, Proceedings of XI Workshop de Visão Computacional ‐ October 05th‐07th, 2015, Pp. (15-19).
3. Khan, A. A., and Yong, S., 2016, “An Evaluation of Convolutional Neural Nets for Medical Image Anatomy Classification”, Springer International Publishing Switzerland 2016, in P.J. Soh et al. (eds.), Advances in Machine Learning and Signal Processing, Lecture Notes in Electrical Engineering, DOI 10.1007/978-3-319-32213-1_26, Pp. (293-303).
4. Haryanto, T., Wasito1, I. and Suhartanto, H., 2017, “Convolutional Neural Network (CNN) for Gland Images Classification”, International Conference on Information & Communication Technology and System (ICTS), Pp. (55-60).
5. Kapoor, I., and Mishra, A., 2018, “Automated Classification Method for Early Diagnosis of Alopecia Using Machine Learning”, International Conference on Computational Intelligence and Data Science (ICCIDS 2018), ESEVIER, ScienceDirect, Pp. (437-443).
6. Antipov, G., Berrani, S.A., Ruchaud, N. and Dugelay, J.L., 2015, October. “Learned vs. hand-crafted features for pedestrian gender recognition”. In Proceedings of the 23rd ACM international conference on Multimedia. pp. (1263-1266).
7. Akcay, S., Kundegorski, M. E., Willcocks, C. G., and Breckon, T. P., 2018, “Using Deep Convolutional Neural Network Architectures for Object Classification andDetection within cX-ray Baggage Security Imagery”, IEEE Transactions on Information Forensics and Security,DOI 10.1109/TIFS.2018.2812196,Pp. (1-13).
8. Bian, P., Li, W., Jin, Y., and Zhi, R., 2018, “Ensemble feature learning for material recognition with convolutional neural networks”, EURASIP Journal on Image and Video Processing, 2018:64, Pp. (1-11).
9. Zuo, J., Xu, G., Fu, K., Sun, X., and Sun, H., 2018, “Aircraft Type Recognition Based on Segmentation With Deep Convolutional Neural Networks”, IEEE Geoscience And Remote Sensing Letters, Vol. 15, No. 2, Pp. (282-286).
10. Xu, H., Han, Z., Feng, S., Zhou, H., and Fang, Y., 2018 “Foreign object debris material recognition based on convolutional neural networks”, EURASIP Journal on Image and Video Processing, https://doi.org/10.1186/s13640-018-0261-2, 2018:21, Pp. (1-10).
11. Wan, J., Chen, B., Xu, B., Liu, H., and Jin, L., 2019, “Convolutional neural networks for radar HRRP target recognition and rejection”, EURASIP Journal on Advances in Signal Processing, https://doi.org/10.1186/s13634-019-0603-y, 2019:5, Pp. (1-17).
12. Krizhevsky A., Sutskever I. and Hinton G.E., 2012, “ImageNet classification with deep convolutional neural networks”, Proceedings of the 25th International Conference on neural information processing systems (NIPS), Lake Tahoe, December, pp. (1097-1105).
13. Zhang P., Niu X., Dou Y., and Xia F., 2017, “Airport Detection on Optical Satellite Images Using Deep Convolutional Neural Networks”, IEEE Geoscience and Remote Sensing Letters, Vol. 14, No. 8, pp. (1183–1187).
14. Hoseini F., Shahbahrami A. and Bayat P., 2018, “An Efficient Implementation of Deep Convolutional Neural Networks for MRI Segmentation”, Journal of Digital Imaging, Vol. 31, No. 5, pp. (738-747).
15. Zeiler M.D., Fergus R., 2014, “Visualizing and Understanding Convolutional Networks”,in Fleet D., Pajdla T., Schiele B., Tuytelaars T., (eds) Computer Vision – European Conference on.
16. Simonyan, K. and Zisserman A., 2015, “Very deep convolutional networks for large-scale image recognition”, International Conference on Learning Representations (ICLR), (pp. 1409.1556).
17. Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., Erhan D., Vanhoucke V. and Rabinovich, A., 2015, “Going deeper with convolutions”, in Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 1-9).
18. Zagoruyko, S., and Komodakis, N., 2017, “Wide Residual Networks”, rXiv:1605.07146v4, Pp. (1-15).
19. Rumelhart, D.E., Hinton, G.E. and Williams, R.J., 1986. “Learning representations by back-propagating errors”. Cognitive modeling, Nature, LETERSTO NATURE, Vol. 323, PP (533-536).
20. Qian, N. 1999, “On the momentum term in gradient descent learning algorithms”, Neural Networks, ELSEVIER, Vol. 12, Issue 1, (pp. 145–151).
21. Duchi J., Hazan E. and Singer Y., 2011, “Adaptive subgradient methods for online learning and stochastic optimization”, Journal of Machine Learning Research, 12(Jul), pp. (2121-2159).
22. Kingma, D. P., and Ba, J. L., 2015, “Adam: A Method for Stochastic Optimization”, in Proceedings of the International Conference on Learning Representations (ICLR), pp. (1-15).
23. Reddi S. J., Kale S. and Kumar S., 2018, “On the Convergence of Adam And Beyond”, Proceedings of the International Conference on Learning Representations (ICLR), pp. (1-23).
24. Zeiler M. D., 2012, “Adadelta: An Adaptive Learning Rate Method”, arXiv preprint arXiv, pp. (1212-5701).
25. Tieleman T. and Hinton G., 2012, “Lecture 6.5-rmsprop: Divide the Gradient by a Running Average of Its Recent Magnitude”, COURSERA: Neural Networks for Machine Learning, 4, pp. (26-31).
26. Loshchilov I. and Hutter F., 2019, “Decoupled Weight Decay Regularization”, Proceedings of the International Conference on Learning Representations (ICLR), pp. (1-8).
27. Ma, J. and Yarats, D., 2019. “Quasi-hyperbolic momentum and Adam for deep learning”. International Conference on Learning Representations (ICLR), Pp. (1-38).
28. Hoseini F., Shahbahrami A. and Bayat P., 2019, “AdaptAhead Optimization Algorithm for Learning Deep CNN Applied to MRI Segmentation”, Journal of Digital Imaging, Society of imaging informatics in medicine, Springer, Vol. 32, issue 1, Pp. (105-115).
29. Huang G., Liu Z., Van Der Maaten L. and Weinberger K.Q., 2017, “Densely connected convolutional networks”, in Proceedings of the IEEE conference on computer vision and pattern recognition. (pp. 4700-4708).
30. Johnson M., Schuster M., Le Q. V., Krikun M., Wu Y., Chen, Z., … Dean, J. (2017). Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation. Transactions of the Association for Computational Linguistics, Vol. 5, (pp. 339–351).
31. He K., Zhang X., Ren S. and Sun J., 2016, ”Deep residual learning for image recognition”, in Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). (pp. 770-778).
32. Tai, Y., Yang, J., and Liu, X., 2017, “Image Super-Resolution via Deep Recursive Residual Network”, Proceedings of 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI: 10.1109/CVPR.2017.1, Pp. (3147-3155).
33. Korzeniowski F., 2018, “Experiments with AMSGrad” Retrieved December 24, 2018, from https://fdlm.github.io/post/amsgrad/.
34. Carneiro, T., Da Nóbrega1, R. V., Nepomuceno, T., and others, 2018, “Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications”, IEEE Access, DOI: 10.1109/ACCESS.2018.2874767, IEEE Access.
35. He, K., Zhang, X., Ren, S., and Sun, J., 2015, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”, In Proceedings of the IEEE international conference on computer vision, Pp. (1026-1034).
36. Krogh, A. and Hertz, J.A., 1992. “A simple weight decay can improve generalization”. In Advances in neural information processing systems (pp. 950-957).
37. Drăgulescu B., Bucos M., Vasiu R., 2015, “Predicting Assignment Submissions in a Multi-class Classification Problem”, TEM Journal, Vol. 4, No. 3, Pp.(244-254).
Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright
The use of a Creative Commons License enables authors/editors to retain copyright to their work. Publications can be reused and redistributed as long as the original author is correctly attributed.
- Copyright
- The researcher(s), whether a single or joint research paper, must sell and transfer to the publisher (the Academic Journal of Nawroz University) through all the duration of the publication which starts from the date of entering this Agreement into force, the exclusive rights of the research paper/article. These rights include the translation, reuse of papers/articles, transmit or distribute, or use the material or parts(s) contained therein to be published in scientific, academic, technical, professional journals or any other periodicals including any other works derived from them, all over the world, in English and Arabic, whether in print or in electronic edition of such journals and periodicals in all types of media or formats now or that may exist in the future. Rights also include giving license (or granting permission) to a third party to use the materials and any other works derived from them and publish them in such journals and periodicals all over the world. Transfer right under this Agreement includes the right to modify such materials to be used with computer systems and software, or to reproduce or publish it in e-formats and also to incorporate them into retrieval systems.
- Reproduction, reference, transmission, distribution or any other use of the content, or any parts of the subjects included in that content in any manner permitted by this Agreement, must be accompanied by mentioning the source which is (the Academic Journal of Nawroz University) and the publisher in addition to the title of the article, the name of the author (or co-authors), journal’s name, volume or issue, publisher's copyright, and publication year.
- The Academic Journal of Nawroz University reserves all rights to publish research papers/articles issued under a “Creative Commons License (CC BY-NC-ND 4.0) which permits unrestricted use, distribution, and reproduction of the paper/article by any means, provided that the original work is correctly cited.
- Reservation of Rights
The researcher(s) preserves all intellectual property rights (except for the one transferred to the publisher under this Agreement).
- Researcher’s guarantee
The researcher(s) hereby guarantees that the content of the paper/article is original. It has been submitted only to the Academic Journal of Nawroz University and has not been previously published by any other party.
In the event that the paper/article is written jointly with other researchers, the researcher guarantees that he/she has informed the other co-authors about the terms of this agreement, as well as obtaining their signature or written permission to sign on their behalf.
The author further guarantees:
- The research paper/article does not contain any defamatory statements or illegal comments.
- The research paper/article does not violate other's rights (including but not limited to copyright, patent, and trademark rights).
This research paper/article does not contain any facts or instructions that could cause damages or harm to others, and publishing it does not lead to disclosure of any confidential information.