Optimization Strategies for Deep Learning Models in Natural Language Processing

Jerry Yao; Bin Yuan

doi:10.53469/jtpes.2024.04(05).11

Authors

Jerry Yao Trine University, AZ, USA
Bin Yuan Trine University, AZ, USA

DOI:

https://doi.org/10.53469/jtpes.2024.04(05).11

Keywords:

Natural Language Processing, Deep Learning, Model Optimization, Data Heterogeneit

Abstract

Deep learning models have achieved remarkable performance in the field of natural language processing (NLP), but they still face many challenges in practical applications, such as data heterogeneity and complexity, the black-box nature of models, and difficulties in transfer learning across multilingual and cross-domain scenarios. In this paper, corresponding improvement measures are proposed from four perspectives: model structure, loss functions, regularization methods, and optimization strategies, to address these issues. Extensive experiments on three tasks including text classification, named entity recognition, and reading comprehension confirm the feasibility and effectiveness of the proposed optimization solutions. The experimental results demonstrate that introducing innovative mechanisms like Multi-Head Attention and Focal Loss, and judiciously applying techniques such as LayerNorm and AdamW, can significantly improve model performance. Finally, this paper also explores model compression techniques, providing new insights for deploying deep models in resource-constrained scenarios.

References

Srivastava R , Avasthi V , Krishna P R .Self-Adaptive Optimization Assisted Deep Learning Model for Partial Discharge Recognition[J].Parallel Processing Letters, 2022.DOI:10.1142/S0129626421500249.

Dar J A , Srivastava K K , Ahmed L S .Design and development of hybrid optimization enabled deep learning model for COVID-19 detection with comparative analysis with DCNN, BIAT-GRU, XGBoost[J].Computers in Biology and Medicine, 2022:150.

Kanchanamala P , Alphonse A S , Reddy P V B .Heart disease prediction using hybrid optimization enabled deep learning network with spark architecture[J].Biomedical signal processing and control, 2023(Jul. Pt.1):84.

Kim S , Lee U , Lee I ,et al.Idle vehicle relocation strategy through deep learning for shared autonomous electric vehicle system optimization[J].Journal of Cleaner Production, 2022, 333:130055-.

Yutong G , Khishe M , Mohammadi M ,et al.Evolving Deep Convolutional Neural Networks by Extreme Learning Machine and Fuzzy Slime Mould Optimizer for Real-Time Sonar Image Recognition[J].International Journal of Fuzzy Systems, 2022(3):24.

Manasa B M R , Venugopal P .Swarm intelligence-based deep ensemble learning machine for efficient channel estimation in MIMO communication systems[J].International journal of communication systems, 2022(10):35.

Liu J , Tsai B Y , Chen D S .Deep reinforcement learning based controller with dynamic feature extraction for an industrial claus process[J].Journal of the Taiwan Institute of Chemical Engineers, 2023.DOI:10.1016/j.jtice.2023.104779.

Bhola S , Pawar S , Balaprakash P ,et al.Multi-fidelity reinforcement learning framework for shape optimization[J]. 2022.DOI:10.48550/arXiv.2202.11170.

Du G , Zou Y , Zhang X ,et al.Energy management for a hybrid electric vehicle based on prioritized deep reinforcement learning framework[J].Energy, 2022(Feb.15):241.

Tsokov S , Lazarova M , Aleksieva-Petrova A .A Hybrid Spatiotemporal Deep Model Based on CNN and LSTM for Air Pollution Prediction[J].Sustainability, 2022, 14.

Optimization Strategies for Deep Learning Models in Natural Language Processing

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Resources

Resources

Links

Quick Links

Current Issue

call for papers

Journal of Theory and Practice of Engineering Science (JTPES) ISSN: 2790-1505

CONTACT US