Main Article Content

Abstract

Transfer learning has become a key technique for improving the accuracy of neural networks in low-resource, low-data environments. The quantitative comparative analysis of the pre-trained models includes ResNet50, VGG16, BERT, GPT, and the baseline CNN and LSTM models. They are compared across three different application areas: computer vision, natural language processing (NLP), and medical imaging. The five benchmark datasets used were ImageNet, CIFAR-10, SST-2, IMDB, and Chest X-Ray. All experiments used the same preprocessing pipeline and evaluation metrics (accuracy, F1 score, precision, recall, and ROC-AUC). Results showed that models trained on the pre-trained data achieved consistently greater accuracy than the baselines in all domains (9-20%) and F1-score (0.09-0.16) gains. ResNet50 achieved 92% accuracy on CIFAR-10, compared to 72% for the CNN baseline, whereas BERT hit 92% on SST-2, with 80% accuracy for LSTM. VGG16 improved the accuracy of Chest X-Ray classification from 78% to 87% and reduced training time by up to 60%. There were a few instances of minor overfitting and domain mismatch, emphasizing the need for adaptive fine-tuning strategies. The results demonstrate that transfer learning significantly improves convergence speed, generalization, and computational efficiency, making it a promising approach for AI applications across domains such as healthcare, NLP, and autonomous systems.    

Keywords

Transfer learning Neural networks Pre-trained models Fine-tuning Deep learning Domain adaptation

Article Details

How to Cite
Ismail Wdaa , A. S., Hussein, I., & Ahmed, A. . (2026). Transfer learning in neural networks: leveraging pre-trained models for improved performance. Future Technology, 5(3), 309–318. Retrieved from https://fupubco.com/futech/article/view/1010
Bookmark and Share

References

  1. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., and Hesse, C. “Language Models Are Few-Shot Learners,” arXiv, vol. 4, no. 33, 2020. DOI: https://doi.org/10.48550/arXiv.2005.14165
  2. Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., and Joulin, A. “Unsupervised Learning of Visual Features by Contrasting Cluster Assignments,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2006.09882
  3. Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. “A Simple Framework for Contrastive Learning of Visual Representations,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2002.05709
  4. Raghu, M., Zhang, C., Kleinberg, J., and Bengio, S. “Transfusion: Understanding Transfer Learning,” NeurIPS, 2019. DOI: https://doi.org/10.48550/arXiv.1902.07208
  5. Xu, M., Wu, M., Chen, K., Zhang, C., and Guo, J. “Unsupervised Domain Adaptation in Remote Sensing,” Remote Sens., 2022. DOI: https://doi.org/10.3390/rs14174380
  6. Zhang, Y., and Yang, Q. “A Survey on Multi-Task Learning,” IEEE Trans. Knowl. Data Eng., 2021. DOI: https://doi.org/10.1109/TKDE.2021.3070203
  7. Yu, F., Xiu, X., and Li, Y. “Deep Transfer Learning Survey,” Mathematics, 2022. DOI: https://doi.org/10.3390/math10040564
  8. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv, 2018. DOI: https://doi.org/10.18653/v1/N19-1423
  9. OpenAI, “GPT-4 Technical Report,” arXiv, 2023. DOI: https://doi.org/10.48550/arXiv.2303.08774
  10. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. “An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2010.11929
  11. Kolesnikov, A., Beyer, L., Zhai, X., Puigcerver, J., Yung, J., Gelly, S., and Houlsby, N. “Big Transfer (BiT): General Visual Representation Learning,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.1912.11370
  12. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. “An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2010.11929
  13. Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. “LoRA: Low-Rank Adaptation of Large Language Models,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2106.09685
  14. Liang, J., Hu, D., and Feng, J. “Source Hypothesis Transfer for Unsupervised Domain Adaptation,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2002.08546
  15. Redko, I., Morvant, E., Habrard, A., Sebban, M., and Bennani, Y. “A Survey on Domain Adaptation Theory,” arXiv, 2022. DOI: https://doi.org/10.48550/arXiv.2004.11829
  16. Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., and Zhou, M. “MiniLM,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2002.10957
  17. Koh, P. W., Sagawa, S., Marklund, H., Xie, S. M., Zhang, M., Balsubramani, A., Hu, W., Yasunaga, M., Phillips, R. L., Gao, I., Lee, T., David, E., Stavness, I., Guo, W., Earnshaw, B. A., Haque, I. S., Beery, S., Leskovec, J., Kundaje, A., and Pierson, E. “WILDS: A Benchmark of In-the-Wild Distribution Shifts,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2012.07421
  18. Zhou, K., Liu, Z., Qiao, Y., Xiang, T., and Loy, C. C. “Domain Generalization: A Survey,” IEEE TPAMI, 2022. DOI: https://doi.org/10.1109/TPAMI.2022.3195549
  19. Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., and He, Q. “A Comprehensive Survey on Transfer Learning,” Proc. IEEE, 2021. DOI: https://doi.org/10.1109/JPROC.2020.3004555
  20. Tan, M., and Le, Q. V. “EfficientNet,” arXiv, 2019. DOI: https://doi.org/10.48550/arXiv.1905.11946
  21. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., and Girshick, R. “Masked Autoencoders Are Scalable Vision Learners,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2111.06377
  22. Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., and Neubig, G. “Pre-train, Prompt, and Predict: A Systematic Survey,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2107.13586
  23. Touvron, H., Bojanowski, P., Caron, M., Cord, M., El-Nouby, A., Grave, E., Izacard, G., Joulin, A., Synnaeve, G., Verbeek, J., and Jégou, H. “ResMLP,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2105.03404
  24. Touvron, H., Cord, M., and Jégou, H. “DeiT III,” arXiv, 2022. DOI: https://doi.org/10.48550/arXiv.2204.07118
  25. Chen, X., Fan, H., Girshick, R., and He, K. “Improved Baselines with Momentum Contrastive Learning,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2003.04297
  26. Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P. H., Buchatskaya, E., Doersch, C., Pires, B. A., Guo, Z. D., Azar, M. G., Piot, B., Kavukcuoglu, K., Munos, R., and Valko, M. “Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2006.07733
  27. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., and Sutskever, I. “Learning Transferable Visual Models,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2103.00020
  28. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” arXiv, 2019. DOI: https://doi.org/10.48550/arXiv.1907.11692
  29. Sanh, V., Debut, L., Chaumond, J., and Wolf, T. “DistilBERT,” arXiv, 2019. DOI: https://doi.org/10.48550/arXiv.1910.01108
  30. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. “ALBERT: A Lite BERT for Self-supervised Learning of Language Representations,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.1909.11942
  31. Clark, K., Luong, M.-T., Le, Q. V., and Manning, C. D. “ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2003.10555
  32. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. “Exploring the Limits of Transfer Learning,” arXiv, 2019. DOI: https://doi.org/10.48550/arXiv.1910.10683
  33. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., and Le, Q. “XLNet,” arXiv, 2019. DOI: https://doi.org/10.48550/arXiv.1906.08237
  34. Pfeiffer, J., Kamath, A., Rücklé, A., Cho, K., and Gurevych, I. “AdapterFusion: Non-Destructive Task Composition,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2005.00247
  35. Ben Zaken, E., Ravfogel, S., and Goldberg, Y. “BitFit: Simple Parameter-Efficient Fine-tuning for Transformer-based Masked Language Models,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2106.10199
  36. Li, X., and Liang, P. “Prefix-Tuning: Optimizing Continuous Prompts for Generation,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2101.00190
  37. Liang, J., He, R., and Tan, T. “A Comprehensive Survey on Test-Time Adaptation Under Distribution Shifts,” Int. J. Comput. Vision, 2024. DOI: https://doi.org/10.1007/s11263-024-02004-w
  38. Lester, B., Al-Rfou, R., and Constant, N. “The Power of Scale for Parameter-Efficient Prompt Tuning,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2104.08691
  39. Wang, D., Shelhamer, E., Liu, S., Olshausen, B., and Darrell, T. “Tent: Test-Time Adaptation,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2006.10726
  40. Wang, Q., Fink, O., Van Gool, L., and Dai, D. “Continual Test-Time Domain Adaptation,” arXiv, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.01344
  41. Zhai, X., Wang, X., Mustafa, B., Steiner, A., Keysers, D., Kolesnikov, A., and Beyer, L. “LiT: Zero-Shot Transfer,” arXiv, 2021. DOI: https://doi.org/10.48550/arXiv.2111.07991
  42. Rosenfeld, J. S., Rosenfeld, A., Belinkov, Y., and Shavit, N. “Prediction of Generalization Error Across Scales,” arXiv, 2019. DOI: https://doi.org/10.48550/arXiv.1909.12673
  43. Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. “A Simple Framework for Contrastive Learning of Visual Representations,” arXiv, 2020. DOI: https://doi.org/10.48550/arXiv.2002.05709
  44. Wang, Z., Luo, Y., Zheng, L., Chen, Z., Wang, S., and Huang, Z. “Online Test-Time Adaptation Survey,” Int. J. Comput. Vision, 2024. DOI: https://doi.org/10.1007/s11263-024-02003-x
  45. Xie, Q., Luong, M.-T., Hovy, E., and Le, Q. V. “Noisy Student Training,” Proc. CVPR, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.01346
  46. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., and Bowman, S. R. “GLUE Benchmark,” arXiv, 2019. DOI: https://doi.org/10.48550/arXiv.1804.07461