Main Article Content

Abstract

Neural story generation models have two significant challenges: (1) coherence over narrative structure, especially long-range dependencies, and (2) emotional coherence and consistency, generally producing redundant or incoherent narration. A new, emotionally intelligent two-stage short story generation model is presented by combining GPT-2 with a tailored FNET model, a light transformer architecture substituting standard self-attention with Fourier Transform layers to improve semantic and emotional relationship capture in text. The first stage employs GPT-2 to generate a list of input candidate sentences, a question, an answer, and an emotional state. The candidate sentences are then filtered using an emotion classifier from DistilRoBERTa to keep only those that adhere to a desired emotional tone. The filtered sentences are then fed into a fine-tuned  FNET model, which examines inter-sentence relationships and enforces emotional coherence to generate a coherent and emotionally engaging narrative. An empirical comparison using three benchmark datasets demonstrates the system's superiority over earlier state-of-the-art approaches. The FNET model achieves 0.3093 in BLEU-1, outperforming Plan-and-Write (0.0953) and T-CVAE (0.2574), with an enhanced narrative quality and lexical coherence with human-written narratives. The story coherence and emotion retention accuracies are 85%, 67%, and 60%  for Visual7W, ROCStories, and Cornell Movie Dialogs datasets.

Keywords

Story generation Emotionally aware narration FNET GPT-2 Natural Language Processing

Article Details

How to Cite
Kachare, A., Goswami, C., Gupta, A., & Chouhan, D. (2025). FNet-GPT: Fourier-based lightweight transformer for emotion-aware text generation using GPT. Future Technology, 4(4), 138–145. Retrieved from https://fupubco.com/futech/article/view/459
Bookmark and Share

References

  1. Jurafsky, D. (2000). Speech & language processing. Pearson Education India.
  2. ISBN-13: 9780131873216
  3. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.
  4. ISBN: 9781510860964
  5. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., & Others. (2018). Improving language understanding by generative pre-training.
  6. Link: Click Here
  7. Dwivedi, Y. K., Kshetri, N., Hughes, L., Slade, E. L., Jeyaraj, A., Kar, A. K., … Others. (2023). Opinion Paper: "So what if ChatGPT wrote it?" Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. International Journal of Information Management, 71, 102642.
  8. DOI: https://doi.org/10.1016/j.ijinfomgt.2023.102642
  9. Brown, P. F. (1990). Class-based n-gram models of natural language. Comput. Linguist., 18, 18.
  10. Link: https://aclanthology.org/J92-4003/
  11. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Others. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901.
  12. ISBN: 9781713829546
  13. Zhang, H., Song, H., Li, S., Zhou, M., & Song, D. (2023). A survey of controllable text generation using transformer-based pre-trained language models. ACM Computing Surveys, 56, 1–37.
  14. DOI: https://doi.org/10.1145/3617680
  15. Meehan, J. R. (1976). The metanovel: writing stories by computer. Yale University.
  16. ISBN: 0824044096
  17. Turner, S. R. (2014). The creative process: A computer model of storytelling and creativity. Psychology Press.
  18. DOI: https://doi.org/10.4324/9781315806464
  19. Bringsjord, S., & Ferrucci, D. (1999). Artificial intelligence and literary creativity: Inside the mind of brutus, a storytelling machine. Psychology Press.
  20. DOI: https://doi.org/10.4324/9781410602398
  21. Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, 27.
  22. ISBN: 9781510800410
  23. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., & Others. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1, 9.
  24. Sepúlveda-Torres, R., Bonet-Jover, A., & Saquete, E. (2023). Detecting Misleading Headlines Through the Automatic Recognition of Contradiction in Spanish. IEEE Access, 11, 72007–72026.
  25. DOI: https://doi.org/10.1109/ACCESS.2023.3295781
  26. Lebowitz, M. (1985). Story-telling as planning and learning. Poetics, 14, 483–502.
  27. DOI: https://doi.org/10.1016/0304-422X(85)90015-4
  28. PÉrez, R. P. Ý., & Sharples, M. (2001). MEXICA: A computer model of a cognitive account of creative writing. Journal of Experimental & Theoretical Artificial Intelligence, 13, 119–139.
  29. DOI: https://doi.org/10.1080/09528130010029820
  30. Riedl, M. O., & Young, R. M. (2010). Narrative planning: Balancing plot and character. Journal of Artificial Intelligence Research, 39, 217–268.
  31. DOI: https://doi.org/10.1613/jair.2989
  32. Cavazza, M., Charles, F., & Mead, S. J. (2002). Character-based interactive storytelling. IEEE Intelligent Systems, 17, 17–24.
  33. DOI: https://doi.org/10.1109/MIS.2002.1024747
  34. Fan, A., Lewis, M., & Dauphin, Y. (2018). Hierarchical Neural Story Generation. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
  35. DOI: https://doi.org/10.48550/arXiv.1805.04833
  36. Xu, J., Ren, X., Zhang, Y., Zeng, Q., Cai, X., & Sun, X. (2018). A skeleton-based model for promoting coherence among sentences in narrative story generation. arXiv Preprint arXiv:1808. 06945.
  37. DOI: https://doi.org/10.48550/arXiv.1808.06945
  38. Yao, L., Peng, N., Weischedel, R., Knight, K., Zhao, D., & Yan, R. (2019). Plan-and-write: Towards better automatic storytelling. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 7378–7385.
  39. DOI: https://doi.org/10.1609/aaai.v33i01.33017378
  40. Wang, T., & Wan, X. (2019). T-CVAE: Transformer-based conditioned variational autoencoder for story completion. IJCAI, 5233–5239.
  41. DOI: https://doi.org/10.24963/ijcai.2019/727
  42. Chen, G., Liu, Y., Luan, H., Zhang, M., Liu, Q., & Sun, M. (2020). Learning to generate explainable plots for neural story generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 585–593.
  43. DOI: https://doi.org/10.1109/TASLP.2020.3039606
  44. Zhang, Y., Shi, X., Mi, S., & Yang, X. (2021). Image captioning with transformer and knowledge graph. Pattern Recognition Letters, 143, 43-49.
  45. DOI: https://doi.org/10.1016/j.patrec.2020.12.020
  46. Chen, G., Liu, Y., Luan, H., Zhang, M., Liu, Q., & Sun, M. (2021). Learning to generate explainable plots for neural story generation. ACM Transactions on Audio, Speech, and Language Processing, 29, 585–593.
  47. DOI: https://doi.org/10.1109/TASLP.2020.3039606
  48. Brahman, F., & Chaturvedi, S. (2020). Modeling protagonist emotions for emotion-aware storytelling. arXiv Preprint arXiv:2010. 06822.
  49. DOI: https://doi.org/10.48550/arXiv.2010.06822
  50. Tan, B., Yang, Z., AI-Shedivat, M., Xing, E. P., & Hu, Z. (2020). Progressive generation of long text with pretrained language models. arXiv Preprint arXiv:2006. 15720.
  51. DOI: https://doi.org/10.48550/arXiv.2006.15720
  52. Min, K., Dang, M., & Moon, H. (2021). Deep learning-based short story generation for an image using the encoder-decoder structure. IEEE Access, 9, 113550–113557.
  53. DOI: https://doi.org/10.1109/ACCESS.2021.3104276
  54. Wu, C., Wang, J., Yuan, S., Wang, L., & Zhang, W. (2021). Generate classical Chinese poems with theme-style from images. Pattern Recognition Letters, 149, 75–82.
  55. DOI: https://doi.org/10.1016/j.patrec.2021.05.016
  56. Liu, Y., Huang, Q., Li, J., Mo, L., Cai, Y., & Li, Q. (2022). SSAP: Storylines and sentiment aware pre-trained model for story ending generation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 686–694.
  57. DOI: https://doi.org/10.1109/TASLP.2022.3145320
  58. Jin, Y., Kadam, V., & Wanvarie, D. (2022). Plot writing from pre-trained language models. arXiv Preprint arXiv:2206. 03021.
  59. DOI: https://doi.org/10.48550/arXiv.2206.03021
  60. Chen, Y., Li, R., Shi, B., Liu, P., & Si, M. (2023). Visual story generation based on emotion and keywords. arXiv Preprint arXiv:2301. 02777.
  61. DOI: https://doi.org/10.48550/arXiv.2301.02777
  62. Khan, L. P., Gupta, V., Bedi, S., & Singhal, A. (2023). StoryGenAI: An Automatic Genre-Keyword Based Story Generation. 2023 International Conference on Computational Intelligence and Sustainable Engineering Solutions (CISES), 955–960.
  63. DOI: https://doi.org/10.1109/CISES58720.2023.10183482
  64. Hartmann, J. (2022). Emotion english distilroberta-base. See
  65. Link: https://huggingface.co/j-hartmann/emotion-english-distilroberta-base
  66. Lee-Thorp, J., Ainslie, J., Eckstein, I., & Ontanon, S. (2021). Fnet: Mixing tokens with fourier transforms. arXiv preprint arXiv:2105.03824.
  67. DOI: https://doi.org/10.48550/arXiv.2105.03824
  68. Fu, K., Li, H., & Shi, X. (2024). An encoder-decoder architecture with Fourier attention for chaotic time series multi-step prediction. Applied Soft Computing, 156(111409), 111409.
  69. DOI: https://doi.org/10.1016/j.asoc.2024.111409
  70. Dittakan, K., Prompitak, K., Thungklang, P., & Wongwattanakit, C. (2023). Image caption generation using transformer learning methods: a case study on instagram image. Multimedia Tools and Applications, 83(15), 46397–46417.
  71. DOI: https://doi.org/10.1007/s11042-023-17275-9
  72. Danescu-Niculescu-Mizil, C., & Lee, L. (2011). Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. arXiv Preprint arXiv:1106. 3077.
  73. DOI: https://doi.org/10.48550/arXiv.1106.3077
  74. Zhu, Y. (2024). Visual7W dataset [Data set].
  75. DOI: https://doi.org/10.57702/zqariweh
  76. Mostafazadeh, N. (2024). ROCStories [Data set].
  77. DOI: https://doi.org/10.57702/26yy027v
  78. Lee, S., Lee, J., Moon, H., Park, C., Seo, J., Eo, S., … Lim, H. (2023). A survey on evaluation metrics for machine translation. Mathematics, 11(4), 1006.
  79. DOI: https://doi.org/10.3390/math11041006
  80. Kaptein, F., & Broekens, J. (2015, August). The affective storyteller: using character emotion to influence narrative generation. In International Conference on Intelligent Virtual Agents (pp. 352-355). Cham: Springer International Publishing.
  81. DOI: http://doi.org/10.1007/978-3-319-21996-7_38
  82. Rashkin, H., Celikyilmaz, A., Choi, Y., & Gao, J. (2020). PlotMachines: Outline-conditioned generation with dynamic plot state tracking. arXiv preprint arXiv:2004.14967.
  83. DOI: https://doi.org/10.48550/arXiv.2004.14967
  84. Li, Y., Gan, Z., Shen, Y., Liu, J., Cheng, Y., Wu, Y., ... & Gao, J. (2019). Storygan: A sequential conditional gan for story visualization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6329-6338).
  85. DOI: https://doi.org/10.48550/arXiv.1812.02784
  86. Keskar, N. S., McCann, B., Varshney, L. R., Xiong, C., & Socher, R. (2019). Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858.
  87. DOI: https://doi.org/10.48550/arXiv.1909.05858