Research on AI-enabled collaborative governance mechanism for content security: an optimization perspective of review technology based on deep learning

Jia Yan; Fei Huang

Download

PDF

Statistic

Read Counter : 322 Download : 109

Abstract

In order to bridge the gap between technological optimization and institutional design in Internet content security governance, an integrated framework was constructed, incorporating deep learning-based review technology and multi-stakeholder collaboration. A methodology leading to a three-layer dynamic coupling governance model covering technology, process, and institution with an extended Stackelberg game framework was developed for formal modeling of the strategic interactions among regulators, platforms, Artificial Intelligence (AI) systems, and users. In this connection, an adaptive cross-modal confidence propagation algorithm was presented to improve the accuracy in reviewing multimodal content, together with a Thompson sampling-based dynamic threshold optimization mechanism. On comprehensive test sets, the accuracy of the dynamic collaboration mechanism was 94.6%, and game equilibrium attainment was 95.8%. Compared with pure manual review, costs were reduced by 76%, and efficiency was increased by 8.7 times. Meanwhile, the cross-modal confidence propagation algorithm showed an accuracy increase of 8.4% in high-uncertainty situations. Cross-scenario generalization capabilities have also been tested and verified on social media, short video, online education, and e-commerce platforms. The proposed collaborative governance mechanism can effectively balance accuracy, efficiency, and cost in content moderation and provide a theoretical basis for AI-enabled governance research.

Keywords

Content security governance Deep learning Stackelberg game Human-AI collaboration Multimodal fusion

How to Cite

Yan, J., & Huang, F. (2026). Research on AI-enabled collaborative governance mechanism for content security: an optimization perspective of review technology based on deep learning. Future Technology, 5(2), 92–102. Retrieved from https://fupubco.com/futech/article/view/714

Download Citation

References

T. Gillespie, "Content moderation, AI, and the question of scale," Big Data & Society, vol. 7, no. 2, p. 2053951720943234, 2020, doi: 10.1177/2053951720943234.
E. Douek, "Governing online speech," Columbia Law Review, vol. 121, no. 3, pp. 759-834, 2021, doi: 10.2139/ssrn.3679607.
Y. Ma and C. Liu, "The developmental party and the regulatory state in China's Internet governance," Policy & Internet, vol. 16, no. 4, pp. 764-782, 2024, doi: 10.1002/poi3.410.
M. Jiang, "Chinese internet policies: Historical reflections and new research directions," Communication and the Public, vol. 10, no. 3, pp. 162-167, 2025, doi: 10.1177/20570473251316590.
S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, "Deep learning--based text classification: a comprehensive review," ACM computing surveys (CSUR), vol. 54, no. 3, pp. 1-40, 2021, doi: 10.1145/3439726.
Q. Li et al., "A survey on text classification: From traditional to deep learning," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 13, no. 2, pp. 1-41, 2022, doi: 10.1145/3495162.
F. Alkomah and X. Ma, "A literature review of textual hate speech detection methods and datasets," Information, vol. 13, no. 6, p. 273, 2022, doi: 10.3390/info13060273.
S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, "Transformers in vision: A survey," ACM computing surveys (CSUR), vol. 54, no. 10s, pp. 1-41, 2022, doi: 10.1145/3505244.
M. De Lange et al., "A continual learning survey: Defying forgetting in classification tasks," IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 7, pp. 3366-3385, 2021, doi: 10.1109/TPAMI.2021.3057446.
G. M. Van de Ven, T. Tuytelaars, and A. S. Tolias, "Three types of incremental learning," Nature Machine Intelligence, vol. 4, no. 12, pp. 1185-1197, 2022, doi: 10.1038/s42256-022-00568-3.
V. Heimburg and M. Wiesche, "Digital platform regulation: opportunities for information systems research," Internet Research, vol. 33, no. 7, pp. 72-85, 2023, doi: 10.1108/INTR-05-2022-0321.
C. Li, H. Li, and C. Tao, "Evolutionary game of platform enterprises, government and consumers in the context of digital economy," Journal of business research, vol. 167, p. 113858, 2023, doi: 10.1016/j.jbusres.2023.113858.
V. Lai, S. Carton, R. Bhatnagar, Q. V. Liao, Y. Zhang, and C. Tan, "Human-ai collaboration via conditional delegation: A case study of content moderation," in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022, pp. 1-18, doi: 10.1145/3491102.3501999.
M. D. Molina and S. S. Sundar, "When AI moderates online content: effects of human collaboration and interactive transparency on user trust," Journal of Computer-Mediated Communication, vol. 27, no. 4, p. zmac010, 2022, doi: 10.1007/s10844-022-00726-4.
T. Davidson, D. Warmsley, M. Macy, and I. Weber, "Automated hate speech detection and the problem of offensive language," in Proceedings of the international AAAI conference on web and social media, 2017, vol. 11, no. 1, pp. 512-515, doi: 10.1609/icwsm.v11i1.14955.
Z. Talat and D. Hovy, "Hateful symbols or hateful people? predictive features for hate speech detection on twitter," in Proceedings of the NAACL student research workshop, 2016, pp. 88-93, doi: 10.18653/v1/N16-2013.
L. Lyu et al., "Privacy and robustness in federated learning: Attacks and defenses," IEEE transactions on neural networks and learning systems, vol. 35, no. 7, pp. 8726-8746, 2022, doi: 10.1109/TNNLS.2022.3216981.
G. Bansal et al., "Does the whole exceed its parts? the effect of ai explanations on complementary team performance," in Proceedings of the 2021 CHI conference on human factors in computing systems, 2021, pp. 1-16, doi: 10.1145/3411764.3445717.
H. Fan et al., "Social media toxicity classification using deep learning: real-world application UK Brexit," Electronics, vol. 10, no. 11, p. 1332, 2021, doi: 10.3390/electronics10111332.
A. Abbasi, A. R. Javed, F. Iqbal, N. Kryvinska, and Z. Jalil, "Deep learning for religious and continent-based toxic content detection and classification," Scientific Reports, vol. 12, no. 1, p. 17478, 2022, doi: 10.1038/s41598-022-22523-3.
K. Mnassri, P. Rajapaksha, R. Farahbakhsh, and N. Crespi, "BERT-based ensemble approaches for hate speech detection," in GLOBECOM 2022-2022 IEEE Global Communications Conference, 2022: IEEE, pp. 4649-4654, doi: 10.1109/GLOBECOM48099.2022.10001325.
H. Li and M. Chau, "Human-AI collaboration in content moderation: the effects of information cues and time constraints," 2023. https://aisel.aisnet.org/ecis2023_rip/2/?utm_source=aisel.aisnet.org%2Fecis2023_rip%2F2&utm_medium=PDF&utm_campaign=PDFCoverPages
K. B. Nelatoori and H. B. Kommanti, "Multi-task learning for toxic comment classification and rationale extraction," Journal of Intelligent Information Systems, vol. 60, no. 2, pp. 495-519, 2023, doi: 10.1007/s10844-022-00726-4.
R. Gomez, J. Gibert, L. Gomez, and D. Karatzas, "Exploring hate speech detection in multimodal publications," in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2020, pp. 1470-1478, doi: 10.1109/WACV45572.2020.9093414.
S. Thapa, A. Shah, F. Jafri, U. Naseem, and I. Razzak, "A multi-modal dataset for hate speech detection on social media: Case-study of russia-ukraine conflict," in Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE), 2022, pp. 1-6, doi: 10.18653/v1/2022.case-1.1.
P. Leonidou, N. Kourtellis, N. Salamanos, and M. Sirivianos, "Privacy–Preserving Online Content Moderation: A Federated Learning Use Case," in Companion Proceedings of the ACM Web Conference 2023, 2023, pp. 280-289, doi: 10.1145/3543873.3587604.
F. C. Akyon and A. Temizel, "State-of-the-art in nudity classification: A comparative analysis," in 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), 2023: IEEE, pp. 1-5, doi: 10.1109/ICASSPW59220.2023.10193621.
S. Chen, Y. Yu, Y. Li, Z. Lu, and Y. Zhou, "Mask-free Iterative Refinement Network for weakly-supervised Few-shot Semantic Segmentation," Neurocomputing, vol. 611, p. 128600, 2025, doi: 10.1016/j.neucom.2024.128600.
R. Prabhu and V. Seethalakshmi, "A comprehensive framework for multi-modal hate speech detection in social media using deep learning," Scientific Reports, vol. 15, no. 1, p. 13020, 2025, doi: 10.1038/s41598-025-94069-z.

References

T. Gillespie, "Content moderation, AI, and the question of scale," Big Data & Society, vol. 7, no. 2, p. 2053951720943234, 2020, doi: 10.1177/2053951720943234.

E. Douek, "Governing online speech," Columbia Law Review, vol. 121, no. 3, pp. 759-834, 2021, doi: 10.2139/ssrn.3679607.

Y. Ma and C. Liu, "The developmental party and the regulatory state in China's Internet governance," Policy & Internet, vol. 16, no. 4, pp. 764-782, 2024, doi: 10.1002/poi3.410.

M. Jiang, "Chinese internet policies: Historical reflections and new research directions," Communication and the Public, vol. 10, no. 3, pp. 162-167, 2025, doi: 10.1177/20570473251316590.

S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, "Deep learning--based text classification: a comprehensive review," ACM computing surveys (CSUR), vol. 54, no. 3, pp. 1-40, 2021, doi: 10.1145/3439726.

Q. Li et al., "A survey on text classification: From traditional to deep learning," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 13, no. 2, pp. 1-41, 2022, doi: 10.1145/3495162.

F. Alkomah and X. Ma, "A literature review of textual hate speech detection methods and datasets," Information, vol. 13, no. 6, p. 273, 2022, doi: 10.3390/info13060273.

S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, "Transformers in vision: A survey," ACM computing surveys (CSUR), vol. 54, no. 10s, pp. 1-41, 2022, doi: 10.1145/3505244.

M. De Lange et al., "A continual learning survey: Defying forgetting in classification tasks," IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 7, pp. 3366-3385, 2021, doi: 10.1109/TPAMI.2021.3057446.

G. M. Van de Ven, T. Tuytelaars, and A. S. Tolias, "Three types of incremental learning," Nature Machine Intelligence, vol. 4, no. 12, pp. 1185-1197, 2022, doi: 10.1038/s42256-022-00568-3.

V. Heimburg and M. Wiesche, "Digital platform regulation: opportunities for information systems research," Internet Research, vol. 33, no. 7, pp. 72-85, 2023, doi: 10.1108/INTR-05-2022-0321.

C. Li, H. Li, and C. Tao, "Evolutionary game of platform enterprises, government and consumers in the context of digital economy," Journal of business research, vol. 167, p. 113858, 2023, doi: 10.1016/j.jbusres.2023.113858.

V. Lai, S. Carton, R. Bhatnagar, Q. V. Liao, Y. Zhang, and C. Tan, "Human-ai collaboration via conditional delegation: A case study of content moderation," in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022, pp. 1-18, doi: 10.1145/3491102.3501999.

M. D. Molina and S. S. Sundar, "When AI moderates online content: effects of human collaboration and interactive transparency on user trust," Journal of Computer-Mediated Communication, vol. 27, no. 4, p. zmac010, 2022, doi: 10.1007/s10844-022-00726-4.

T. Davidson, D. Warmsley, M. Macy, and I. Weber, "Automated hate speech detection and the problem of offensive language," in Proceedings of the international AAAI conference on web and social media, 2017, vol. 11, no. 1, pp. 512-515, doi: 10.1609/icwsm.v11i1.14955.

Z. Talat and D. Hovy, "Hateful symbols or hateful people? predictive features for hate speech detection on twitter," in Proceedings of the NAACL student research workshop, 2016, pp. 88-93, doi: 10.18653/v1/N16-2013.

L. Lyu et al., "Privacy and robustness in federated learning: Attacks and defenses," IEEE transactions on neural networks and learning systems, vol. 35, no. 7, pp. 8726-8746, 2022, doi: 10.1109/TNNLS.2022.3216981.

G. Bansal et al., "Does the whole exceed its parts? the effect of ai explanations on complementary team performance," in Proceedings of the 2021 CHI conference on human factors in computing systems, 2021, pp. 1-16, doi: 10.1145/3411764.3445717.

H. Fan et al., "Social media toxicity classification using deep learning: real-world application UK Brexit," Electronics, vol. 10, no. 11, p. 1332, 2021, doi: 10.3390/electronics10111332.

A. Abbasi, A. R. Javed, F. Iqbal, N. Kryvinska, and Z. Jalil, "Deep learning for religious and continent-based toxic content detection and classification," Scientific Reports, vol. 12, no. 1, p. 17478, 2022, doi: 10.1038/s41598-022-22523-3.

K. Mnassri, P. Rajapaksha, R. Farahbakhsh, and N. Crespi, "BERT-based ensemble approaches for hate speech detection," in GLOBECOM 2022-2022 IEEE Global Communications Conference, 2022: IEEE, pp. 4649-4654, doi: 10.1109/GLOBECOM48099.2022.10001325.

H. Li and M. Chau, "Human-AI collaboration in content moderation: the effects of information cues and time constraints," 2023. https://aisel.aisnet.org/ecis2023_rip/2/?utm_source=aisel.aisnet.org%2Fecis2023_rip%2F2&utm_medium=PDF&utm_campaign=PDFCoverPages

K. B. Nelatoori and H. B. Kommanti, "Multi-task learning for toxic comment classification and rationale extraction," Journal of Intelligent Information Systems, vol. 60, no. 2, pp. 495-519, 2023, doi: 10.1007/s10844-022-00726-4.

R. Gomez, J. Gibert, L. Gomez, and D. Karatzas, "Exploring hate speech detection in multimodal publications," in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2020, pp. 1470-1478, doi: 10.1109/WACV45572.2020.9093414.

S. Thapa, A. Shah, F. Jafri, U. Naseem, and I. Razzak, "A multi-modal dataset for hate speech detection on social media: Case-study of russia-ukraine conflict," in Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE), 2022, pp. 1-6, doi: 10.18653/v1/2022.case-1.1.

P. Leonidou, N. Kourtellis, N. Salamanos, and M. Sirivianos, "Privacy–Preserving Online Content Moderation: A Federated Learning Use Case," in Companion Proceedings of the ACM Web Conference 2023, 2023, pp. 280-289, doi: 10.1145/3543873.3587604.

F. C. Akyon and A. Temizel, "State-of-the-art in nudity classification: A comparative analysis," in 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), 2023: IEEE, pp. 1-5, doi: 10.1109/ICASSPW59220.2023.10193621.

S. Chen, Y. Yu, Y. Li, Z. Lu, and Y. Zhou, "Mask-free Iterative Refinement Network for weakly-supervised Few-shot Semantic Segmentation," Neurocomputing, vol. 611, p. 128600, 2025, doi: 10.1016/j.neucom.2024.128600.

R. Prabhu and V. Seethalakshmi, "A comprehensive framework for multi-modal hate speech detection in social media using deep learning," Scientific Reports, vol. 15, no. 1, p. 13020, 2025, doi: 10.1038/s41598-025-94069-z.

Research on AI-enabled collaborative governance mechanism for content security: an optimization perspective of review technology based on deep learning

Article Sidebar

Main Article Content

Abstract

Keywords

Article Details

References

References