Arabic Review Dataset for Deepfake Text Detection: Collection and Generation

It has become increasingly challenging to distinguish between real and deepfake texts, particularly for low-resource languages such as Arabic. This study aims to construct a reliable Arabic dataset to enable deepfake text detection by collecting authentic YouTube comments and generating synthetic text using OpenAI GPT-4.0 Mini. The collected comments span four thematic domains— entertainment, religion, health, and sports—to capture common discussion topics and linguistic variations present in Arabic online communities. Synthetic samples were generated using a structured prompt-based methodology that applies predefined deception techniques to simulate realistic misleading content. To validate the proposed dataset, a Bidirectional Encoder Representations from Transformers (BERT)–based model was fine-tuned for binary classification. Experimental results achieved an accuracy of 91.43%, demonstrating strong classification capability and confirming the effectiveness of the dataset for deepfake detection tasks. Although the dataset remains limited in size and dialectal diversity, the results demonstrate the effectiveness of the proposed methodology. The dataset and methodology are expected to support future research in Arabic natural language processing and improve the reliability of automated deepfake detection approaches.

Submit your paper

Instructions for Authors

Archive

Send by email

Identification of Diabetic Retinopathy Using Deep Learning and Ensemble Model Approach

Indexes

Keywords index

Topics index

Authors index

We process personal data collected when visiting the website. The function of obtaining information about users and their behavior is carried out by voluntarily entered information in forms and saving cookies in end devices. Data, including cookies, are used to provide services, improve the user experience and to analyze the traffic in accordance with the Privacy policy. Data are also collected and processed by Google Analytics tool (more).

You can change cookies settings in your browser. Restricted use of cookies in the browser configuration may affect some functionalities of the website.

I agree I do not agree