An Ensemble Model of Dual Learning for Gambling and Pornographic Websites Classification

dc.contributor.authorSirapat Thianphan
dc.contributor.authorKietikul Jearanaitanakij
dc.date.accessioned2026-05-08T19:25:08Z
dc.date.issued2025-5-6
dc.description.abstractThe rapid proliferation of pornographic and gambling websites poses significant challenges, as these platforms increasingly employ sophisticated techniques to evade detection. Traditional classification approaches that rely on a single feature often fail to achieve high detection rates due to the diverse strategies these websites use to bypass detection systems. To address this limitation, this study introduces an ensemble model for classifying pornographic and gambling websites by integrating two key features: URLs and textual content. A webscraping script was developed to extract textual data from HTML elements of 3,000 websites, evenly distributed among benign, pornographic, and gambling categories, specifically curated for Thai users. The URLs undergo preprocessing to capture their meaningful semantic properties, which reflect the characteristics of the corresponding websites. Separate classifiers were then trained on each feature before being integrated into an ensemble model for final prediction. This approach achieved an outstanding accuracy of 96.83%, significantly surpassing single-feature classifiers. Moreover, the findings demonstrate the proposed model's robustness against obfuscation techniques and anti-crawling mechanisms, underscoring its potential for effective automated detection.
dc.identifier.doi10.1109/iceast64767.2025.11088214
dc.identifier.urihttps://dspace.kmitl.ac.th/handle/123456789/19934
dc.subjectGambling Behavior and Treatments
dc.subjectArtificial Intelligence in Games
dc.titleAn Ensemble Model of Dual Learning for Gambling and Pornographic Websites Classification
dc.typeArticle

Files

Collections