Optimizing Contextual Embedding-Based Text Classification: A Comparative Analysis Of PSO And GSO Feature Selection Approach

PRIUSHA NARWARIYA; SUSHEEL KUMAR TIWARI

doi:10.64252/268n7n80

Authors

PRIUSHA NARWARIYA Author
SUSHEEL KUMAR TIWARI Author

DOI:

https://doi.org/10.64252/268n7n80

Keywords:

Text Classification, Contextual Embeddings, Particle Swarm Optimization, Glowworm Swarm Optimization, Feature Selection, AUC-ROC, Execution Time Analysis

Abstract

Text classification occupies a central position in natural language processing, with feature engineering and optimization being critical levers for enhancing predictive power. Conventional pipelines frequently falter in reconciling the competing demands of reduced feature space, model transparency, and accuracy. This study presents a novel hybrid architecture that leverages contextual embeddings within a fusion-centered feature construction, fine-tuned through the complementary heuristics of Particle Swarm Optimization (PSO) and Glowworm Swarm Optimization (GSO).The proposed framework orchestrates a cohesive feature assemblage from term-frequency-inverse-document-frequency (TF-IDF), pre-trained word embeddings, and condensed representations generated by singular value decomposition (SVD) and principal component analysis (PCA). Subsequent dimensional refinement is governed by a bi-level optimization schema that alternates between PSO and GSO. Benchmarks on the curated train and test splits indicate pronounced superiorities relative to a baseline that pairs TF-IDF with logistic regression. The PSO-tuned ensemble attained an accuracy of 96.82%, a macro F1-score of 96.77%, and an area under the receiver operating characteristic curve (AUC-ROC) of 0.987, while the GSO-tuned variant recorded 96.95%, 96.91%, and 0.989, respectively. Temporal profiling under proxy settings in Google Colab, with GPU support, disclosed that the PSO variant required 39.0 minutes of processing, whereas the GSO variant consumed 41.5 minutes, thereby affirming their operational viability.The integration of confusion matrices, disaggregated performance metrics, AUC-ROC visualizations, log-loss trajectories, and hyperparameter convergence profiles convincingly established GSO as the leading classifier, despite its marginally elevated computational burden. This work highlights the potency of swarm-intelligence-driven optimization in refining contextual-embedding-based text categorization and indicates the method’s broader applicability to NLP domains that mandate elevated accuracy and resilient feature-selection capabilities.

Downloads

Download data is not yet available.

Optimizing Contextual Embedding-Based Text Classification: A Comparative Analysis Of PSO And GSO Feature Selection Approach

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Indexing