Dwi H. Widyantoro
Department of Informatics, School of Electrical Engineering and Informatics (STEI) Bandung Institute of Technology, Jalan Ganesha 10, Bandung, 40132

Published : 3 Documents

Found 3 Documents

A Multiclass-based Classification Strategy for Rethorical Sentence Categorization from Scientific Papers Widyantoro, Dwi H.; Khodra, Masayu L.; Trilaksono, Bambang Riyanto; Aziz, E. Aminudin
Journal of ICT Research and Applications Vol 7, No 3 (2013)
Publisher : ITB Journal Publisher, LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (122.999 KB) | DOI: 10.5614/itbj.ict.res.appl.2013.7.3.5


Rapid identification of content structures in a scientific paper is of great importance particularly for those who actively engage in frontier research. This paper presents a multi-classifier approach to identify such structures in terms of classification of rhetorical sentences in scientific papers. The idea behind this approach is based on an observation that no single classifier is the best performer for classifying all rhetorical categories of sentences. Therefore, our approach learns which classifiers are good at what categories, assign the classifiers for those categories and apply only the right classifier for classifying a given category. This paper employsk-fold cross validation over training data to obtain the category-classifier mapping and then re-learn the classification model of the corresponding classifier using full training data on that particular category. This approach has been evaluated for identifying sixteen different rhetorical categories on sentences collected from ACL-ARC paper collection. The experimental results show that the multi-classifier approach can significantly improve the classification performance over multi-label classifiers.
Rhetorical Sentences Classification Based on Section Class and Title of Paper for Experimental Technical Papers Helen, Afrida; Purwarianti, Ayu; Widyantoro, Dwi H.
Journal of ICT Research and Applications Vol 9, No 3 (2015)
Publisher : ITB Journal Publisher, LPPM ITB

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (246.844 KB) | DOI: 10.5614/itbj.ict.res.appl.2015.9.3.5


Rhetorical sentence classification is an interesting approach for making extractive summaries but this technique still needs to be developed because the performance of automatic rhetorical sentence classification is still poor. Rhetorical sentences are sentences that contain rhetorical words or phrases. Rhetorical sentences not only appear in the contents of a paper but also in the title. In this study, features related to section class and title class that have been proposed in a previous research were further developed. Our method uses different techniques to reach automatic section class extraction for which we introduce new, format-based features. Furthermore, we propose automatic rhetoric phrase extraction from the title. The corpus we used was a collection of technical-experimental scientific papers. Our method uses the Support Vector Machine (SVM) algorithm and the Naïve Bayesian algorithm for classification. The four categories used were: Problem, Method, Data, and Result. It was hypothesized that these features would be able to improve classification accuracy compared to previous methods. The F-measure for these categories reached up to 14%. 
International Journal of Electrical and Computer Engineering (IJECE) Vol 10, No 4: August 2020
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (691.356 KB) | DOI: 10.11591/ijece.v10i4.pp3537-3549


Information credibility in social media is becoming the most important part of information sharing in the society. The literatures have shown that there is no labeling information credibility based on user competencies and their posted topics. This study increases the information credibility by adding new 17 features for Twitter and 49 features for Facebook. In the first step, we perform a labeling process based on user competencies and their posted topic to classify the users into two groups, credible and not credible users, regarding their posted topics. These approaches are evaluated over ten thousand samples of real-field data obtained from Twitter and Facebook networks using classification of Naive Bayes (NB), Support Vector Machine (SVM), Logistic Regression (Logit) and J48 algorithm (J48). With the proposed new features, the credibility of information provided in social media is increasing significantly indicated by better accuracy compared to the existing technique for all classifiers.