Articles

MODIFICATION OF STEMMING ALGORITHM USING A NON DETERMINISTIC APPROACH TO INDONESIAN TEXT Rifai, Wafda; Winarko, Edi
IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol 13, No 4 (2019): October
Publisher : IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia.

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (461.372 KB) | DOI: 10.22146/ijccs.49072

Abstract

 Natural Language Processing is part of Artificial Intelegence that focus on language processing. One of stage in Natural Language Processing is Preprocessing. Preprocessing is the stage to prepare data before it is processed. There are many types of proccess in preprocessing, one of them is stemming. Stemming is process to find the root word from regular word. Errors when determining root words can cause misinformation. In addition, stemming process does not always produce one root word because there are several words in Indonesian that have two possibilities as root word or affixes word, e.g.the word ?beruang?.To handle these problems, this study proposes a stemmer with more accurate word results by employing a non deterministic algorithm which gives more than one word candidate result. All rules are checked and the word results are kept in a candidate list. In case there are several word candidates were found, then one result will be chosen.This stemmer has been tested to 15.934 word and results in an accurate level of 93%. Therefore the stemmer can be used to detect words with more than one root word.
PARALLELIZATION OF HYBRID CONTENT BASED AND COLLABORATIVE FILTERING METHOD IN RECOMMENDATION SYSTEM WITH APACHE SPARK Ikhsanudin, Rakhmad; Winarko, Edi
IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol 13, No 2 (2019): April
Publisher : IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia.

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (731.943 KB) | DOI: 10.22146/ijccs.38596

Abstract

Collaborative Filtering as a popular method that used for recommendation system. Improvisation is done in purpose of improving the accuracy of the recommendation. A way to do this is to combine with content based method. But the hybrid method has a lack in terms of scalability. The main aim of this research is to solve problem that faced by recommendation system with hybrid collaborative filtering and content based method by applying parallelization on the Apache Spark platform.Based on the test results, the value of hybrid collaborative filtering method and content based on Apache Spark cluster with 2 node worker is 1,003 which then increased to 2,913 on cluster having 4 node worker. The speedup got more increased to 5,85 on the cluster that containing 7 node worker.
APLIKASI MOBILE UNTUK ANALISIS SENTIMEN PADA GOOGLE PLAY Ilmawan, Lutfi Budi; Winarko, Edi
IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol 9, No 1 (2015): January
Publisher : IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia.

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (649.847 KB) | DOI: 10.22146/ijccs.6640

Abstract

AbstrakGoogle dalam application store-nya, Google Play, saat ini telah menyediakan sekitar 1.200.000 aplikasi mobile. Dengan sejumlah aplikasi tersebut membuat pengguna memiliki banyak pilihan. Selain itu, pengembang aplikasi mengalami kesulitan dalam mencari tahu bagaimana meningkatkan kinerja aplikasinya. Dengan adanya permasalahan tersebut, maka dibutuhkan sebuah aplikasi analisis sentimen yang dapat mengolah sejumlah komentar untuk memperoleh informasi.Sistem yang dibangun memiliki tujuan untuk menentukan polaritas sentimen dari ulasan tekstual aplikasi pada Google Play yang dilakukan dari perangkat mobile. Perangkat mobile memiliki portabilitas yang tinggi dan sebagian dari perangkat tersebut memiliki resource yang terbatas. Hal tersebut diatasi dengan menggunakan arsitektur sistem berbasis client server, di mana server melakukan tugas-tugas yang berat sementara client-nya adalah perangkat mobile yang hanya mengerjakan tugas yang ringan. Dengan solusi tersebut maka Analisis sentimen dapat diaplikasikan pada mobile environment.Adapun metode klasifikasi yang digunakan adalah Naïve Bayes untuk aplikasi yang dikembangkan dan Support Vector Machine Linier sebagai pembanding. Nilai akurasi dari Naïve Bayes classifier dari aplikasi yang dibangun sebesar 83,87% lebih rendah jika dibandingkan dengan nilai akurasi dari SVM Linier classifier sebesar 89,49%. Adapun penggunaan semantic handling untuk mengatasi sinonim kata dapat mengurangi akurasi classifier. Kata kunci? analisis sentimen, google play, klasifikasi, naïve bayes, support vector machine AbstractGoogle's Google Play now providing approximately 1.200.000 mobile applications. With these number of applications, it makes the users have many options. In addition, application developers have difficulties in figuring out how to improve their application performance. Because of these problems, it is necessary to make a sentiment analysis applications that can process review comments to get valuable information.The purpose of this system is determining the polarity of sentiments from applications?s textual reviews on Google Play that can be performed on mobile devices. The mobile device has high portability and the majority of these devices have limited resource. That problem can be solved by using a client server based system architecture, where the server performs training and classification tasks while clients is a mobile device that perform some of sentiment analysis task. With this solution, the sentiment analysis can be applied to the mobile environment.The classification method that used are Naive Bayes for developed application and Linear Support Vector Machine that is used for comparing. Naïve Bayes classifier?s accuracy is 83.87%. The result is lower than the accuracy value of Linear SVM classifier that reach 89.49%. The use of semantic handling can reduce the accuracy of the classifier. Keywords?sentiment analysis, google play, classification, naïve bayes, support vector machine
KLASIFIKASI DATA NAP (NOTA ANALISIS PEMBIAYAAN) UNTUK PREDIKSI TINGKAT KEAMANAN PEMBERIAN KREDIT (STUDI KASUS : BANK SYARIAH MANDIRI CABANG LUWUK SULAWESI TENGAH) Adi, Sumarni; Winarko, Edi
IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol 9, No 1 (2015): January
Publisher : IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia.

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (604.166 KB) | DOI: 10.22146/ijccs.6635

Abstract

AbstrakSetiap bulannya bank syariah mandiri cabang luwuk menerima proposal kredit (NAP) dari nasabah dalam jumlah yang terus meningkat dan perlu respon yang cepat. Dengan demikian, perlu dikembangkan sistem untuk melakukan data mining dari tumpukan data tersebut yang akan digunakan untuk kepentingan tertentu, salah satunya adalah untuk menganalisis resiko pemberian kredit.Teknik data mining digunakan dalam penelitian ini untuk klasifikasi tingkat keamanan pemberian kredit dengan menerapakan algoritma Naïve Bayes Classificatio. Naive bayes classifier merupakan pendekatan yang mengacu pada teorema Bayes yang menkombinasikan pengetahuan sebelumnya dengan pengetahuan baru, sehingga merupakan salah satu algoritma klasifikasi yang sederhana namun memiliki akurasi tinggi. Sebelum dilakukan klasifikasi, data debitur melalui preprocessing. Kemudian dari preprocessing ini dilakukan klasifikasi dengan naive bayes classifier, sehingga menghasilkan model probabilitas klasifikasi untuk prediksi kelas pada debitur selanjutnya. Teknik pengujian akurasi model diukur menggunakan boostrap, dan menunjukkan bahwa nilai akurasi terkecil 80% dihasilkan pada sampel data 100, dan menghasilkan nilai akurasi terbesar 98,66% pada sampel data 463. Kata kunci? akurasi, naive bayes, data mining, klasifikasi, preprocessing, NAP AbstractEvery month the Mandiri Syariah Bank Branch Office of Luwuk receives a very large number of proposal credit. Thus, the system should be developed to perform data mining of the heap data to be used for specific purpose, one of which is for the risk analysis of credit allowance. Data mining techniques used in this study for classification level prediction of credit allowance by applying a naïve Bayes Classification algorithm . Naive bayes classifier is an approach that refers to the bayes theorem, is a combination of prior knowledge with new knowledge. So that is one of the classification algorithm is simple but has a high accuracy. Prior to classification, data of debitur has been through a preprocessing. Then the weight is to perform classification with naive bayes classifier. After the data is classified, so produce probabilitas of model classification for prediction class to next debitur.       Testing techniques the accuracy of the model was measured by bosstrap, and shows that the smallest value of accuracy is 80% produced in the 100 data sample, and the largest value of accuracy 98,66% on a data sample of 463. Keywords? accuracy, naive bayes, data mining, classification, preprocessing, NAP
ALGORITMA CPAR UNTUK ANALISA DATA KECELAKAAN (STUDI PADA KEPOLISIAN DAERAH SULAWESI TENGGARA) Ransi, Natalis; Winarko, Edi
IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol 8, No 2 (2014): July
Publisher : IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia.

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (559.095 KB) | DOI: 10.22146/ijccs.6547

Abstract

AbstrakKecelakaan lalu lintas (laka lantas) di Sulawesi Tenggara perlu mendapatkan penanganan yang efektif karena menyebabkan korban meninggal dunia yang terus meningkat setiap tahunnya. Salah satu langkah penanganan adalah analisis karakteristik laka lantas yang berhubungan dengan korban meninggal dunia. Analisis karakteristik laka lantas dapat dilakukan dengan pendekatan faktor penyebab kecelakaan, jenis kecelakaan, dan waktu kejadian.Penelitian ini mengaplikasikan algoritma Classification based on Predictive Association Rules (CPAR) pada data mining untuk analisa karakteristik laka lantas. Algoritma CPAR menghasilkan Class Association Rules (CARs), selanjutnya CARs digunakan untuk mendeskripsikan karakteristik laka lantas yang berhubungan dengan korban meninggal dunia.Hasil penelitian diperoleh bahwa faktor yang menyebabkan korban meninggal dunia pada kasus laka lantas adalah faktor manusia (berkendara dibawah pengaruh alkohol dan berkendara melebihi batas kecepatan) dan faktor lingkungan fisik (prasarana jalan yang rusak dan jalan dengan tikungan tajam). Jenis kecelakaan (tunggal dan depan-depan), waktu kejadian (tanggal 8-14, hari Senin dan Selasa, jam 13:00-18:59), jenis kendaraan (sepeda motor) dan merek kendaraan (Honda), berpotensi menimbulkan korban meninggal pada kasus laka lantas. Pengendara sepeda motor rentan menjadi korban pada kasus laka lantas. Pengujian akurasi menggunakan 10-fold cross validation Hasil pengujian menunjukkan bahwa rata-rata akurasi algoritma CPAR lebih tinggi yaitu 48,75% dibandingkan dengan algoritma PRM yaitu 41,13%. Kata kunci? data mining, algoritma CPAR, kecelakaan lalu lintas Abstract Traffic accident in Southeast Sulawesi needs to get treatment more effective. One of the handling is analysis of traffic accident characteristic and then it was related to the death. Analysis of trafiic accident characteristics can be done with the approach factors the cause of the accident, the kind of an accident, and time genesis.This Research apply CPAR algorithm on the data mining to analyze the characteristics of traffic accident. CPAR Algorithm produce Class Association Rules (CARs) that used to describe traffic accident characteristics related to the death.Results of research, that the factors that caused the victim died in traffic accident is human factors (driving under the influence of alcohol and driving exceed the speed) and environmental factors physical (road infrastructure and damaged roads with elbow).  Types of accidents (in the singular and home-front), time genesis (on 8-14, reported Monday and Tuesday, hours 1:00 pm-6:59 pm), the type of vehicle (motorcycle), potentially causing the death toll in the case laka then. Motorcycle drivers are prone to fall victim in that case laka then. Testing accuracy using 10-fold cross validation test result show that on average these accuracy algorithm CPAR 48.75%, higher than the algorithm PRM 41.13%. Keywords? data mining, CPAR algorithm, traffic accident
ADWORDS KEYWORD SET SELECTION DECISION SUPPORT SYSTEM USING AHP AND TOPSIS METHOD Chandra, Sholikin Ady; Winarko, Edi; Priyanta, Sigit
IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol 14, No 2 (2020): April
Publisher : IndoCEISS in colaboration with Universitas Gadjah Mada, Indonesia.

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22146/ijccs.50731

Abstract

CV. Gitani Creative Agency is a company engaged in the field of creative agency providing digital marketing service. Google Adwords is a platform used by the company to run this service. Keyword set selection is critical to the performance of ads. However, finding the right keyword set is not an easy task. The company needs to consider various criteria to get the optimal advertising results. Decision support system (DSS) is needed as an objective reference in the process of keyword set selection. The criteria for decision-making are click, impressions, cost, and avg. CPC.AHP method is used to compare the value of each criteria and then generate priority weights of each criteria. While TOPSIS method is used for alternative ranking. The combination of these methods aims to improve the performance of TOPSIS method.The result of this study shows that the combination of AHP and TOPSIS methods can be used to determine the best keyword set for ads. Based on the testing results, DSS can do alternative ranking correctly in accordance with the results of manual calculation and it is also flexible to the changes in criteria and alternatives.
QUESTION CLASSIFICATION MENGGUNAKAN SUPPORT VECTOR MACHINES DAN STEMMING Abdiansah, Abdiansah Abdiansah; Winarko, Edi
Seminar Nasional Aplikasi Teknologi Informasi (SNATI) 2015
Publisher : Jurusan Teknik Informatika, Fakultas Teknologi Industri, Universitas Islam Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Abstractâ??Question Classification (QC) merupakan salah satukomponen penting dalam Question Answering System (QAS)karena akan berpengaruh langsung terhadap kinerjakeseluruhan QAS. Sejauh ini metode yang disarankan olehkomunitas QAS untuk QC adalah menggunakan SupportVector Machines (SVM). Untuk melakukan klasifikasi teksdibutuhkan fitur berdimensi tinggi, banyaknya fitur dapatmengurangi performa SVM. Stemming adalah teknik yangdigunakan untuk mereduksi term suatu dokumen.Penggunaan stemming akan berpengaruh terhadap sintaksisdan semantik suatu pertanyaan. Penelitian ini bertujuan untukmengetahui pengaruh stemming terhadap akurasi SVM. Telahdilakukan dua percobaan klasifikasi pertanyaan, yaitu denganmenggunakan SVM dan SVM+stemming. Hasil rata-rataakurasi dari percobaan diperoleh sebesar 86.75% untuk SVMdan 87.48% SVM+stemming sehingga telah terjadi kenaikanakurasi sebesar 0.73%. Walaupun peningkatan akurasi tidaksignifikan tetapi stemming dapat mereduksi fitur tanpamenurunkan akurasi SVM.Keywordsâ??question classification, question answering system,support vector machines, stemming
ONTOLOGY-BASED WHY-QUESTION ANALYSIS USING LEXICO-SYNTACTIC PATTERNS Karyawati, A.A.I.N. Eka; Winarko, Edi; Azhari, Azhari; Harjoko, Agus
International Journal of Electrical and Computer Engineering (IJECE) Vol 5, No 2: April 2015
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (186.282 KB) | DOI: 10.11591/ijece.v5i2.pp318-332

Abstract

This research focuses on developing a method to analyze why-questions.  Some previous researches on the why-question analysis usually used the morphological and the syntactical approach without considering the expected answer types. Moreover, they rarely involved domain ontology to capture the semantic or conceptualization of the content. Consequently, some semantic mismatches occurred and then resulting not appropriate answers. The proposed method considers the expected answer types and involves domain ontology. It adapts the simple, the bag-of-words like model, by using semantic entities (i.e., concepts/entities and relations) instead of words to represent a query. The proposed method expands the question by adding the additional semantic entities got by executing the constructed SPARQL query of the why-question over the domain ontology. The major contribution of this research is in developing an ontology-based why-question analysis method by considering the expected answer types. Some experiments have been conducted to evaluate each phase of the proposed method. The results show good performance for all performance measures used (i.e., precision, recall, undergeneration, and overgeneration). Furthermore, comparison against two baseline methods, the keyword-based ones (i.e., the term-based and the phrase-based method), shows that the proposed method obtained better performance results in terms of MRR and P@10 values.
FLOWER POLLINATION ALGORITHM (FPA) TO SOLVE QUADRATIC ASSIGNMENT PROBLEM (QAP) Samdean, Derby Prayogo; Suprajitno, Herry; Winarko, Edi
Contemporary Mathematics and Applications Vol 1, No 2 (2019)
Publisher : Universitas Airlangga

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (388.275 KB) | DOI: 10.20473/conmatha.v1i2.17398

Abstract

The purpose of this paper is to solve Quadratic Assignment Problem using Flower Pollination Algorithm. Quadratic Assignment Problem discuss about assignment of facilities to locations in order to minimize the total assignment costs where each facility assigns only to one location and each location is assigned by only one facility. Flower pollination Algorithm is an algorithm inspired by the process of flower pollination. There are two main steps in this algorithm, global pollination and local pollination controlled by switch probability. The program was created using Java programming language and implemented into three cases based on its size: small, medium and large. The computation process obtained the objective function value for each data using various values of parameter. According to the pattern of the computational result, it can be concluded that a high value of maximum iteration of the algorithm can help to gain better solution for this problem.
PENERAPAN DATA MINING DALAM E-LEARNING Sutrisno, Ashari; Winarko, Edi
Proceedings of KNASTIK 2009
Publisher : Duta Wacana Christian University

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Banyak organisasi memandang bahwa informasi merupakan aset yang sangat berharga. Data miningmemungkinkan organisasi untuk memanfaatkan sebesar mungkin penggunaan informasi dan menemukan informasiinformasibaru dari melimpahnya data yang dimiliki, dalam rangka membantu tugas-tugas seperti: prediksi (untukmemperlihatkan bagaimana karakteristik data dan memperkirakan informasi di masa mendatang), identifikasi (untukmengidentifikasi suatu kejadian atau aktifitas), klasifikasi (untuk memilah-milah data, sehingga dapat menggolongkaninformasi berdasar kategori atau parameter tertentu), dan optimalisasi (untuk mengoptimalkan penggunaan keterbatasansumber, sekaligus memaksimalkan output yang didapatkan). Data mining telah digunakan di banyak bidang, di antaranya:manajemen bisnis, telekomunikasi, farmasi, super market, industri, transportasi, kesehatan, penerbangan angkasa luar, danElectronic Learning (E-Learning). Meningkatnya penyebaran materi atau bahan pembelajaran dalam E-Learning telahmeningkatkan jumlah data menjadi sangat besar dan kompleks. Data yang terlalu besar dan kompleks menjadi sulit untukdimanfaatkan. Cara yang populer dan efektif untuk mengatasi hal ini adalah dengan menggunakan data mining. Makalah iniakan menghadirkan pembahasan tentang data mining dalam penerapan dan teknik-tekniknya untuk E-Learning.