Hendra Bunyamin is a lecturer who graduated from Mathematics department Bandung Institute of Technology in 1999 and Software Engineering Informatics department from the same university in 2003.

He is very passionate about teaching. Mainly, he teaches Mathematics and Programming.

His research interests are machine learning and its applications.

He also enjoys sharing his faith and understanding about maths & machine learning in his blog.

Selected Courses
Bachelor Program in Informatics Engineering
  • Linear Algebra
  • Discrete Mathematics
  • Introduction to Logic
  • Natural Language Processing
  • Data Mining
  • Design Patterns
  • Machine Learning
ShareShare on FacebookTweet about this on Twitter
Bachelor Program in Informations Systems
  • Object-Oriented Programming
ShareShare on FacebookTweet about this on Twitter


Information Retrieval System dengan Metode Latent Semantic Indexing
Master Thesis at Bandung Institute of Technology

Information retrieval (IR) system is a system, which is used to search and retrieve information relevant to the users’ needs. IR system retrieves and displays documents that are relevant to the users’ input (query). One of the methods to retrieve information relevant to the query is how to match the query semantically with document collection. Latent Semantic Indexing (LSI) is a method to match the query semantically with document collection. For example, there is a query ‘purchase’. ‘Purchase’ and ‘buy’ are two words that have semantic matching. So, LSI retrieves documents, which have both or either one of those words. This thesis explores the comparison between the performance of LSI method and that of vector method. The performance is measured by non-interpolated average precision (NIAP).(download dataset) (download pdf)

ShareShare on FacebookTweet about this on Twitter


Sentiment Classification Menggunakan Machine Learning: Metode Naive-Bayes dan Support Vector Machines (Studi Kasus: Movie Reviews imdb.com)
Seminar Teknologi Informasi dan Sistem Informasi (SETISI)

Saat ini banyak sekali informasi tersedia dalam bentuk dokumen on-line. Para peneliti berusaha menyelidiki masalah automatic text categorization sebagai bagian untuk mengorganisir informasi untuk pengguna. Banyak hasil penelitian berfokus pada topical categorization dengan cara mengurutkan dokumen-dokumen menurut subjeknya (contoh: sports vs politics). Akan tetapi, belakangan ini muncul fokus baru yaitu bagaimana mengurutkan atau mengklasifikasikan dokumen-dokumen menurut sentiment-nya atau opini keseluruhan terhadap objek pembicaraan (contoh: apakah sebuah product review positif atau negatif). Penelitian ini bermaksud untuk menyelidiki keefektifan penggunaan teknik machine learning untuk menyelesaikan masalah sentiment classification. (download pdf)

ShareShare on FacebookTweet about this on Twitter


A Comparison of Retweet Prediction Approaches: The Superiority of Random Forest Learning Method
TELKOMNIKA (Telecommunication Computing Electronics and Control)

We consider the following retweet prediction task: given a tweet, predict whether it will be retweeted. In the past, a wide range of learning methods and features has been proposed for this task. We provide a systematic comparison of the performance of these learning methods and features in terms of prediction accuracy and feature importance. Specifically, from each previously published approach we take the best performing features and group these into two sets: user features and tweet features. In addition, we contrast five learning methods, both linear and non-linear. On top of that, we examine the added value of a previously proposed time-sensitive modeling approach. To the authors’ knowledge this is the first attempt to collect best performing features and contrast linear and non-linear learning methods. We perform our comparisons on a single dataset and find that user features such as the number of times a user is listed, number of followers, and average number of tweets published per day most strongly contribute to prediction accuracy across selected learning methods. We also find that a random forest-based learning, which has not been employed in previous studies, achieves the highest performance among the learning methods we consider. We also find that on top of properly tuned learning methods the benefits of time-sensitive modeling are very limited. (download pdf)

ShareShare on FacebookTweet about this on Twitter


Automatic Topic Clustering Using Latent Dirichlet Allocation with Skip-gram Model on Final Project Abstracts
The 21st International Computer Science and Engineering Conference (ICSEC)

Topic model has been an elegant method to discover hidden structures in knowledge collections, such as news archives, blogs, web pages, scientific articles, books, images, voices, videos, and social media. The basic model of topic model is Latent Dirichlet Allocation (LDA) and this paper utilizes LDA to automatically cluster topics from final project abstract collection. We compare two methods, that are LDA as a unigram model and LDA with Skip-gram model. Our results are evaluated by an expert on readily available categories. Overall, words from each topic are indeed keywords describing each topic; moreover, the combination of LDA and skip-gram model are capable to capture key phrases from each topic (download pdf).

ShareShare on FacebookTweet about this on Twitter


Pemanfaatan Inverted Index pada Proses Penelusuran Kesamaan Isi File Dokumen pdf Tugas Akhir Mahasiswa
Seminar Nasional Teknologi Informasi dan Komunikasi (SENTIKA)

Penelusuran kesamaan isi tulisan pada sebuah karya tulis ilmiah merupakan salah satu cara untuk mengurangi atau menghilangkan kejadian plagiarisme di kalangan para peneliti, termasuk para mahasiswa yang sedang menempuh proses pendidikan tinggi. Penelitian ini ditujukan untuk membuat sebuah aplikasi berbasis web sederhana dengan menggunakan inverted index untuk mencari seberapa banyak kesamaan sebuah dokumen dengan data dokumen yang telah dimiliki. Seluruh dokumen pembanding disimpan dalam basis data lokal untuk memudahkan proses pencarian datanya. Adapun dokumen yang dibandingkan merupakan file PDF yang dapat merupakan sebagian atau seluruh laporan tugas akhir mahasiswa yang ditulis dalam Bahasa Indonesia. Berdasarkan hasil percobaan yang telah dilakukan, aplikasi yang dihasilkan telah dapat mengukur berapa besar kesamaan kesamaan kalimat dan dokumen yang diberikan terhadap dokumen referensi yang telah tersimpan di dalam basis data. (download pdf)

ShareShare on FacebookTweet about this on Twitter


The Relationship between Country Risk and Company Performance in Southeast Asia
Journal of Business & Retail Management Research Vol-12, Issue 3, April 2018

Managing risk is important. Organizations are starting to see the value of, or asking for strategic solutions to managing the risk. Risk refers to a deviation from what the organization plans or expects. Risk has an upside (opportunity), as well as a downside, the potential negative impact to an asset. This type of risk (loss) can prevent companies from achieving strategic goals. Organizations can turn risks into opportunities through effective risk management. For public companies which have subsidiaries in many countries, one of the risks should be managed is country risk. Country risk is defined as the risk a foreign government will default on its bonds or other financial commitments. Country risk also refers to the broader notion of degrees to which political and economic unrest affects the securities of issuers that do businesses in a particular country. In this research, we analyze the effect of country risk on company performance. Moreover, we employ linear regression to model the effect and the result shows country risk has a significant negative influence on Return on Equity (ROE). We also build nine models to predict country risk ratings based on country risk reports by utilizing machine learning algorithms. Furthermore, decision tree algorithm has the highest accuracy 31.25% on our dataset. Finally, our results show that, firstly, international companies who have overseas subsidiaries can benefit from using country risk as a tool to measure returns. Secondly, decision tree algorithm should be utilized to help decision makers determine country risks based on country reports; however, the effect of time-series data set into the machine learning algorithms still needs more investigations. (download pdf)

ShareShare on FacebookTweet about this on Twitter


Analisis Performa dan Pengembangan Sistem Deteksi Ras Anjing pada Gambar dengan Menggunakan Pre-Trained CNN Model
Jurnal Teknik Informatika dan Sistem Informasi Volume 4 Nomor 2 Agustus 2018

The main objective of this research is to develop an image recognition system for distinguishing dog breeds using Keras’ pre-trained Convolutional Neural Network models and to compare the accuracy between those models. Specifically, the models utilized are ResNet50, Xception, and VGG16. The system that we develop here is a web application using Flask as its development framework. Moreover, this research also explains how the deep learning approaches, such as CNN, can distinguish an object in an image. After testing the system on a set of images manually, we learn that every model has different performance, and Xception came out as the best in term of accuracy. We also test the acceptance of the user interface we develop to the end-users. (download pdf)(source code)

ShareShare on FacebookTweet about this on Twitter


Social Media

MOOC Courses that I took

Deep Learning Nanodegree by Udacity

Artificial Intelligence is transforming our world in dramatic and beneficial ways with Deep Learning powering that progress. Together with experts in the deep learning field, Udacity provides a dynamic introduction to this amazing field using weekly videos, exclusive projects, and expert feedback to teach you the basics of this future-shaping technology.

Natural Language Processing by National Research University Higher School of Economics in Coursera

This course covers a wide range of tasks in Natural Language Processing from basic to advanced: sentiment analysis, summarization, dialogue state tracking, to name a few. Upon completing, you will be able to recognize NLP tasks in your day-to-day work, propose approaches, and judge what techniques are likely to work well. The final project is devoted to one of the most hot topics in today’s NLP. You will build your own conversational chat-bot that will assist with search on StackOverflow website. The project will be based on practical assignments of the course, that will give you hands-on experience with such tasks as text classification, named entities recognition, and duplicates detection. Throughout the lectures, we will aim at finding a balance between traditional and deep learning techniques in NLP and cover them in parallel. For example, we will discuss word alignment models in machine translation and see how similar it is to attention mechanism in encoder-decoder neural networks. Core techniques are not treated as black boxes. On the contrary, you will get in-depth understanding of what’s happening inside. To succeed in that, we expect your familiarity with the basics of linear algebra and probability theory, machine learning setup, and deep neural networks. Some materials are based on one-month-old papers and introduce you to the very state-of-the-art in NLP research.