A Refined Similarity-Based Bigram Model

In my previous post, I discussed the similarity-based bigram model (Dagan et al., 1998) and showed its performance against other classic ngram smoothing techniques. Although the original similarity-based bigram model had less perplexity than the Katz backoff model, it didn’t fare as well as the interpolated Kneser-Ney model on both Penn-tree bank (PTB) and Wikitext103 datasets. In this blog post, I’ll introduce a new similarity-based bigram model with better performance.

Read More

Introduction to N-gram language models

Language modeling has been a pivotal area in Natural Language Processing (NLP). It forms the foundation for several applications like speech recognition, machine translation, spell correction, and more. One of the initial and simplest techniques in language modeling is the N-gram model.

Read More

A Developing Framework for Natural Language Systems

In order to comprehend the human mind and create interpretable and reliable AI systems, I firmly believe in the necessity of a practical theory of natural language. While classical linguistic theories hold some truth and offer valuable insights into language’s nature, they have not achieved significant success in the field of NLP. This can be primarily attributed to their neglect of a crucial aspect: language acquisition. The primary objective of this project is to develop a comprehensive framework for acquiring, understanding, and generating natural language, with particular emphasis on efficient learning and generalization from limited observations. By pursuing this goal, we aim to enhance our understanding of both machine and human intelligence, ultimately leading to the development of interpretable and robust AI systems.

Read More

Introduction to Distributional Semantics

What makes car and automobile, rage and anger synonymous? A simple answer would be: they can be used interchangeably in many situations/contexts. This observation is where the intuition behind the Distributional Hypothesis came from: words that occur in the same contexts tend to have similar meanings.

Read More

On Interpretable Language Modeling

How do we acquire and utilize language? How do infants learn word meanings, word compositions, and grammar? How do children produce new phrases or sentences they have never heard before? Furthermore, how does language relate to intelligence? These are questions that have captivated researchers for many years.

Read More

Modifying Custom Matmul CUDA Kernels

I started to learn CUDA last year, and started writing matrix multiplication kernels as a learning project. After some struggles, I made them to work, but then got disappointed when I saw my kernels are 10 times slower than cuBLAS GEMM kernels. Maybe my expectations were a bit too high. I’ve tried lots of open sourced matmul kernels on github, but the best one I found was still about 5 times slower (some of them were optimized for older architectures). So I started the journey of optimizing my own matmul kernel. After few months of trial and error, my matmul kernel finally has comparable speed to cuBLAS GEMM.

Read More