Lessons from the Text-to-Text Transformer (T5) ablation studies

Reading the T5 paper was a pleasure for me and has helped me learn a great deal.

The paper was written in a way that’s easy to understand and follow. It used a long-format style (44 pages) and this has allowed the authors to explain things in detail.

And most importantly, its explicit focus was to do ablation studies to shed a clear light on what works and what doesn’t, pointing the way for future explorations.

A quick intro to T5

  • The framing as text-to-text enables it to solve both generation and classification problems using an exact same encoder-decoder architecture, and without the need for using different “heads” for different problems (like what was used with BERT).  This is a really cool and ambitious problem modelling. To instruct the model on what to perform, a prefix is added to the input to signal what’s expected in the output (Ex: “translate English to German: <input>”)
  • In pretraining, the dropped-out tokens can be phrases (multiple continguous words); vs BERT: dropping out single words.

Below is a non-exhaustive list of key ablation studies in the paper. Continue reading Lessons from the Text-to-Text Transformer (T5) ablation studies

A Transformer model for inserting Vietnamese accent marks

Updates:  The trained model + instructions to use can now be downloaded from HF here.

In this post, I summarize how I made use of Huggingface’s transformer library to re-solve an NLP problem related to the Vietnamese language.

The problem

After learning about Hidden Markov models about 10+ years ago, I decided to apply it to building a small, but practical, toy that can auto insert accent marks for Vietnamese language.

In a nutshell, Vietnamese has some letters that have additional marks put on them. For ex, in addition to the letter ‘a’, the Vi alphabet also contains these “marked versions”: ă, â.

And for each of these 3  versions (a, ă, â), we can then put the 5 tones on them. An example for ‘ă’ will be:  ắ (acute),  ằ (grave), ẳ (hook), ẵ (tilde), ặ (dot).

Continue reading A Transformer model for inserting Vietnamese accent marks

A few notes on Items Response Theory (IRT) and Computerized adaptive testing

Recently, I was thinking about how to improve the accuracy of assessment tests for ESL learners and so I googled and found Computerized Adaptive Testing (CAT).

During the process, I accidentally discovered an interesting theory behind it. It’s called Items Response Theory or IRT for short.

So I’ve spent some time reading up about it and in the process, picked up a few very useful bits about statistical hypothesis testing, which I’m very glad to have learned.

Below, I share the most important ideas about IRT that I’ve learned.

Continue reading A few notes on Items Response Theory (IRT) and Computerized adaptive testing

A few thoughts on learning, responsibility and commitment

Up to recently, I only knew of one meaning of “learning”: that is to take in more knowledge, or to improve one’s existing knowledge.

So it’s mostly about information and knowledge. About what one knows.

But with time and more experience in management, communication and work in general, I realized that there’s another type of learning that is even more important for one to make progress in work and life.

And it’s not quite the above type of learning more knowledge.

Continue reading A few thoughts on learning, responsibility and commitment

Building a culture of high performance: Learn to give and receive feedback well

There are many things we can learn from the book Netflix’s Culture of reinvention. Among them, a practice that we can all learn and apply is its insistence on “selfless candor”: the practice of improving performance through receiving regular feedback (from everyone).

To build a culture that really embraces constant learning and improvements, learning to give and receive feedback well is a sine qua non.

Without constant 360-degree feedback, we identify our mistakes more slowly (and sometimes completely oblivious to our mistakes) and as a result, we learn and improve more slowly.

Continue reading Building a culture of high performance: Learn to give and receive feedback well

A few thoughts on the current phase of online learning

In this article from A16z, the author discussed the 3 phases in online learning in the US.

Here’s  a quick recap of the 3 phases, according to the authors:

Phase 1: MOOCs (Massive Online Open courses): referring to university-style courses such as those offered by Coursera, MIT Open courses, etc.

Phase 2: Built tools & resources that support in-person tutoring.
This phase includes softwares in 3 sub categories:

  • Learning management systems (LMS): for admin-related work
  • Pre-recorded content: such as YouTube, Khan Academy, Duolingo, etc.
    • In my opinions, we should have a new category to include self-study learning software such as Duolingo, ABC Mouse, etc. b/c their content are way more dynamic and customized than pre-recorded videos.
  • Tutoring & tutor-matching platforms: facilitating online tutoring & students-tutors matching. Exs include PhotoMath, Brainly, etc.

Continue reading A few thoughts on the current phase of online learning