I wrote a program that speaks like the collective hive-mind of The Straits Times Forum

How can they seek recourse?And my personal favorite, “Bukit Timah belt schools breeding elitist students”, a passionate plea for students to stop wasting valuable money on Starbucks, so that the forum letter writer can sacrifice her much-less-valuable money on the exact same drinks.

I find that I cannot go a day without trawling through the Straits Times Forum for conventional wisdom so thoroughly lacking in modern Singapore.

And so I decided to create the Straits Times Forum Letter Generator (STFLG) to constantly tell me what I need to do in order for Singaporean society to not crumble to pieces.

If you’d like to give it a go, you can find it here: Straits Times Forum Letter Generator.

I will explain the algorithm used by STFLG in the next section — skip to the end for some of my favorite pieces made by STFLG.

Markov chains explained with the help of BeyoncéThe goal of STFLG is to create an article — essentially a bunch of text — in the style of the Straits Times Forum section.

The underlying algorithm I chose for STFLG is the Markov chain, a commonly used technique in next-word prediction on your mobile phone and the prediction of stock prices.

Or in other hilarious applications like the Trump tweet generator —which is really not too different from STFLG.

Anyway, the Markov chain is a very simple algorithm that is often heavily obscured by unnecessarily complicated mathematical notation.

Let me instead attempt to explain Markov chains using a lyric from Crazy in Love by Beyoncé:“<START> uh oh uh oh uh oh oh no no <END>”Let’s say that we want to create a lyric in the same style as Crazy in Love.

Our algorithm would have to do two things: (1) learn the style of the song, and (2) generate a lyric with that style.

Remember, the goal isn’t to exactly reproduce the lyric but to create a new lyric that is similar in nature.

Specifically, (1) can be achieved by studying a corpus (a body of text) and calculating the probability of a word occurring given that another word has been chosen.

This sounds like abstract gobbledegook so let’s look at a concrete example.

In the lyric above, we know that “oh” comes after “uh” 100% of the time.

We also know that “oh” comes after “no” 50% of the time, and “oh” comes after “oh” 25% of the time.

Given these probabilities, we can say the following:If the current word is:“uh”, the next word is “oh” — 100% of the time“oh”, the next word is “uh” — 50%, “oh” — 25%, “no” — 25% of the time respectively“no” , the next word is “no” — 50%, “<END>” — 50% of the time respectively“<START>”, the next word is “uh” — 100% of the timeWe have just created our Markov chain!Now let’s perform step (2): generating a new lyric using our Markov chain.

Let’s start with “<START>” and roll a die to pick the next word.

Since the next word that comes after “<START>” is “uh” 100% of the time, we choose “uh”.

Now we roll a die to pick our next word.

The next word that comes after “uh” is “oh” a 100% of the time too, so we pick “oh”.

We now have:“<START> uh oh”Here’s where it gets interesting.

Given that we have the word “oh”, our next word could be “uh”, “no”, or “oh”.

Let’s say that we roll a die at this moment in time — the word happens to be “oh”.

But if we re-did this entire process, the word that was chosen could be “no” or “uh” instead.

And as such, we could have:“<START> uh oh uh…” — 50% of the time“<START> uh oh oh…” — 25% of the time“<START> uh oh no….

” — 25% of the timeAnd given that there are now three different possible outcomes when we have three words, we can have six outcomes when we have four words:“<START> uh oh uh oh” — 50% of the time“<START> uh oh oh uh” — 12.

5% of the time“<START> uh oh oh oh” — 6.

25% of the time“<START> uh oh oh no” — 6.

25% of the time“<START> uh oh no no” — 12.

5% of the time“<START> uh oh no <END>” — 12.

5% of the timeAfter some #quickmaths, we get the original lyric:“<START> uh oh uh oh uh oh oh no no <END>” — 0.

78125% of the timeSkrrat, skidi-kat-kat.

Boom.

However, we get lyrics that are similar but not exactly the same (the differences are highlighted in bold):“<START> uh oh uh oh uh oh uh oh no no <END>” — 0.

78125% of the time“<START> uh oh uh oh oh uh oh no no <END>” — 0.

390625% of the time“<START> uh oh uh oh uh oh uh oh no <END>” — 3.

25% of the timeSkidiki-pap-pap, and a pu-pu-pudrrrr-boom.

 Skya, du-du-ku-ku-dun-dun.

#quickmaths.

Sorry, I got distracted.

It is interesting to note that in this toy example where there is only one lyric in the corpus, there is an exactly 0.

78125% chance that the original lyric is generated, but a much higher (4.

3%) chance that any of the 3 above mentioned similar lyrics is generated.

And so as the Markov chain grows, the generated lyric is unlikely to be exactly the same as the original, but it is very likely to be similar.

This is the intuition behind Markov chains.

Improving our algorithm: second-order Markov chainsThe algorithm described above is known as a first-order Markov chain; it works decently for the most part, but breaks in some cases.

Consider the following example.

We have generated three words, “I ate a”.

The first-order Markov chain attempts to find the next word.

However, it has five choices it can pick from — not all of which make sense to a human reader.

First-order Markov chainCurrent word: “a”Next word: “duck, window, giraffe, breeze, carrot”But the algorithm is just as likely to pick “window” as it is “duck”.

I don’t know about you, but I don’t quite consider myself a connoisseur of windows.

This is where the first-order Markov chain trips up.

We can improve the generator by instead using a second-order Markov chain.

The second-order Markov chain is similar to the first-order Markov chain described above, except for one crucial detail.

Instead of choosing the next word based on what the previous word was, a second-order Markov chain chooses the next word based on what the previous two words were.

Consider the following example:Second-order Markov chainCurrent two words: “ate a”Next word: “duck, carrot”Second-order Markov chains are more likely to generate believable articles than first-order ones since they have more “historical context”.

One could conceivably implement higher-order Markov chains to improve the prediction ability of the generator.

And that is indeed the case for applications in stock price or protein folding prediction, where one would like as accurate a prediction as possible.

But in our case, we just want something that is believable — and not the exact replica of an existing Straits Times forum letter.

ResultsI very diligently studied thousands of the Straits Times Forum Letters and was able to create a second-order Markov chain capturing the “style” of the forum letters.

I then generated my own articles using the above-mentioned second-order Markov chain — you can play with it here: Straits Times Forum Letter Generator.

Here are some of my favorite articles generated by STFLG.

I genuinely think some of these are poetry.

Disclaimer: none of these articles reflect any of my viewpoints… or really anyone’s.

Post to work together.

In the ward, if you are an imposed stringent code of conduct and watermark for peace in the many government grants and giving due regard to our environment immediately, but was not granted, and perhaps more outdoor seating provided for people attempting suicide; Sept 9).

It is arguably the most effective way of life, is unreasonable.

While most of us need to use Tasers to restrain the violent or weapons-wielding subjects.

Instead of begrudging ministers their high salaries.

Schools, need to respect and space requirements surface.

For instance, a sunset industry.

I may like hawker food remains affordable for all.

There is nothing more than 400 SCCs in schoolsWhile investing in the national feedback unit — guarantee access to affordable, quality goods and people fighting depression are labelled “spoilt” or “weak”.

Perhaps recess in Singapore.

Having account, ATM card can protect and facilitate but do not own a car.

They help Singaporeans understand the key driver of the taxpayer in entering into the public eye.

The estimated payout provided last month because the leaders were one with extremely few natural resources too, to ensure that these agencies have to continue increasing land sales programme without fixing the inherent problem, while dampening market sentiment with a bank account for FDWs to access social assistance to persons with disabilities to walk to a minority of Singaporeans who did well in the workforce remains relatively low property prices.

We take pride in securing this top honour.

Week’s Letter #1: Time S’pore responded like-for-like to Malaysia at Johor’s request; Jan 6).

I also wonder how much allows us back into the carriage and found it to mean targeting specific groups of National Development.

Working preserve, enhance Pulau Ubin’s living community which renders the island are organising “eat all you can” durian tours to Malaysia.

Hence we should adopt.

The recent cyber attack, I realised that it will be brave and strong corporate governance, often underpinned by technology.

The Ministry of Social and Family DevelopmentLions dearth of strikersSingapore’s biggest resource is its strength.

I have seen how these have been using gross domestic product (GDP) and GDP per capita utility usage rate is more objective to better understand the effectiveness of past and the appointment reminder sent via SMS.

We believe that a fractured Asean will respond to something they did not cease with the latest case of SNEC, its charges periodically to determine if tax avoidance based on their food.

Bit by bit, we can all be able to read about the best recruit who I voted for was the photograph that accompanied it — few would disagree with Mr Seah Yam Meng’s letter (Airport-like checks at our wet markets in the comfort of their PMDs, manage indiscriminate parking of cars drastically.

TakeawaysThis was another fun-and-frivolous™ project brought to you by yours truly (mostly) because I am sick in bed.

Oh the things software engineers do for fun/instead of recuperating.

Here are my takeaways:I was surprised at how easy it was to implement my own second-order Markov chain.

I might look back in a year or two and regret making a program that emulates the collective hive-mind of The Straits Times Forum.

If you enjoyed this, do check out my other pieces, like the one about a Machine Learning Chicken Rice Classifier that I made.

.

. More details

Leave a Reply