Dissociated Press in PHP

My implementation of Dissociated Press; the classic Markov Chain algorithm.

Dissociated Press is a potentially hilarious algorithm. Just give it a corpus of text, and watch it generate madness.

It started while reading this Daily WTF post when I decided it would be fun to implement my own Markov chain algorithm.

Here an example of the output using the previously mentioned DailyWTF article as a corpus:

Gone! You've killed our had finished typing in so he buckled down, business platform?"
And ran out of He told him that backend.  At the time, everyone in the office ecstatic too,
"This is one day before release, a new job.  Five just have a bad-word minutes later he
came find.  He plugged in pronounces it wrong, differently.

Here's how the algorithm works:

1. Break up the corpus into a series tokens, in my case words and some punctuation.

2. Jump to a random token.

3. Select the four next tokens and output them.

4. Find the other occurrences of those four tokens in the corpus. Randomly select one of those occurrences and jump to that location in the text.
a. If 2 or less occurrences are found, jump to a random token instead to avoid loops.

5. Repeat steps 3 and 4 until we've outputted enough.

My implement also tries to correct the formatting of the outputted text, as well as end on a complete sentence (or more accurately, end with a period and hope its a sentence). It also has a few more bells and whistles, but you'll need to read the source to find out what.

Give it a try HERE or check out the source HERE.