Monte Carlo Simulations and Explorations

I first learned about the Monte Carlo method of calculating answers to probabilistic situations in university as part of a class on numerical methods. To quickly summarize: given some scenario with non-trivial but easily understood base probabilities, rather than going through the complicated process of determining the concrete answer of any particular question, in a Monte Carlo simulation you instead make a large number of observations based on the known probabilities to answer the question.

Perhaps more illustrative is to look at some example scenarios:

Given a random 5-card draw from a 52-card deck, how many hands would lose to a pair of Jacks?
Given the final ranking of each NFL team from the previous year, what are the odds of any particular team making the playoffs?
Not exactly probabilities, but, given a function f(x) = 7x^2 + 3x – 4, what is its integral in the range -47.5, 81?

The Wiki page linked above has more examples, such as its (albeit limited) use in the Manhattan Project. I think it’s a cool technique to have in one’s arsenal, as it’s pretty easy to whip up an impromptu simulation for whatever you’re looking at and doesn’t require any libraries as long as your language’s standard library has some kind of random number generator. Even better, since you’re doing the equivalent of throwing paint at the wall to see what patterns show up, the RNG doesn’t need to be cryptographically secure! So in the worst case you can just pipe in from /dev/urandom and the effect on the end result will be immaterial.

Final Fantasy 14

I’ve recently been playing a lot of FF14 and working on the crafting and gathering jobs. In patch 5.21, the most current version of the Diadem gathering instance was released, and I spent a lot of time in this repetitive, monotonous zone. While I zoned out to ultrawide monitor reviews on Youtube, I also wanted to know the approximate pace of leveling from the Diadem experience, but rather than just getting an average number of gathering attempts to level, I wanted to produce a rough distribution that could also give a range of expected attempts required and the density of each quantity in the range.

The simulation process was as follows:

Repeatedly simulate acting on a gathering node, accumulating experience while doing so and keeping track of how many nodes have been used
When the required amount of experience to level is accumulated, record the number of nodes
Repeat the above for some large-ish number of iterations, such as 10k or 100k The most interesting part is in the “simulate a gathering node”, which has a few stochastic aspects:
- Gathering a resource from a node has a base probability that determines the chance of success for a basic gather, which will only give you 1 of the resource
- There is also a Gatherer’s Boon probability for you to get 2 of the resource when you succeed the basic gather, which also has a 100% exp bonus
- Consecutive gathering successes have an experience bonus: 2 in a row is 5%, 3 is 10%, 4 is 20%, and 5+ is 50%

So, it’s a little less trivial than just dividing required XP by XP per gather, which would give an upper bound of basic gather successes (and which could be divided by 5 to get the number of nodes for that upper bound). Still, nothing too complicated for a few for loops!

// let me add this later, it's on another machine...

Japanese Kanji Combinations

Another silly idea I had at some point was to generate a bunch of 2-length kanji sequences and see how often it produced a valid 2-length word in Japanese. The answer is: basically never. Still fun, and pretty simple to code up in Elixir:

defmodule WordsFromKanji do
  @kanji_range "\u4e00\u9faf"

  @doc """
  Generate a random kanji sequence of the given `length`.

  ## Examples

      iex> WordsFromKanji.generate_sample() |> Enum.count()
      2
      iex> WordsFromKanji.generate_sample() |> to_string() |> String.valid?()
      true
      iex> WordsFromKanji.generate_sample() |> to_string() |> String.printable?()
      true
  """
  def generate_sample(length \\ 2) do
    [first, last] = String.to_charlist(@kanji_range)
    for _i <- 1..length do
      :rand.uniform(last - first) + first
    end
  end
end

defmodule WordsFromKanji.WordList do
  def read_words(filename \\ "./data/kanji_words.txt") do
    filename
    |> File.read!()
    |> String.split("\n")
  end
end

With these two modules, you can pretty easily open up an iex -S mix session and try it out:

iex(1)> words = WordsFromKanji.WordList.read_words()
["〃", "仝", "々", "○", "〇", "ＡＢＣ順", "ＣＤプレーヤー",
 "ＣＤプレイヤー", "Ｎ響", "Ｏバック",
 "ＲＳ２３２ケーブル", "Ｔシャツ", "Ｔバック",
 "あうんの呼吸", "阿吽の呼吸", "あ・うんの呼吸", "明白",
 "明白", "偸閑", "白地", "明かん", "悪どい", "論う", "遇う",
 "配う", "馬酔木", "彼処", "彼所", "あっという間に",
 "あっと言う間に", "あっとゆう間に", "アッという間に",
 "アッと言う間に", "アッとゆう間に", "彼の", "あの人",
 "彼の人", "あの方", "彼の方", "溢れる", "阿呆陀羅", "甘子",
 "天魚", "雨子", "𩺊", "彼", "彼れ", "いい加減にしなさい",
 "いい年をして", "否々", ...]
iex(3)> for i <- 1..1000 do
...(3)>   WordsFromKanji.generate_sample() |> Enum.member?(words)
...(3)> end |> Enum.frequencies()
%{false: 1000}

I think an interesting next step for this question is to instead, find all the 2-length words in the word list, then take one, start with it's first kanji, then generate a second kanji and see how often that produces a valid word. Because the number of unique first kanji out of the ~200k words is only 4,600, we're pretty unlikely to produce one given there are about 21k kanji in the generation range. And if we don't produce a valid kanji 1, we're unable to produce a valid word.

iex(4)> words |> Enum.filter(&(String.length(&1) == 2)) |> Enum.count()
60332
iex(5)> words |> Enum.count
199107
iex(7)> words |> Enum.filter(&(String.length(&1) == 2)) |> Enum.map(fn x -> String.at(x, 0) end) |> Enum.uniq() |> Enum.count()
4653 # this is all of the unique starting kanji

So, with this in mind, let's try it!

defmodule WordsFromKanji do
  # in addition to the same stuff as before
  def revised_samples(first_kanji_list) do
    first_selected = first_kanji_list |> Enum.count() |> :rand.uniform()
    first_selected = Enum.at(first_kanji_list, first_selected)
    [first, last] = String.to_charlist(@kanji_range)
    second_selected = :rand.uniform(last - first) + first
    first_selected <> <<second_selected::utf8>>
  end
end

# in iex

iex(39)> for i <- 1..1000 do
...(39)>   Enum.member?(words, WordsFromKanji.revised_samples(start_kanji))
...(39)> end |> Enum.frequencies()
%{false: 1000}

Alas, still no hits. We'll probably need to try running it a lot more to find a hit...but I shall leave that as an exercise for the reader!

Summary

Having questions about how the world and the various contraptions we've put into it can lead to pretty interesting explorations. I hope this post empowers you to fire up an editor and write up some quick-'n-dirty simulations to aid your explorations, be it a Monte Carlo kind of exploration or anything else!

Stephen Hara