Previously, I had my web site hosted on Azure with my DNS held at GoDaddy. These were fine UNTIL I lost my free credits. Once that happened I was getting charged ~\$50/month for my site. Since I don’t really do anything with it (besides a place to host files for my mobile apps) I didn’t want to pay for it.

Fast forward a year and I was messing around with hosting a trained TFv2 model in the AI-Engine. I would then interface with that model for a gambling web site I am messing around with. I get into this in a later post but it didn’t go well.

While I was in the Google Cloud console I started to look at hosting my web site. I saw that I can deploy to a bucket and then have DNS redirect.

First, I created a bucket that is named after my site. I then downloaded the files from Azure and uploaded them to the bucket. I changed the permissions and we were off an running.

Second, I needed to get my DNS moved so that I could just type in http://www.EckronSoftware.com and it would redirect. This was a few more steps but I had to go to GoDaddy and “unlock” my domain. After that was done I could set up a transfer. Finally, I was able to import it to Google.

Third, I needed to create the CNAME entries so that DNS would know how to direct it. This took a bit to refresh but in the end it worked just fine.

Finally, since I am a developer and I like to automate, I followed this page on how to set up a trigger so that every time I committed to GitHub it would deploy to Google.

I have decided to move my lift to the Google Cloud. Before, my web site was hosted in Azure through GoDaddy. I have since moved my HTML to a Google Cloud bucket and my domain to their domain service.

Over the next few weeks I will discuss what I have done and why. Including trying to put my DevPost submission in the cloud before Iowa gets finished approving legalized gambling.

My first post will be moving my web site to the cloud. This includes updating my DNS, implementing deployments through GitHub, and setting permissions.

My second post will be trying to use the AI-Engine with TensorFlow v2

My third and final post will be about moving my DevPost project to the cloud and using TensorFlowJS.

Information Gain in Decision Trees

One of the ways to determine HOW to break apart a decision tree is by using the Information Gain on that column. The idea is that it tells you how much you gain by splitting on that column. For example, if you split and ALL of the values in the left child have the same value in the Target column you don’t gain anything and your value is 0. If they are split in the middle you get a gain of 1.

The equation is:

$IG(A) = Entrophy(Target-or-Prior) - Remainder(A)$
This breaks down to:
$Entropy(Target-or-Prior) = I(P_1,P_2,..P_n) = \sum_{n=1}^{n} -p_i log_2 p_i$
$Remainder(A) = \sum_{i=1}^{v}\frac{a_i + b_i}{a+b} I(\frac{a_i}{a_i + b_i};\frac{b_i}{a_i + b_i})$

Now that we have the ugly formula out of the way lets put in some actual numbers to determine how we want to split on this tree

 Sample Clouds Sun Rain 1 Yes Yes No 2 Yes Yes No 3 No Yes No 4 No No Yes 5 Yes Yes Yes 6 No No No 7 No No Yes 8 Yes Yes Yes 9 No Yes No 10 Yes No No

In this case ‘Rain’ is our Target column. We have 6 no and 4 yes values. Our first step is to find the Entropy of Rain. We then need to find the Remainder of Clouds and Sun. Finally, we find the Information Gain and use that to break out tree.

Entropy of Rain

If we remember $I(P_1,P_2) = \sum_{i=1}^{n} -p_i log_2 p_i$

We can now put in the actual numbers:

$I(\frac{6}{10},\frac{4}{10}) = - \frac{6}{10} log_2 \frac{6}{10} + - \frac{4}{10} log_2 \frac{4}{10}$

$I(\frac{6}{10},\frac{4}{10}) = -0.6 * -0.7 + -0.4 * -1.3$

$I(\frac{6}{10},\frac{4}{10}) = 0.42 + 0.52$

$I(\frac{6}{10},\frac{4}{10}) = 0.94$

We know that the Entropy of Rain is 0.94.

On a side note, $I(0,1) = 0$ AND $I(\frac{1}{2},\frac{1}{2}) = 1$.

Remainder of Cloud and Sun

If we remember the formula is $Remainder(A) = \sum_{i=1}^{v}\frac{a_i + b_i}{a+b} I(\frac{a_i}{a_i + b_i};\frac{b_i}{a_i + b_i})$

Let get some actual numbers and break down the formula.

 Cloud No Yes No 3 3 Yes 2 2
 Sun No Yes No 2 4 Yes 2 2

For the remainder we need to calculate the number of Yes/No for each of the Yes/No from Rain. These are the values in the grids above.

$Remainder(A) = \sum_{i=1}^{v}\frac{a_i + b_i}{a+b} I(\frac{a_i}{a_i + b_i},\frac{b_i}{a_i + b_i})$

$Remainder(Cloud) = \frac{6}{10}I(\frac{3}{6},\frac{3}{6})+\frac{4}{10}I(\frac{2}{4},\frac{2}{4})$

WHERE $\frac{6}{10}$ comes from the number of No’s total. $\frac{3}{6}$ comes from the number of No’s that are ALSO No’s in the Rain column. The second $\frac{3}{6}$ from the number of Yes’s that are ALSO No’s in the Rain column.

$\frac{4}{10}$ comes from the number of Yes’s total. $\frac{2}{4}$ comes from the number of No’s that are ALSO Yes’s in the Rain column. The second $\frac{2}{4}$ from the number of Yes’s that are ALSO Yes’s in the Rain column.

$Remainder(Cloud) = \frac{6}{10}(-\frac{3}{6} log_2 \frac{3}{6} + -\frac{3}{6} log_2 \frac{3}{6} )+\frac{4}{10}I(-\frac{2}{4} log_2 \frac{2}{4} + -\frac{2}{4} log_2 \frac{2}{4} )$

$Remainder(Cloud) = \frac{6}{10}(-.5 * -1 + -.5 * -1)+\frac{4}{10}I( -.5 * -1 + -.5 * -1 )$

$Remainder(Cloud) = \frac{6}{10}+\frac{4}{10}$

$Remainder(Cloud) = 1$

Now, lets do Sun

$Remainder(Sun) = \frac{6}{10}I(\frac{2}{6},\frac{4}{6})+\frac{4}{10}I(\frac{2}{4},\frac{2}{4})$

WHERE $\frac{6}{10}$ comes from the number of No’s total. $\frac{2}{6}$ comes from the number of No’s that are ALSO No’s in the Rain column. $\frac{4}{6}$ from the number of Yes’s that are ALSO No’s in the Rain column.

$\frac{4}{10}$ comes from the number of Yes’s total. $\frac{2}{4}$ comes from the number of No’s that are ALSO Yes’s in the Rain column. The second $\frac{2}{4}$ from the number of Yes’s that are ALSO Yes’s in the Rain column.

$Remainder(Sun) = \frac{6}{10}(-\frac{2}{6} log_2 \frac{2}{6} + -\frac{4}{6} log_2 \frac{4}{6} )+\frac{4}{10}I(-\frac{2}{4} log_2 \frac{2}{4} + -\frac{2}{4} log_2 \frac{2}{4} )$

$Remainder(Sun) = \frac{6}{10}(-.33 * -1.58 + -.66 * -.58)+\frac{4}{10}I( -.5 * -1 + -.5 * -1 )$

$Remainder(Sun) = \frac{6}{10}(.52 +.38)+\frac{4}{10}(1)$

$Remainder(Sun) = \frac{6}{10}(.9)+\frac{4}{10}(1)$

$Remainder(Cloud) = .54 + .4$

$Remainder(Cloud) = .94$

Information Gain

Remember $IG(A) = Entropy(Rain) - Remainder(A)$

$IG(Cloud) = 0.94 - 1 = -0.06$

$IG(Sun) = 0.94 - 0.94 = 0.0$

So, Sun has the highest Information gain and that is how you should start your decision tree.

Now, I made this simple since I only had 2 columns but if you had more columns you would need to do this again using Sun as the Target and build off of that.