QwikLabs: Intro to ML: Image Processing

As stated in my previous post, I was given 1000 credits (~$1000) for QwikLabs. Today, I finished my first “quest”. It was titled Intro to ML: Image Processing.

I will restate this but I LOVE how QwikLabs are set up. They give you an entire Google Cloud account so you don’t have to mess with your account and get unwanted billing and other changes. Once the lab is done, the account gets deleted and you go on your way.

This lab covered a few different aspects of Google Cloud. First, the console. If you are familiar with working in Linux this is a simple transition. Second, we work with the storage system called “Buckets”. We mess around with some pretty simple permissions as well as uploading files for processing.

The part that I liked was using the AI-Engine to host a trained model. It was a super simple model but since I failed last time it was cool to see it work as expected. Plus, they showed how to host and access externally. This will definitely be something I do once they start supporting TFv2.

The last few section was using the API to process images. The first was a simple recognition. This stood out because you could change the calling JSON and have it return internet articles that contained the same image. Second, we processed an image to determine people’s faces and possible emotions as well as landmarks. Finally, we processed a sign with some French text on it. We were able to translate it to english as well as add some more processing that would give us information (links, etc.) about what was printed.

Overall, VERY COOL first lab. I will get started on my next round of cloud training soon.

Advertisements

Pluribus: Facebook and Carnegie Mellon’s Poker AI

Article: https://science.sciencemag.org/content/early/2019/07/10/science.aay2400

There are few things for frustrating to me in the machine learning/AI world than seeing Buzzfeed type companies write about ML/AI. Sites like that burn down the actual facts into clickbait titles. It seems any “learning” that occurs is a step away from Skynet. I honestly don’t have a single site that I trust that I can go to and see the facts displayed. I always have to fall back on the technical paper that was hopefully written or something from the actual authors.

I come from an ML background with very little AI experience but most of the latest advancements are similar to Reinforcement Learning that I can piece it together. So, this is my attempt to do just that.

Poker and Machines

I have long wanted to be part of something that would be able to “solve” poker. The entire state and action space fascinate me. I was always of the opinion that with infinite knowledge you could beat emotional players. It always seemed the best and most consistent players were the mellow mathematicians that approached the game like a math problem. At each step, there are known percentages on different plays. A computer with a much larger memory than we could ever hope could have all of this available.

With Google doing so well with Chess and Go I figured it was fairly close to getting poker. Obviously, poker doesn’t have “perfect” information since you don’t know what the opponent has in their hand.

Learning From Machines

My biggest excitement going forward is how humans can learn from machines. It has been stated that the top chess players have learned a few new opening strategies after playing AlphaZero.

EXCLUDING THE ETHICAL ASPECT, I am curious to see what we could learn on the battlefield from an advanced war simulation. If there is something out there that could save lives during our ongoing battles that humans have never even thought about.

Evan’s Summary

Discretization

From what I can tell, the researchers at CMU had the same problems that early Q-learning did where the action space and state space were too large to handle in a traditional array. Q-learning went towards neural networks eventually but early on they did discretize the input. That is what Pluribus is doing with the actions and information.

For the action state they group the bet sizes (think $105 is the same as $100 or $110) and then during play they use a search algorithm to narrow down the decision.

For the information collected they group similar hands together IN FUTURE ROUNDS since they are played the same. But, in the current round they use the exact hand.

Offline Training (Blueprint)

To build the offline model they used what is common called counterfactual regret (CFR) minimization. Basically, after the hand is over they go back through and decide what “should” have been done and how it affected the outcome. What is new (to me at least) is they used a Monte Carlo simulation to sample actions versus traversing through the entire game. Because they have all the players in the offline game the AI knows all of this information.

Counterfactual Regret

CFR guarantees the overall interactions converge to a nash equilibrium in a zero-sum game.

CFR guarantees in all finite games that all counterfactual regrets grow sublinearly in the number of iterations. This, in turn, guarantees in the limit that the average performance of CFR on each iteration that was played matches the average performance of the best single fixed strategy in hindsight. CFR is also proven to eliminate iteratively strictly dominated actions in all finite games

Superhuman AI for multiplayer poker

Training Time

According to the paper the training was done of 8 day on a 64-core server with 12,4000 CPU core hours. They used less than 512 GB of memory. The assume current cloud rates and said it would cost them about $144 to produce.

Playing Strategy

Because the offline playing is “course” because of the complexity of poker they only use it for the first round where they considered it safe to use. After that, or if the player confuses it by betting oddly, they use real time search.

While real time search has been successful in perfect information games there is a problem in imperfect information games. The paper says it is “fundamentally broken”. There is example is Rock/Paper/Scissors. In my previous blog post I showed that the expected value of each move is 1/3. Using the search algorithm you would think each move is the same and will just pick scissors each time. This would cause an issue as the other player would know this and win every time with rock.

There were 2 alternatives that were used in the past. AI DeepStack determined the leaf nodes value based on the strategy used to get to the leaf. This doesn’t scale with a large tree. The other alternative was used in AI Libratus who would only use the search when they could extend to the end of the game. This wouldn’t work with the addition of extra poker players.

Pluribus instead uses a modified form of an approach that we recently designed—previously only for two-player zero-sum games (41)— in which the searcher explicitly considers that any or all players may shift to different strategies beyond the leaf nodes of a subgame. Specifically, rather than assuming all players play according to a single fixed strategy beyond the leaf nodes (which results in the leaf nodes having a single fixed value) we instead assume that each player may choose between k different strategies, specialized to each player, to play for the remainder of the game when a leaf node is reached.

Superhuman AI for multiplayer poker

They used 4 strategies that were all based on the baseline blueprint strategy. The first was the actual blueprint, the second was the blueprint with a bias towards folding, the third is a bias toward raising, and the fourth is a bias towards calling.

Bluffing

Another issue of the imperfect game is around bluffing. The optimal strategy with the best hand it to play and the worst hand is to fold. If you do this all the time the other players will know what you have. To help solve this, Pluribus determines the probability of reaching the current point with each possible hand. It then creates a balanced strategy and does that action.

Testing

Pluribus was tested against elite human players (won at least $1 million playing poker) in 2 formats. The first was 5 v 1 with 5 players and the second was with 5 Pluribus instances.

The measurement was the number of milli big blinds per game. This is the measurement of the number of big blinds won per 1k hands of poker. Pluribus ended up winning 48mbb/game which is considered very high and shows it is “better” than the players it was against.

Conclusion

I hope this helps clear the hype around most articles written about the subject and maybe a little easier to read than the Science article linked above.

I am excited to see if this will lead any poker players to change their game or if this will kill the online poker world.

GDE Perks: QwikLabs Credits

One of the many perks of being a Google Developer Expert is that we get credits for many of their products. The most recent one was 1,000 credits for QwikLabs. It is a training site that Google bought recently. The main feature that stuck out to me is that they create temp Google Cloud accounts so that you are working in a real environment without having to mess with your existing account or getting charge.

We also get Google Cloud credits and I will speak on that more in the future when the AI Engine supports TFv2 models and I can get my NCAA basketball predictor running.

Derivatives

I have always been bad at calculus. I think it all started with I was locked into a good grade in college and I stopped going to class. When my next calc class started I had a trash teacher and have been digging myself out ever since. Because of this, I haven’t been able to figure out what the heck is going on with gradients while working with TFv2.

I would see that we would have 4 x 4 equal to 16 but when we did a gradient on that same thing we would get 8. So, here is my attempt to write this out and explain what is going on. I also have a slightly longer Colab Notebook on GitHub.

First, we have the equation y = 4 and z = y^2. We then try and find the derivate \frac{d_z}{d_y}.

If you know calculus, which it appears EVERY web site I go to assumes you do, you can see that to find this derivative you use the power rule. This states that you can convert x^n to nx^{n-1}

Using that, we can see y^2 becomes 2y^1 and since y=4 we get 8

Now, to the second example that uses a cube. x = 2, y = x^3, and z is \frac{d_y}{d_x}. This then gives use 3x^2 and using $x = 2$ we get 8

Hopefully, this clears up what is going one. If not, you can just call me a dummy like I am sure everyone else already does when I try and do calculus.

Google Cloud: TensorFlowJS

Check out my working site: http://bet.eckronsoftware.com!

After failing in my last blog to get my trained model in the cloud I started looking at TensorFlowJS. It is a JavaScript library that will let me interact with TensorFlow.

Good new: It works. Bad news: It doesn’t work on a mobile browser. I have a GitHub Ticket (1586) open hoping that it will be cleared up. The issue is wild though, using the same JS library and the same model in the same bucket in the same cloud returns different values if you use a phone.

Anyway, here is how I got everything up and running.

First, I installed TensorFlowJS in my Google Colab notebook. I was then able to use that to create a save
tfjs.converters.save_keras_model(restored_model, location).

Second, I created another bucket like I did in my previous post. In that bucket I uploaded my plain HTML file and then a folder for my scripts. In the scripts folder I added my model.

Third, in my HTML file I needed to call out to a different server to load TensorFlowJS :
<script src=”https://cdn.jsdeliver.net/npm/@tensorflow/tfjs@1.0.0/tf.min.js&#8221; />

Fourth, I needed to figure out how I was going to code this bad boy. I don’t do much JavaScript so this was a lot of searching. One of the main pages was a tutorial that helped.

Finally, I needed to figure out what in the world a Promise in JS is. Turns out, it is just an async return type similar to a Task in .NET. Once I got this loaded on start up and then called the model on a click I was good to go.

In the near future, I am going to clean up this site as well as allow the user to select 2 teams and see what the model things will happen. Stay tuned!

Google Cloud: AI-Engine

After completing my DevPost project and hearing that Iowa has legalized sports betting I decided I should take my model out of Google Colab and into the magical cloud.

My plan was to follow a few web sites and upload my model. This didn’t go well. As listed in the GitHub issue at the bottom. There is an issue with TFv2 and the AI-Engine.

But, here are the steps I went through to get to a place where once a bug fix is in place I will be up and running.

First, I had to create a model that would work. My assumption that my original h5 file was good enough was wrong. I needed to use the ‘Saved_Model’ method. Well, in TFv2 this was moved from contrib to experimental. Once this was cleared up I created the model and exported it from my Colab notebook.

Second, I created a Cloud Bucket to host my model. This was straight forward.

Third, I needed to test my model. I had to create a text file that would host my input parameters so I could test locally. Then, using the gcloud command line I called ai-platform and got an error message. It turns out that the way the model was built wasn’t compatible with how AI-Engine expects.

Finally, I just created the GitHub issue and asked around in my Google slack channels. It turns out it is an open issue. The real problem is that since this is between the cloud and TF I am not sure who blinks first and has to change it.

Feel free to watch the issue to see if/when this gets figured out.

GitHub Issue: https://github.com/tensorflow/tensorflow/issues/28708

Google Cloud: Static Web Site

Previously, I had my web site hosted on Azure with my DNS held at GoDaddy. These were fine UNTIL I lost my free credits. Once that happened I was getting charged ~$50/month for my site. Since I don’t really do anything with it (besides a place to host files for my mobile apps) I didn’t want to pay for it.

Fast forward a year and I was messing around with hosting a trained TFv2 model in the AI-Engine. I would then interface with that model for a gambling web site I am messing around with. I get into this in a later post but it didn’t go well.

While I was in the Google Cloud console I started to look at hosting my web site. I saw that I can deploy to a bucket and then have DNS redirect.

Following this page I went to work.

First, I created a bucket that is named after my site. I then downloaded the files from Azure and uploaded them to the bucket. I changed the permissions and we were off an running.

Second, I needed to get my DNS moved so that I could just type in http://www.EckronSoftware.com and it would redirect. This was a few more steps but I had to go to GoDaddy and “unlock” my domain. After that was done I could set up a transfer. Finally, I was able to import it to Google.

Third, I needed to create the CNAME entries so that DNS would know how to direct it. This took a bit to refresh but in the end it worked just fine.

Finally, since I am a developer and I like to automate, I followed this page on how to set up a trigger so that every time I committed to GitHub it would deploy to Google.