Transcript
Employees unlock up to ?1,000 tax free with a new OptionsCard digital gift card. With OptionsCard, there's no fees and no fuss. Your full balance is yours for up to 5 years. Shop your favorite brands and see your balance at all times in your mobile wallet. It's simple to buy and simple to use. Send instantly by email. No admin, registration or forms required. You can even regift and share your options card with family and friends. Buy now at optionscard.au. There is a scandal that is so big that even Elon Musk has tweeted about it, and it's been trending on Twitter, Google, YouTube. And it's a scandal in the chess world, not the most likely world for scandals. But essentially, the world champion of chess, Magnus Carlsen, who I greatly, greatly admire. He's the probably the best chess player in history. I mean, he's just amazing. And he lost a game to a very young player, a 19 year old named Hans Niemann, who I also admire and whose games are very wonderful to watch. And there was an immediately afterwards, Magnus quit the tournament. He has never done that before. And every there was wild speculation. Why did he quit the tournament? Even Gary Kasparov, who's been on this podcast, who was a former world champion, said this was unprecedented and we need more we need to hear something from somebody about what is going on. But the implication was is that Carlson may have thought, we don't know what he thought, but he may have thought Hans Nyman cheated. First off, how do you cheat in live chess over the board? We'll discuss this in a second. But Hans Niemann did admit that when he was 12 years old and when he was 16 years old, he did some cheating online. So that is what is known. I bring on the world's greatest expert in chess cheating and he uses computer analysis to determine if people are cheating. He has analyzed 100 of thousands of games, tens of thousands of cases of alleged cheating, all the way going back to a world championship in 2,006 where one player accused another of cheating by using a computer in the bathroom. And Ken Regan is not only a computer science professor at the University of Buffalo who's done a lot of excellent work on chess cheating and and other things, but he's also an international chess master. He's a very, very strong player. He doesn't remember it, but in 1988, I played him a casual game, that when we just randomly met each other. And I lost that game. And I remember specifically him explaining how I lost the game and how I might have been able to won to to win. But now I'm so grateful to talk to him, however many years later, 35 years later, and talk to him about this this scandal given that he is the expert. And I think what he says is kind of the conclusive answer about what is happening in the chess world right now. And without further ado, here is Ken Regan. This isn't your average business podcast, and he's not your average host. This is the James Altucher Show. Professor Ken Regan and also international chess master Ken Regan, you are the world's expert on identifying computer chess cheating. You're you work with all the major online chess servers. You help tournament organizers for over the board tournaments as well. That's when people are playing live and in front of each other. You have been very successful identify both identifying cheaters and identifying people who are not cheating. And I have many questions about cheating in general, but I also have questions about the recent scandal Magnus Carlsen maybe. He didn't accuse anyone of anything, that's the genius of Magnus Carlsen, but he insinuated that a player he had played might have been cheating over the board or at least there's some suspicion. And, again, we don't know what Magnus Carlsen was was accusing anyone of, and it's and there's no evidence of anything. But maybe give a little bit of a lay of the land, and just in general, how do people cheat at chess? Okay. Well, I put all the different mechanisms by which people have cheated chess to a doctor Seuss rhyme, in my 2014 TEDx Buffalo talk. And I even left out one first, which was some had computers in their shoes or had them hidden in the loops. The reality of it is have you heard the one about how your iPhone is more powerful than the world's best supercomputer at 1993? Well, 1993 is only a little before when Deep Blue beat Garry Kasparov, and the fact definitely is that your phone can play chess better than I measured Deep Blue play. So I measured Deep Blue pretty well playing at 2850 level against Kasparov. But with a cell phone, you can be over 3,000, far out of touch of what any human on the planet, including Magnus Carlsen, is capable of sustaining for a long period of time. And, Kent, just to define some terms, chess is a rating system where, let's say, the average player is rated 1500, and every 100 to a 150 points higher or lower is another standard deviation. Meaning, if you're 1650, you could be to 1500 probably 2 out of 3 times. To put it clearly how good Magnus Carlsen is, he's 28100 which means he'll never lose to pretty much anyone except, like, one of the top 20 players in the world or top 100 players in the world. And he's at 28100 because Stockfish, which is the computer on leechess and chess.com, is probably around 35100 from what I understand. Yes. Here's my view of the world which AlphaZero upended to some extent. So does the the designed standard deviation of the rating system is 200 gila. That's at the source. At at the end, it depends on how many games you play. So the, the linchpin of the rating system is that if you are 200 rating points stronger than your opponent, then you expect to take about 75% of the gains. Now, actually, because of rating uncertainty, it's a little less. I could go into that, but that's the main idea. So now, this this 200 rating point 75 percent expectation is the notion of a class unit in the US rating system. That's why class a, class b, class c are all 200 rating points. Why? And the Hungarian writer, Laszlo Mero, abstracted this to other games. So the depth of the game is the number of class units from a beginning adult player to the world human world champion. By the way, this is a fascinating way to look at whether a game is, quote, unquote, interesting or not. So so, like, chess realistically probably has 15 or so, maybe more, maybe like 20 classes. Because at the higher levels, it's more a little more fine grained. At the time, Merrow and people like me, tabbed the beginning of the scale at 600. But we have scholastics where there are valid ratings below 100, and the 200 point difference is known to be still operating down there. So 100 is the USCF floor, but there are proposals to remove it. They don't want people to have negative ratings now. And so so some games that have by the way, some games that have rating systems very similar to this rating system is Ping Pong has a rating system. Yep. That works exactly the same. I believe backgammon Oh, it's it's wider than that. 538.com uses Elo ratings unadorned with exactly the same principle. So I'll get to that in a moment. Okay. So chess from from 600 to 28100 is 11 class units. Laszlo Mero, measured, backgammon and checkers at 10 class units. Japanese chess at 14 class units. Wow. And Go, The figure I saw was 25 to 40. Obviously, you should use the lower end of that scale, but, even so, alpha 0, alpha go busted it. But, anyway, the point is that this is a measure of the progress of Moore's Law on the software end. So it took about 8 or 9 years longer to beat Japanese chess than chess our chess because of the the three class unit difference. K? So you can phrase the software Moore's Law in terms of the number of class units per year that computers improve, and that's the conceit of a paper that I wrote by invitation for the Springer Verlag 10000 Lecture Notes of Computer Science Anniversary Issue called Rating Computer Science via Chess. So at any rate, cut it simple, our phones definitely outclass us by several class units, and I put 35 hun 36100 as a good estimate of where the best computer programs, running at standard time controls are now. So just to be clear then, like, if someone is at a tournament, regardless of their rating, but let's just say they're playing in a regular tournament near their town, they're 1800 rated, 1500 rated, they have a phone in their pocket, they go to the bathroom, they shut the bathroom door, put in the moves of their game, and the computer tells them a move, that move is gonna be the best move in that position. Almost certainly. Or certainly good enough. And to be clear again, there's, let's say, 2 types of cheating. I'll call 1 stupid cheating and the one more sophisticated. Stupid cheating is if you take every single move and run it through the computer. More sophisticated would be 3 or 4 times in the game, you go to the bathroom when you're a little unsure what's happening and you get the best move in that. And and and former world champion Viswanathan Anand has been known to say, even if you do that once, it can make it could result in a difference of a 150 rating points over time. And I think that is accurate. One bit is a 150 AML. I think it's accurate. And so in tournaments typically, like a big wide open tournament, they'll say no phones in the bathroom. They have trouble enforcing it, but they enforce it as best they could. And in a more sophisticated tournament, they'll even do detectors and search you and and so on for your phone. But, let's talk online cheating first, which is on chess.com. I could simply have my phone on next to playing, you know, on my computer on chess.com. Anybody could do this, and and cheating apparently is is very widespread. I don't wanna say it's the norm, but it happens more than one would think. Like, I regularly get emails from chess.com saying, we noticed someone you played was cheating. You got your rating points back. So I get that maybe, like, once every couple days. Right. Yeah. Unfortunately, the Bayesian prior rate of cheating in online chess is a 100 to 200 times higher than over the board. I can't even imagine, by the way, how to do it over the board other than the bathroom, I think I just described, which we've seen. There was a there was a case a few years ago, a grand master who was, suddenly went from, like, 25100 to high 26 100, like, in a matter of months. He there's a photo of him in the bathroom looking at his phone that he was hiding in his pants. Yeah. I don't know which one. There are several of those. So so so well, so yeah. I I'd bet you're talking about grand master Igor's rouses. Yes. That was in 2019. Yeah. There have been earlier cases. So so before we get to the current scandal, online cheating, how do you really detect it? Like, how how does one detect it? How does chess.com detect it? And I know they use you as a consultant or whatever. But Yeah. Chess.com has a multifaceted cheating system, and I generally always defend it. You could say that it has 2 or 3 prongs, of which only one prong overlaps what I do. So the the statistical prong involves the engine similarity of the plate moves, possibly taking into account the time control. But then there's also information that they gather through their interface. That is that is more of a trade secret, so I cannot go into that. Right. But what what if I were to guess, and and from little pieces here and there, is they see if you're swapping screens, and, you know, depending on the browser and the browser's API, my guess is they probably should have deals with other companies that have chess computers involved so they could see if you're switching to a screen with a chess computer running and and so on. My my guess is they're they're looking a little bit more at screen swapping or tab swapping during a game. Yeah. And there are 2 other things. So there are 2 common places that are publicly known that I can say without compromising anything. 1 is if you use a bot to execute your moves, that bot is gonna click on the same pixel every time relative to the square. So by clicking the dead center of the square. Okay? That's certainly not something that a human being using a mouse is able to do. So that pattern will get you detected in 3 or 4 moves. And then the other one is if you get in the habit of consulting something off your main screen, you might show that habit even when you have an obvious recapture. So a telltale little head delay of obvious recapture. Okay. So those are funny things, but it gets to the idea of of of and around those are very sophisticated Gaussian modeling, modeling the distribution of times actually taken by a human player to play an obvious recapture. This is what it's it's the, data that it's compared against. It's mathematically very similar to how the Higgs boson was detected by contrasting the bumps from the experiments involving likely decays of the Higgs boson to just ordinary background decays. So I see. So if there's an obvious recapture, you either do it instantly or for whatever reason, you're away from the board, there's an arbitrary amount of time it takes. It's not like every 3 seconds like clockwork, there's a move. Yes. Exactly. Because sometimes I'm not looking at this game that I'm playing. I'm maybe reading a Facebook post because there's an obvious forcing sequence happening, and I don't always make the recapture instantly, for instance. Yeah. But then it's more random, the time I take. Right. So the main thing about this is that online providers have access to much larger amounts of information than I do. I use only the moves in the game. And, curiously, I don't even use the timings of the moves simply because those are not always available. And, you know, 5 or so years ago, they weren't necessarily reliable either. So I I so my model is based on 100 of thousands of games between players of all ratings, but the sources for those games don't even give the time control of the tournament, let alone the times for individual rules. So I have no basis on which to model for contrast, so I just ignore that data. I have 2 questions about this. If chess.com is looking for how many moves did this player make that was exactly what a computer would make, there's 2 questions about that. One is, a, what if a player knowing what Viswanathan Anand said about you will need one move? What if a player only does this every 10 moves or or doesn't take the top suggestion, takes the 4th suggestion each time? That's a real challenge. Yeah. So so there's one mathematical thing that helps. So we all know the term flying under the radar, which is what you're describing. But in the physical, flying under the radar means you can keep a constant altitude. Okay. But in the statistics, you cannot keep a constant act altitude. If you cheat at the same rate, no matter what kind of fraction that rate is, like 1 eighth or 1 tenth of the time, if you do it long enough, eventually, you will catch a statistic. So so you're saying that one method of detection is to look at many games and see if there's a pattern where someone's always a 100% accurate every 6th move. Oh, well, not necessarily so regular. Instead, what I have is I'm able to measure the amount of discrepancy. And the point is that if you keep sending at a constant rate, ultimately, the deviation will go up. You have to taper off your sending at a rate proportional to the one over the square root of the number of moves. I don't think I understand. Like like what the radar. Yeah. I don't think I understand. So what's the detection technique there? It's just the laws of statistics. So a similar thing, if you have someone who's insider trading or, or, you know, making suspicious trade trades, You know, a small number of suspicious trades might fly onto the radar. But if the person keeps on making suspicious trades at a constant rate, ultimately, it adds up. But at what do you mean by a constant rate? Because you said also the rate might the the move the number of moves they wait before they consult a computer might change. Right. But if but if it the average stays the same, then the deviation keeps ramping up. So mean the average stays the same? Sorry. I'm If the average number of moves, could I go by between your consultation of the computer, it stays constant. Then So wouldn't a smart cheater vary up the the number of moves that Well, vary it up, but but also as a taper off the global rate. Like, maybe you know, I'm sorry. I I don't wanna give too much advice here, but let's put it this way. Yeah. But but, by the way, this is not a guide to cheating. We're just trying to Yeah. So in the in Ben Johnson's podcast, I gave a numerical example that basically was to the effect that if you cheat on 3 moves per game by 9 games, I can catch. I see. But but this leads to the second question. Let's say you, Ken Regan, you're an international master. You're a very strong player. Let's say you were playing the average tournament player. You would expect to have almost a 100% accuracy to the computer because they're right away gonna make weak moves, and you will make the obvious best move, which is would be the which would probably be the computer move. So so some and even when 2 players of equal ability are playing each other, I was just looking yesterday, 2 1000 rated players were playing with 84% accuracy to the computer. And that's because they're both making equally weak moves, so it's possible to make the best computer the the best computer move is also gonna be a somewhat not a weak move, but a way an easy way to exploit a weak move will be the best computer move. Yeah. Well, it it is true that the number one case where my results exonerate a player with a high matching percentage to the computer is when the the opponent played a 4 c game and left the player only one option to stay alive or only one option to win. And chances are a strong player and a computer are both gonna find that necessary move. This was the case with the original toilet gate accusation in 2006, Topolov versus Kreminik in the world championship match. And it was in particular in game 2. The the fact that Topolov was winning beautifully and then, did not press his advantage and then lost was, I think, the most upsetting thing to the Bulgarians. And it is true that for the last 32 of the 64 moves of the game, I reproduce claims that Kramnik matched over 90%. But most of those moves were completely forced. So this is in public on my website along with in bold green, the statistical principle involved. And in 16 years since, I've not had any reason to change it. And yet there are some kinds of moves that a computer will make that are very nonhuman like. For instance, they can make a move that seems obscure, but 11 moves later, you realize why it was important. But you no no human would have calculated why that move would have been important. So how do you know like, it seems like it would be easy to to detect cheating if you could detect any of those moves, but it's very difficult to determine whether a move is computer like or human like. Right. It is. What's interesting is I have gone for the minimalist approach of trying to infer that organically only from the numbers. So I have an objective non chess based measure of when a position is difficult or complex. And so I'm hoping to detect smart cheating by using a distribution that upweights complex positions and downweights positions with easy choices. So if it's not a forcing sequence, but there is a move that is significantly like, the first choice is significantly better than all the other choices, you would weight that more. That's right. Although, if it's a dead end game where there are 10 moves that are equal, but they all lead to draws, then I have to down weight that as well. So I actually weight by the amount of hazard in the position, the by the the probability and magnitude of losses that a misstep may incur. And that's the type of position when you would most want to call on a lifeline. So, so that that's my idea anyway. I did all this work in 2019, but in the pandemic for online chess, the one drawback of that approach is that by clumping the distribution, I increase the denominator of the z score, making the model a little less sharp. Okay. So what does that mean? So it means that if if you if you if you clump up a distribution, the standard deviation goes up, and that standard deviation is your basic yardstick. You're you're talking in multiples of that. So if I have a deviation of, say, 3 and a half sigma of the old way, and I use a larger deviation, like using a meter instead of the yard as my yardstick, then my score is only 3 meters instead of 3 and a half yards. And and that means my my, statistical score is going to be less. So unless a person really is smart cheating, the work needed to make my my program detect smart cheating actually makes it a little dull. So during the pandemic in online chess, there I definitely got results that were sharper with my simple unit weight approach rather than the smart cheating design approach. Translation, in online chess during the pandemic, I encountered a lot of dumb cheating. I see. People who were just every move using the computer. Yeah. Or or using it in bursts, but not with discrimination as to when they felt they would need the help. So whereas what Anand is talking about is a position like you could play bishop takes h 7 check, but you don't know if it works. Give me one bit of information on whether that move works. Right. Right. And that's the smart cheating approach when you know a critical position and you consult a computer. Right. And so I guess the third question is and this occurs due to the pandemic and also is related to this current chess cheating scandal with a player named Hans Nyman. During the pandemic, a lot of young people, and Hans Nyman was was is a teenager still. A lot of young people very quickly improved because they had more time to study during the pandemic. And then the difference between 2020 and 2022 when they start playing in tournaments, they might have had a huge leap in rating that would not have that that that defies kind of statistical planning. So this is where I would like to share my screen. So what you've just touched on has been the number one scientific activity that I've had to do during the pandemic. So start screen share, if I may have position. So I should say one other thing about myself. I cowrite one of the major blogs in computing. It was a top 55 blog roundup, 2 years ago, and we were in the top quartile of it behind publisher sites. This is the blog started by professor Richard J. Lipton, Emeritus of Georgia Tech at Princeton. We have coauthored a textbook on quantum computing with MIT Press. So this blog has over a 1,000 posts, in it. Wow. I could mention Tyler Cowen has sometimes referenced this blog. So he's a Tyler friend as well, of course. Oh, also also a a former New Jersey state champion, around the same age as you, and you're both from New Jersey. We we were on winning Garden State Chess Association 4 teams, in the, or you what became the US Amateur East, for instance. So he's been on the podcast quite a bit. Yep. So so, anyway, so this is an article that I wrote. So, so one of the realities of the pandemic is that because online chess is not officially rated, ratings of of aspiring, you know, growing junior players flatlined. Okay. So for instance, this is Annie Wang who won the, US junior female section last year. So the point is, I estimated with a back of the on flow formula, but it's been surprisingly accurate, where Annie Wang's real rating really should be. So I had her up around 2480 at the time I was doing this tournament. And the real challenge is to be able to tell it's not so much at the top end. The real challenge is to be able to tell for really young players that they are in this kind of explosive growth curve. So one example I could mention is is at an in person tournament, junior tournament in Asia. A kid was on the wall chart at 1595, and he was beating 21 100, 22100 players. And I got contacted about this. But I just said, you know, my my pandemic lag adjustment formula places him already at 21100. So it's not a surprise he's beating 22 100. So it's catching people here, and I can't tell whether I'm accurate for a given player. But for tournaments on mass, I have been incredibly accurate, including for the Olympiad, especially in the female section. There are, were a lot of junior teams, including, I think, New Zealand or one of the ocean Pacific teams had all junior players. So I put my adjustment rating adjustment formula, and, I'm you know, it's 28 months of the pandemic out. So it's really extrapolating, but it was still it was 4 to 5 times more accurate than what you would get if you didn't use the adjustments, and that means closer. And, actually, for the women at the Olympiad, my screening average screening score was 50.00, exact bull's eye with with the adjustment. Does your cheating algorithm or your your cheating detection algorithm take into account rating, like, if someone's playing above like, statistically significantly above their rating level? Right. Now for when it gets to a full test stage, then I consult with people, in fact, to get the most accurate fix on the rating, not just what my formula gives. But I I screen, you know, 10,000 games a week or not quite 10,000 games a week. But, you know, the week in chess says at chess base updates of 5,000 games each with considerable overlap. So I get all the I I get these massive tables, so I could tell that on average, my formula is working just right for those massive tables. So And How often do you detect a cheater where nobody asked you to detect a cheater? Yeah. That's a good question. I and the problem is I with my responding to that is I can't definitely say the person was a cheater because often this is not followed up. So just say I detect high outliers and inform about them, and sometimes they're followed up and sometimes they're not. And is this for offline or for online tour mean live tournaments or online tournaments? Both. And there are some people there are some cases. Or in fact, I just brought 2 players to chest.com's attention last week. So And are they, like, high rated title players? I mean, I'm not looking for names, but I'm just curious. How prevalent is cheating at, let's say, the highest levels? It's all over the place. So That is unbelievable. Would say not in the elite. You're not in the 26100 plus, but, you know, there are there are been a couple of, of, cases of people being sanctioned at, at 26100 plus level that are in the public record. So let's take a look at the Hans Nyman case. So Yeah. Again, what happened was is that Magnus Carlsen lost the game to Hans Nyman. So several things about it were interesting. One is Magnus Carlsen was white. It's it's very unusual for Magnus Carlsen to lose a slow rate of game with the white pieces. Not only that, he had just gone 53 games in a row, which is unbelievable. 53 games in a row without a loss, and this was his loss to a person rated roughly 200 rating points lower than him, and Magnus was playing the white pieces. Also, Magnus said, apparently, I'd heard about Hans cheating years earlier, I guess, at chess.com. It was unclear whether it was years earlier or more recent, so we still don't know. But there was some communication between chess.com and Magnus right after the game, and Magnus dropped out with a not saying why, but with a video saying, basically, I can't say why I'm dropping out or I'll get in trouble. He he he referred to another video of in another sport of someone saying that. And so there's been wild speculation. Was was he, a, was he accusing Hans Niemann of cheating? B, was the cheating that Hans Niemann somehow knew Magnus' specific book preparation, or was Hans cheating in that particular game, or was there more just general cheating that Hans had been involved in so Magnus was disgusted and didn't wanna play anymore? There's been all this speculation. And every day there's new speculation. So the most recent one being this morning, I saw that Hans does significantly better on games where the moves are transmitted, you know, when they're live games, games where the moves are transmitted to the public as opposed to games where they're not transmitted. Gahan himself has admitted cheating when he was 12 16 online, and he also stated that he would never cheat in an over the board game. So I think this summarizes everything we know that's not circumstantial evidence. There's a lot of circumstantial evidence out there that is meaningless. Yeah. And by the way, I wanted to also say state for the record, I admire Hans Nyman as a player, and I hope this is all proven false and that everything is good. He he's a very interesting personality and and whether you like him or not, and and his games are are amazing. And also I admire the fact, and I I want your comment on this. Appear according to Hakaro, Hans Niemann has had the fastest rise at this level of anybody ever at from at that age. Like, even though he's 19, that's still a fairly big age to go from a 2480 rating to a 2720 rating. And so so that's one of Nakamura's circumstantial evidence that there might be suspicion warrantied here. But what what what's your feeling on that? And then just in general, we'll we'll get into it. So a bunch of things here. So I'll just, so I'll first state that, I'm still right in the middle of data analysis here. So there are some things I can't say, not because I don't feel at liberty to say them, but simply because the work has not been done yet. Okay. So so that that's that's number 1. Can you say what work hasn't been done yet? Oh, sifting a lot of this metadata. For instance, the, the the thing you mentioned is from this morning, which I saw yesterday about the the tournaments broadcast versus non broadcast where he does better, and also this other question about spurts by, you know, suddenly from from, you know, 246020 700 being unprecedented. I think I saw a response for that, but I've not even had time to, to go through the details on that. So I'll work from the parameters of what's publicly known and and and what's definitely what's settled at this moment, and stay away from the things where it's unclear even than my own work. So first of all, the organizers released a statement on Saturday, saying that both they and I I've been in official consultation with the, tournaments at St. Louis, in fact, the entire Grand Chester series, from the beginning, and we have not found any evidence or indication of over the board type g engine type g. Over the board in this tournament? In this tournament. That's right. And and, again, you you look at all the moves. You look at the relationship between what the computer would have played or a second move or third move and what Hans played. And is it different than what Hans normally would have played and you found Well, that that that's a separate matter. So one thing about it is that my model has no chess knowledge built into it. No that's that's, on purpose to avoid potential bias. The danger of bias is far greater than the lack of knowledge Sure. From that. And, in fact, sometimes I already even look at the game so as not to prejudice my own, you know, understanding in the case. So, anyway, I mean, remember I said that with the with the creme that game that I reproduce a high concordance to the computer, but the game was quite clear cut. Okay? So this it's the similar things operate here. How how do you define clear cut if you if there's no bias? Clear cut when there is one clear standout move. And But but, again, one clear standout move could be impossible for a human to figure out, or it could be easy for a human to figure out. Yeah. That's right. Now this is the hardest part of my model. So, my model does try to ascertain when the best move will be especially difficult to find, such as when when there are other moves that are very tempting. And in fact, for about for 1 out of 7 or 1 out of 8 moves over players of all ratings, even the highest ratings, it does project to put the highest likelihood on an inferior move. And that gives me about 2 to 3 percentage points advantage in predictivity. So, for instance, with with a, you know, 27100 rated player, if I just predicted that the player would make the computer's best move, I'd be right 57% of the time. But if I use my model to sometimes predict inferior moves and judge when in the most difficult or complex positions, then I can get 59 to 60% hit rate. So if you wanna bet on chess games, my model is absolutely what you should use if you think that 2% is enough of a return on your investment. And, and there there are couple of other things that might surprise you. So so if you don't mind by going into a screen sharing, Riff again? No. I'll I'll read what's on the screen for a better So this is a article I wrote when I settled my model in summer 20, 2019, related to betting on horse rinses. So if you wanna read that angle, this is it. And and so my old model used to always put the highest probability on the top move, which is the favor. Now this is an experiment. When you say probability, probability of what? It's a predictive analytic model. So it treats the move legal moves in the chess positions as the events and puts a probability figure on a player of a given rating making the a a given move. So what I'm looking at here is, the top move, for someone rated, what, is 17 So this is a controlled experiment. So I took all of took those, you know, 6,000 positions in my main training set, where the player to move was rated between 1,012 100 feet in. And it was a position with many reasonable choices, At least 10 moves valued within a quarter pawn of optimal. Okay? So so moves where the information gained by choosing the computer's best move is most considerable. Okay. Now this is like crowdsourcing the number Wait. Wait. I I have a question about that. If 10 moves are valued relatively similar to each other, what the computer says doesn't even really matter. I can choose any of those moves. That's what you'd think. Right? Well, what you're looking at are the empirical results. So these very weak chess players well, I'd say very weak. You know? But, they nevertheless found the computer's best move one sixth of the time. Certainly, 14 percentage points better than the 10th best move, which was only a quarter point worse. And if so if the evaluations really don't matter, then these moves should these percentages should be all near 10% or sheer equality. Okay. I mean, they're weak chess players, so 25%, they play a move outside the top 10, a blunder. But the main point is that this refutes the idea that weaker players prefer weaker moves. No. If you get enough weaker players and crowdsource them, they will still, with 4.5 percentage points clearance, pick the the the top rank move. Okay. And why is that? What's just, philosophically, what's happening there? Because we because because there's an error, a noisy process by which we apprehend quality at chess. And we all have some basic notion of quality, and there are things that interfere or keep us unable to get the full truth of that quality. Nonetheless, you know, even a novice stock trader, okay, will occasionally make a good trade and more off and moreover, we'll have some idea of what feels in the gut to be a good idea even if the player doesn't do the real deep research to see if that's really so. So, you know, a a a a novice stock trader may be at a disadvantage competing against, well armed people. But if it were a novice stock trader against the entire realm range of society without these tools, which chess is modeling a little more, the the AO, in in a boom time, the average stock trader will do reasonably well. Okay? You you don't have the phenomenon that in a reasonable up market time, average stock traders are gonna make terrible choices. Okay? Right. They might not do as well as the brains, but that's why the stock market's publicly accepted. Because in the main, John q Public has has done fairly well. And the doing fairly well is the kind of distribution you're seeing here. Well, not the bottom line. Okay. So, so instead, what my work says makes a blunder in chess is when you're diverted by a shiny object. In other words, conned by the chess position. Okay. And, so that's so that's the that's the approach to it. Now let me stop sharing, sweet, get back to, to what you're talking about. So that's that's the phenomenon that I'm trying to capture in my, souped up model. So clear cut, therefore, means that there's not a lot of cases where my program is picking up the diversion in a chest position. In other words, what I'm saying is that this as far as my model can detect just from the way move values jump around, that's that's the key. Let's look at lower depth values and see how much they jump around. The strategy for NEEMON in that position was fairly clear cut. Why is an isolated pawn gang up on it, win it, defend against the 7th rank counterattack? And there was one really nice move, e 3, sacrificing a pawn so the knight gets to e 4 and white suddenly threatened with checkmate. But Carlson tried to create a distraction with g 4, and it turns out that that was his worst move of that. He was punished for trying to create a distraction. So there weren't many distractions for black, and that's what my model is picking up. And how does your model determine what those shiny so so, basically, a blunder happens after a shiny object like g 4 like Carlson's g 4. And if he doesn't do the blunder, does that suggest I'm trying to figure out what suggests unusual play. Yeah. That's right. So so that will that yep. Well, avoiding a real trap could be unusual play. Another question about what you just said. Yeah. Did you just use chess bias in your description of the game? Like, yes. You might know that in an isolated queen pawn position, the the these 6 moves are clear cut. But Yeah. How do you how does the computer distinguish whether it's easy clear cut or difficult? That's what I try to do organically, so I'll show you. But you're right. I mean, I play the c three Sicilian as white, so I often had the same split font font structure Carlson, had. And if I if if I Note to self, when playing blitz against Ken Regan, prepare for c three Sicilian. Yes. And and trade queens if you can. Gang up at my isolated c pawn. Okay. So this now is an example. This is by student, by the way. Jamal Biswas is now on on the faculty of of of RKM Ferry in Calcutta, India. He's from Bangladesh originally. So this was one of the main pivots of his thesis. So this is the, a key moment in the 2008 world championship match between Okay. Brandon White and Anand Black in this position. So the question is, can white capture black's t pawn? Okay. So now a beginning player will say, no. Black's queen is on. Okay. Slightly stronger, deeper play will say, hey. Wait a second. If black's queen takes it, I can move my rook, and I'm skewering black's queen to the knight, and I'll get my piece back. In fact, I might win the bishop too. Now deeper player getting into your level will say, uh-oh. Wait a second. Black encounter attack with knight f 6 on my queen and move the knight out of trouble. But now world championship level player, at least Kramnik fell into this, will say, but after I take the queen and black knight takes my queen, I can go down here and get the bishop. Then I've got 2 healthy passed pawns on the queen's side. Moreover, my bishop is defending my back rank, so I should be okay. Okay. I almost fell for that one. Yes. There you go. Well, Kramnik played into it, and he did not see what was coming until Anand executed on the board. Anand had seemed a little further, and there was knight e 3 attacking the bishop. And after pawn takes knight, pawn takes pawn, white's rook is completely out of position to guard against the e pawn coming down for checkmate. See, this is why chess is like a a beautiful work of art. Like, that people can't see this position, and for many people, it might not matter. But if you watch the video of this, we'll put the video on YouTube. This is just a beautiful, beautiful move at the end of this, and I could easily see how anyone can miss this, even a world championship level player. And what my model relies upon is that the computer at lower debts cannot see it either. So this was stock fish 6, you know, current stock fish in 2015. And at debt 9, the pawn capture, it thinks, is initially bad. But at debt 9, it jumps up to 0.77. And it stays in, you know, the range of a little over half of a pawn until depth 14. So that's a pretty considerable stretch point. 14 moves ahead. Right. Looking 14 half moves, that's 7 full moves ahead. And then 7 and a half moves ahead, suddenly, it goes to minus 181 because it has seen, enhanced track. So this is the shiny object causing a diversion. So this is the case where my model will up weight the probability of falling into the track. Basically then, if someone makes a move where it suggests they've seen so first, there's a a comp a move that the computer could make where it's wrong, wrong, wrong, wrong, wrong, then it switches then it switches back massively around Yeah. 15 moves deep or more. And if someone makes a move like that, that could suggest cheating. Or if someone responds accurately to a move like that, it could suggest cheating. Any anything around this move. That's right. It's getting deep information. And there are people who are trying to assess this directly, which might be work, you know, on the scale of a single player. But for the vetting and validation of my model, I need to make sure that its scores stay within the normal distribution on massive amounts of data. So I had to program a way to do it organically just from the recorded engine values at various depths, not on any notion of chess knowledge of what's a deep position. Because unless I played the entire army of master players in the world to annotate 100 of thousands of games that way, I just could not get the trading data. So so wait. It it's is this a distribution issue, like a statistics issue? Or, like, right here, I'm looking at what you showed as the computer valuation. Again, for 14 moves, the computer was wrong. And then at 15th move, it suddenly saw this amazing thing. Now most people and by the way, it was massively wrong. It was like a a 33.3 pawn difference essentially. So so do you really need a distribution, or you should just look at every game which has people making moves that mimic That's true. So the distribution so there's one distribution of values over the moves. That I get organically. But what I'm talking about for making sure that my model is reliable, that I don't go accusing people with incorrect justification, I need to attend to the just the mass distribution of honest players and the statistics that they generate, including the fact that, occasionally, you know, once every, you know, 30,000 entrants of a player into an event, that player is going to have a 4 sigma up. Lucky day. Well, it could be the case too that they do in that particular case, they make the move, and then they think, oh my god. I just blundered. And but then as the game continues, they finally see the correct move. Right. So there's so that's those things happen. And I do statistical randomized resampling of my training sets in millions of validation trials to make sure that it may happen, but it doesn't happen so often as to throw off the conformance of my model to the bell curve for, the great mass of honest players. But then let's take let's take on the on and Kramnik. They're both world champions. Right. And Mhmm. They clearly can make a move like that because they did make Hanan did make that move. And but your system wouldn't accuse him of cheating because you have statistically analyzed his level of play players' games, and they will make those moves occasionally? Or how do you do that? That's that's, so they are yes. So the so the point is they're they're high rate. So among the bay most basic things that I do so I'm gonna share my screen again, so more more articles on this blog. And I must say the blog is a pre publication venue, so these things should go in papers, but the pandemic kept me so busy that I only had time to do this. In fact, I have not had time to write an article about, Alireza Faroozda's ultra bullet marathon and how that plays into my statistics. Oh, oh, okay. Yeah. Yeah. I analyze that. So Let's take the Hans Devo for a second. That's fascinating. Yes. But, so here's the point. So this is a good statistical thing for stock market charting as well. So this is how roughly how my data stood, you know, 10 years ago when my reliable data was in the 1600 to 27100 range. And what you're looking at is the percentage first line match versus a player's rate. So I said, you know, a 27 100 player will match about 57% of the time. There it is. Okay. But a, 22100 player, so here, will match the first line of computer only right around 50%. Which is fascinating because you're saying then the difference between a a a 22100 and what is the highest highest number? 2640 or 2540? Yeah. So you're saying the difference between a a a master and, like, a super grand master is only 8% difference in terms of how they'll Only 8 percentage points. That's why gaining 2 percentage points advantage is huge. And I guess it's because most moves are clear cut. Like, the first five moves, for instance, of a game are always clear cut. You're always you can it's easy to make the top move. And then maybe it's really almost around moves 20 to 40 that are the you're gonna find the critical position. It almost would be interesting to just look at moves 20 to 40 because I bet those percentages would change a lot. It that's true. Now that is that is a fact that that that that there is a important sensitivity on the, index of the move of the game, which I tried to average over, but, that's a really messy area. So let me pretend you didn't ask that. Okay. It might that might be relevant per opening or it it is a lot of a lot more factors that are are messy. Yeah. Yeah. There's lots of messy sausages in my shop here. But, the a not sporting analogy I can make from having seen the US Open yesterday is that, you know, weak players do often play the best move of in a position, and it's like holding serve in tennis. You know, a clearly inferior tennis player, nevertheless, does expect to win more than 50% of his or her own service games, unless it was the match of Iga Swiatek against Jesse Pagouli over a 13 breaks of serve. Okay. Now, anyway, the point I'm saying here is this looks like a perfect linear relationship. It's got an r squared of 0.99. In the social sciences, you'd kill for something like this. So so, basically, what I'm seeing is there's a straight line from the bottom left to the top right where as you increase rating, you you very, you can see that the higher the rating, the more likely they are to pick the top computer suggested move. So at a given rating, like, let's say, 22100 where it's roughly half the time they will pick the top engine rated move, if over a number of games, they're at 60% instead of 50%, there's only 2 conclusions. 1 is either they're cheating or 2 is they're underrated and they're during a period of massive improvement, I guess. Right. Although there is a third reality that I have to discount in talking about that, and that is the fact that this is actually not a linear relationship even though it should be given the design of the rating system. So when I got more data, including data more data above 27100, simply put, there have been more players with that rating, say, a lot more games. And with the availability of reliable data under 1500, what looked like a linear relationship when you wind it is actually clearly curved. And I've had to revise my model to take that into account. But but that's probably because I mean, you have to take into account I mean, this is getting a little into the weeds, but the rating system changes at below certain levels and above certain levels and also based on your age. So for instance, the the the k factor in ELO ratings tightens up after a certain rating level, meaning standard deviations are tighter, and I wonder if you take that into account. It might be. I mean, it yeah. I I mean, I I don't know the the root cause underneath yet. All I know is is is that I've had to take this into account. I mean, it looks to me that the I mean, the younger players are more volatile because their improvement could be faster, so that's why it doesn't work as well at the very lower level. And at the higher level, the difference between a 2650 and a 26100 might be a full standard deviation as opposed to a 200 point difference. And now one thing is in terms of estimating, there is that the fact of higher rating uncertainty at lower ratings does bump up the standard deviation, and this has also been the case during the pandemic. You know, with official ratings frozen, There's a lot more uncertainty in my rating estimations for individual players, and that has bumped up my Sigma. And that Sigma, bumps up in a way that is linear with the amount of data, amount number of games player has played rather than square root. So it's a real pain. So so it's very interesting. So if someone's improving quickly, obviously, their rating will adjust fairly quickly as well. But after the pandemic, when we went back to over the board, it could be the case, like you mentioned earlier, someone 1500 can now be 21 100 in scale. And at younger ages, that tends to happen more often than at higher ages. Mhmm. There also could be something where if someone's just learned an opening and now they're playing that opening perfectly even though there's still 1500, suddenly their computer accurate moves will bump up for that opening. Right. And that's another fact that's especially important at fast chess, which is that the amount of book knowledge has increased. So the average novelty is now a move or so later than it was when I started 10 or so years This gets to a kind of another, prong of online cheating, which is they could simply have the book in front of them sometimes and not the book in front of them other times, and you could probably detect that. Like, so if someone plays a certain opening and they and they get the top computer move 50% of the time, but there's, like, one day where 5 games in a row, they were 60%, you could suggest that could suggest that they don't have a computer in front of them, but they're simply reading their course right in front of them. Yeah. That that's and that's that is also, by the way, the fact that prep is unknown. So prep comes into play at a further iteration after I give my original report. But now let's went back to the Niman case because I think this is the crucial elephant. Yes. So in the game, Niman versus Carlson, and I could find it quickly if you I'll show it on the screen in a moment. So in that game, the official novelty was Carlson's queen takes d 4. This using chess based cloud and my chess based, updates. And so and so so I just wanna mention for to set context, and I'm not right now, we're not he's Ken is not screen sharing. We're not looking at the game. But what I know of this game is that many people claimed that Carlson had never played this opening before, and Hans, in the interview right afterwards, said it was by miracle that he had studied and prepared for this opening because people were not I wanna I don't wanna say suspicious. They were it was just unusual that he would have prepared for a game that an opening that Magnus never played. But, apparently, Magnus has played as Hans said, Magnus has played transpositions of this opening before, so they've he has reached the roughly the same position at least twice before. I don't know. I have not checked this myself, but I believe Hans on why would he not why would I say that? The key element is exactly to ask what was the nature of this miracle. But I will say that, so here's the game, and, so I'll scroll through the opening. So it's a Nimzo Indian defense, but Weifley's g 3. Hey. Oleg Romanosha played like this when we were on student team championships together in the 19 seventies. We played for what was then the Soviet Union. You're you're old school chess player. So you were playing with, Fedorowicz back then, probably Yeah. When he was a junior and and Joel Benjamin, Michael Wilder? Danny Kobay, Kim Collins. Yeah. So, John yeah. Fedorowicz, Tizdol. So, anyway okay. So it's now actually has more of the character of the catalan. Black is counter attacking in the sector or sometimes happens. Now queen takes d 4 rather than taking with the knight of 1. Just to describe, like, it's not critical for understanding this case, but I we are looking at the game, Carlson versus Neiman. It's it's sort of the the moves themselves were not that important, except we'll it'll help me to explain when we get to the critical position. I'm just saying for that That's right. For the audience sake. So this is the official novelty. So what follows is prep. A novelty is a move that has apparently never been played before during a tournament. Not in the databases, at least not of elite games. I use a policy of taking moves by players rated 23100 and above. A similar distinction is was made by the opening master disc series. So, anyway okay. So now, Carlson regains his pawn. Now black, however, strikes into the center, and here we have, by the way, the isolated c pawn. So the question is, can white generate an initiative, especially with the 2 bishops, to offset black's structural advantage? Black puts the question and now white attacks black's queen, and this is the key move. If black has to move the queen, then white can continue generating an initiative. But as Niemann said he reviewed before the gate, he knew that black has to counterattack with bishop e 6. Which by itself is not unusual for players at this level to do what's called a or an intermediate move. So Right. So one piece is being attacked, but rather than defending it and just reacting to your opponent mindlessly, you make a move that also is an attack. So it's a it's an intermediate move rather than a gut reaction. Right. In general, there's a little bit of risk because white's queen could move and attack something else, and then black would have the queen and something else attacked. But in fact, there's no way for white to really take advantage of this. Like, this this move, doesn't work. I don't know exactly how black deals within. I guess I can ask, the computer, and the computer says black We can cheat. We a 5 is the right way to to deal with that. Well, okay. At any rate, Carlson took this, took this, and then on 15th move, Carlson thought for a long time, clearly out of his debt. And what's actually happened is that white's initiative has been squelched. Carlson actually took this and took this. So black has double pawns on the king's side, but they're really not an issue. The issue is this Aiselani and the fact that white has to sculpt with his king to defend e 2 and then defend the open file. So what So, again, just to describe, on move 15 on move 14, I guess, there was the novelty. On move 15, Hans move 10, the official novelty. Move 13, the the definite important preparation move. Right. And and and Hans, who said after the game and before but before the cheating accusations that he had had this position on his board before the game Yeah. Because he was studying this, he he came up with kind of a counterattack, and it all boils down both sides end up with weaknesses and it boils down to which weakness is more important. And Hans had determined, I guess, in preparation that Magnus Carlsen's weakness was more critical than Black's weakness. And this was the the innovation that kind of led to the rest of the game, which earlier you described as the rest of the game is roughly clear cut. Like, now there's a kind of plan of what you do against these types of weaknesses, and the isolated pawn is a worse weakness than the slightly weakened king in an end game in the doubled pawns. And players prepare with engines. So you doubtless could see that stock fish 15 the depth 26, which is a higher depth than I used well, not always, but, but to pretty nice high depth, gives black the better side of what it still classes as an equal position. So 3 It's still roughly equal. Yeah. But it's it's favors Hans, but it's it's it's any this would be ignored in most computer analysis as being better for black or better for white. Like, it's not better enough that you would say black is winning. Yeah. And there's also what, Jacob Agard or some other I there's not someone else, calls the zone of one mistakes or the slippery slope slope before you get to the zone of one mistake. This is a great example. The most natural human move to make is to guard your e pawn. But, in fact, Stockfish is saying there are 3 moves better than guard your e pawn. It's time to jettison your e pawn and start thinking about counterplay. So in other words, the computer is saying My my only instinct was to guard the EPOD, I might add. Yeah. But the computer is saying it's time for White to, to make a concession and start thinking about counterplay. So that in itself is inhuman in chess terms, so significant fact on the ground. So now it's saying half a point advantage to black. And okay. Well, the other thing that I my model has to deal with is that computer values jump around. Oh, it says actually that rook d 8 was not the right move to play. So, I mean, I'd I certainly say that NEEMON is not cheating with stock fish 15 in this position because what the computer is saying is you should've played your knight a 5 strategy first and then decide where which file to put the rook on. Instead, Niman played here, allowed white to improve the king, then moves the knight, and then move the rook to a different square. So you could see that over those three moves, black actually lost the tempo. So this suggests that given that this was the critical position where a computer is most I guess you could define the critical position as those positions where the computer is most useful as opposed to a human brain. Right. And so in this this is clearly a critical position that the whole board has just transformed, and that almost that's kind of what defines a critical position. And and this is where a computer would have been useful, and here's where Hans makes his weakest moves. Yeah. So it suggests that he was not cheating over the board here. Right. But now back displayed the 2nd best move, giving up the pawn right away when knight e four was also moved to consider. But he gave the pawn right away because he sees that he can get black's bishop for it. Why can't black take the pawn with the knight? That's a good question. What happens if I take the pawn with the knight? Probably something really bad. In fact, actually, the computer, Stockpitch, is saying that black shouldn't grab the pawn yet at all. The pawn's gonna fall. Why improve your position first? Well, the the the bishop takes pawn attack Yes. Bishop takes pawn attack if you're working on that steady. Okay. Yes. Yes. Yes. Okay. Good. Alright. So yeah. So, okay. So, anyway, so bishop takes pawn. That was a little premature by hats. And And that and that could even be, you know, you could say youthful exuberance. He's he's beating he's got a better game against the world champion. Yeah. So white got counterplay, and, you know, so this this is this is white's knight is sort of governed by white's bishop. I wanna address that too because after the game again, there was a lot of people giving may mostly circumstantial evidence that there was cheating involved. A lot of people said it was unusual for Magnus to not have any chances at all. But we see here, white did Magnus did have some counterplay. He had some chances. Yeah. Yeah. He's well, he yeah. So he he should play rook d 8. So because f 5, and now play here. You know, this is due some weakness, and if white plays f 3, that's the right way to break up black's pawn chain. Magnus did this instead. And now the problem is is that, by white not having it escape square that f 3 would have made, now, actually, black can very effectively start depriving white's bishop of squares. So this bishop is actually is is starved at the moment. So so okay. So here's the critical question. This is move 20 Yeah. Where did g 4 occur? Move 28? Move 28. Yeah. So the critical question is or or move 29 or no. Move 28. Yeah. The critical question is, this is a crucial position where Magnus is offering up a shiny object. Most people would take that shiny object in a regular blitz game or whatever, and Hans didn't. So what suggests to you that there was no cheating, you know, when he didn't take that shiny object? Obviously, he knew that Magnus is planting a trap because it even looks like a trappy sort of move. Yes. So that's right. So do do you're saying that the the the the Magnus says, yeah, g 4 looks like a trap. Yeah. Magnus, maybe maybe was still thinking, well, I have better pawn structure, and I've got a bishop against the knight. So if I open up the game, I'll be able to at least equalize, but maybe even have chances again. But here, however, though, Magnus would have been well advised, as the computer would say, to remember the proverb taught to every Russian schoolboy. All rookie and games are drawn. So Magda should've, whipped off the knight and and and start going toward making a draw. But instead, Magda's moved Right. Because it was kind of almost equal, slightly better for black, but almost equal. Like, equal ish. Whereas And but maybe he wanted to you know, he goes for the win. He's one thing that makes him world champion is he goes for a win in equal positions or slightly down positions. He's known for that. And he people fall into his traps. But could it be the case how would you identify Hans as to be not cheating when in the moves that followed after g 4, after the potential trap? Yeah. Well, that that at least in this run of Stockfish 15, he did not always play Stockfish's best move. He made slight very human mistakes, and that's kind of what my model picks up in the official run. So another issue that I deal with quite in general is that there's no such thing as, quote, the computer's move. The move values are a distribution. For instance, here, Hans play a humanly wonderful move, and I would say that's definitely the best move. But in this trial of stockfish 15 to def 22, stockfish teen actually prefers the knight move, allowing white to get into a rook ending again. So but, you know, better than than where it was before. But still, I would say that that if you allow, well, I guess the compute the the computer is saying that this should take c 4 is really lost for white. So what I know. Okay. At any rate but if you run the computer to a slightly different depth, I bet there would be a depth where e 3 is the best move. Well, let me ask you a question. Was e 3 not the best move? Like right now, the computer is at depth 22. Yeah. At depth 5 was e 3 in the running? At depth 10 was e 3 in the running? Yeah. It's a good question. I don't know. Let's see if I can copy the, the move code of the position quickly. Because that's what you were suggesting earlier, is that if the first 15 moves, something like e 3 doesn't show up, but then suddenly it shows up as the best move, and then Hans plays it, that's suggestive of cheating. But if e 3 was always one of the top 3 suggestions from move 1, it's a move like any other. Yeah. That's right. Let's see. I put computer analysis that just that just topped it off. Let's see. Is is there a quick way to get the, the, You you could start analyzing the position from def 1. Right? Yeah. I I could. Indeed. I'm just I'm just trying to I I have the arena chess engine chess, interface, all where I've well, oh, let's see. Wait a second. I've I'm being very slow here. I just opened up my copy of the, games. So here we go. Same field cup and, Carlson, Niemann. So now let me go to this position. What move is it? It's, move 32. Okay. So now move 32. Move 32 with the e 3? Yeah. So it's oh, you moved 32 black. Okay. So now I this is stopped 15, but let's load another engine. So I have all the, engines and their versions here. Do you have alpha 0? Oh, I don't have alpha 0, and, unfortunately, I don't have a working PC version of Leela. So but I well, I may have l c zero GPU, but I'm not sure it's really functional. No. Let's use stockfish. Okay. Just to see. Starfish 15. Good for you? Yeah. Yeah. Now there are many other little variables here. So what size of hash table do you want? Do you want it to be 2 PV mode, multi PV mode, single PV playing mode? How do you want it? I don't know what those are, actually. Oh, those well, they make a big difference to the distribution of values of oos that you see. Let's run on through 4 of my processor cores. So we sacrifice scientific reproducibility, but we get a stronger engine. Okay. We just wanna see if e 3 like, the game you showed me with Kramnik and Anand Yeah. That was an impossible move to see Okay. Well, now on this run, the engine is liking e 3, clear up through depth 2930. You're still seeing my screen. Right? 32? Yeah. Okay. So it definitely likes e 3, which is different from the same stock 15 engine with the 4 PV mode that, that But what about depth 5? Depth 5. Let's scroll up to depth 5. At depth 5, it was night c 4. Look at that. Depth 5 and 6, it wanted night c 4. But depth 7, in a twinkling of an instant, it switched over to e 3 for all the rest of the time. And can you see the second and third suggestions at depths 345? Because I'm just very curious. I have to change over to multi PD mode, else they won't show. So I'm gonna go go in here. I'm gonna clear the hash table so we get a clean run from a fresh start. And now I'll do what chess 24 does, show 4 variations. Oh, actually, they have 3, so I'll show their 3. ChessBot used to shuffle. Okay. Chess 24 being the company Magnus owns, by the way, just to add? Yep. That is being tendered by chess.com. So now, again, the analysis came out evenly. So at depth 4, it thought that black was completely winning with rook c 1 followed by rook a 1, but that's silly. Okay. But at depth 6, it has e at depth 5, it has e 3 as the second choice? Yeah. Depth 5 as second choice and rook c 6 protecting the knight as top choice. Then at depth 8 and 9, it's switches over to e 3 being the top choice. But let's So so it's not unusual for a 27100 or even a 25100 level player to see a second choice that's 5 moves deep. Right. But also notice that here, knight c 4, it says it's only 11 100 of pawn works at depth 29. So the moves are really, coequal as far as it's concerned. Right. So any variety of moves so what you're saying there is is that there were several moves that were winning here. It was hard for Hans to lose here even if he'd whether he had a computer or not. Right. Between the 2 top moves, now they're exactly tied, 1.86. So even though e 3 looks like a trickier move, it really is not, you know, consequential that it was a trickier move. And in fact, now knight c 4 has taken over at TEP 32. Right. Yeah. So so so the evidence that according to you is that at least in this one game and I'm curious if you look at the other games from this tournament and other hunts over the board tournaments. Yeah. He's not cheating in this game. He's not cheating. That that so I so so I yeah. So so as as the official press release stated, I did not have any indication of cheating in this game. He played well, but, you know, there's there's a large gulf between the threshold for well and the threshold for cheating. What what about other Hans games that you've looked at in in recent months? Evidence of cheating in over the board chess at all. So And you've looked at you've how many games of his have you looked at? Well, I have screened 106 events counting on Volley Plus Over the Ford since January 2020. Okay. And no cheating online or offline? Get a completely normal distribution of ROI by by measure. So my ROI screening measure is on the scale of flipping a coin a 100 times with 50 as the expectation and 5 as the standard deviation. So I have a 106 readings. So, you will so the standard deviation's 5. So one out of every 44 readings, you'll get more than 2 sigma up. Okay? So that just happens by natural chance. And, so I have a few of those. And just to describe that again, so Hans is expected a certain percentage of the time based on his rating to match the computer's first pick. Right. And it's not unusual in some games for him to be 2 standard deviations away from that or 2 standard deviations below that. Right. And but but I'm saying also, this is by entries in this table, our counting is complete performance on the available games for the tournament. And important to point out, for a lot of the European tournaments, they don't post all their games. They post only the games that are broadcast. So my sample is definitely biased toward broadcast games. And yet, I have a completely normal bell curve with median 49.8 out of a 106 readings of hands, of that's counting is online and, and over the board. I could I could restrict it to just over the board, but I'd probably get a similar result. I mean, again and let let me take play devil's advocate just for a second. Mhmm. Are you able to if he's you know, it's just one move out of the game, would you be able to detect that? One move out of the game? Probably not. But 3 moves out of the game? I I did a quantify quantitative run for Ben Johnson on his podcast. And and if he picks always the 4th suggestion of the computer, would you be able to detect that? Well, sometimes it's a blunder. Yes. I do actually have the top 2 and top 3 tests programmed in my system. They're not calibrated, however, though. As it stands, they're slightly positively biased. I mean and and, again, just to try to play the devil's advocate, is there anything you might be missing that would suggest he was cheating over the board? Like, is there any test you would like to do, but you can't for even theoretical reasons? Well, yes. So as I said, I work in minimal information. So, by the way, at depth, 37, things boomeranged back to night c 4 being the best move. So I said, my my model takes into account that moves are a distribution from the get go, not any, incorrect semantic categories such as, quote, dot computers move. Right. So, at any rate yeah. It's possible. So for instance, if I had how much time, if it's Grishuk playing, I can assume that Grishuk is a time pressure. You have to move 30, and that could influence my determination. So, okay, it's a little bit of a swipe, but it's but it's, for No. No. But that's a good point because time pressure might you maybe you could adjust your algorithm to adjust Grishuk's rating down Yeah. Down. When he's in time pressure. Most important and most difficult data, this is on the tournament staff, is to record times when a player was away from the board, possibly at the bathrooms, or to record times when anything unusual might have been observed or any unusual movements of a player. This kind of video analysis has been important in some very high level cases played online, but it can apply over the board as well. And then it is possible that there might be a particular bump of correlation to the player's moves at those independently determined junctions. But that's not what's happening here. You're just saying hypothetically. I have no such information here. Right. So for instance, if every time they went to the bathroom, the very next move was higher correlated to the computer than other moves, then that suggests something. Right. And here's the thing. Because of a much smaller sample, you know, my zed score, z score might go down. But the fact of this correlation itself is a different kind of evidence in the Bayesian calculus. Now let's looking at the other circumstantial evidence, is there anything to the fact that his rating has zoomed so much in the in the prior 2 years? Right. Now that's possible. So one thing I will say is that the formula according to the pandemic post that I made, last post, because Niemann played so much that he already had a k factor of 10, at the beginning of the pandemic, So my formula projects that from 2465, which was his long frozen rating, that he would go to, 2582. Okay. So in other words, of his rating increase, a little bit more than half is exactly what I would project as something any player of his rating would do. So so here's his rate you know, lack of in person chess the 1st summer of the pandemic. That's when it's frozen. And then he did go to Europe. It was a to play, so that's why he has entries in his, in his way. So, anyway so, you know, I wish there were a more formal study of these kind of marks. At any rate, my back of the envelope formula would have put me on about where my well, I guess you're not seeing my mouse. About 26100. Just 2582. But but factors in there could be from what I understand, he had a coach for the first time, during this period. Yeah. So so so the addition of a coach could speed up the increase in rating. He also, played more so they looked at I have seen something where they looked at junior players who made fast streaks up. And if you if you don't do it by years, but by number of games played Mhmm. He did have enough games played that it could account for this the increase in rating. Right. And then maybe that, see, what I think I'm essentially measuring is the amount of time spent improving one's game. And one of the things that ought to become part of a published paper, but it'll take a lot more work, you know, of of dotting i's and crossing t's, whereas because of my role in the chess world, I'm having to do this real time, is that online chess has been just as good for improving one's chess as older offline chess has been. Okay. Well, actually, let me ask you about this. So let's this is almost a different subject. Yeah. And I do wanna I ask you the about this for a few minutes. But let's let's I wanna finish Hans Niemann. So over the board and online from January 2020 on, as far as you could tell, no evidence of any cheating. Right. So given the increases in his rating as they happened, I do not get a large increase. So the important contrast is this is completely different from the picture I had with Igor's rouses where, you know, I I I I do up do my updates every week. I have tables with, you know, for the pandemic, approaching 300,000 entries. So when I goop when I just simply grabbed all the lines with browsers, I saw, oh, yeah. My goodness. The vast majority of these are above 50. And I'm wondering why chess.com was taking action. And by the way, I love chess.com. I think it's a well managed company and site. I I actually really think it's a great deal that they're buying the, Magnus' company. Mhmm. That aside, it's it's interesting to me that chess.com chose now to investigate cheating allegations from 2020 on Hans Niemann, you know, when he was 16 years old. Not that that excuses anything, but, you know, he's got Well, he's got 16 to 19 20% of his life. So, so that's the thing. But, you know, I I'm I'm just analyzing the the data. I mean, my my role is just to put the data that has been collected in a scientifically neutral matter out there so that it could be considered and hopefully avert cases of people rushing to judgment and doing silly things, uninformed thing. And this has been a major dialogue with with FIDE, with the International Chest Federation on the whole, that they should take charge of this data so that they have it to hand and can deploy it rapidly and have their own people rather than a university professor who has to prepare lecture notes for a lecture in 90 minutes. Well, and you and you refer to that quite a bit. So I'm curious. Why don't you start a company and simply charge for every time you've asked, about a cheating allegation? I don't have time for that kind of, of, the infrastructure that I would need to do in order to do that. If you get people who would like to start a company like that on my behalf, that's fine. But I wear 3 hats. Okay? I'm primarily a computational complexity theorist. I do quantum computing. I have a skeptical position in quantum computing. I deal with I I'm researching a possible algebraic obstacle, which would amount to be a new physical law, that, may be an impediment towards scaling quantum computers. Now what are the odds that I'm right? You know, maybe 10,000 to 1 against. It's probably just silly Ken Riga and, you know, not a not a super top professor, but, you know, with some ideas. Nevertheless yeah. I'm a I mean, I'll show you the idea. So you find it on this blog. You just have to, search grilling. I've had the idea out there for, over a year, so I'm making fun of the fact that there are quantum grills. This is, Amlan Chakrabarti, who was recently dean of computing at the University of Calcutta, but this work was started when he was a grad student visiting Buffalo in 2007. By the way, have you noticed a trend of Indian grad students now moving back to India instead of Silicon Valley after they graduate? Yeah. The world is more mobile in general. Yes. So so yes. And we have we have opened up some of my wonderful colleagues, Bharat Jay, Ram, and Shambu Padia, have have opened up a liaison with with several Indian universities with the connection over there. So, anyways, this is the sort of algebra I do, and the point is that that the Gibbs polynomial of this circuit may be an impediment in quantum complexity the way the notion of geometric degree gives us the only known nonlinear lower bounds in ordinary complexity. So another article covered that. So the good thing there is is that for everybody worried about quantum computing's effect on cryptocurrencies because it could break the cryptography of it, don't worry anymore if Ken is right. Right. Well, you know, see here's why. By the way, this is Gil Kalai, who's a, major mathematician, Israeli mathematician, skeptic of quantum computing, coauthor of the paper that statistically refuted the book, The Bible Code. And, anyway so he's a friend of mine. And but the point is, you know, maybe there's a 1 in 10 only a 1 in 10,000 chance I'm right, but $10,000,000,000 and more is invested every year in quantum computing. So you multiply 1 in 10,000,000 by 10,000,000,000, you're still, you know, talking, well, 1,000,000 at least a 1000000000 dollars of of of potential value to my pursuing that instead of a chess company. So, you know, the unfortunately, the chess world is near us, so it gets my time. But there's really stuff I should be doing in quantum computing. So that those are the parameters I work on. So so okay. So let me ask you this, and this is about about chess. Have you moneyballed chess? In other words, are there statistical anomalies that one could take advantage of to better improve a chest? So for instance Well, you absolutely, by program, could be used as a training tool. Most in particular, it could automate or or better improve what I think a lot of players generally do, which is prepare moves that are risky, but where's the chance of the opponent finding the reputation as acceptably well. In other words, gambling moves in the open. That's my student's turn. And Oh, so so by looking for an an opening situation where the computer doesn't see anything until ply 15, that could be an effective trap in the opening. But but even more so that's a great way to analyze traps which people could study, but it doesn't necessarily improve chess knowledge. That's true. It's What about Yeah. What about money balling in the sense that and money ball, of course, refer refers to the statistics of, used in baseball and Michael Lewis's, famous book. But like like like for example, you studied what openings tend to be better to study for improving players than other openings or or what knowledge. Right. That's if someone wants to make that leap of improvement, is it better to study tactics, end games, openings? Yeah. I I saw one study you did where people are more likely to make mistakes when they when they're up upon, maybe because they have overconfidence or or something like that. So are there other anomalies like that? Yes. And you could use my model to to to analyze, those as well. So in particular, I should call up one other relevant, fact in this in this Niemann case. This is the article written by the famous chess trainer, Jacob Agard, total, titled paranoia and insanity. Okay. And he says that he can detect obvious big holes in Niemann's game and others. And Niemann himself, for his following game against Faroozja, said that Faroozja has a weakness when he's being directly attacked. Well, you can try to detect or at least, you know, label such positions in my program and then compute the opponent's the player's performance on those positions and thereby objectively verify that, yes, this player is not so good when being attacked or not so good with certain pawn structures or at endgames. And I actually think I was a 26 100 level player in endgames. When I co won the US Junior in 1977, all 5 of my wins were in end games. Wow. So it'd be interesting over, Sarah. Yeah. So it'd be interesting to see over, an analysis of all games which ones lead to performance results like that. Yeah. I mean, do do you so I'm an I'm an an adult improver. After 25 years absence from tournament play, I am playing in tournaments again. And someone told me at the very beginning of that, your rating is gonna instantly fall 200 points, which is what happened Mhmm. After 25 year rates. Yeah. Yeah. Rusty, and, also, do you think age plays a factor? Do you think people who are older calculate weaker and have to use other things like more chess cultural knowledge? Yeah. I'll say that's probably true. So so, you know, we had like, I find myself in research mathematics, having to, you know, rely on my pattern matching more than deep calculations. Like, I've been very tired doing a maximum likelihood calculation to try and add a new feature to my model, and, and that is, you know, not working out so well. So And and so so so with with chess, like, again, you had something where after you went upon, be careful of mistakes. Are there other computer with chess wisdoms that you've developed that you've seen in your data, that you've already seen? Yeah. Not really. See, I haven't had a chance to play in tournaments. And, you know, maybe since I've monitored so many tournaments, it's, quite possible that I have been the one only player who can't really play unknown. So but, I just haven't had time. Have you seen anything come out of the statistics, though, or have you re have you researched anything? It's not most of your research, obviously, is on the cheating. Right. But I bet you you could come up with various like, it's it strikes me as amazing that no one's really studied which openings will give you the most improvement in rating points on average. Yeah. I mean, there's the what I was about to say is, you know, AlphaZero is famously credited with, showing the unsuspected value of the moves h 4 or g 4 and similar ones, h 5 or g 5 fight black earlier in games. So so Right. And people call that. Yeah. Right. Like like Carlson, for instance, after al after Matthew Sadler's, you know, studies on Alpha 0, Carlson started playing h 4 and g 4 a lot more. Yeah. Well, that's that's that's dead right. So there would be lots of other opportunities for for things like that in, chess. Well, Ken Ken Regan, professor at University of Buffalo, thank you so much for this analysis. I mean, I don't even know what else to ask. I hope you come on again. Yeah. And I've long admired all all of your work, including Mhmm. The one game we played that you don't remember, but that's okay. I was I was unknown, and you were the famous international Ken Regan. And so so thanks once again. Is there anything else that you think interesting to say that I'm not asking about the about the Neiman Carlson situation or or chest cheating in general? Yeah. Well, I'd be I'd be very happy to do a separate show where Bayesian reasoning, you know, doomsday argument, that sort of philosophical thing, and relations to discussions like Nassim, Nicholas Nassim Taleb, fat tails, and because my model intersects with some of that stuff as well. And it I would love to talk about about that. So we'll definitely have to have to schedule that, and that's very much relating to trading and investing. And and I have a lot of you know, he there's a whole difference between a normal curve with fat curves as opposed to a power law situation, which he discusses also in his work, and we could and this is very relevant for trading, particularly as coming up right now. Hypothetically, and I've talked about this on the podcast before, if China invades Taiwan, we're gonna have a 7 sigma event potentially in the stock market and different ways to look at that are are interesting. Yeah. And you're this this is my only be interesting well, this is interesting to everybody if they really know what it means, but your p equals m p work is very interesting. So we'd love to discuss that at some point. And, and, yeah, all that more sciency stuff. But just to finish, on this note, is the one you see, the cheating stuff is not unusual information. The one unusual information about this case is that Niemann had reviewed the bishop e 6 move before the game. And I'm not gonna say that that was, unlikely in itself, but the effect, the deportion of people's thinking, that unusual bit, I think, has outsized share of the mind space. And I think that's the right way to approach this case. Yeah. I think there's an Occam's razor here, which is that there was enough background suspicion on other things, and then Magnus lost in an what was for him in an unusual fashion that it all kind of bubbled to the top and things got out of hand, and, you know, we'll see what happens. So but then there's the the more existential question, which is that as let's say someone has a computer chip in their head, you know, if Elon Musk's Neuralink works. Oh, yeah. This is a big thing. I've talked Then is chess over? Right. How how do you define, you know, implants? Yeah. How do we define human yeah. Is this, you know, prosthetics? Oh, that that's a big area. Alright. Well, Ken, we we'll discuss that another time. Thank you so much for coming on the podcast. I really appreciate it. Okay. Thank you for having me. Introducing OptionsCard, Ireland's new multi brand digital gift card. OptionsCard is perfect for employee and personal gifts as you can customize it with images and video messages. Send instantly by email or print it out to send later. And best of all, it's completely free of any charges or fees. Redeemable for Ireland's biggest retail brands, options card lives in your mobile wallet so you'll always know the current balance and you'll never lose a gift card again. Options card, buy now at optionscard.au.
Comments