From: "World Chess Championship", INTERNET:newsletter@mark-weeks.com Date: 00/02/01, 09:38 Re: Chess History on the Web (2000 no.3) The next review, following the Chess History bookmarks, is for 'Two million games online' by ChessLab. The home page is at http://www.chesslab.com/, where we learn that the service provides 'Java based interactive search, analysis of 2 million chess games [...] move chess pieces on Java chess board; search and analyse any position for free [...] search games by position, players names, years, etc. over the Internet for free [...] interactive study of chess games, generation of PGN'. It's not clear if this is a commercial site. My guess is 'no', because there are almost no banner ads & I see no other way that it generates revenue. The bottom of the page says 'Copyright © 1999, GameColony.com', which links to a page offering 'Free Online Board and Card Games'. The Usenet posts announcing the service started around end-1998. Network Solutions 'Web Interface to Whois' says that the record for the site was created on 31-Aug-1998. The site is registered to BPS Software, Inc. & all contacts are listed as Boris Shneyderman. The Java fun starts on http://www.chesslab.com/PositionSearch.html. This search interface has two frames -- on the left is a chess board with some controls; on the right are four tabs. The default 'Search games' tab controls a database search using parameters & buttons. The other three tabs are 'Analyse', which suggests moves for a given position; 'Magazine 12/99', which opens up the 'ChessLab Interactive Magazine'; and 'Board', which sets some options for the left frame. On the 'Search games' tab there are five buttons for 'Help', 'Search games', 'Openings', 'Position stats', and 'Download'. The help button opens a new window which explains the parameters and buttons. It links to other help pages, of which the most useful is the index http://www.chesslab.com/Help/h_indexchess.htm. The 'Search games' button launches a database search based on the search parameters:- - White Player - Black Player - Both colors [checkbox] - City/Location * - Game Result ['White won', 'Black won', & 'Draw' checkboxes] - Date range ['Latest games (1991-present)' & 'Historical archive (1485-1990)'] - Year - Rating * The parameters marked '*' are 'Accurate for chess games after 1997'. There is also a link to set some 'Advanced options', which are:- - Hits per page - With checking ['Castling' & 'En passant captures'] - Player name scan ['Fast', 'Similar, & 'Full scan'] - Max. moves - Reverse sort by year The 'Openings' button allows a selection of ECO codes. When one is selected, the moves are played automatically on the board in the left frame. The 'Position stats' button searches the database and returns statistics [% won by White, % won by Black, % Draw] for games in the selected database. The initial position of a chess game gives:- - Historical archive 40%-28%-31% - Latest games 38%-30%-30% whih means that White has won 38% of the games in the database played since the beginning of 1991. There is a roundoff problem here -- the numbers don't add up to 100%. I apportioned the missing 1-2% over each of the results using the same ratio, e.g. in the 'Historical archive', the result 'White won' gets 40.40%, 'Black won' gets 28.28%, etc. Then I calculated the expected point value of a game for the two databases. This gives:- - Historical archive W-0.561 B-0.439 - Latest games W-0.541 B-0.459 which indicates that White's expected point value is decreasing over time. I did some more tests of this interesting statistics function, which I'll post on the discussion group as a response to this article. The 'Download' button opens up a new window, which offers to create a file either from the game on the board -or- from the games returned in the latest search. The file is written to the same new window, where it can be saved as a PGN file. I initially had some trouble with this because the buttons were never reset correctly after a download. I discovered in the help pages that this is because I run my browser without a local cache. The problem disappeared when the cache was reactivated. Unfortunately, the PGN headers don't respect the PGN standard. The help page 'Please send your chess games' describes the PGN format used by ChessLab... 'In one email (or attached text file) you can send one or multiple games, where the format conforms to the following example: [Event "8th Amber Blindfold"] [Site "Monte Carlo MNC"] [Date "1999.??.??"] [White "Karpov, A"] [Black "Ljubojevic, L"] [Result "1-0"] [WhiteElo "2710"] [BlackElo "2571"] [Round "6"] 1. d4 Nf6 [...] 30. Rd8 1-0 'Please note that standard export PGN format has no comments or diagram indicators. Also, please note that there is one space between the move number and the actual chess move.' ...This is the same format which is produced on a download. The specification for the PGN format is available on the site at http://www.chesslab.com/PGNDescription.txt. This is the standard dated 1994.03.12, by Steven J. Edwards, where the relevant section is... '8.1.1: Seven Tag Roster 'There is a set of tags defined for mandatory use for archival storage of PGN data. This is the STR (Seven Tag Roster). The interpretation of these tags is fixed as is the order in which they appear. Although the definition and use of additional tag names and semantics is permitted and encouraged when needed, the STR is the common ground that all programs should follow for public data interchange. 'For import format, the order of tag pairs is not important. For export format, the STR tag pairs appear before any other tag pairs. (The STR tag pairs must also appear in order; this order is described below). Also for export format, any additional tag pairs appear in ASCII order by tag name. 'The seven tag names of the STR are (in order): 1) Event (the name of the tournament or match event) 2) Site (the location of the event) 3) Date (the starting date of the game) 4) Round (the playing round ordinal of the game) 5) White (the player of the white pieces) 6) Black (the player of the black pieces) 7) Result (the result of the game)' ... Note that 'Round' should be the fourth tag, but it is the last in ChessLab's header. Fortunately, this is only a minor annoyance, which is probably overcome by most software. I've never had a real problem with it. My main interest is in the value of this site to the chess historian. It can also be used to research openings and positions, and it appears that the tool was designed primarily for this reason. I decided to perform some practical tests related to chess history. --- I) How many Bobby Fischer games are in the 'Historical archive'? 1) A search on 'fischer' returned many games played by other than Robert James, so I had to narrow the search on name. The PGN standard says, 'If a first name or first initial is available, it is separated from the family name by a comma and a space.' 2) A search on 'fischer, r' returned 'Sorry, no games found for this position with current parameters'. Maybe the name is represented otherwise. A search on 'fischer,r' returned the same message. I decided to locate some Fischer games to see exactly how the name is spelled in the database. The following searches were all limited to 1972. 3) A search on 'fischer' & 'spassky' returned 21 games where Fischer's name is spelled 'Fischer, R', following the PGN standard. A search on 'fischer' [without Spassky] returned 22 games. There was one additional game against 'Parham, F'. 4) A search on 'fischer,' returned 21 games, including the Parham game, so I lost one game somewhere. A search on 'fischer, ' [note the trailing space] also returned 21 games. 5) A search on 'fischer r' [as well as on 'fischer, r' & 'fischer, r '] returned 'Sorry, no games found...'. At this point, I gave up -- I don't know how to narrow the search on name. Unless I'm overlooking something obvious, the search doesn't handle players with the same family name -- searching for one of the Polgar sisters would be a problem. Finally, I looked at the results of the searches where 21 games and 22 games were returned. For some reason, in some searches, game 13 of the 1972 match is returned twice with a different number of moves in the two games. The PGN headers seem identical. The extra game is returned on the search for 'fischer' & 'spassky', but not on the search for 'fischer' alone. There is something wrong here. --- II) How many Morphy games are there? If I can't get the number of Fischer games in one search, I'll have to go through the database year by year to get all of Fischer's games. I didn't want to do that for this article -- it is, after all, only a test. There should be little trouble with Morphy's name, so I decided to try that instead. The results of a search are normally returned by descending year & ascending player names. The first game returned for 'Morphy, P' was played in 1898 (Morphy died in 1884, so it must be another Morphy!) against 'NN', the second was played in 1869. I can only get a maximum of 50 games each time. After clicking on 'Next' a few times I arrived at 'games from 251 to 280' & no 'Next' button. The last game in the list was for 'Morphy, E-Ford' in 1840. The next to last was for 'Morphy, P-Morphy, A' in 1847. There were two games in 1848. I downloaded the 280 Morphy games. I also downloaded the corresponding file from UPITT... http://www.pitt.edu/~schach/Archives/index2.html ftp://136.142.185.47/group/student-activities/chess/PGN/Players/morphypg.zip ... and noted that morphypg.zip is a 56 Kbyte file, was created on 'Sun Jul 9 00:00:00 1995', and contains 400 games. After comparing the files by game date, I discovered that most games in both files were played in three consecutive calendar years... Year ChessLab UPITT 1857 57 86 1858 118 131 1859 53 91 ...and that UPITT consistently has more games in any one year. I know from experience that the UPITT player files contain many duplicate games, with small variations in the moves of the duplicate games. Can that account for the difference of 120 games between the two files? I compared the files on Morphy's opponents in 1858. After making a small correction on the UPITT file, where the PGN standard for names is not respected, I discovered many similarities, but also some differences. Here's a sample of both... Player ChessLab UPITT Barnes, T 8 Barnes,T 8 Bird, H 4 Bird,H 4 Boden, S 9 Boden,S 10 Comparing the Boden games from 1858, I found the game missing in the ChessLab collection. It is not listed under another year in the ChessLab collection & is not duplicated in the UPITT collection. My conclusion is that the ChessLab collection is missing games that are available in other collections. --- III) Kasparov - Anand For some time now, I've wanted to add to my own site an overview of all games played between Kasparov and Anand. I've already looked up all games played between the two players since their 1995 match & decided to check this against the ChessLab database. The 'Latest games' database returned 61 games. The 'Historical archive' returned 'Sorry, no games found...'. The following table gives the breakdown by result, e.g. the first line shows that Kasparov has won nine times with the Black pieces... Anand, V Kasparov, G 0-1 9 Anand, V Kasparov, G 1-0 5 Anand, V Kasparov, G =-= 11 Kasparov, G Anand, V 0-1 3 Kasparov, G Anand, V 1-0 15 Kasparov, G Anand, V =-= 18 ...while the result over all 61 games is +24-8=29 for Kasparov. How reliable is this calculation? Looking at the results for the 1995 PCA title match, I saw that games 1 & 7 are duplicated, and that game 5 is missing. The duplicate game 1 has a version with Kasparov as White, although Anand had White in the odd numbered games. When I find the time to look at this more carefully, I'll post the analysis. All I can say for now is that Kasparov is an overwhelming favorite in any new match, which is really no surprise! --- IV) Olafsson - Fischer I often encounter discrepancies between PGN files and the printed literature. A recent example is the game Olafsson - Fischer from round 11 of the 1958 Portoroz Interzonal. The PGN game score I have on file does not match the score given in Fine's 'The World's Great Chess Games'. The discrepancy starts where the PGN move 37...R8e4 does not match Fine's 37...R(K1)-K4, which is 37...R8e5 in algebraic notation. This type of discrepancy normally disappears after a move or two when the game scores converge to the same line. In these cases it is impossible to tell which move is correct & it usually makes little difference to the result. What made this case different was that the PGN score & Fine's score fail to converge. The endings of both scores are completely different until Fischer resigns on his 44th move. When I crosschecked with Wade & O'Connell's 'Bobby Fischer's Chess Games', I found a third variation. Wade's score matches Fine's until it diverges on the 40th move. It converges on the next move, not to Wade's score, but to the PGN score. If Wade's score is correct, then Olafsson's 40th move was a blunder which would have allowed Fischer to gain the advantage. How many versions of this game does ChessLab's database have? A search on 'Olafsson - Fischer' returns four games, of which one is from the 1958 Interzonal, and which matches the PGN score. The move 37...R8e4 is more logical than 37...R8e5, so it seems that ChessLab has the correct score. --- While I was finishing this review & double checking my work, I frequently received the error message... 'A network error occurred: unable to connect to server (TCP Error: Broken pipe) The server may be down or unreachable. Try connecting again later.' ...from the ChessLab server. This usually means that the server is having problems. Because of the complex database engine required to drive a service like this, I suppose that Shneyderman operates the server himself. He must be a very busy person -- writing the software, operating the server, & loading the new games. This is a tremendous task. Let there be no doubt. This database is a great tool & I have already used it as a tool to prepare a new research. If a game has been converted to digital format, the chances are good that it has been loaded on ChessLab's database. But, as with many Web-based resources, its results are not completely trustworthy and need to be confirmed against another source. Judging from his help pages, Shneyderman is very receptive to any additions & corrections which other people might offer. Maybe I'll send him a collection of my own games -- it's the only way they'll ever be published! Bye for now, Mark Weeks