From: "World Chess Championship", INTERNET:newsletter@mark-weeks.com Date: 00/05/15, 14:38 Re: Chess History on the Web (2000 no.10) The next review, continuing the second pass through the Chess History bookmarks, is for Chess Downloads by Klaus Wrba. The site is at address... http://home.t-online.de/home/wrba.klaus/download.htm ...and is a directory of sites which offer files of game scores for download. Many chessplayers like to collect files of chess games, where the game scores are usually in PGN or Chessbase format. I don't really understand why they like to do this, although I'm certainly glad that they do. I collect files which correspond to the chess events documented on my own site, because the events are more meaningful when taken together with the games. Wrba classifies download sites into seven categories... A - International Databases B - Chess sites with games C - National Databases D - Tournaments E - Correspondence Chess International F - Correspondence Chess National G - Email Associations ...Each link is decorated with a flag showing the nationality of the owner. The links also have the title of the site, plus a rating and a description in German. For some reason, the domain of the home page is not the same as the linked pages, which are in www.games-of-chess.de. While looking at the linked sites I found the 'Homepage of Lars Balzer' at address... http://www.rhrk.uni-kl.de/~balzer/index.html ...which has 'links to downloadable chessgames around the internet', and which is very similar to the Wrba site. The Balzer list is organized as a single table, where each link is classified by country, with a brief description the linked site. I decided to look at both sites for this article. I considered two different approaches: I could limit the review to just these two sites -or- I could review the most interesting of the linked sites. I decided to concentrate on the two master lists, to determine whether one has any advantage over the other, and to defer reviews of interesting linked sites to another time. It is a whale of a job to find all Internet sites with game scores and to classify them. It as an even bigger job to maintain the links. The archive sections of the general chess directories, like Chessopolis, don't even come close to the number of sites linked by the Balzer & Wrba directories. Having said that, it is straightforward to create a starter set of links. The process is:- 1) Locate lists that other people have assembled. 2) Merge the lists & eliminate duplicates. 3) Publish the new list on the Web. I decided to try my hand at building a list of links to game scores. It took me about an hour to reduce the Balzer & Wrba lists, to eliminate the duplicates, and to upload the results to... http://www.mark-weeks.com/sit-0e15.htm ...Although I learned some things about directories while doing the job, my new list is not particularly useful. The links need to be annotated to guide visitors to those sites which are most likely to deliver the sought-after files. The real added value comes in classifying the linked sites. Both Balzer & Wrba have attempted to classify sites geographically, which may be the best approach. The most popular methods of classifying chess games are by event, by player, & by opening. I prefer classification by event because it is the cleanest -- it's an easy call whether a game was played in a particular event or not. Once an event is fully documented, it becomes stable. Geographical classification is one level higher. Events, whether local or international, which were played in Spanish towns & cities can logically be included in a collection of Spanish games. Some problems occur when national boundaries change -- as in the breakups of Yugoslavia and the Soviet Union -- but these are easily handled by convention. Games played by correspondence, by email, or over the Internet may also need to be classified by convention. Classification by player is a second reasonable option, but it has at least two disadvantages. The first is that every game has at least two players (some have more than two!), which means that many games end up in two collections. The second is that collections for active players are out of date as soon as the player competes in another event. Even collections for inactive players are subject to change when new game scores from exhibitions are unearthed from forgotten periodicals. My least favorite system of classification is by opening. Game files constructed around a popular opening are out of date as soon as they are assembled. Some opening sequences are difficult to classify due to transposition. I've often seen the same game in two sources where the annotators classified the game in different sections of ECO. Resources like Chesslab are a more efficient way to classify & research opening theory. Another question which confronts everyone who collects game scores is, 'Where do I draw the line?' Should any game by any player be collected -or- should the collection be limited to strong players only? If limited to strong players, how should 'strong' be defined? Should games between computers be included? Where one draws the line is largely a matter of personal taste. I have two overriding general questions for any Web directory -- 1) Are the links correct & operational? 2) Are they useful? While examining the linked sites on my combined Balzer/Wrba list, I encountered some other issues. Different Web addresses frequently point to the same page -- a common example is an 'index.htm' address which serves as a default page. The addresses written with 'index.htm' deliver the same page as those written without. How can these duplicates be eliminated efficiently? Another issue is which page to indicate when there are several good candidates on the same site. There are many sites where Balzer and Wrba have chosen to link different pages. New problems arrive afterwards when the list of links has to be maintained. One approach is to provide a mechanism where visitors can give feedback on changes. This is haphazard at best. Many people will take the time to suggest a new site, especially when it is their own creation, but not many will take the time to flag a broken link. This means that broken links need to be monitored constantly, which can be a tedious job for even a small set of links. Fortunately, there are a few Web based tools which check links on a page or on an entire site. I had previously bookmarked three addresses which perform link checking, so I went back to each in order to take a closer look. The three tools are similar -- given a Web address, they try to access each link on the page to determine if the target is active or not. After checking all links they issue a summary of their findings. 1) Web Site Garage at http://websitegarage.netscape.com/ is a Netscape/AOL service. It is limited to 25 links, which is not a lot for a links page, so I couldn't use it. 2) Site Check at http://siteowner.bcentral.com/sitecheck.cfm is a Microsoft Network (MSN) service. I discovered that it doesn't handle FTP addresses or local links correctly. In fact, it repeatedly reported a large number of active links as 'Fail', but didn't say why. I assumed that it simply didn't wait long enough. Since it also 'times out' on the Balzer page & offers no option to send its report to an email address, I couldn't use it. 3) NetMechanic at http://www.netmechanic.com/ is the best of the trio, although it doesn't handle FTP addresses. NetMechanic returned the following statistics for the Balzer list... 126 ok 21 file not found 4 no such domain name 1 no response from host ... which indicates that 17.1% of the links have a problem. It returned the following for the Wrba list... 186 ok 35 file not found 12 no such domain name 3 no response from host 2 access denied for robots 1 access forbidden ...which is 22.2% with a problem. Although the Balzer page seems to be a little more accurate, I doubt that the difference is significant. It turns out that Wrba doesn't remove inactive links from his pages, which may also account for the higher percentage. I found a few other errors in the Wrba links... 'A' page:- - the link for the TWIC archive is completely incorrect - the link for the S. Mayer Chessbase is mistyped 'B' page:- - the link for Chesscity is missing the address entirely ...which led me to believe that these links have never been properly checked. On the positive side, Wrba rates each of his links with up to five stars, which makes it easy to determine which sites are the most promising. I made a quick check to determine the number of sites in each category, and came up with the following counts... 34 ***** [five stars] 23 **** 30 *** 66 ** 27 * 14 [no stars] ...Some sites are listed on more than page with a different number of stars, so the rating system is highly subjective. Along with the stars, Wrba classifies sites with a colored ball... 141 Blue : OK 27 Yellow : No longer updated 18 Red : No longer active ...The 'C' page also lists 8 sites with a blinking red ball, but I'm not sure what this means. At first I was mystified as to why he would leave inactive sites on his lists -- most people try to cull these. When I saw that he is offering a set of CDs containing all the files that he has ever discovered on the Web, I understood. The CDs include files from sites which have disappeared. --- There are many interesting sites linked from the two lists which document a piece of chess history. If you're looking for a specific game or file, you'll probably find it on one of the sites linked from the Balzer & Wrba lists, assuming that it exists somewhere on the Web. Balzer's site appears to be maintained somewhat better, perhaps because Wrba's objective is to sell CDs rather than to maintain an online directory. For this reason, I'm going to change the Chess History bookmark to point to the Balzer site. Bye for now, Mark Weeks