Subject: Re: Kuperberg and Bruner re xxx Date: Mon, 13 Jul 1998 11:46:51 -0500 From: Clarence Wilkerson Using PDF formats for mathematical papers (This was prompted by Bob Bruner's admission that he was not sitting at a machine with Acrobat installed.) One of the output options on Hopf is the Adobe PDF file. For unix systems with a complete tex system, this may offer no advantage. For a portable mac or pc or a windows based pc, it has the advantage that one can print or view the document without having the tex fonts installed or in fact, having any tex installation at all. Typical PDF files are 2-3 times the size of the original DVI file or gzipped .ps or .lj files . This means that a 10 page paper that produced a 90k DVI file might expand to a 300k PDF file. Adobe distributes "AcroRead" from their website http://www.adobe.com as freeware. Obviously Adobe wants to promote PDF use as a standard for graphics enhanced text. AcroReader is available directly for PC's, Macs's, Linux (intel), Sunos(sparc), Solaris(sparc), and others. Most of the p.d. BSD's can use either the Linux (intel) or Sunos(sparc) version in emulation mode on the correct hardware. For example, I use Sunos(sparc) AcroRead under OpenBSD on a sparc. ` Typical system requirements are 10 megs of hard disk storage. Later versions of ghostscript will interpret PDF also. One X-shell for this is "xpdf". The version I tried had problems with embedded postscript fonts ( exactly what one wants to use with math ). If you need to produce the PDF files, ghostscript can do this also. Early versions of this method produced large files. I have not tried more recent versions. Adobe also sells many graphics tools that will output PDF. A cheap (but not free) solution is Adobe DISTILLER, which is packaged in various ways by Adobe: 1) For pcs and macs, as ACROBAT, which includes some pdf writing and scanner utilities as well as the distiller. 2) For unix boxes, as ACROBAT, missing some of the extras that the pc version has. Academic price for either of these packages is about $50. They are not needed if you only want to read the papers in pdf format. The final ingredient is the fonts to be embedded in the PDF file. If one uses the normal DVIPS output, you get a pdf file that prints fairly well on a PC, but for which the screen quality is well below acceptable. On Hopf, this has been improved by using the Bakoma scalable form of the CM fonts. On a unix workstation this give screen quality comparable to XDVI. The cost is a somewhat larger PDF file. I don't know that PDF will win the standard wars, but it is a convenient way to transmit math to sites with no tex installation. PDF versions of all 400 papers on Hopf would certainly fit on a standard CD disk. Clarence Wilkerson July 13, 1998 _____________________________________________________ Subject: Re: Morava on xxx In-reply-to: Your message of "Sat, 11 Jul 1998 11:46:06 EDT." Date: Mon, 13 Jul 1998 11:58:29 -0500 From: Clarence Wilkerson This is a response to Jack Morava's posting and is intended to be redistributed on the mailing list. While I appreciate the tone of Jack's remarks, it's clear that his view of Hopf is different than mine on at least one point: In particular, many people send final versions of papers to Hopf, as well as preliminary versions. I usually remove the earlier version from public access when the final version arrives. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In the spirit of full disclosure, Jack and I have known each other since we were teenagers growing up in small neighboring towns in southern Texas in the early 60's. Also, I use his name (without permission) to illustrate the search engine on Hopf. Best, Clarence _______________________________________________- From: Greg Kuperberg Subject: Re: Kuperberg and Bruner re xxx Date: Mon, 13 Jul 1998 10:04:40 -0700 (PDT) Bob Bruner: > As a result, I predict that in 10 to 20 years, the pre-electronic > literature will be lost to our attention, not because it is unavailable, > but just because it is harder to access. There is a project called JSTOR (http://www.jstor.org/) that is scanning the pre-electronic scholarly literature. Right now they only have 8 mathematics journals, and compared to xxx it is all history and no news. But since the project covers all scholarly disciplines (especially humanities) and since it scans the journals all the way back to the beginning, it is big enough to be interesting. They claim that they have 2.5 million pages, which could be 50% more than xxx. > Perhaps Krantz' article discusses these issues, but I don't have > the paper copy of the Notices at hand (see!). His article is available in the xxx system in PS, PDF, and DVI format, and the Notices version is in HTML (just before they switched to PDF): http://front.math.ucdavis.edu/math.HO/9801013 http://www.ams.org/notices/199708/page2.html Timor Beke: > As regards (i), dilution of scholarship (not specific to the mathematical > sciences) began -- according to certain beholders -- with the rapid growth > of journals and journal pages in the 80's. According to Plato, it began when the invention of paper undermined the oral tradition. > (But writing "ease of submission" instead of "ease of publication", I do > wonder: how will xxx safeguard itself against bozos painfully compiling > and submitting 20-page proofs of Fermat's Last Theorem?) Since xxx has a working, nearly universal physics archive, the right question is not "how will it?" but "how does it?" The answer is that it has several partial deterrents that together reduce crackpot submissions to a tolerable level. The most important barrier is that the archive is not much fun for crackpots, because it offers them no evidence that anyone is paying any attention to them. In addition, it is only available to those crackpots who can master writing papers and submitting them to the archive; although this process is as streamlined at xxx as anywhere else, most crackpots just don't get along with computers at all. Finally although xxx is not in a position to reject a bad math or physics paper, it can reclassify one which is off-topic. Many crackpot submission can't be categorized as any particular kind of mathematics or physics, so they go into "General Physics", or they would go into a hypothetical "General Mathematics". I used to be a co-moderator of sci.math.research, and I can tell you that it doesn't take much to deflate crackpots provided that they aren't getting any feedback. In sci.math they at least get criticized and lampooned; filtering out these responses is at least as important for sci.math.research as filtering out the original submissions. > Incidentally: is it the case that xxx is among the largest public > libraries of (formatted) electronic texts? Or can some order-of-magnitude > comparisons be given with other projects, so as to help with any > feasibility and cost estimates? If you count physics, then yes, I think it is. As of today, the whole system has 78725 e-prints. It's not as big going by total number of pages as JSTOR, but that could be apples versus oranges. The relevant comparison is with other mathematics and physics resources such as Math Reviews, and it is more relevant to compare the "flux" (new submissions per month) than the total number of holdings. Here is the flux into various on-line repositories in Math Reviews units: Collection Per month % of MR ---------- --------- ------- Math Reviews 4000 100 xxx (math+physics) 2000 50 xxx math 160 4 math speciality archives (*) 80? 2? all free math papers on web 1000? 25? (*) The main non-xxx subject-based math archives, other than those that have merged or will merge with xxx, are these, listed with xxx math for comparison: Archive Total papers ------- ------------ xxx math 5462+784 mp_arc 1600 Hopf 400 Topology Atlas 388 K-theory 290 Algebraic Number Theory 120 Conservation Laws 106 Other 250? I can only estimate the flux from the total number; many of these were started around the same time as xxx. Also, there are a few "archives" (such as AMSPPS) that rely on hyperlinks rather than archiving the papers themselves. I am not counting these because of the inherent structural problems of such a system (to the point that the number of papers that they have is not well-defined). As for cost, xxx's NSF grant is something like one million dollars for three years, and I don't know what their support from DOE (via Los Alamos) is. By comparison Math Reviews gets several million dollars of revenue per year, and JSTOR might be about a million and a half per year. Greg