Project Gutenberg

From Nordan Symposia
Revision as of 16:29, 12 August 2007 by Rdavis (talk | contribs) (New page: Project Gutenberg, abbreviated as PG, is a volunteer effort to digitize, archive, and distribute cultural works. Founded in 1971 by Michael Hart, it is the oldest digital library. Most of ...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Project Gutenberg, abbreviated as PG, is a volunteer effort to digitize, archive, and distribute cultural works. Founded in 1971 by Michael Hart, it is the oldest digital library. Most of the items in its collection are the full texts of public domain books. The project tries to make these as free as possible, in long-lasting, open formats that can be used on almost any computer. As of April 2007, Project Gutenberg claimed over 21,000 items in its collection. Project Gutenberg is affiliated with many projects that are independent organizations which share the same ideals, and have been given permission to use the Project Gutenberg trademark.


Project Gutenberg, abbreviated as PG, is a volunteer effort to digitize, archive, and distribute cultural works. Founded in 1971 by Michael Hart, it is the oldest digital library. Most of the items in its collection are the full texts of public domain books. The project tries to make these as free as possible, in long-lasting, open formats that can be used on almost any computer. As of April 2007, Project Gutenberg claimed over 21,000 items in its collection. Project Gutenberg is affiliated with many projects that are independent organizations which share the same ideals, and have been given permission to use the Project Gutenberg trademark.

History

Project Gutenberg was started by Michael Hart in 1971. Hart, a student at the University of Illinois, obtained access to a Xerox Sigma V mainframe computer in the university's Materials Research Lab. Through friendly operators, he received an account with a virtually unlimited amount of computer time; its value at that time has since been variously estimated at $100,000 or $100,000,000.[1] Hart has said he wanted to "give back" this gift by doing something that could be considered to be of great value.

This particular computer was one of the 15 nodes on the computer network that would become the Internet. Hart believed that computers would one day be accessible to the general public and decided to make works of literature available in electronic form for free. He used a copy of the United States Declaration of Independence in his backpack, and this became the first Project Gutenberg e-text. He named the project after Johannes Gutenberg, the fifteenth century German printer who propelled the movable type printing press revolution.

By the mid-1990s, Hart was running Project Gutenberg from Illinois Benedictine College. More volunteers had joined the effort. Most text was entered manually until image scanners and optical character recognition software improved and became more widely available, which made book scanning more feasible. Hart later came to an arrangement with Carnegie Mellon University, which agreed to administer Project Gutenberg's finances. As the volume of e-texts increased, volunteers began to take over the project's day-to-day operations that Hart had run.

Pietro Di Miceli, an Italian volunteer, developed and administered the first Project Gutenberg website and started the development of the Project online Catalog. In his ten years in this role (1994–2004), the Project web pages won a number of awards, often being featured in "best of the Web" listings, and contributing to the Project popularity.[2]

Recent Developments

In 2000, a non-profit corporation, the Project Gutenberg Literary Archive Foundation, Inc. was chartered in Mississippi to handle the project's legal needs. Donations to it are tax-deductible. Long-time Project Gutenberg volunteer Gregory Newby became the foundation's first CEO. Also in 2000, Charles Franks founded Distributed Proofreaders, which allowed the proofreading of scanned texts to be distributed among many volunteers over the Internet. This effort greatly increased the number and variety of texts being added to Project Gutenberg, as well as making it easier for new volunteers to start contributing. As of 2007, the 10,000+ DP-contributed books comprised almost half of the nearly 21,000 books in Project Gutenberg.

Starting in 2004, an improved online catalog made Project Gutenberg content easier to browse, access, and link to. Project Gutenberg is now hosted by ibiblio at the University of North Carolina at Chapel Hill.

Scope of collection

As of April 2007, Project Gutenberg claimed over 21,000 items in its collection, with an average of over fifty new e-books being added each week.[3]

These are primarily works of literature from the Western cultural tradition. In addition to literature such as novels, poetry, short stories, and drama, Project Gutenberg also has cookbooks, reference works and issues of periodicals. The Project Gutenberg collection also has a few non-text items such as audio files and music notation files.

Most releases are in English, but there are also significant numbers in many other languages. As of April 2007, the non-English languages most represented are (in order): French (1025 files), German (431), Finnish (384), Dutch (264), and Spanish (147).[4]

Whenever possible, Gutenberg releases are available in plain text, mainly using US-ASCII character encoding but frequently extended to ISO-8859-1. Other formats may be released as well when submitted by volunteers, with the most common being HTML. Formats which are not easily editable, such as PDF, are generally not considered to fit in with the goals of Project Gutenberg, although a few have been added to the collection. For years, there has been discussion of using some type of XML, although progress on that has been slow.

File:Pg cd.JPG
Project Gutenberg e-texts have been distributed on CD-ROM.

Ideals

Michael Hart said in 2004, "The mission of Project Gutenberg is simple: 'To encourage the creation and distribution of ebooks.'"[5]

A slogan of the project is "break down the bars of ignorance and illiteracy", because its volunteers aim to continue spreading public literacy and appreciation for the literary heritage just as public libraries began to do in the late 19th century.

Project Gutenberg is intentionally decentralized. For example, there is no selection policy dictating what texts to add. Instead, individual volunteers work on what they are interested in, or have available. The Project Gutenberg collection is intended to preserve items for the long term, so they cannot be lost by any one localized accident. In an effort to ensure this, the entire collection is backed-up regularly and mirrored on servers in many different locations.

Copyright issues

Project Gutenberg is careful to verify the status of its ebooks according to U.S. copyright law. Material is added to the Project Gutenberg archive only after it has received a copyright clearance, and records of these clearances are saved for future reference.

Unlike some other digital library projects, Project Gutenberg does not claim new copyright on titles it publishes. Instead, it encourages their free reproduction and distribution.

Most books in the Project Gutenberg collection are distributed as public domain under U.S. copyright law. The licensing included with each ebook puts some restrictions on what can be done with the texts (such as distributing them in modified form, or for commercial purposes) as long as the Project Gutenberg trademark is used. If the header is stripped and the trademark not used, then the public domain texts can be reused without any restrictions.

There are also a few copyrighted texts that Project Gutenberg distributes with permission. These are subject to further restrictions as specified by the copyright holder.

Criticism

Project Gutenberg has been criticized for lack of scholarly rigor in its e-texts: for example, there is usually inadequate information about the edition used and often omission of original prefaces. A marked improvement in preserving such text can be seen by comparing earlier texts with newer ones; most new e-texts preserve edition information and prefaces. The editions also are not the most current scholarly editions, for these later editions are not usually in the public domain.

Project Gutenberg requires that all of its e-texts include a version in ASCII plain text where feasible, believing that it is the format most likely to be readable in the extended future. (They do not require an ASCII plain text version for mathematics or languages that would be hard to represent in ASCII.) Project Gutenberg also includes a variety of generally open formats alongside the ASCII ones and generated from them. Some project members and users have requested more advanced formats, believing them to be much easier to read. ASCII text by definition cannot hold some information, such as bold, italics, superscript, and some non-English characters.

Affiliated projects

All affiliated projects are independent organizations which share the same ideals, and have been given permission to use the Project Gutenberg trademark. They often have a particular national, or linguistic focus.

Although Projekt Gutenberg-DE was given permission to use the Gutenberg name years ago, not everyone considers it to be an affiliated project, because of philosophical differences. Projekt Gutenberg-DE claims copyright for its product and limits access to browsable web-versions of its texts.

List of Affiliated projects

References

  1. Template:Cite web
  2. Template:Cite web
  3. According to gutindex-2006, there were 1,653 new Project Gutenberg items posted in the first 33 weeks of 2006. This averages out to 50.09 per week. This does not include additions to affiliated projects.
  4. Template:Cite web
  5. The Project Gutenberg Mission Statement, Updated October 23 2004

(Project Gutenberg calls its products "ebooks," and that term is used here. The corresponding Wikipedia term is e-texts)

See also

External links

Template:Spoken Wikipedia

Affiliated with Project Gutenberg

  • Distributed Proofreaders a worldwide group of volunteer editors that is now the main source of ebooks for Project Gutenberg
  • Quotations Book - which has indexed Project Gutenberg and surrounded classic quotations with page content to show context - example
  • HTML Writers Guild provides guidance in using XHTML and XML markup for Project Gutenberg
  • Template:Gutenberg author (note that many of these have been renamed to Project Gutenberg for trademark concerns, and are not original with the Project)

Affiliated Projects

Not Affiliated with Project Gutenberg

  • GutenMark — a tool for automatically creating high-quality HTML or LaTeX markup from Project Gutenberg etexts.
  • GutenPy — an opensource text reader and offline catalog browser for Project Gutenberg written with pythonGTK for windows and linux.
  • Flippin — a commercial text reader and offline catalog browser for Project Gutenberg for Windows
  • Text to iPod notes converter — Program used to transfer Gutenberg files onto an Ipod in their entirety.

ar:مشروع غوتنبرغ bs:Projekat Gutenberg bg:Проект Гутенберг cs:Projekt Gutenberg da:Project Gutenberg de:Project Gutenberg el:Project Gutenberg es:Proyecto Gutenberg eo:Project Gutenberg fr:Projet Gutenberg ko:구텐베르크 프로젝트 hr:Projekt Gutenberg id:Proyek Gutenberg ia:Projecto Gutenberg is:Project Gutenberg it:Progetto Gutenberg he:פרויקט גוטנברג ka:პროექტი გუტენბერგი lt:Gutenbergo projektas nl:Project Gutenberg ja:プロジェクト・グーテンベルク no:Prosjekt Gutenberg nn:Prosjekt Gutenberg pl:Projekt Gutenberg pt:Projeto Gutenberg ksh:Projäkk Joodebersh ro:Proiectul Gutenberg ru:Проект «Гутенберг» simple:Project Gutenberg sk:Project Gutenberg sl:Projekt Gutenberg sr:Пројекат Гутенберг sh:Projekt Gutenberg fi:Gutenberg-projekti sv:Project Gutenberg ta:குட்டன்பேர்க் திட்டம் vi:Dự án Gutenberg tr:Project Gutenberg zh:古腾堡计划