Пређи на садржај

Wayback Machine — разлика између измена

13.660 бајтова уклоњено ,  пре 6 година
м (engleski)
{{курзивни наслов}}
{{потребан превод}}
[[Датотека:Internet Archive Wayback Machine logo.png|мини]]
-{'''''Wayback Machine'''''}- је дигитална [[архива]] [[Светска мрежа|светске мреже]] (веба) и других информација на [[интернет]]у креираних од стране ''[[Интернетска архива|Интернетске архиве]]'' ({{јез-енг|Internet Archive}}), непрофитне организације са седиштем у [[Сан Франциско|Сан Франциску]] ([[Калифорнија]]). Архиву су основали [[Брустер Кејл]] и [[Брус Џилијат]], а одржава се заједно са садржајима ''[[Алекса интернет|Алекса интернета]]'' ({{јез-енг|Alexa Internet}}), калифорнијске подржнице ''[[Amazon.com|Амазон]]а'' која сакупља комерцијалне податке о веб-саобраћају. Сервис -{''Wayback Machine''}- корисницима омогућава да виде архивиране верзије [[веб-страница]] од. како су те странице изгледале на одређени датум у прошлости, што сама архива назива „тродимензионалним индексом”.
Име -{''Wayback Machine''}- (дословно у преводу са енглеског: „машина за путовање уназад”) је изабрано као смешна алузија на плот-уређај из анимираног цртаног филма -{''[[The Rocky and Bullwinkle Show]]''}-. У једном од саставних делова тог анимираног цртаног филма, -{''[[Peabody's Improbable History]]''}-, главни ликови -{[[Mr. Peabody]]}- и -{Sherman}- рутински су користили [[Путовање кроз време|времеплов]] по имену -{''[[WABAC machine]]''}- (што се изговара исто као и -{''wayback''}-) како би сведочили, (не)учествовали или изменили одређене познате догађаје из прошлости.<ref>{{cite news |first=Heather |last=Green |title=A Library as Big as the World |publisher=BusinessWeek |date=28. 2. 2002 |url=http://www.businessweek.com/technology/content/feb2002/tc20020228_1080.htm |accessdate=29. 7. 2007 }}</ref><ref>{{cite news|last=Tong|first=Judy|title=RESPONSIBLE PARTY — BREWSTER KAHLE; A Library Of the Web, On the Web|url=http://www.nytimes.com/2002/09/08/business/responsible-party-brewster-kahle-a-library-of-the-web-on-the-web.html|accessdate=15. 8. 2011|newspaper=[[New York Times]]|date=8. 9. 2002}}</ref>
== Почеци, раст и капацитет/могућности складиштења ==
Брустер Кејл и Брус Џилијат су 1996. године развили [[софтвер]] за [[Веб-индексер|индексирање]] и преузимање свих јавно доступних страница [[веб]]а.
In 1996 Brewster Kahle, with Bruce Gilliat, developed software to [[Web crawler|crawl]] and download all publicly accessible World Wide Web pages, the [[Gopher (protocol)|Gopher]] hierarchy, the [[Netnews]] (Usenet) bulletin board system, and downloadable software.<ref name=ArchivingInternet>{{cite web|last=Kahle|first=Brewster|title=Archiving the Internet|url=http://www.uibk.ac.at/voeb/texte/kahle.html|publisher=Scientific American – March 1997 Issue|accessdate=19 August 2011}}</ref> The information collected by these "crawlers" does not include all the information available on the Internet, since much of the data is restricted by the publisher or stored in databases that are not accessible. These "crawlers" also respect the [[robots exclusion standard]] for websites whose owners opt for them not to appear in search results or be [[Web cache|cached]]. To overcome inconsistencies in partially cached websites, Archive-It.org was developed in 2005 by the Internet Archive as a means of allowing institutions and content creators to voluntarily harvest and preserve collections of digital content, and create digital archives.
Information had been kept on digital tape for five years, with Kahle occasionally allowing researchers and scientists to tap into the clunky database.<ref>{{cite news |last=Cook |first=John |title=Web site takes you way back in Internet history |url=http://www.seattlepi.com/news/article/Web-site-takes-you-way-back-in-Internet-history-1070534.php |accessdate=15 August 2011|newspaper=Seattle Post-Intelligencer |date=November 1, 2001}}</ref> When the archive reached its fifth anniversary, it was unveiled and opened to the public in a ceremony at the [[University of California-Berkeley]].
Snapshots usually become available more than 6 months after they are archived or in some cases even later, 24 months or longer. The frequency of snapshots is variable, so not all tracked web site updates are recorded. There are sometimes intervals of several weeks or years between snapshots.
After August 2008 sites had to be listed on the [[Open Directory Project|Open Directory]] in order to be included.<ref>{{cite web|url=https://www.archive.org/about/faqs.php |title=Internet Archive FAQ |publisher=Archive.org |date= |accessdate=2014-04-16}}</ref> According to Jeff Kaplan of the Internet Archive in November 2010, other sites were still being archived,<ref>[https://www.archive.org/post/309042/why-did-archiving-stop-suddenly-about-2-years-ago-for-our-site Archive.org forum thread with response by Jeff Kaplan], last update November 07, 2010</ref> but more recent captures would only become visible after the next major indexing, an infrequent operation.
{{As of|2009}} the Wayback Machine contained approximately three [[petabyte]]s of data and was growing at a rate of 100 [[terabyte]]s each month;<ref>{{cite news | first=Lucas |last=Mearian |title=Internet Archive to unveil massive Wayback Machine data center |publisher=Computerworld.com |date=March 19, 2009 |url=http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=hardware&articleId=9130081&taxonomyId=12&intsrc=kc_top| accessdate=2009-03-22}}</ref> the growth rate reported in 2003 was 12 terabytes/month. The data is stored on [[PetaBox]] rack systems manufactured by [[Capricorn Technologies]].<ref>{{cite news |first=Michael |last=Kanellos |title=Big storage on the cheap |publisher=CNET News.com |date=July 29, 2005 |url=http://news.zdnet.com/2100-9584_22-5808754.html |accessdate=2007-07-29 |archiveurl = https://web.archive.org/web/20070403030705/http://news.zdnet.com/2100-9584_22-5808754.html <!-- Bot retrieved archive --> |archivedate = 2007-04-03}}</ref>
In 2009 the Internet Archive migrated its customized storage architecture to [[Sun Open Storage]], and hosts a new data center in a [[Sun Modular Datacenter]] on [[Sun Microsystems]]' California campus.<ref>{{cite web |title=Internet Archive and Sun Microsystems Create Living History of the Internet |publisher=[[Sun Microsystems]] |date=March 25, 2009|url=http://www.sun.com/aboutsun/pr/2009-03/sunflash.20090325.1.xml |accessdate=2009-03-27}}</ref>
In 2011 a new, improved version of the Wayback Machine, with an updated interface and fresher index of archived content, was made available for public testing.<ref name=WordpressArchive>{{cite web|title=Updated Wayback Machine in Beta Testing|url=http://iawebarchiving.wordpress.com/2011/01/24/updated-wayback-machine-in-beta-testing/|publisher=Archive.org|accessdate=19 August 2011}}</ref>
In March 2011 it was said on the Wayback Machine forum that "The Beta of the new Wayback Machine has a more complete and up-to-date index of all crawled materials into 2010, and will continue to be updated regularly. The index driving the classic Wayback Machine only has a little bit of material past 2008, and no further index updates are planned, as it will be phased out this year."<ref>{{cite web|url=https://www.archive.org/post/350738/updated-wayback-machine-in-beta-testing |title=Beta Wayback Machine, in forum |publisher=Archive.org |date= |accessdate=2014-04-16}}</ref>
In January 2013 the company announced a ground-breaking milestone of 240 billion URLs.<ref>{{cite web|url=http://blog.archive.org/2013/01/09/updated-wayback/ |title=Wayback Machine: Now with 240,000,000,000 URLs &#124; Internet Archive Blogs |publisher=Blog.archive.org |date=2013-01-09 |accessdate=2014-04-16}}</ref>
In October 2013 the company announced the "Save a Page" feature<ref name=ia-2013-10>{{Cite web
| url = https://blog.archive.org/2013/10/25/fixing-broken-links/
| title = Fixing Broken Links on the Internet
| last = Rossi
| first = Alexis
| date = 2013-10-25
| website = archive.org
| publisher = Collections Team, the Internet Archive
| location = San Francisco, CA, US
| archive-url = https://web.archive.org/web/20141107193437/http://blog.archive.org/2013/10/25/fixing-broken-links/
| archive-date = 2014-11-07
| dead-url = no
| access-date = 2015-03-25
| quote = We have added the ability to archive a page instantly and get back a permanent URL for that page in the Wayback Machine. This service allows anyone &ndash; wikipedia editors, scholars, legal professionals, students, or home cooks like me &ndash; to create a stable URL to cite, share or bookmark any information they want to still have access to in the future.
}}</ref> which allows any Internet user to archive the contents of a URL. This became a threat of abuse by the service for [[Drive-by download|hosting malicious binaries]].<ref name=vt-207-241>{{Cite web
| url = https://www.virustotal.com/en/ip-address/
| title = IP address information
| author = The VirusTotal Team
| date = 2015-03-25
| website = virustotal.com
| publisher = [[VirusTotal]]
| location = Dublin 2, Ireland
| archive-url = https://web.archive.org/web/20140714232311/https://www.virustotal.com/en/ip-address/
| archive-date = 2014-07-14
| dead-url = no
| access-date = 2015-03-25
| quote = 2015-03-25: Latest URLs hosted in this IP address detected by at least one URL scanner or malicious URL dataset. ... 2/62 2015-03-25 16:14:12 [complete URL redacted]/Renegotiating_TLS.pdf ... 1/62 2015-03-25 04:46:34 [complete URL redacted]/CBLightSetup.exe
}}</ref><ref name=goog-sb-ia1>{{Cite web
| url = http://www.google.com/safebrowsing/diagnostic?site=archive.org
| title = Safe Browsing Diagnostic page for archive.org
| author = Advisory provided by Google
| date = 2015-03-25
| website = google.com/safebrowsing
| publisher = [[Google]]
| location = Mountain View, CA, US
<!-- | archive-url = Page cannot be crawled or displayed due to robots.txt.
| archive-date = -->
| dead-url = no
| access-date = 2015-03-25
| quote = 2015-03-25: Part of this site was listed for suspicious activity 138 time(s) over the past 90 days. ... What happened when Google visited this site? ... Of the 42410 pages we tested on the site over the past 90 days, 450 page(s) resulted in malicious software being downloaded and installed without user consent. The last time Google visited this site was on 2015-03-25, and the last time suspicious content was found on this site was on 2015-03-25. ... Malicious software includes 169 trojan(s), 126 virus, 43 backdoor(s).
As of December 2014 the Wayback Machine contained almost nine [[petabyte]]s of data and was growing at a rate of about 20 [[terabyte]]s each week.<ref>{{cite web |title=Internet Archive Frequently Asked Questions |url=https://archive.org/about/faqs.php |date= |accessdate=2015-01-17}}</ref>
Between October 2013 and March 2015 the website's global Alexa rank changed from 162<!-- Old Infobox data, preserved here: {{DecreasePositive}} 162 ({{as of|2013|10|29|alt=October 2013}}) --><ref name=alexa-2013-10>{{Cite web
| url = http://www.alexa.com/siteinfo/archive.org
| title = Archive.org Site Info
| publisher = [[Alexa Internet]]
| archive-url = https://web.archive.org/web/20131028025923/http://www.alexa.com/siteinfo/archive.org
| archive-date = 2013-10-28
| dead-url = yes
| access-date = 2013-10-29
}}</ref> to 208.<ref name=alexa-2015-03>{{Cite web
| url = http://www.alexa.com/siteinfo/archive.org
| title = Archive.org Site Overview
| publisher = [[Alexa Internet]]
| archive-url = https://web.archive.org/web/20150409101131/http://www.alexa.com/siteinfo/archive.org
| archive-date = 2015-04-09
| dead-url = yes<!-- set to yes, because the alexa page will show a new current rank, not the as-of-date rank, invalidating this reference. -->
| access-date = 2015-04-09
{| class="wikitable" style="margin: 1em auto 1em auto;"
|+Wayback Machine page growth
! scope="col" width="180px" |2006&ndash;08<!-- col width=180px helps ref #s not to wrap -->
! scope="col" width="180px" |2009&ndash;12
|-style="width: 100%"
!Number of pages archived {{break}} (billion)
|40<ref>{{Cite web|url=http://www.archive.org/index.php|archive-url=https://web.archive.org/web/20051231080301/http://www.archive.org/index.php|archive-date=2005-12-31|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25}}</ref>
|85<ref>{{Cite web|url=http://www.archive.org/index.php|archive-url=https://web.archive.org/web/20061228011056/http://www.archive.org/index.php|archive-date=2006-12-28|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25
}}</ref><ref>{{Cite web|url=http://www.archive.org/index.php|archive-url=https://web.archive.org/web/20071228170611/http://www.archive.org/index.php|archive-date=2007-12-28|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25
}}</ref><ref>{{Cite web|url=http://www.archive.org/index.php|archive-url=https://web.archive.org/web/20081224073445/http://www.archive.org/index.php|archive-date=2008-12-24|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25
|150<ref>{{Cite web|url=http://www.archive.org/index.php|archive-url=https://web.archive.org/web/20091220201119/http://www.archive.org/index.php|archive-date=2009-12-20|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25
}}</ref><ref>{{Cite web|url=http://www.archive.org/index.php|archive-url=https://web.archive.org/web/20101230100945/http://www.archive.org/index.php|archive-date=2010-12-30|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25
}}</ref><ref>{{Cite web|url=http://www.archive.org/index.php|archive-url=https://web.archive.org/web/20110830094047/http://www.archive.org/index.php|archive-date=2011-08-30|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25
}}</ref><ref>{{Cite web|url=https://www.archive.org/index.php|archive-url=https://web.archive.org/web/20121231094801/https://archive.org/index.php|archive-date=2012-12-31|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25
|373<ref>{{Cite web|url=https://www.archive.org/|archive-url=https://web.archive.org/web/20131231032705/https://archive.org/|archive-date=2013-12-31|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25}}</ref>
|400<ref>{{Cite web|url=https://blog.archive.org/2014/05/09/wayback-machine-hits-400000000000|title=Wayback Machine Hits 400,000,000,000!|author=michelle|publisher=Internet Archive|date=2014-05-09|archive-url=https://web.archive.org/web/20140826191225/http://blog.archive.org/2014/05/09/wayback-machine-hits-400000000000/|archive-date=2014-08-26|dead-url=no|access-date=2015-03-25}}</ref>
|452<ref>{{Cite web|url=https://www.archive.org/|archive-url=https://web.archive.org/web/20150213001303/https://archive.org/|archive-date=2015-02-13|dead-url=yes|title=Internet Archive Wayback Machine|publisher=Internet Archive|access-date=2015-03-25}}<!-- Update me at end of 2015 --></ref>
== Референце ==