Github Buried a Giant Open-Source Archive in an Arctic Vault
Github把一个巨大的开源档案藏在北极的地窖里
Microsoft-owned GitHub has finally moved its snapshot of all active public repositories on the site to a vault in Norway.
最终,微软的GitHub将支撑社会运转的所有开源代码的快照转移至挪威的一间地下室。
GiHub announced the archiving plan last November and on February 20 followed through with the 21 terabyte snapshot written to 186 reels of film.
GiHub在去年11月宣布了存档计划,随后的2月20日,21TB的快照被写入186卷胶片。
GitHub cancelled plans for a team to “personally escort the world’s open-source code to the Arctic” due to the coronavirus pandemic, leaving the job to local partners who received the boxed films and deposited them in an old coal mine on July 8.
由于疫情,GitHub取消了“亲手将全世界的开源代码运往北极”的队伍计划,把这项工作留给了当地收到盒装胶卷的同伴,于7月8日由他们将胶卷储藏在一口废弃的煤矿井里。
The archive is being stored in Svalbard, Norway, a group of islands that’s also home to the global seed bank.
该档案存放在挪威的斯瓦尔巴德,这是一个群岛,也是全球种子库的所在地。
“The code landed in Longyearbyen, a town of a few thousand people on Svalbard, where our boxes were met by a local logistics company and taken into intermediate secure storage overnight,” said Julia Metcalf, director of strategic programs at GitHub.
“代码埋藏于朗伊尔城,它是斯瓦巴特岛里一座仅有几千人的小镇,在那里,我们的储存容器面交给当地物流公司,并连夜送往中间安全储存库。”GitHub的策略工程主管朱莉亚·梅卡夫说道。
“The next morning, it traveled to the decommissioned coal mine set in the mountain, and then to a chamber deep inside hundreds of meters of permafrost, where the code now resides fulfilling their mission of preserving the world’s open-source code for over 1,000 years.”
“第二天早上,它便到了设在山里的废弃矿井,接着被送到冻土层内上百米深的一个房间里,这就是如今代码所归之处,该处的使命是保存世界开源代码超过1000年。”
The repository includes public code repositories and significant dormant repos. The snapshot consists of the HEAD of the default branch of each repository, minus any binaries larger than 100kB in size. Each repository is then packaged as a single TAR file, and for efficiency’s sake, most of the data will be stored as QR codes.
储存区包括公共代码储存库和重要的休眠区。快照由每一储存库的默认分支的HEAD组成,减去任何大于100kB的二进制文件。接着每一个仓库都被打包成单个TAR文件,为了效率,大多数据将以二维码的形式储存。
A human-readable index and guide will itemize the location of each repository and explain how to recover the data.
人类可读索引和指南会逐条记录每一储存库所在处,并说明恢复数据的方法。
The Internet Archive separately kicked off its archive of GitHub public repositories on April 13. Its Wayback Machine is archiving raw GitHub data as Web ARChive (WARC) files and so far has archived 55TB of data.
互联网档案馆于4月13日单独开启GitHub公共储存库的存档工作。它的时光机正将原始的GitHub数据归档为WARC文件,到目前为止,已完成了55TB的数据。
Later this month the Internet Archive will use “git clone” to keep repositories available while also ensuring repo comments, issues, and other metadata can be accessed on the web.
本月晚些时候,互联网档案馆会使用git克隆来保持储存库的开放,同时确保网页上可见回购评论、议题和其他元数据。