A | B | C | D | E | F | G | H | |
---|---|---|---|---|---|---|---|---|
1 | Collection ID | Description | Collection Start date | Collection End date | Total files (Number of URLs collected) | Total seeds (Preserved sites) | Total volume of WARCS files (TB) | Collection in Production? (URL, Page and Image indexed) |
2 | AWP1 | 1st Complete crawl of the Portuguese web, mainly from the .PT domain, in 2008. | 2008-02-12 | 2008-03-06 | 56,046,288 | 154,787 | 1.60 | |
3 | AWP2 | 2nd Complete crawl of the Portuguese web, mainly from the .PT domain, in 2008. | 2008-03-11 | 2008-05-30 | 48,718,404 | - | 1.60 | |
4 | AWP3 | 3rd Complete crawl of the Portuguese web, mainly from the .PT domain, in 2008. | 2008-10-21 | 2008-12-10 | 51,863,006 | 193,294 | 2.00 | |
5 | AWP4 | 4th Complete crawl of the Portuguese web, mainly from the .PT domain, in 2009. | 2009-05-01 | 2009-05-31 | 68,776,707 | 366,880 | 2.50 | |
6 | AWP5 | 5th Complete crawl of the Portuguese web, mainly from the .PT domain, in 2009. | 2009-10-01 | 2009-10-31 | 119,135,566 | 373,323 | 3.80 | |
7 | AWP6 | 6th Complete crawl of the Portuguese web, mainly from the .PT domain, in 2009. | 2009-12-01 | 2009-12-31 | 118,810,364 | 340,018 | 3.50 | |
8 | AWP7 | 7th Complete crawl of the Portuguese web, mainly from the .PT domain, in 2010. | 2010-05-01 | 2010-05-31 | 87,988,812 | 389,957 | 2.90 | |
9 | AWP8 | Incremental crawl of the Portuguese web web, mainly from the .PT domain, in 2010. The AWP8 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP8 incremental crawl. | 2010-08-01 | 2010-08-31 | 75,771,317 | 411,562 | 1.90 | |
10 | AWP9 | Incremental crawl of the Portuguese web, mainly from the .PT domain, in 2011. The AWP9 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP9 incremental crawl. | 2011-01-20 | 2011-03-22 | 81,114,575 | 473,588 | 2.10 | |
11 | AWP10 | Incremental crawl of the Portuguese web, mainly from the .PT domain, in 2011. The AWP10 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP10 incremental crawl. | 2011-05-17 | 2011-06-17 | 76,710,879 | 704,837 | 2.10 | |
12 | AWP11 | Incremental crawl of the Portuguese web, mainly from the .PT domain, in 2011. The AWP11 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP11 incremental crawl. | 2011-06-30 | 2011-08-05 | 69,790,126 | 509,280 | 2.30 | |
13 | AWP12 | Incremental crawl of the Portuguese web, mainly from the .PT domain, from December of 2011 to February of 2012. The AWP12 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP7 as baseline. Thus, the files that remained unchanged from the AWP7 complete crawl were not archived (duplicated) on the AWP12 incremental crawl. | 2011-12-30 | 2012-02-28 | 90,122,611 | 328,846 | 2.70 | |
14 | AWP15 | Complete crawl of the Portuguese web performed, mainly from the .PT domain, performed between the 5th November of 2013 and the 13rd January of 2014. The AWP15 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2013-11-05 | 2014-01-13 | 139,296,363 | 1,088,962 | 6.00 | |
15 | AWP16 | Incremental crawl of the Portuguese web, mainly from the .PT domain, in 2014. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl. | 2014-09-23 | 2014-11-24 | 203,407,698 | 609,201 | 8.50 | |
16 | AWP17 | Complete crawl of the Portuguese web performed, mainly from the .PT domain, in 2015. The AWP17 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2015-04-10 | 2015-06-09 | 243,803,163 | 818,360 | 9.56 | |
17 | AWP18 | Incremental crawl of the Portuguese web, mainly from the .PT domain, in 2015. The AWP18 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP17 as baseline. Thus, the files that remained unchanged from the AWP17 complete crawl were not archived (duplicated) on the AWP18 incremental crawl. | 2015-05-13 | 2015-11-05 | 214,527,044 | 518,848 | 7.82 | |
18 | AWP19 | Incremental crawl of the Portuguese web, mainly from the .PT domain, from November of 2015 and May of 2016. The AWP19 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP18 as baseline. Thus, the files that remained unchanged from the AWP18 complete crawl were not archived (duplicated) on the AWP19 incremental crawl. | 2015-11-12 | 2016-01-05 | 199,209,953 | 658,777 | 7.10 | |
19 | AWP20 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2016. The AWP20 crawl did NOT use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2016-02-05 | 2016-05-03 | 238,822,615 | 686,668 | 12.00 | |
20 | AWP21 | Incremental crawl of the Portuguese web performed, mainly from the .PT domain, in 2016. The AWP18 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP17 as baseline. Thus, the files that remained unchanged from the AWP17 complete crawl were not archived (duplicated) on the AWP18 incremental crawl. | 2016-05-30 | 2016-08-03 | 193,212,877 | 660,385 | 7.20 | |
21 | AWP22 | Incremental crawl of the Portuguese web performed, mainly from the .PT domain, from October of 2016 to January 2017. The AWP22 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP21 as baseline. Thus, the files that remained unchanged from the AWP21 complete crawl were not archived (duplicated) on the AWP22 incremental crawl. | 2016-10-31 | 2017-01-04 | 162,188,798 | 767,310 | 6.50 | |
22 | AWP23 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2017. The AWP23 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2017-01-01 | 2017-05-07 | 225,221,781 | 925,138 | 13.00 | |
23 | AWP24 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2017. The AWP24 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2017-07-31 | 2017-09-11 | 165,161,477 | 905,973 | 7.80 | |
24 | AWP25 | Complete crawl of the Portuguese web, mainly from the .PT domain, performed between December of 2017 and January of 2018. The AWP25 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2017-12-07 | 2018-01-18 | 152,505,397 | 705,296 | 6.70 | |
25 | AWP26 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2018. The AWP26 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2018-04-09 | 2018-07-02 | 233,145,629 | 711,105 | 14.00 | |
26 | AWP27 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2018. The AWP27 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2018-07-14 | 2018-07-28 | 111,848,303 | 640,912 | 13.00 | |
27 | AWP28 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2019. The AWP28 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2018-10-10 | 2018-11-02 | 363,393,207 | 710,924 | 21.00 | |
28 | AWP29 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2019. The AWP29 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2019-04-01 | 2019-04-26 | 386,225,779 | 835,462 | 22.90 | |
29 | AWP30 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2019. The AWP30 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2019-08-29 | 2019-11-03 | 632,266,742 | 1,038,954 | 38.00 | |
30 | AWP31 | Incremental crawl of the Portuguese web, mainly from the .PT domain, from Dezembro of 2019 to January 2020. The AWP31 used DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2019-12-06 | 2020-01-31 | 366,370,692 | 1,587,726 | 24.00 | |
31 | AWP32 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2020. The AWP32 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2020-03-03 | 2020-05-04 | 836,246,709 | 1,409,215 | 29.00 | |
32 | AWP33 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2020. The AWP33 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2020-06-01 | 2020-07-15 | 373,880,292 | 366,939 | 27.00 | |
33 | AWP34 | Incremental crawl of the Portuguese web, mainly from the .PT domain, in 2020. The AWP31 used DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2020-09-01 | 2020-10-08 | 115,489,181 | 302,504 | 8.40 | |
34 | AWP35 | Incremental crawl of the Portuguese web, mainly from the .PT domain, performed between December 2020 and January 2021. The AWP35 used DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2020-12-02 | 2021-01-07 | 111,563,667 | 294,671 | 7.70 | |
35 | AWP36 | Complete crawl of the Portuguese web, mainly from the .PT domain, in 2021. The AWP36 crawl did not use DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2021-03-01 | 2021-04-07 | 634,166,744 | 264,098 | 17.00 | |
36 | AWP37 | Incremental crawl of the Portuguese web, mainly from the .PT domain, performed between June 2021 and Jully 2021. The AWP37 used DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2021-06-02 | 2021-07-16 | 568,450,007 | 273,468 | 9.80 | |
37 | AWP38 | Incremental crawl of the Portuguese web, mainly from the .PT domain, performed between October 2021 and November 2021. The AWP37 used DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2021-10-08 | 2021-11-26 | 560,843,698 | 291,518 | 12.00 | |
38 | AWP39 | Incremental crawl of the Portuguese web, mainly from the .PT domain, performed between January 2022 and February 2022. The AWP39 do not used DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2022-01-19 | 2022-02-23 | 884,737,568 | 727,039 | 27.00 | |
39 | AWP40 | Incremental crawl of the Portuguese web, mainly from the .PT domain, performed between April 2022 and 2022. The AWP40 do not used DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2022-04-11 | 2022-06-06 | 1,886,902,750 | 926,745 | 45.00 | |
40 | AWP41 | Incremental crawl of the Portuguese web, mainly from the .PT domain, performed between April 2022 and 2022. The AWP40 do used DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). | 2022-07-27 | 891,452 | ||||
41 | BlogsSapo2018 | Special collection of blogs from the Portuguese website blogs.sapo.pt, in 2018. | 2018-08-02 | 2018-08-31 | 2,414,012 | 0.20 | ||
42 | Roteiro | Collection donated to the Arquivo.pt. Pages collected in 1996, from José Magalhães book "Novo Roteiro Prático da Internet". | 1996-01-01 | 1996-12-31 | 75,174 | 0.00 | ||
43 | IA | Collection acquired from the Internet Archive. Pages of the Portuguese web collected by the Internet Archive between 1996 and 2007. | 1996-01-01 | 2007-12-31 | 123,889,349 | 2.00 | ||
44 | BN | Collection donated to the Arquivo.pt. Pages collected by the Biblioteca Nacional de Portugal and the Instituto de Engenharia de Sistemas e Computadores (INESC) as part of the "Recolha" or "collection" project. This partnership collected Web pages, between 2004 and 2005 about the "Legislativas" Portuguese elections "Legislativas", held in February 2005. | 2004-01-01 | 2005-12-31 | 14,373,817 | 0.17 | ||
45 | Tomba | Collection integrated from Tomba project that includes Web pages collected between 2005 and 2006. The Tomba project was the Portuguese web archive prototype, following the Tumba! project developed by the research group XLDB of the University of Lisbon and supported by FCCN. | 2005-01-01 | 2006-12-31 | 37,000,000 | 1.30 | ||
46 | Dinis | Collection donated to the Arquivo.pt. Web pages collected between 1997 and 2007, courtesy of Dinis Manuel Alves. | 2000-01-01 | 2007-12-31 | 4,000 | 0.00 | ||
47 | Weblog | Special collection of blogs from the hosting platform weblog.com.pt before being closed in 2012. | 2012-01-01 | 2012-12-31 | 563,350 | 7,012 | 0.03 | |
48 | UL | Special collection on the University of Lisbon domain (ul.pt), performed serveral times by the Arquivo.pt team as tests in the beginning of the service, in 2008. It brings together 6 small collections. | 2008-02-18 | 2008-03-03 | 411,171 | 0.03 | ||
49 | BlocoEsquerda | Special collection of the first website of the political party Bloco de Esquerda, performed as a test for special collections, in 2012. | 2012-10-01 | 2012-10-31 | 36 | 1 | 0.00 | |
50 | DEM-IST | Collection donated to the Arquivo.pt with the website content of the Department of Mechanical Engineering of the Instituto Superior Técnico, Lisboa. Files are dated from 1998 to 2006. | 1998-01-01 | 2006-12-31 | 3,536 | 4 | 0.00 | |
51 | DinisAlves2018 | Collection donated to the Arquivo.pt that have contents about two websites: www.portosdeportugal.pt, website of the Associação de Portos de Portugal, dated from August to December 2012, and portofigueiradafoz.pt, website of the Porto da Figueira da Foz, dated from February to December 2013). This collection is courtesy of Dinis Alves in 2018. | 01/08/2012 | 2013-12-18 | 11,349 | 2 | 0.00 | |
52 | NON | Collection donated to the Arquivo.pt, first as local files of the NON maganize website, one of the first Portuguese online magazines, then converted into WARC files and integrated into the Arquivo.pt. Files are dated of from 1996 to 1999 as zonanon.com, and from 1999 to 2002 as zonanon.org. Courtesy of Rui Bebiano in 2020. | 01/01/1996 | 2002-12-31 | 8,303 | 2 | 0.00 | |
53 | FAWP1 | 1st block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from March to July 2010. | 2010-03-23 | 2010-07-06 | 57,352,532 | 332 | 2.00 | |
54 | FAWP2 | 2nd block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2010. With Deduplicator from the second day. | 2010-07-07 | 2010-09-21 | 33,957,637 | 359 | 0.80 | |
55 | FAWP3 | 3rd block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from September to December 2010. With Deduplicator. | 2010-09-22 | 2010-12-31 | 45,623,908 | 360 | 0.87 | |
56 | FAWP4 | 4th block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2011. With Deduplicator. | 2011-01-01 | 2011-03-31 | 42,094,295 | 360 | 1.01 | |
57 | FAWP5 | 5th block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2011. With Deduplicator. | 2011-04-01 | 2011-06-30 | 41,941,367 | 360 | 1.30 | |
58 | FAWP6 | 6th block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2011. With Deduplicator. | 2011-07-01 | 2011-09-30 | 42,436,564 | 360 | 1.80 | |
59 | FAWP7 | 7th block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from September to December 2011. With Deduplicator. | 2011-10-01 | 2011-12-31 | 43,833,826 | 1.90 | ||
60 | FAWP8 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2012. With Deduplicator. | 2012-01-01 | 2012-03-31 | 45,522,178 | 2.10 | ||
61 | FAWP10 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2012. With Deduplicator. | 2012-07-01 | 2012-09-30 | 25,254,390 | 0.90 | ||
62 | FAWP11 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to December 2012. With Deduplicator. | 2012-01-01 | 2012-12-31 | 8,560,866 | 0.30 | ||
63 | FAWP12 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to September 2013. With Deduplicator. | 2013-01-01 | 2013-09-30 | 10,423,663 | 0.48 | ||
64 | FAWP14 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2013. With Deduplicator. | 2013-07-01 | 2013-09-30 | 27,686,676 | 1.20 | ||
65 | FAWP15 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from October to December 2013. With Deduplicator. | 2013-10-01 | 2013-12-31 | 16,461,666 | 0.77 | ||
66 | FAWP17 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2014. With Deduplicator. | 2014-04-01 | 2014-06-30 | 18,800,556 | 1.00 | ||
67 | FAWP18 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2014. With Deduplicator. | 2014-07-01 | 2014-09-30 | 29,436,673 | 1.60 | ||
68 | FAWP19 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from October to December 2014. With Deduplicator. | 2014-10-01 | 2014-12-31 | 39,843,502 | 2.00 | ||
69 | FAWP20 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2015. With Deduplicator. | 2015-01-01 | 2015-03-31 | 38,936,485 | 324 | 1.80 | |
70 | FAWP21 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2015. With Deduplicator. | 2015-04-01 | 2015-06-30 | 38,636,837 | 327 | 1.80 | |
71 | FAWP22 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2015. With Deduplicator. | 2015-07-01 | 2015-09-30 | 44,702,505 | 252 | 2.10 | |
72 | FAWP23 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from October to December 2015. With Deduplicator. | 2015-10-01 | 2015-12-31 | 57,405,014 | 257 | 3.70 | |
73 | FAWP24 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2016. With Deduplicator. | 2016-01-01 | 2016-03-31 | 60,725,384 | 291 | 4.50 | |
74 | FAWP25 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2016. With Deduplicator. | 2016-04-01 | 2016-06-30 | 63,894,659 | 296 | 4.90 | |
75 | FAWP26 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2016. With Deduplicator. | 2016-07-01 | 2016-09-30 | 63,780,872 | 299 | 6.10 | |
76 | FAWP27 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from October to December 2016. With Deduplicator. | 2016-10-01 | 2016-12-31 | 64,083,906 | 308 | 6.90 | |
77 | FAWP28 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2017. With Deduplicator. | 2017-01-01 | 2017-03-31 | 62,797,293 | 323 | 7.60 | |
78 | FAWP29 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2017. With Deduplicator. | 2017-04-01 | 2017-06-30 | 73,100,203 | 320 | 8.80 | |
79 | FAWP30 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2017. With Deduplicator. | 2017-07-01 | 2017-09-30 | 75,259,797 | 326 | 9.40 | |
80 | FAWP31 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from October to December 2017. With Deduplicator. | 2017-10-01 | 2017-12-31 | 73,242,650 | 326 | 8.70 | |
81 | FAWP32 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2018. With Deduplicator. | 2018-01-01 | 2018-03-31 | 80,766,956 | 325 | 9.20 | |
82 | FAWP33 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2018. With Deduplicator. | 2018-04-01 | 2018-06-30 | 125,860,345 | 325 | 11.00 | |
83 | FAWP34 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to Sptember 2018. With Deduplicator. | 2018-07-01 | 2018-09-30 | 126,921,267 | 325 | 13.00 | |
84 | FAWP35 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from October to December 2018. With Deduplicator. | 2018-10-01 | 2018-12-31 | 124,775,974 | 361 | 14.00 | |
85 | FAWP36 | Block of frequent collections of of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2019. With Deduplicator. (3 Heritrix ARCs) | 2019-01-01 | 2019-03-31 | 137,178,831 | 361 | 14.00 | |
86 | FAWP37 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2019. With Deduplicator. (3 Heritrix ARCs) | 2019-04-01 | 2019-06-30 | 132,608,317 | 361 | 14.00 | |
87 | FAWP38 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2019. With Deduplicator. (3 Heritrix ARCs) | 2019-07-01 | 2019-09-30 | 140,120,566 | 361 | 15.00 | |
88 | FAWP39 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from October to December 2019. With Deduplicator. (3 Heritrix ARCs & WARCs) | 2019-10-01 | 2019-12-31 | 154,710,796 | 361 | 19.00 | |
89 | FAWP40 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2020. With Deduplicator. (3 Heritrix WARC) | 2020-01-01 | 2020-03-31 | 142,757,027 | 361 | 21.00 | |
90 | FAWP41 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2020. With Deduplicator. (3 Heritrix WARC) | 2020-04-01 | 2020-06-30 | 178,792,290 | 256 | 21.00 | |
91 | FAWP42 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2020. With Deduplicator. (3 Heritrix WARC) | 2020-07-01 | 2020-09-30 | 119,671,522 | 256 | 17.00 | |
92 | FAWP43 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from October to December 2020. Possible Without Deduplicator. (3 Heritrix WARC) | 2020-10-01 | 2020-12-31 | 282,254,451 | 156 | 32.00 | |
93 | FAWP44 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2021. With Deduplicator. (3 Heritrix WARC) | 2021-01-01 | 2021-03-31 | 85,092,238 | 154 | 15.00 | |
94 | FAWP45 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies,. With Deduplicator, from April to June 2021. (3 Heritrix WARC) | 2021-04-01 | 2021-06-30 | 81,529,382 | 154 | 13.00 | |
95 | FAWP46 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies,. With Deduplicator, from July to September 2021. (3 Heritrix WARC) | 2021-07-01 | 2021-09-30 | 85,239,435 | 140 | 12.00 | |
96 | FAWP47 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies,. With Deduplicator, from October to December 2021. (3 Heritrix WARC) | 2021-10-01 | 2021-12-31 | 111,910,859 | 140 | 19.00 | |
97 | FAWP48 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from January to March 2022. With Deduplicator. (3 Heritrix WARC) | 2022-01-01 | 2022-03-31 | 115,394,656 | 140 | 18.00 | |
98 | FAWP49 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from April to June 2022. With Deduplicator. (3 Heritrix WARC) | 2022-04-01 | 2022-06-30 | 134,942,148 | 140 | 21.00 | |
99 | FAWP50 | Block of frequent collections of Portuguese web, mainly websites of news media, websites with frequent renewal of contents, government and public bodies, from July to September 2022. With Deduplicator. (3 Heritrix WARC) | 2022-07-01 | 2022-09-30 | 140 | |||
100 | EAWP1 | Special collection about content related to web preservation. This collection contains web pages crawled between May 2011 and February 2012. | 2011-05-20 | 2012-02-05 | 3,087,727 | 0.05 |