• Shopping Cart
    There are no items in your cart
We noticed you’re not on the correct regional site. Switch to our AMERICAS site for the best experience.
Dismiss alert

BS ISO 28500:2017

Current

Current

The latest, up-to-date edition.

Information and documentation. WARC file format

Available format(s)

Hardcopy , PDF

Language(s)

English

Published date

11-09-2017

£272.00
Excluding VAT

Foreword
Introduction
1 Scope
2 Normative references
3 Terms, definitions and abbreviated terms
4 File and record model
5 Named fields
6 WARC record types
7 Record segmentation
8 WARC file name, size and compression
Annex A (informative) - Use cases for writing WARC
        records
Annex B (informative) - Examples of WARC records
Annex C (informative) - WARC file size and name
        recommendations
Annex D (informative) - Compression recommendations
Bibliography

Defines the WARC file format: - to store both the payload content and control information from mainstream Internet application layer protocols, such as the HTTP, DNS, and FTP; - to store arbitrary metadata linked to other stored data (e.g. subject classifier, discovered language, encoding); - to support data compression and maintain data record integrity; - to store all control information from the harvesting protocol (e.g. request headers), not just response information; - to store the results of data transformations linked to other stored data; - to store a duplicate detection event linked to other stored data (to reduce storage in the presence of identical or substantially similar resources); - to be extended without disruption to existing functionality; - to support handling of overly long records by truncation or segmentation, where desired.

Committee
IDT/2
DevelopmentNote
Supersedes 08/30167515 DC. (08/2009) Supersedes 16/30345920 DC. (09/2017)
DocumentType
Standard
Pages
36
PublisherName
British Standards Institution
Status
Current
Supersedes

This document specifies the WARC file format:

  • to store both the payload content and control information from mainstream Internet application layer protocols, such as the HTTP, DNS, and FTP;

  • to store arbitrary metadata linked to other stored data (e.g. subject classifier, discovered language, encoding);

  • to support data compression and maintain data record integrity;

  • to store all control information from the harvesting protocol (e.g. request headers), not just response information;

  • to store the results of data transformations linked to other stored data;

  • to store a duplicate detection event linked to other stored data (to reduce storage in the presence of identical or substantially similar resources);

  • to be extended without disruption to existing functionality;

  • to support handling of overly long records by truncation or segmentation, where desired.

Standards Relationship
ISO 28500:2017 Identical

ISO 8601:2004 Data elements and interchange formats — Information interchange — Representation of dates and times

£272.00
Excluding VAT