Hi Francesco,
As you correctly mention, XML Schema validation cannot be used in case of XMP due to ambiguities in the serialization. This is why veraPDF uses a fork of Adobe XMP library for low-level XMP parsing: https://github.com/veraPDF/veraPDF-library/tree/integration/xmp-core
All further XMP validation rules are based on the PDF/A requirements as well as predefined schemas as defined Adobe XMP 2004 specification for PDF/A-1, Adobe XMP 2005 specification for PDF/A-2,3 and ISO 16684-1 for PDF/A-4.
The detailed list of these rules can be found in https://github.com/veraPDF/veraPDF-validation-profiles/wiki under the numbers that match the Metadata sections in the corresponding PDF/A specifications:
6.7.* for PDF/A-1 and PDF/A-4 6.6.* for PDF/A-2 and PDF/A-3
Best regards, Boris
-----Original Message----- From: Users users-bounces@lists.verapdf.org On Behalf Of Francesco Pretto Sent: Thursday, May 26, 2022 1:47 PM To: users@lists.verapdf.org Subject: [veraPDF-users] How veraPDF performs XMP packets validation?
Hello veraPDF users/devs,
I have a question over XMP packets validation in PDF documents: how does veraPDF actually does it? Can you point me to references in the code where this is performed?
I elaborate more on the question, providing a viable solution as suggested by ISO standards committee. I have access to both ISO 16684-1:2019[1] and ISO 16684-2:2014[2]: the first standard generically describes the XMP packets data model and properties, explaining the many alternative notations accepted (that make difficult to validate the packets). The second suggests a method for the normalization of the packets to create an unique representation of the information stored in the packets (this method can be mostly implemented without knowing anything about the actual XMP data model/properties). It then supplies some sample RELAX NG schemas to validate unspecified XMP demo packets. These schemas clearly aren't describing the full schema to validate the XMP packets of PDF documents since all pdf specific properties are missing but also some generic ones. If the full schema to validate XMP packets in PDF documents was publicly available, validate the XMP packets in PDF documents would be quite simple as the normalization algorithm is not very difficult to implement. I asked for it in[3] but I doubt Adobe will release it. It may also not exist at all since 16684-2:2014 recommendations may have been developed independently from use in any actual Adobe product.
Thank you in advance for any insight.
Regards, Francesco
[1] https://www.iso.org/standard/75163.html [2] https://www.iso.org/standard/57422.html [3] https://github.com/adobe/xmp-docs/issues/20 _______________________________________________ Users mailing list Users@lists.verapdf.org http://lists.verapdf.org/listinfo/users