Hello veraPDF users/devs,
I have a question over XMP packets validation in PDF documents: how does veraPDF actually does it? Can you point me to references in the code where this is performed?
I elaborate more on the question, providing a viable solution as suggested by ISO standards committee. I have access to both ISO 16684-1:2019[1] and ISO 16684-2:2014[2]: the first standard generically describes the XMP packets data model and properties, explaining the many alternative notations accepted (that make difficult to validate the packets). The second suggests a method for the normalization of the packets to create an unique representation of the information stored in the packets (this method can be mostly implemented without knowing anything about the actual XMP data model/properties). It then supplies some sample RELAX NG schemas to validate unspecified XMP demo packets. These schemas clearly aren't describing the full schema to validate the XMP packets of PDF documents since all pdf specific properties are missing but also some generic ones. If the full schema to validate XMP packets in PDF documents was publicly available, validate the XMP packets in PDF documents would be quite simple as the normalization algorithm is not very difficult to implement. I asked for it in[3] but I doubt Adobe will release it. It may also not exist at all since 16684-2:2014 recommendations may have been developed independently from use in any actual Adobe product.
Thank you in advance for any insight.
Regards, Francesco
[1] https://www.iso.org/standard/75163.html [2] https://www.iso.org/standard/57422.html [3] https://github.com/adobe/xmp-docs/issues/20