当前位置:网站首页>Pdf things

Pdf things

2022-07-05 03:00:00 unreality

I have done a lot in recent months PDF Related activities , Take advantage of these two days , Record the relevant knowledge points .

PDF yes Portable Document Format For short , Which translates to “ Portable document format ”, from Adobe On 1992 Created in . Its format features , Independent of the operating system platform , The same rendering effect can be maintained on any platform .

As far as this platform is concerned …… Rather, the format is simple , All platforms follow the same standard , Plus the publicity of Standards , Naturally, it is platform independent .

The history of

PDF After creation , Has always been a Adobe Proprietary format of the company , until 2008 year , To become official ISO(ISO 32000) standard .

although PDF Documents have become the standard , but Adobe As PDF The originator of , Or get some private functions , such as XFA(Adobe PDF Forms ), This thing doesn't belong to ISO 32000 PDF standard .

Domestic format - OFD

and PDF Allied , also OFDOpen Fixed-layout Documents) Format , Count as “ domestic PDF” standard , By China National Standardization Administration Committee on 2016 Official release .

and PDF Format compared to ,OFD Simpler format , Easier to implement , And support state secrets , However, few are currently used .

typeface

common Office In the format , typeface By default, they are all non embedded , Non embedded fonts can avoid storing the same fonts repeatedly , Just install the corresponding font on the rendering device ; But the disadvantages are obvious , If there is no corresponding font on the customer's device , Will result in rendering failure , Use alternative fonts , It will also affect the rendering effect .

and PDF In font processing and Office Somewhat different ,PDF By default Embedded fonts The way , It is also a subset of embedded fonts - Only embed the character fonts used in the file into PDF In file , The whole font library will not be embedded . thus , Instantly embedded fonts , It will not increase the file size too much .

Font encryption

Some documents are for data security , Provide PDF Format information for reference , But I don't want users to copy text at will .

here PDF The benefits of embedded fonts are reflected , Based on font obfuscation & Encryption technology , Let the current PDF Use confusion & Encrypted font library . In this way, it is publicly provided PDF file , The words copied by the customer are also confused , It also guarantees data security .

But now, OCR So strong , After confusing fonts , Can still pass OCR Means to identify , Just need to spend more energy .

Electronic signature & digital signature

Electronic signature - Electronic Signature

The United States 《 Global and national commercial electronic signature laws 》 (2000 year ) take “ Electronic signature ” Defined as “ Attached to the contract or generated electronically , send out , convey , Other records received or stored, or electronic sounds logically associated with them , Symbol or process .”

actually , Electronic signature is just a picture of handwritten signature , Attached to electronic documents , Then cooperate with some multifactor identification methods (PIN/ password / mailbox ) Certificate to complete .

digital signature - Digital Signature

Digital signature is different from electronic signature , Digital signature needs cooperation PKI authentication center (CA) The issued digital certificate realizes , The basic playing method is like this :

  1. Use summarization algorithms for content ( Such as MD/SHA etc. ) Generate summary
  2. Use asymmetric encryption algorithm + The certificate private key encrypts the digest
  3. Will encrypt the digest data , And signed certificates ( The public key part ) Attached to PDF In file

As can be seen from the above steps ,PDF Digital signatures and SSL Encryption is not the same ,PDF In essence, it is about documents “ Signature ”, It can guarantee the identity of the document signer , Guaranteed not to be tampered with , and SSL It is to encrypt the message .

The following figure shows the asymmetric encryption algorithm , The difference between encryption and digital signature :
image.png
To sum up , There are two main applications of asymmetric encryption algorithm : Public key encryption -> Private key decryption , Private key encryption -> Public key signature verification .

PDF There is another special way to play digital signature , Digital signature information and pictures can be “ binding ”, For example, the signature in the electronic invoice (Stamp) picture , This picture can be used as the appearance of a digital signature (Appearance).
image.png

If you don't use appearance , Of course, it is also possible to only sign digitally . But remember one thing : A signed picture does not necessarily have a digital signature , A digital signature does not necessarily have a signature image , These two are not the same thing .

It's not just PDF Files can be digitally signed , Microsoft Home Office The suite also supports digital signatures , But generally no one will be right Office Format file signature , So what you can see on the market is right PDF digital signature .

Signature verification

PDF The principle of signature verification is also very simple :

  1. verification PDF Whether the signing certificate of is trusted

    1. Use the client root certificate Library ( such as Adobe PDF, Will use the built-in root certificate list , Nothing to do with the operating system ), Verify whether the signing certificate is trusted
  2. Pass the public key of the certificate , Verify the signed summary data .

    certificate & Signature algorithm

    PDF The type of certificate used for digital signature , and SSL It's different . Ordinary SSL Just verify the domain name owner with your certificate , and PDF The certificate used for digital signature is generally called agency certificate , There is no concept of domain name , But it will strictly verify the enterprise information , Such as business license .

At present, the mainstream digital certificate asymmetric encryption algorithms include RSA/DSA/DSS, But the most widely used is RSA Algorithm , However, with the trend of localization , Industries such as finance and insurance have slowly migrated to the state secret algorithm .

But the algorithm is not important , Are asymmetric encryption , All digital certificates , Just a specific signature / attestation / encryption / The decryption algorithm is different .

Form field - Acro Form

Form fields refer to PDF Forms , The English definition is Acro Form. Yes, you are right ,PDF And HTML Similar form technology , Text fields can be configured in the form 、 Radio buttons 、 Check boxes and other elements :

image.png

The editor is ready PDF After the form , You can use tools or programs , Yes PDF Fill in or fill in .

PDF library (JAVA)

PDF Technology is still relatively closed , The open source library will be very uncomfortable to use , If the enterprise is commercial , Try to consider buying business SDK, Rich in functions , Well documented , Spend money on time .

Open source & free

  • Itext - 4.x The version below is free ,5 The open source license of the above version is AGPL, Business needs to pay
  • Openpdf - be based on itext 2.x Version fix , still Itext
  • pdfbox - Apache Next open source PDF library , Although it's free , But the function is far inferior to Itext, It is not recommended to use .

There are some more niche PDF library , It's not recommended here . At present, the most commonly used itext,pdfbox Although it is completely open source and free , But the richness of functions and documents is far less than itext.

business

原网站

版权声明
本文为[unreality]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202140829474408.html