当前位置：网站首页>Rich text editor copying pictures in word documents

Rich text editor copying pictures in word documents

2022-06-12 11:42:00 【Jioho_】

List of articles

It's a little long , I feel very wordy every time I write an article , If you don't want to see the process, just skip to * Hands-on practice That step , There is a core method ~

Rich text editor copy word Pictures in the document

The problem is ： from word The pictures of the contents copied in the document are file:/// agreement , At this time, if our page is http:// perhaps https:// agreement , It is not allowed to read pictures .

Unless the page is also opened in a local file （ But in the actual project, it is basically impossible ）：

And ckeditor regret we didn't meet sooner

paste-from-word demo

see ,ckeditor It supports ！ However, the project at this time already has too many historical burdens （ Including the plug-ins newly developed later , I use it tinymce ）

Not that tinymce Not good. , It's just that you'll find ... It's really bad （ it's a long story , Record later tinymce When I was in roast ）

If you also have editor requirements , And there is no historical burden , Try it directly ckeditor hold

Get a preview of the picture

To get pictures , Start with the clipboard , Because our data source was finally copied from the clipboard .

Let's get to know a few knowledge points first , In order to better understand the following content

Why can't websites read pictures directly ？ Because of security ：

ckeditor No matter how powerful it is, it is impossible to http/https Read the web address under the agreement file:/// The file of . The reason is also simple , If it can be read , Isn't the website able to read all our information ？

word Inside the document

word The document only needs to change the suffix to .zip. Then open the corresponding directory , You will find that the picture is inside , and word There is another one in the catalog webSettings.xml There is... In it word Information about the document . If you are interested, you can find one by yourself

About the system clipboard /JS Medium clipboardData

We often use the function of copying a certain paragraph of words , In fact, the core is to use window Sub object clipboardData One way ：setData()

clipboardData.setData(sDataFormat, sData)

sDataFormat： Format of the content to be copied ;
sData： Content to copy .

Just because clipboardData Or experimental function , So I don't use much at ordinary times . The next thing I want to say is sDataFormat Is closely linked .

Get the contents of the clipboard

Actively seek

shortcoming ：
- Only in https Use under the domain name （ See the picture below 1）
- The page must focus , The mouse doesn't work on the console （ See the picture below 2）
- Will be found , Even rejected （ See the picture below 3）
advantage ：
- It allows you to get the contents of the clipboard ...

navigator.clipboard
  .readText()
  .then(v => {
    
    console.log(' Get clipboard succeeded ：', v)
  })
  .catch(v => {
    
    console.log(' Failed to get clipboard : ', v)
  })

Controlled acquisition monitor ctrl + v / Paste the event

Use event Medium clipboardData call getData Method , At present, I know the following parameters

text Get text
text/html obtain html Text
text/plain Get plain text , Effect and text equally
text/rtf obtain rtf Information （ Ask no questions , What is rtf）

window.addEventListener('paste', function(e) {
    
  const clipdata = e.clipboardData || window.clipboardData
  let data = clipdata.getData('text/html')
  console.log(data)
})

PS： Copy it and paste it on the page , You don't have to find the input box , Press down ctrl+v Just go

Output is as follows ： There are a lot of messy labels on it ,wps Just like office Much cleaner , This is from office Copied in .

clipdata.getData('text/html') That is, the method we use for rich text , Get the pasted content html Code Note that text/html There's a pit here , We'll talk about that later

clipdata.getData('text/rtf') What you get is even more chaotic , But it contains our picture information （ My documents are 2 A picture ,11mb. terrible ）

With the above basic knowledge , We can get rid of the rich text editor , First, let's implement the screenshot at the front of an article , Paste display word Functions of documents .

<body>
  <p> Please press ctrl+v Paste content </p>
  <div id="preview"></div>
  <script> window.addEventListener("paste", function (e) {
       const clipdata = e.clipboardData || window.clipboardData; document.querySelector('#preview').innerHTML = clipdata.getData("text/html") }); </script>
</body>
</html>

obtain word Pictures in the document

According to the below ckeditor Source code to learn , The specific code is in

GitHub：ckeditor5-paste-from-office

Or from npm download ：@ckeditor/ckeditor5-paste-from-office

Analysis of the source code ：

src/index.js -> src/pastefromoffice.js （ stay init Function , Executed a activeNormalizer.execute Method ）-> src/normalizers/mswordnormalizer.js

Here we see a replaceImagesSourceWithBase64 Method , This is the core of today's learning

replaceImagesSourceWithBase64 Method

The method in ：src/filters/image.js

stay replaceImagesSourceWithBase64 Function , The method related to pictures is :

findAllImageElementsWithLocalSource Find all file:/// The first picture

createRangeIn、new Matcher、 These methods don't need too much attention , Because all that is copied in is text , These could be ckeditor The core code is converted to dom Node method
We directly render the rough point to the real dom, Then in the operation of real dom Is it
The first 12 That's ok , obtain src yes file:// At the beginning dom node

function findAllImageElementsWithLocalSource(documentFragment, writer) {
    
  const range = writer.createRangeIn(documentFragment)

  const imageElementsMatcher = new Matcher({
    
    name: 'img'
  })

  const imgs = []

  for (const value of range) {
    
    if (imageElementsMatcher.match(value.item)) {
    
      if (value.item.getAttribute('src').startsWith('file://')) {
    
        imgs.push(value.item)
      }
    }
  }

  return imgs
}

Then perform replaceImagesFileSourceWithInlineRepresentation Method . Before that, we will execute extractImageDataFromRtf

extractImageDataFromRtf Method

It's also in src/filters/image.js

This part of the code takes us from the clipboard getData('text/rtf') The obtained value is processed , Extract the picture information inside （ I admit I didn't understand what was extracted , I am right. rtf I don't know much about , Ha ha ha ha ）

function extractImageDataFromRtf(rtfData) {
    
  if (!rtfData) {
    
    return []
  }

  const regexPictureHeader = /{\\pict[\s\S]+?\\bliptag-?\d+(\\blipupi-?\d+)?({\\\*\\blipuid\s?[\da-fA-F]+)?[\s}]*?/
  const regexPicture = new RegExp('(?:(' + regexPictureHeader.source + '))([\\da-fA-F\\s]+)\\}', 'g')
  const images = rtfData.match(regexPicture)
  const result = []

  if (images) {
    
    for (const image of images) {
    
      let imageType = false

      if (image.includes('\\pngblip')) {
    
        imageType = 'image/png'
      } else if (image.includes('\\jpegblip')) {
    
        imageType = 'image/jpeg'
      }

      if (imageType) {
    
        result.push({
    
          hex: image.replace(regexPictureHeader, '').replace(/[^\da-fA-F]/g, ''),
          type: imageType
        })
      }
    }
  }

  return result
}

replaceImagesFileSourceWithInlineRepresentation

The method under the same document

The first parameter passed in is src by file:// Image node array of , The second is from rtf Extracted image information array , The third is ckeditor My own way , Used to display text , Don't worry about him

There's also a _convertHexToBase64 Method , hold hex Convert to base64

Then there is a cycle , Replace the corresponding node with the corresponding base64, Set... To the picture node src On , But here they use their own packaging writer.

function replaceImagesFileSourceWithInlineRepresentation(imageElements, imagesHexSources, writer) {
    
  // Assume there is an equal amount of image elements and images HEX sources so they can be matched accordingly based on existing order.
  if (imageElements.length === imagesHexSources.length) {
    
    for (let i = 0; i < imageElements.length; i++) {
    
      const newSrc = `data:${
      imagesHexSources[i].type};base64,${
      _convertHexToBase64(imagesHexSources[i].hex)}`
      writer.setAttribute('src', newSrc, imageElements[i])
    }
  }
}

function _convertHexToBase64(hexString) {
    
  return btoa(
    hexString
      .match(/\w{2}/g)
      .map(char => {
    
        return String.fromCharCode(parseInt(char, 16))
      })
      .join('')
  )
}

Hands-on practice , Get picture information and show

The above analyses some ckeditor After code , In fact, what we need to use is

findAllImageElementsWithLocalSource
- This method has been modified , Read the actual... Directly dom node , Get the picture node
replaceImagesFileSourceWithInlineRepresentation
- This method is also changed in the final assignment , Because we have recorded the actual dom node , So use it directly .setAttribute(‘src’,newSrc)
extractImageDataFromRtf
_convertHexToBase64

The sorted code is as follows ：

<body>
  <p> Please press ctrl+v Paste content </p>
  <div id="preview"></div>

  <script> window.addEventListener("paste", function (e) {
       const clipdata = e.clipboardData || window.clipboardData; document.querySelector('#preview').innerHTML = clipdata.getData("text/html") let rtf = clipdata.getData('text/rtf') let imgs = findAllImageElementsWithLocalSource() replaceImagesFileSourceWithInlineRepresentation(imgs, extractImageDataFromRtf(rtf)) }); function findAllImageElementsWithLocalSource() {
       let imgs = document.querySelectorAll('img') return imgs; } function extractImageDataFromRtf(rtfData) {
       if (!rtfData) {
       return []; } const regexPictureHeader = /{\\pict[\s\S]+?\\bliptag-?\d+(\\blipupi-?\d+)?({\\\*\\blipuid\s?[\da-fA-F]+)?[\s}]*?/; const regexPicture = new RegExp('(?:(' + regexPictureHeader.source + '))([\\da-fA-F\\s]+)\\}', 'g'); const images = rtfData.match(regexPicture); const result = []; if (images) {
       for (const image of images) {
       let imageType = false; if (image.includes('\\pngblip')) {
       imageType = 'image/png'; } else if (image.includes('\\jpegblip')) {
       imageType = 'image/jpeg'; } if (imageType) {
       result.push({
       hex: image.replace(regexPictureHeader, '').replace(/[^\da-fA-F]/g, ''), type: imageType }); } } } return result; } function _convertHexToBase64(hexString) {
       return btoa(hexString.match(/\w{2}/g).map(char => {
       return String.fromCharCode(parseInt(char, 16)); }).join('')); } function replaceImagesFileSourceWithInlineRepresentation(imageElements, imagesHexSources, writer) {
       // Assume there is an equal amount of image elements and images HEX sources so they can be matched accordingly based on existing order. if (imageElements.length === imagesHexSources.length) {
       for (let i = 0; i < imageElements.length; i++) {
       const newSrc = `data:${
        imagesHexSources[i].type};base64,${
        _convertHexToBase64(imagesHexSources[i].hex)}`; imageElements[i].setAttribute('src',newSrc) } } } </script>

</body>

</html>

Add a beautiful thing to a contrasting beautiful thing , Upload pictures

After going through the above series of methods , We did get it base64 Format picture , But the display is too long , If you want to upload , We have to start a new one at the back end base64 How to upload pictures ...

base64 Convert to blod object

blod We usually use input Select the picture and get it File type （ I don't know if the explanation is wrong , That's probably what it means ）

The method is as follows :

/**  take base64 Convert to file object  * @param {String} base64 base64 character string  * */
function convertBase64ToBlob(base64) {
    
  var base64Arr = base64.split(',')
  var imgtype = ''
  var base64String = ''
  if (base64Arr.length > 1) {
    
    // If it's a picture base64, Remove the header information 
    base64String = base64Arr[1]
    imgtype = base64Arr[0].substring(base64Arr[0].indexOf(':') + 1, base64Arr[0].indexOf(';'))
  }
  //  take base64 decode 
  var bytes = atob(base64String)
  //var bytes = base64;
  var bytesCode = new ArrayBuffer(bytes.length)
  //  Convert to a typed array 
  var byteArray = new Uint8Array(bytesCode)

  //  take base64 Convert to ascii code 
  for (var i = 0; i < bytes.length; i++) {
    
    byteArray[i] = bytes.charCodeAt(i)
  }

  //  Generate Blob object （ File object ）
  return new Blob([bytesCode], {
     type: imgtype })
}

The effect is as follows

Optimized display URL

The upload problem is solved , But so long base64 It's really annoying , Fortunately, we still have ObjectURL

It's much fresher all at once ：

let boldFile = convertBase64ToBlob('base64 String ')
//  Use it directly  URL.createObjectURL  Generate 
imageElements[i].setAttribute('src', URL.createObjectURL(boldFile))

blod turn base64

Now that it's all here , There's another conversion, by the way

function readBlobAsDataURL(blob, callback) {
    
  var a = new FileReader()
  a.onload = function(e) {
    
    callback(e.target.result)
  }
  a.readAsDataURL(blob)
}

readBlobAsDataURL('blod File object ', function(base64) {
    
  console.log(base64)
})

Read the picture , Picture shows , Include image conversion to blod Objects also have , As long as the picture is uploaded , In echo , It's all alive ~

summary

The core principles include ckeditor Part of the source code interpretation is over , Of course, there are many details not considered , Including some label conversion , Tag filtering , Style filtering , The most important thing is to judge whether the copied ones are word file , And if you can't get it rtf And so on , Can study ckeditor Code for

Process summary

Listen for paste events , Get clipboard data （ Include text/html and text/rtf）
Get html Rear handle file:// At the beginning img Find the node , Then use the transformation method to convert rtf The corresponding picture information is also found one by one
Use hex turn base64 Method to get the picture base64 Information , And then we need to convert

Colored eggs - Next episode Trailer

It says there is a pit , Is what we get getData('text/html') and getData('text/rtf')

this 2 Something doesn't appear out of thin air , And artificially set （ Don't think that everything copied has text/html）

These things are all set up in the clipboard setData('text/html'). What is set , To get something （ Because I stepped into this hole in another function of rich text , Include safari Browsers also have holes ！）

The next article will write about this clipboard pit ！

Copy word There are really few articles on the principle of document and picture ~ I hope this article can help you

原网站

版权声明
本文为[Jioho_]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/03/202203010522466011.html