当前位置：网站首页>Tag based augmented reality using OpenCV

Tag based augmented reality using OpenCV

2022-06-29 20:01:00 【woshicver】

precondition

Understand what augmented reality is (AR)、 Virtual reality (VR) And mixed reality (MR),Marker-based AR and Marker-less AR The difference between ：https://arshren.medium.com/all-you-want-to-know-about-augmented-reality-1d5a8cd08977

Tag based augmented reality

Tag based AR, Also known as image recognition AR, Use an object or reference mark as a reference to determine the position or direction of the camera .

Location based AR By scanning image ArUco Mark such a mark to work .ArUco Tag detection triggers an enhanced experience to locate objects 、 Text 、 Video or animation for display on the device .

In this case , We will write a simple code , With the help of ArUco Tag to enhance the image on the video stream .

ArUco Mark

ArUco(Augmented Reality University of Cordoba) from S.Garrido-Jurado And other people in 2014 In their work “ Automatically generate and detect highly reliable reference marks under occlusion ”（https://www.researchgate.net/publication/260251570_Automatic_generation_and_detection_of_highly_reliable_fiducial_markers_under_occlusion） In the development .

ArUco The marker is a reference square marker for camera attitude estimation . When... Is detected in the video ArUco When the tag , You can add digital content to the detected tag , Such as images .

Size is 6X6 Of ArUco Mark

ArUco A marker is a composite square marker , The internal binary matrix is contained in a wide black border with a unique identifier . stay ArUco In the mark , Black means 1, White means 0.

The tag size determines the size of the internal binary matrix .ArUco The odd blocks in the tag represent parity bits , Even squares in the tag represent data bits .

The black border is convenient for quick detection in the image , The binary matrix allows it to be identified .

ArUco Marking helps the camera understand the angle 、 Height and depth , And in computer vision 、 Robots and augmented reality .

ArUco Tags consist of predefined Dictionaries , Covers a range of different dictionary sizes and tag sizes . To generate ArUco Mark , You need to specify ：

Dictionary size ： Is the number of marks in the dictionary
Indicates the tag size of the digits

above ArUco The mark comes from 100 A marked Dictionary , The tag size is 6X6 Binary matrix .

This example will capture video using your computer's default camera , And then from 6x6x100 Import from dictionary 4 individual ArUco Mark . Once detected ArUco Mark , In the detected ArUco Add an image to the tag .

Read how to use... Here OpenCV Read 、 Write and display video ：https://arshren.medium.com/read-and-write-videos-using-opencv-7f92548afcba

Import the required Library

import numpy as np
import cv2
import imutils

Detect... In the image ArUco Mark

To detect ArUco Mark ,

Analyze the image to find a square shape as a marker candidate .
When a candidate is detected , Verify their internal codes to ensure that they are ArUco Mark .

stay OpenCV in ,ArUco The tag dictionary follows the naming convention cv2.aruco.DICT_NxN_M, among N Is the size of the binary matrix , Represents the size of the tag ,M It's in the dictionary ArUco The number of marks .

To test ArUco Mark , Please put BGR Image to gray image , To facilitate the detection of .getattr() Used to access the ArUco Tag the value of the key attribute in the to load ArUco Dictionaries .

Detected at detectMarkers() Execute in function , Where the input parameter contains ArUco Marked images ,ArUco A dictionary object , In our case, it's 6x6x100 and DetectorParameters(). detectMarkers() Function returns a vector of four corners 、 Their id And detected but not compliant ArUco Any rectangle encoded .

def findArucoMarkers(img, markerSize = 6, totalMarkers=250, draw=True):
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    key = getattr(cv2.aruco, f'DICT_{markerSize}X{markerSize}_{totalMarkers}')
    
    #Load the dictionary that was used to generate the markers.
    arucoDict = cv2.aruco.Dictionary_get(key)
    
    # Initialize the detector parameters using default values
    arucoParam = cv2.aruco.DetectorParameters_create()
    
    # Detect the markers
    bboxs, ids, rejected = cv2.aruco.detectMarkers(gray, arucoDict, parameters = arucoParam)
    return bboxs, ids

Apply augmented reality by superimposing the source image on top of the video .

Start capturing video using your computer's default camera , And read to be superimposed on ArUco The image on the tag .

Detect... In video frames ArUco Mark and find each ArUco Position of all four corners marked . Calculate the homography between the video frame and the image to be superimposed .

Homography is the transformation mapping from a point in one image to a corresponding point in another image .

OpenCV Of findHomography() Calculate homography function between image and video frame points h To distort the image to fit the video frame . Then the distorted image is masked and copied to the video frame .

import numpy as np
import cv2
import imutils

# function to detect ArUco Markers
def findArucoMarkers(img, markerSize = 6, totalMarkers=250, draw=True):
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    key = getattr(cv2.aruco, f'DICT_{markerSize}X{markerSize}_{totalMarkers}')
    
    
    #Load the dictionary that was used to generate the markers.
    arucoDict = cv2.aruco.Dictionary_get(key)
    
    # Initialize the detector parameters using default values
    arucoParam = cv2.aruco.DetectorParameters_create()
    
    # Detect the markers
    bboxs, ids, rejected = cv2.aruco.detectMarkers(gray, arucoDict, parameters = arucoParam)
    return bboxs, ids
    
# Superimposing the image on the aruco markers detected in the video 
imgH=480
imgW=640

video = cv2. VideoCapture(0)

ret, video_frame=video.read()
image = cv2.imread(r'nature.png')
image = cv2.resize(image, (imgH, imgW))

while(video.isOpened()):
    if ret==True:
        refPts=[]  
        #Detect the Aruco markers on the video frame
        arucofound =findArucoMarkers(video_frame, totalMarkers=100)
        h, w = video_frame.shape[:2]
        
        # if the aruco markers are detected
        if  len(arucofound[0])!=0:
                
                for Corner, id in zip(arucofound[0], arucofound[1]):
                    
                    corners = Corner.reshape((4, 2))
                    (topLeft, topRight, bottomRight, bottomLeft) = corners
                    topRight = (int(topRight[0]), int(topRight[1]))
                    bottomRight = (int(bottomRight[0]), int(bottomRight[1]))
                    bottomLeft = (int(bottomLeft[0]), int(bottomLeft[1]))
                    topLeft = (int(topLeft[0]), int(topLeft[1]))
                    # draw lines around the marker and display the marker id
                    cv2.line(video_frame, topLeft, topRight, (0, 255, 0), 2)
                    cv2.line(video_frame, topRight, bottomRight, (0, 255, 0), 2)
                    cv2.line(video_frame, bottomRight, bottomLeft, (0, 255, 0), 2)
                    cv2.line(video_frame, bottomLeft, topLeft, (0, 255, 0), 2)                    
                    cv2.putText(video_frame, str(id),(topLeft[0], topLeft[1] - 15), cv2.FONT_HERSHEY_SIMPLEX,0.5, (0, 255, 0), 2)
                    corner = np.squeeze(Corner)
                    refPts.append(corner)
                    
                    # only when all the 4 markes are detected in the image
                    if len(refPts)==4:
                        ( refPtBR, refPtTR,refPtBL, refPtTL) = refPts
                        video_pt = np.array([  refPtTL[3], refPtBL[3],refPtBR[2], refPtTR[3]])
                       
                        # grab the spatial dimensions of the  image and define the
                        # transform matrix for the image in 
                        #top-left, top-right,bottom-right, and bottom-left order
                        image_pt = np.float32([[0,0], [h,0], [h,w], [0,w]])
                        
                        # compute the homography matrix between the image and the video frame
                        matrix, _ = cv2.findHomography( image_pt, video_pt)
                        
                        #warp the  image to video frame based on the homography
                        warped  = cv2.warpPerspective(image, matrix, (video_frame.shape[1], video_frame.shape[0]))
                        
                        #Create a mask representing region to 
                        #copy from the warped image into the video frame.
                        mask = np.zeros((imgH, imgW), dtype="uint8")
                        cv2.fillConvexPoly(mask, video_pt.astype("int32"), (255, 255, 255),cv2.LINE_AA)
                                                                    
                        # give the source image a black border
                        # surrounding it when applied to the source image,
                        #you can apply a dilation operation
                        rect = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
                        mask = cv2.dilate(mask, rect, iterations=2)
                        
                        # Copy the mask with the three channel version by stacking it depth-wise,
                        # This will allow copying the warped source image into the input image
                        maskScaled = mask.copy() / 255.0
                        maskScaled = np.dstack([maskScaled] * 3)
                        
                        # Copy the masked warped image into the video frame by
                        # (1) multiplying the warped image and masked together, 
                        # (2) multiplying the Video frame with the mask 
                        # (3) adding the resulting images
                        warpedMultiplied = cv2.multiply(warped.astype("float"), maskScaled)
                        imageMultiplied = cv2.multiply(video_frame.astype(float), 1.0 - maskScaled)
                        #imgout = video frame multipled with mask 
                        #        + warped image multipled with mask
                        output = cv2.add(warpedMultiplied, imageMultiplied)
                        output = output.astype("uint8")
                        cv2.imshow("output", output)
    
    ret, video_frame=video.read()
    key = cv2.waitKey(20)
    # if key q is pressed then break 
    if key == 113:
        break 
    
#finally destroy/close all open windows
video.release()
cv2.destroyAllWindows()

The final output maps the image to what is detected in the video ArUco The top of the mark .

Use ArUco Tagged augmented reality

The code is provided here ：https://github.com/arshren/AR_Aruco

Reference resources ：

https://docs.opencv.org/4.x/d5/dae/tutorial_aruco_detection.html

https://machinelearningknowledge.ai/augmented-reality-using-aruco-marker-detection-with-python-opencv/

https://learnopencv.com/augmented-reality-using-aruco-markers-in-opencv-c-python/

* END *

If you see this , Show that you like this article , Please forward 、 give the thumbs-up . WeChat search 「uncle_pn」, Welcome to add Xiaobian wechat 「 woshicver」, Update a high-quality blog post in your circle of friends every day .

↓ Scan QR code and add small code ↓