[ad_1]
- What is Computer Vision?
- How does a computer read an image?
- What is OpenCV?
- OpenCV Installation
- Read & Save Images
- Basic Operation On images
- OpenCV Resize Image
- OpenCV Image Rotation
- OpenCV Drawing Functions
- OpenCV Blob Detection
What is Computer Vision?
The term Computer Vision (CV) is used and heard very often in artificial intelligence (AI) and deep learning (DL) applications. The term essentially means giving a computer the ability to see the world as we humans do.
Computer Vision is a field of study which enables computers to replicate the human visual system. As already mentioned above, It’s a subset of artificial intelligence which collects information from digital images or videos and processes them to define the attributes. The entire process involves image acquiring, screening, analysing, identifying and extracting information. This extensive processing helps computers to understand any visual content and act on it accordingly.
Computer vision projects translate digital visual content into explicit descriptions to gather multi-dimensional data. This data is then turned into a computer-readable language to aid the decision-making process. The main objective of this branch of artificial intelligence is to teach machines to collect information from pixels.
How does a computer read an image?
How does a human mind apprehend an image? When you see the image below,what do you actually see and how do you say what is in the Image?
You most probably look for different shapes and colours in the Image and that might help you decide that this is an image of a dog. But does a computer also see it in the same way? The answer is no.
A digital image is an image composed of picture elements, also known as pixels, each with finite, discrete quantities of numeric representation for its intensity or grey level. So the computer sees an image as numerical values of these pixels and in order to recognise a certain image, it has to recognise the patterns and regularities in this numerical data.
Here is a hypothetical example of how pixels form an image. The darker pixels are represented by a number closer to the zero and lighter pixels are represented by numbers approaching one. All other colours are represented by the numbers between 0 and 1.
But usually, you will find that for any colour image, there are 3 primary channels – Red, green and blue and the value of each channel varies from 0-255. In more simpler terms we can say that a digital image is actually formed by the combination of three basic colour channels Red, green, and blue whereas for a grayscale image we have only one channel whose values also vary from 0-255.
What is OpenCV?
OpenCV ( Open Source Computer Vision Library) is an open source software library for computer vision and machine learning. OpenCV was created to provide a shared infrastructure for applications for computer vision and to speed up the use of machine perception in consumer products. OpenCV, as a BSD-licensed software, makes it simple for companies to use and change the code. There are some predefined packages and libraries that make our life simple and OpenCV is one of them
Gary Bradsky invented OpenCV in 1999 and soon the first release came in 2000. This library is based on optimised C / C++ and supports Java and Python along with C++ through an interface. The library has more than 2500 optimised algorithms, including an extensive collection of computer vision and machine learning algorithms, both classic and state-of-the-art.Using OpenCV it becomes easy to do complex tasks such as identify and recognise faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D object models, generate 3D point clouds from stereo cameras, stitch images together to generate an entire scene with a high resolution image and many more.
Python is a user friendly language and easy to work with but this advantage comes with a cost of speed, as Python is slower to languages such as C or C++.So we extend Python with C/C++, which allows us to write computationally intensive code in C/C++ and create Python wrappers that can be used as Python modules. Doing this, the code is fast, as it is written in original C/C++ code (since it is the actual C++ code working in the background) and also, it is easier to code in Python than C/C++. OpenCV-Python is a Python wrapper for the original OpenCV C++ implementation.
OpenCV installation
There are many ways in which you can install OpenCV on your computer. Here are some:
Install using Anaconda
Anaconda is a conditional free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. You can download it from here and install it.
After successfully installing anaconda, just go to the anaconda prompt and use this command to install OpenCV:
conda install -c conda-forge opencv
After this command is successfully executed, OpenCV will be available on your computer.Now let us see some other ways to install OpenCV
For Windows
You can use pip to install OpenCV on windows. Pip is a de facto standard package-management system used to install and manage software packages written in Python and it usually comes in installed when you install Python. If you do not have Python installed, I would suggest download it from here. Use this command in the command prompt to install OpenCV:
pip install opencv-python
After installing it,do check if it is installed successfully.For that just go to the command prompt and type ‘python’ and hit enter.You should see some message like this:
If this is not the message you see, I suggest reinstalling python into your system. Next type import cv2 and if there is no error then it is installed successfully.
For Mac
You can use homebrew to install OpenCV as it makes it really easy and you just have to use this command for installing:
brew install opencv
Now that you have installed the OpenCV onto your system, let’s see how it works.
Read & Save Images
Now for OpenCV to work on any image, it must be able to read it. Here we will see how to read a file and save it after we are done with it. Let’s see how to do it:
Imread function in OpenCV
We use the imread function to read images,here is the syntax of this function
cv2.imread(path, flag)
The path parameter takes a string representing the path of the image to be read.The file should be in the working directory or we must give the full path to the image.The other parameter is the flag which is used to specify how our image should be read. Here are possible values that it takes and their working:
cv2.IMREAD_COLOR: It specifies to convert the image to the 3 channel BGR colour image. Any transparency of image will be neglected. It is the default flag. Alternatively, we can passinteger value 1 for this flag.
cv2.IMREAD_GRAYSCALE: It specifies to convert an image to thesingle channel grayscale image. Alternatively, we can pass integer value 0 for this flag.
cv2.IMREAD_UNCHANGED: It specifies to load an image as such including alpha channel.Alternatively, we can pass integer value -1 for this flag.
Usually the method imread() returns an image that is loaded from the specified file but in case the image cannot be read because of unsupported file format, missing file, unsupported or invalid format, it just returns a matrix. Here is a example in which we read an image from my storage.
#importing the opencv module import cv2 # using imread('path') and 1 denotes read as color image img = cv2.imread('dog.jpg',1) #This is using for display the image cv2.imshow('image',img) cv2.waitKey() # This is necessary to be required so that the image doesn't close immediately. #It will run continuously until the key press. cv2.destroyAllWindows()
Imwrite function in OpenCV
We can use OpenCV’s imwrite() function to save an image in a storage device and the file extension defines the image format as shown in the example below. The syntax is the following:
cv2.imwrite(filename, image) Parameters: filename: A string representing the file name. The filename must include image format. image: It is the image that is to be saved.
Here is an example in which we use this function:
import cv2 # read image img = cv2.imread(r'C:UsersMirzadog.jpeg', 1) # save image status = cv2.imwrite(r'C:UsersMirzadog.jpeg',img) print("Image written sucess? : ", status)
If the file is successfully written then this function returns True and thus it is important to store the outcome of this function.In the example above,we have done the same and used the ‘status’ variable to know if the file is written successfully.
Basic Operation On images
In this section,we are going to discuss some of the basic operations that we can do on the images once we have successfully read them.The operations we are going to do here ae:
- Access pixel values and modify them
- Access image properties
- Set a Region of Interest (ROI)
- Split and merge image channels
Access pixel values and modify them
So there are basically two ways to access a pixel value in an Image and modify them.First let us see how we can access a particular pixel value of an image.
import numpy as np import cv2 as cv img = cv.imread(r'C:UsersMirzadog.jpeg') px = img[100,100] print( px )
Output: [157 166 200]
Now as you can see we got a list containing 3 values.As we know OpenCV stores the color image as BGR color image,so the first value in the list is the value of the blue channel of this particular pixel, and the rest are values for green and red channels.
We can also access only one of the channels as shown below:
# accessing only blue pixel blue = img[100,100,0] print( blue )
Output 157
To modify the values, we just need to access the pixel and then overwrite it with a value as shown below:
img[100,100] = [255,255,255] print( img[100,100] )
Output: [255 255 255]
This method to access and modify the pixel values is slow so you should make use of NumPy library as it is optimized for fast array calculations. For accessing individual pixel values, the Numpy array methods, array.item() and array.itemset() are considered better as they always return a scalar. However, if you want to access all the B,G,R values, you will need to call array.item() separately for each value as shown below:
# accessing RED value img.item(10,10,2) >>59 # modifying RED value img.itemset((10,10,2),100) img.item(10,10,2) >>100
Access Image properties
What do we mean by image properties here? Often it is important to know the size(total number of pixels in the image), number of rows, columns, and channels.We can access the later three by using the shape() method as shown below:
print( img.shape ) >>(342, 548, 3) print( img.size ) >>562248
So here we have three numbers in the returned tuple,these are number of rows, number of columns and number of channels respectively.Incase an image is grayscale, the tuple returned contains only the number of rows and columns.
Often a large number of errors in OpenCV-Python code are caused by invalid datatype so img.dtype which returns the image datatype is very important while debugging.Here is an example:
print( img.dtype ) >>uint8
Image ROI(Region of interest)
Often you may come across some images where you are only interested in a specific region.Say you want to detect eyes in an image,will you search the entire image,possibly not as that may not fetch accurate results.But we know that eyes are a part of face,so it is better to detect a face first ,thus here the face is our ROI.You may want to have a look at the article Face detection using Viola-Jones algorithm where we detect the faces and then find eyes in the area we found faces.
Splitting and Merging Image Channels
We can also split the channels from an image and then work on each channel separately.Or sometimes you may need to merge them back together,here is how we do it:
But this method is painfully slow,so we can also use the Numpy to do the same,here is how:
b,g,r = cv.split(img) img = cv.merge((b,g,r)) b = img[:,:,0] g = img[:,:,1] r = img[:,:,2]
Now suppose you want to just set all the values in the red channel to zero, here is how to do that:
#sets all values in red channel as zero img[:,:,2] = 0
OpenCV Resize Image
Usually when working on images,we often need to resize the images according to certain requirements.Mostly you will do such operation in Machine learning and deep learning as it reduces the time of training of a neural network. As the number of pixels in an image increases, the more is the number of input nodes that in turn increases the complexity of the model. We use an inbuilt resize() method to resize an image.
Syntax: cv2.resize(s, size,fx,fy,interpolation) Parameters: s - input image (required). size - desired size for the output image after resizing (required) fx - Scale factor along the horizontal axis.(optional) fy - Scale factor along the vertical axis. Interpolation(optional) - This flag uses following methods: INTER_NEAREST – a nearest-neighbor interpolation INTER_LINEAR – a bilinear interpolation (used by default) INTER_AREA – resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method. INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood
Here is an example of how we can use this method:
import cv2 import numpy as np #importing the opencv module import cv2 # using imread('path') and 1 denotes read as color image img = cv2.imread('dog.jpg',1) print(img.shape) img_resized=cv2.resize(img, (780, 540), interpolation = cv2.INTER_NEAREST) cv2.imshow("Resized",img_resized) cv2.waitKey(0) cv2.destroyAllWindows()
Output:
OpenCV Image Rotation
We may need to rotate an image in some of the cases and we can do it easily by using OpenCV .We use cv2.rotate() method to rotate a 2D array in multiples of 90 degrees.Here is the syntax:
Syntax: cv2.rotate( src, rotateCode[, dst] ) Parameters: src: It is the image to be rotated. rotateCode: It is an enum to specify how to rotate the array.Here are some of the possible values : cv2.cv2.ROTATE_90_CLOCKWISE cv2.ROTATE_180 cv2.ROTATE_90_COUNTERCLOCKWISE
Here is an example using this function.
import cv2 import numpy as np #importing the opencv module import cv2 # using imread('path') and 1 denotes read as color image img = cv2.imread('dog.jpg',1) print(img.shape) image = cv2.rotate(img, cv2.ROTATE_90_COUNTERCLOCKWISE) cv2.imshow("Rotated",image) cv2.waitKey() cv2.destroyAllWindows()
Output
Now what if we want to rotate the image by a certain angle.We can use another method for that.First calculate the affine matrix that does the affine transformation (linear mapping of pixels) by using the getRotationMatrix2D method,next we warp the input image with the affine matrix using warpAffine method.Here is the syntax of these functions
syntax: cv2.getRotationMatrix2D(center, angle, scale) cv2.warpAffine(Img, M, (W, H)) center: center of the image (the point about which rotation has to happen) angle: angle by which image has to be rotated in the anti-clockwise direction. scale: scales the image by the value provided,1.0 means the shape is preserved. H:height of image W: width of the image. M: affine matrix returned by cv2.getRotationMatrix2D Img:image to be rotated.
Here is an example in which we rotate an image by various angles.
import cv2 import numpy as np #importing the opencv module import cv2 # using imread('path') and 1 denotes read as color image img = cv2.imread('dog.jpg',1) # get image height, width (h, w) = img.shape[:2] # calculate the center of the image center = (w / 2, h / 2) scale = 1.0 # Perform the counter clockwise rotation holding at the center # 45 degrees M = cv2.getRotationMatrix2D(center, 45, scale) print(M) rotated45 = cv2.warpAffine(img, M, (h, w)) # 110 degrees M = cv2.getRotationMatrix2D(center,110, scale) rotated110 = cv2.warpAffine(img, M, (w, h)) # 150 degrees M = cv2.getRotationMatrix2D(center, 150, scale) rotated150 = cv2.warpAffine(img, M, (h, w)) cv2.imshow('Original Image',img) cv2.waitKey(0) # waits until a key is pressed cv2.destroyAllWindows() # destroys the window showing image cv2.imshow('Image rotated by 45 degrees',rotated45) cv2.waitKey(0) # waits until a key is pressed cv2.destroyAllWindows() # destroys the window showing image cv2.imshow('Image rotated by 110 degrees',rotated110) cv2.waitKey(0) # waits until a key is pressed cv2.destroyAllWindows() # destroys the window showing image cv2.imshow('Image rotated by 150 degrees',rotated150) cv2.waitKey(0) # waits until a key is pressed cv2.destroyAllWindows() # destroys the window showing image
Output
OpenCV Drawing Functions
We may require to draw certain shapes on an image such as circle, rectangle, ellipse, polylines, convex, etc. and we can easily do this using OpenCV.It is often used when we want to highlight any object in the input image for example in case of face detection,we might want to highlight the face with a rectangle.Here we will learn about the drawing functions such as circle,rectangle,lines,polylines and also see how to write text on an image.
Drawing circle:
We use the method to circle to draw a circle in an image.Here is the syntax and parameters:
Syntax: cv2.circle(image, center_coordinates, radius, color, thickness) Parameters: image: It is the input image on which a circle is to be drawn. center_coordinates: It is the center coordinates of the circle. The coordinates are represented as tuples of two values i.e. (X coordinate value, Y coordinate value). radius: It is the radius of the circle. color: It is the color of the border line of the circle to be drawn. We can pass a tuple For in BGR, eg: (255, 0, 0) for blue color. thickness: It is the thickness of the circle border line in px. Thickness of -1 px will fill the circle shape by the specified color. Return Value: It returns an image.
Here are the few of the examples
import numpy as np import cv2 img = cv2.imread(r'C:UsersMirzadog.jpeg', 1) cv2.circle(img,(80,80), 55, (255,0,0), -1) cv2.imshow('image',img) cv2.waitKey(0) cv2.destroyAllWindows()
Drawing Rectangle
In a similar we can draw a rectangle.Here is the the syntax for this function:
Syntax: cv2.rectangle(image, start_point, end_point, color, thickness) Parameters: image: It is the input image on which rectangle is to be drawn. start_point: It is the starting coordinates(top left vertex) of the rectangle. The coordinates are represented as tuples of two values i.e. (X coordinate value, Y coordinate value). end_point: It is the ending coordinates(bottom right) of the rectangle. The coordinates are represented as tuples of two values i.e. (X coordinate value, Y coordinate value). color: It is the color of the border line of the rectangle to be drawn. We can pass a tuple For in BGR, eg: (255, 0, 0) for blue color. thickness: It is the thickness of the rectangle border line in px. Thickness of -1 px will fill the rectangle shape by the specified color. Return Value: It returns an image.
Here is an example of this function:
import numpy as np import cv2 img = cv2.imread(r'C:UsersMirzadog.jpeg', 1) cv2.rectangle(img,(15,25),(200,150),(0,255,255),15) cv2.imshow('image',img) cv2.waitKey(0) cv2.destroyAllWindows()
Drawing Lines
Here is the syntax of the line method using which we can make lines on an image.
Syntax: cv2.line(image, start_point, end_point, color, thickness) Parameters: image: It is the input image on which line is to be drawn. start_point: It is the starting coordinates of the line. The coordinates are represented as tuples of two values i.e. (X coordinate value, Y coordinate value). end_point: It is the ending coordinates of the line. The coordinates are represented as tuples of two values i.e. (X coordinate value, Y coordinate value). color: It is the color of the line to be drawn. We can pass a tuple For in BGR, eg: (255, 0, 0) for blue color. thickness: It is the thickness of the line in px. Return Value: It returns an image.
Here is an example:
import numpy as np import cv2 img = cv2.imread(r'C:UsersMirzadog.jpeg', 1) #defining points for polylines pts = np.array([[100,50],[200,300],[700,200],[500,100]], np.int32) # pts = pts.reshape((-1,1,2)) cv2.polylines(img, [pts], True, (0,255,255), 3) cv2.imshow('image',img) cv2.waitKey(0) cv2.destroyAllWindows()
Drawing Polylines
We can draw the polylines using the polylines() method on the image. And these can be used to draw polygonal curves on the image. The syntax is given below:
syntax cv2.polyLine(image, arr, is_closed, color, thickness) Parameters: img - It represents an image. arr -represents the coordinates of vertices into an array of shape nx1x2 where n is number of vertices and it should be of type int32. is_Closed - It is a flag that indicates whether the drawn polylines are closed or not. color - Color of polylines. We can pass a tuple For in BGR, eg: (255, 0, 0) for blue color. thickness - It represents the Thickness of the polyline's edges.
Here is an example
import numpy as np import cv2 img = cv2.imread(r'C:UsersMirzadog.jpeg', 1) #defining points for polylines pts = np.array([[100,50],[200,300],[700,200],[500,100]], np.int32) # pts = pts.reshape((-1,1,2)) cv2.polylines(img, [pts], True, (0,255,255), 3) cv2.imshow('image',img) cv2.waitKey(0) cv2.destroyAllWindows()
Write text on an image
We can write text on the image by using the putText() method. The syntax is given below.
Syntax cv2.putText(img, text, org, font,fontScale color) Parameters: img: It represents the input image on which we have to write text text: The text which we want to write on the image. org: It denotes the Bottom-left corner of the text string on the image.So it is used to set the location of text on the image font: the font of text.Here is the list of supported fonts. fontScale: The scale of the font by which you can increase or decrease size color: Represents the color. We can pass a tuple For in BGR, eg: (255, 0, 0) for blue color.
Here is an example:
import numpy as np import cv2 font = cv2.FONT_HERSHEY_SIMPLEX mg = cv2.imread(r'C:UsersMirzadog.jpeg', 1) cv2.putText(img,'Dog',(10,500), font, 1,(255,255,255),2) #Display the image cv2.imshow("image",img) cv2.waitKey(0)
OpenCV Blob Detection
Blob stands for Binary Large Object where the term “Large” focuses on the object of a specific size, and that other “small” binary objects are usually considered as noise.
In simpler terms, a Blob is a group of connected pixels which we can find in an image and all of these pixels have some common property. In the image below, the coloured connected regions are blobs, and the goal of blob detection is to identify and mark these regions( marked by red circle).
Using OpenCV’s SimpleBlobDetector method, we can easily find blobs in our images.But how does this method work?Let us see this in detail:
- Thresholding :First the algorithm converts the source images to several binary images by applying thresholding with various thresholds.We define two threshold values,viz- minThreshold (inclusive) to maxThreshold (exclusive) and start from threshold value equal to minThreshold.Then it is incremented by thresholdStep until we reach maxThreshold,so the first threshold is minThreshold, the second is minThreshold + thresholdStep and so on.
- Grouping : In each binary image, we have a curve joining all the continuous points (along the boundary), having the same color or intensity.
- Merging : The centers of the binary blobs in the binary images are computed, and blobs located closer than minDistBetweenBlobs(minimum distance between two blobs) are merged.
- Center & Radius Calculation : The centers and radii of the new merged blobs are computed and returned.
This class can perform several filtrations of returned blobs by setting filterBy* to True to turn on corresponding filtration. Available filtrations are as following:
- By color. We define a parameter blobColor to filter the blobs of colours we are interested in. Set blobColor equal to zero to extract dark blobs and to extract light blobs,set it to 255. This filter compares the intensity of a binary image at the center of a blob to blobColor and filters accordingly.
- By area. By using this filter the extracted blobs have an area between minArea (inclusive) and maxArea (exclusive).
- By circularity. By using this filter the extracted blobs have circularity between minCircularity (inclusive) and maxCircularity (exclusive).
- By ratio of the minimum inertia to maximum inertia.By using this filter the extracted blobs have this ratio between minInertiaRatio (inclusive) and maxInertiaRatio (exclusive).
- By convexity.By using this filter the extracted blobs have convexity (area / area of blob convex hull) between minConvexity (inclusive) and maxConvexity (exclusive).
By default,the values of these parameters are tuned to extract dark circular blobs.
Here is an example of how to use simple SimpleBlobDetector()
import cv2 import numpy as np; img = cv2.imread(r"pic1.jpeg", cv2.IMREAD_GRAYSCALE) # Set up the detector with default parameters. detector = cv2.SimpleBlobDetector() # Detecting blobs. keypoints = detector.detect(img) # Draw detected blobs as red circles. # cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures the size of the circle corresponds to the size of blob im_with_keypoints = cv2.drawKeypoints(img, keypoints, np.array([]), (0, 0, 255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS) # Show keypoints cv2.imshow("Keypoints", im_with_keypoints) cv2.waitKey(0)
Now here is an example in which we use the filters mentioned above:
import cv2 import numpy as np; # Read image im = cv2.imread("blob.jpg") # Setup SimpleBlobDetector parameters. params = cv2.SimpleBlobDetector_Params() # Change thresholds params.minThreshold = 10 params.maxThreshold = 200 # Filter by Area. params.filterByArea = True params.minArea = 1500 # Filter by Circularity params.filterByCircularity = True params.minCircularity = 0.1 # Filter by Convexity params.filterByConvexity = True params.minConvexity = 0.87 # Filter by Inertia params.filterByInertia = True params.minInertiaRatio = 0.01 # Create a detector with the parameters detector = cv2.SimpleBlobDetector(params) # Detect blobs. keypoints = detector.detect(im) # Draw detected blobs as red circles. # cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS ensures # the size of the circle corresponds to the size of blob im_with_keypoints = cv2.drawKeypoints(im, keypoints, np.array([]), (0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS) # Show blobs cv2.imshow("Keypoints", im_with_keypoints) cv2.waitKey(0)
This brings us to the end of this article where we learned about OpenCV, you can take a free course on Computer Vision from Great Learning academy, just click the banner above for more information.
0
[ad_2]
Source link