Introducing Caer — A GPU-accelerated Computer Vision Library

mediumThis post was originally published by Jason Dsouza at Medium [AI]

Python library that changes the way your approach towards Machine Vision

Image for post
Caer — GPU-accelerated Image & Video Processing. Image by Author.

When I released Caer back in August of this year, I have received hundreds of emails from researchers and computer vision enthusiasts around the world thanking me for releasing the library. Their good (and bad) feedback pushed and motivated me to take the library to another level.

Today, I’m excited to announce the first-ever stable release of Caer, a lightweight open-source Python library that simplifies the way you approach Computer Vision. It abstracts away unnecessary boilerplate code enabling maximum flexibility. By offering powerful image and video processing algorithms, Caer provides both casual and advanced users with an elegant interface for Machine vision operations.

It leverages the power of libraries like OpenCV and Pillow to speed up your Computer Vision workflow — making it ideal if you want to quickly test out something.

This design philosophy makes Caer ideal for students, researchers, hobbyists and even experts in the fields of Deep Learning and Computer Vision to quickly prototype deep learning models or research ideas.


A lightweight Computer Vision library for high-performance AI research. Caer contains powerful image and video processing operations…

What is Caer?

Caer is a GPU-accelerated Computer Vision library in Python that’s designed to help speed up your Computer Vision workflow.

It’s ideal for rapid prototyping so you can focus more on the experimenting rather than the building. I use this package every single day when working on image and video processing workflows and it saves me tons of time!

Installing Caer

The latest release of Caer can be installed via a simple pip install

pip install --upgrade caer

Read the complete Installation Guide for more platform-specific download instructions.

Using Caer

I recommend going through the documentation for a look at all the methods in Caer.

1. Standard Test Images

Caer currently ships out of the box with 29 high-quality images from Unsplash. These are extremely handy if you want to test out a feature quickly. Simply call<image>() to get a standard 640×427 image.

Read the documentation for details on all the images you can reference.

Image for post
To get this image, simply call Image by Author

2. Advanced Resizing

Most libraries today like OpenCV and Pillow perform hard-resizing, meaning that you lose the original aspect ratio of your image. When training Deep Neural Networks, this is not such a big deal, but in other cases, it makes a big difference.

caer.resize() resizes your images to a certain target size (400×400, for instance) while still maintaining the original aspect ratio. Behind the scenes, it uses an advanced cropping mechanism that crops out the most useful part of the image.

We are currently working on a Context-Aware smart image resizer to retain the most useful information in your image without the need for cropping.

# A standard 640x427 image
>> img = Resizing to 400x400 maintaing aspect ratio
>> resized = caer.resize(img, (400,400), keep_aspect_ratio=True)>> plt.imshow(resized)
Image for post
Image by Author

3. Translation and Rotation

Translating an image in Caer is as easy as calling caer.translate() . Behind the scenes, it defines a translation matrix and translates the image.

Rotation follows the same principle — a rotation matrix defined is defined and the image is rotated).

# Shifts an image 50 pixels to the right and 100 pixels up
>> translated = caer.translate(img, 50, -100)# Rotates an image around the centre counter-clockwise by 45 degrees
>> rotated = caer.rotate(img, 45, rotPoint=None)
Image for post
Image by Author

4. Batch Pre-processing

Got several hundred images and want to quickly compute the mean pixel intensity?

caer.preprocessing.compute_mean_from_dir() iterates over all the images in a directory and returns a tuple of the average mean intensities which can be used to perform mean subtraction.

# Computes the mean per channel of the image
>> mean = caer.preprocessing.compute_mean_from_dir(path, channels=3, per_channel_subtraction=True)>> mean
(56.935615485948475, 79.85257611241218, 100.95970799180328)# Subtracting the mean using these values
>> mp = MeanProcess(mean, channels=3)
>> sub = mp.mean_preprocess(img, channels=3)>> plt.imshow(sub)
Image for post
Image by Author

What’s Next?

Caer is by no means an attempt to reinvent the wheel. In fact, we utilize backend frameworks like OpenCV to ensure maximum flexibility and performance for your Computer Vision models.

We are actively working to improve Caer’s functionality (contributions welcome!). In the coming days, we will be releasing a context-aware image resizer that we’ve been testing out for weeks. If you’d like to request a specific functionality, you can do so on our Github page or tweet me!

Useful Links

  1. Github Repo
  2. Documentation
  3. Contribute to the codebase
  4. Tweet about us!

If you like Caer, give us a ⭐️ on the repo.

Spread the word

This post was originally published by Jason Dsouza at Medium [AI]

Related posts