QR Code Scanner with ESP32-CAM and OpenCV

Hello friends! How are you today? Today we're going to discuss a project that is interesting and also useful in our everyday life. You see QR codes almost everywhere, right? They are printed on almost every product's package, leaflets, newspapers, and brochures.

Perhaps, you often use QR code scanners on your mobile device. What about making such a program by yourself? Yes! That is exactly what we are going to do today. We will make a QR code scanner using the ESP32-CAM. For image processing, we will use the OpenCV library.

If you’ve ever wanted to create a real-time QR code scanner using a low-cost, wireless camera module, you’re in the right place. In this tutorial, we’ll walk through setting up an ESP32-CAM to stream video and using OpenCV to detect and decode QR codes in real time.

Introduction to the ESP32-CAM

The ESP32-CAM is a powerful yet affordable development board that combines the ESP32 microcontroller with an integrated camera module, making it an excellent choice for IoT and vision-based applications. Whether you're building a wireless security camera, a QR code scanner, or an AI-powered image recognition system, the ESP32-CAM provides a compact and cost-effective solution.

One of its standout features is built-in WiFi and Bluetooth connectivity, allowing it to stream video or capture images remotely. Despite its small size, it packs a punch with a dual-core processor, support for microSD card storage, and compatibility with various camera sensors (such as the OV2640). However, since it lacks built-in USB-to-serial functionality, flashing firmware requires an external FTDI adapter.

System Architecture of ESP32-CAM QR Code Scanner

This project consists of two main components:

  1. ESP32-CAM as an Image Server

  2. Python Script for QR Code Detection and Processing

Each component interacts with different subsystems to achieve the overall functionality.

1. High-Level Overview

The architecture consists of:

  • ESP32-CAM: Captures images and hosts them on a web server.

  • WiFi Network: Enables communication between ESP32-CAM and the computer running the Python script.

  • Python Script on a Computer: Continuously fetches images from ESP32-CAM, processes them, and extracts QR code data.

  • User Interface: Displays the live feed and detected QR codes.


2. Detailed Breakdown of Components

A. ESP32-CAM (Image Server)

  • Hardware: ESP32-CAM module with OV2640 camera.

  • Software: ESP32-CAM uses the esp32cam library to initialize the camera and serve images via an HTTP web server.

  • Functionality:

    • Captures an image when accessed via http:///cam-hi.jpg.

    • Returns the image in JPEG format to the requesting client.

Workflow:

  1. ESP32-CAM initializes camera settings (resolution: 800x600, JPEG quality: 80).

  2. It connects to a WiFi network.

  3. A web server starts on port 80.

  4. When a client (Python script) accesses /cam-hi.jpg, ESP32-CAM captures an image and sends it.

B. Python QR Code Detection Script (Client)

  • Hardware: A computer (Windows/Linux/Mac).

  • Software: Python, OpenCV, NumPy, urllib.

  • Functionality:

    • Fetches images from ESP32-CAM at regular intervals.

    • Converts them to grayscale for better QR detection.

    • If normal detection fails, applies adaptive thresholding.

    • Detects and decodes QR codes using OpenCV.

    • Displays the live video feed with detected QR code data.

Workflow:

  1. The script continuously requests images from http:///cam-hi.jpg.

  2. It decodes the image using OpenCV.

  3. Converts the image to grayscale.

  4. Attempts to detect a QR code.

  5. If detection fails, applies image preprocessing (blurring and thresholding).

  6. If a QR code is found, it prints the decoded text and overlays a bounding box.

  7. The processed frame is displayed in a window.

3. Communication & Data Flow

Data Flow Between Components

  1. ESP32-CAM Captures Image

    • Uses esp32cam::capture() to take a snapshot.

    • Hosts the image on an HTTP endpoint (/cam-hi.jpg).

  2. Python Script Requests Image

    • Sends an HTTP GET request using urllib.request.urlopen().

    • Receives the image data in JPEG format.

  3. Image Processing & QR Code Detection

    • OpenCV converts the image to grayscale.

    • Tries decoding the QR code using cv2.QRCodeDetector().detectAndDecode().

    • If unsuccessful, applies adaptive thresholding and retries.

  4. Output Display & User Interaction

    • If a QR code is detected, its content is displayed.

    • Bounding boxes are drawn around detected QR codes.

    • Live video feed is displayed in an OpenCV window.

List of components

Components

Quantity

ESP32-CAM WiFi + Bluetooth Camera Module

1

FTDI USB to Serial Converter 3V3-5V

1

Male-to-female jumper wires

4

Female-to-female jumper wire

1

MicroUSB data cable

1

Circuit diagram

Following is the circuit diagram of this project.

Fig: Circuit diagram

ESP32-CAM WiFi + Bluetooth Camera Module

FTDI USB to Serial Converter 3V3-5V (Voltage selection button should be in 5V position)

5V

VCC

GND

GND

UOT

Rx

UOR

TX

IO0

GND (FTDI or ESP32-CAM)

Programming

If this is your first project with an ESP32 board, you need to do board installation. You will also need to download and install the ESP32-CAM library. To make the camera functional, the cp210x usb driver and the FTDI driver, must be properly installed in your computer. Here is a detailed tutorial that shows how to get started with the ESP32-CAM.

ESP32-CAM code


#include

#include

#include  

const char* WIFI_SSID = "SSID";

const char* WIFI_PASS = "password";

WebServer server(80);

 


static auto hiRes = esp32cam::Resolution::find(800, 600);

void serveJpg()

{

  auto frame = esp32cam::capture();

  if (frame == nullptr) {

    Serial.println("CAPTURE FAIL");

    server.send(503, "", "");

    return;

  }

  Serial.printf("CAPTURE OK %dx%d %db\n", frame->getWidth(), frame->getHeight(),

                static_cast(frame->size()));

 

  server.setContentLength(frame->size());

  server.send(200, "image/jpeg");

  WiFiClient client = server.client();

  frame->writeTo(client);

}

 


 

void handleJpgHi()

{

  if (!esp32cam::Camera.changeResolution(hiRes)) {

    Serial.println("SET-HI-RES FAIL");

  }

  serveJpg();

}

 


 

 

void  setup(){

  Serial.begin(115200);

  Serial.println();

  {

    using namespace esp32cam;

    Config cfg;

    cfg.setPins(pins::AiThinker);

    cfg.setResolution(hiRes);

    cfg.setBufferCount(2);

    cfg.setJpeg(80);

 

    bool ok = Camera.begin(cfg);

    Serial.println(ok ? "CAMERA OK" : "CAMERA FAIL");

  }

  WiFi.persistent(false);

  WiFi.mode(WIFI_STA);

  WiFi.begin(WIFI_SSID, WIFI_PASS);

  while (WiFi.status() != WL_CONNECTED) {

    delay(500);

  }

  Serial.print("http://");

  Serial.println(WiFi.localIP());


  Serial.println("  /cam-hi.jpg");


 

 

  server.on("/cam-hi.jpg", handleJpgHi);


 

  server.begin();

}

 

void loop()

{

  server.handleClient();

}


After uploading the code, disconnect the IO0 pin of the camera from GND. Then press the RST pin. The following messages will appear.

Fig: Code successfully uploaded to ESP32-CAM

You have to copy the IP address and paste it into the following part of your Python code.

Fig: Copy-pasting the URL to the Python script

Code breakdown

#include

#include

#include

  • #include : Adds support for creating a lightweight HTTP server.

  • #include : Allows the ESP32 to connect to Wi-Fi networks.

  • #include : Provides functions to control the ESP32-CAM module, including camera initialization and capturing images.

const char* WIFI_SSID = "SSID";

const char* WIFI_PASS = "password";

  • WIFI_SSID and WIFI_PASS: Define the SSID and password of the Wi-Fi network that the ESP32 will connect to.

 WebServer server(80);

  • WebServer server(80): Creates an HTTP server instance that listens on port 80 (default HTTP port). 

static auto hiRes = esp32cam::Resolution::find(800, 600);

esp32cam::Resolution::find: Defines camera resolutions:

  • hiRes: High resolution (800x600).

void serveJpg()

{

  auto frame = esp32cam::capture();

  if (frame == nullptr) {

    Serial.println("CAPTURE FAIL");

    server.send(503, "", "");

    return;

  }

  Serial.printf("CAPTURE OK %dx%d %db\n", frame->getWidth(), frame->getHeight(),

                static_cast(frame->size()));

  server.setContentLength(frame->size());

  server.send(200, "image/jpeg");

  WiFiClient client = server.client();

  frame->writeTo(client);

}

  • esp32cam::capture: Captures a frame from the camera.

  • Failure Handling: If no frame is captured, it logs a failure and sends a 503 error response.

  • Logging Success: Prints the resolution and size of the captured image.

  • Serving the Image:

    • Sets the content length and MIME type as image/jpeg.

    • Writes the image data directly to the client.

void handleJpgHi()

{

  if (!esp32cam::Camera.changeResolution(hiRes)) {

    Serial.println("SET-HI-RES FAIL");

  }

  serveJpg();

}

  • handleJpgHi: Switches the camera to high resolution using esp32cam::Camera.changeResolution(hiRes) and calls serveJpg.

  • Error Logging: If the resolution change fails, it logs a failure message to the Serial Monitor.

void  setup(){

  Serial.begin(115200);

  Serial.println();

  {

    using namespace esp32cam;

    Config cfg;

    cfg.setPins(pins::AiThinker);

    cfg.setResolution(hiRes);

    cfg.setBufferCount(2);

    cfg.setJpeg(80);

 

    bool ok = Camera.begin(cfg);

    Serial.println(ok ? "CAMERA OK" : "CAMERA FAIL");

  }

  WiFi.persistent(false);

  WiFi.mode(WIFI_STA);

  WiFi.begin(WIFI_SSID, WIFI_PASS);

  while (WiFi.status() != WL_CONNECTED) {

    delay(500);

  }

  Serial.print("http://");

  Serial.println(WiFi.localIP());

  Serial.println("  /cam-hi.jpg");


 

  server.on("/cam-hi.jpg", handleJpgHi);

 

 

  server.begin();

}


  Serial Initialization:

  • Initializes the serial port for debugging.

  • Sets baud rate to 115200.

  Camera Configuration:

  • Sets pins for the AI Thinker ESP32-CAM module.

  • Configures the default resolution, buffer count, and JPEG quality (80%).

  • Attempts to initialize the camera and log the status.

  Wi-Fi Setup:

  • Connects to the specified Wi-Fi network in station mode.

  • Waits for the connection and logs the device's IP address.

  Web Server Routes:

  • Maps URL endpoint ( /cam-hi.jpg).

  •   Server Start:

  • Starts the web server.

void loop()

{

  server.handleClient();

}


  • server.handleClient(): Continuously listens for incoming HTTP requests and serves responses based on the defined endpoints.

Summary of Workflow

  1. The ESP32-CAM connects to Wi-Fi and starts a web server.

  2. URL endpoint /cam-hi.jpg) lets the user request images at high resolution.

  3. The camera captures an image and serves it to the client as a JPEG.

  4. The system continuously handles new client requests.

Python code

import cv2

import urllib.request

import numpy as np

import time


url = 'http://192.168.1.101/cam-hi.jpg'


detector = cv2.QRCodeDetector()


scanned_text = None


while True:

    # Fetch frame from the IP camera URL

    img_resp = urllib.request.urlopen(url)

    img_arr = np.array(bytearray(img_resp.read()), dtype=np.uint8)

    frame = cv2.imdecode(img_arr, -1)


    if frame is None:

        continue


    # QR Code detection

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    decoded_text, points, _ = detector.detectAndDecode(gray)


    if not decoded_text:  

        # If normal detection fails, try preprocessing

        enhanced = cv2.GaussianBlur(gray, (5, 5), 0)

        enhanced = cv2.adaptiveThreshold(enhanced, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, 

                                         cv2.THRESH_BINARY, 11, 2)

        decoded_text, points, _ = detector.detectAndDecode(enhanced)


    if points is not None and decoded_text:

        if decoded_text != scanned_text:

            print(f"Decoded: {decoded_text}")

            scanned_text = decoded_text


        # Convert points to integer values and draw the bounding box

        points = points.astype(int)  # Convert float points to integer

        cv2.polylines(frame, [points], isClosed=True, color=(0, 255, 0), thickness=3)


    # Display the frame with QR code detection

    cv2.imshow("QR Scanner", frame)

    

    # Wait for 'q' key to exit the loop

    if cv2.waitKey(1) & 0xFF == ord('q'):

        break


cv2.destroyAllWindows()

Code breakdown


Import Required Libraries

import cv2

import urllib.request

import numpy as np

import time

  • cv2 → OpenCV library for image processing.

  • urllib.request → Fetches the image frame from the ESP32-CAM URL.

  • numpy → Handles image data in arrays.

  • time → (Unused here but often used for timing/debugging).

Define Camera Stream URL

url = 'http://192.168.1.101/cam-hi.jpg'

  • The ESP32-CAM provides a JPEG stream over this local IP address.

  • Ensure that your ESP32-CAM is connected to the same Wi-Fi network.

Initialize the QR Code Detector

detector = cv2.QRCodeDetector()

  • cv2.QRCodeDetector() creates an instance of OpenCV's built-in QR code detector.

Variable to Store Previously Scanned Text

scanned_text = None

  • This stores the last detected QR code text.

  • Used to prevent duplicate prints of the same QR code.


Start the Main Loop

while True:

  • Runs indefinitely to keep fetching frames and detecting QR codes.

Fetch Frame from ESP32-CAM

img_resp = urllib.request.urlopen(url)

img_arr = np.array(bytearray(img_resp.read()), dtype=np.uint8)

frame = cv2.imdecode(img_arr, -1)

  • urllib.request.urlopen(url): Fetches the image as bytes.

  • bytearray(img_resp.read()): Converts the byte stream into an array.

  • np.array(..., dtype=np.uint8): Converts the byte array into a NumPy array (for image processing).

  • cv2.imdecode(img_arr, -1): Decodes the array into an OpenCV image (frame).

Skip Frame If Invalid

if frame is None:

    continue

  • Ensures the loop does not crash if the frame is not properly retrieved.

Convert to Grayscale

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

  • Converts the frame to grayscale for better QR code detection.

  • QR code detection works better on grayscale images.

Detect QR Code

decoded_text, points, _ = detector.detectAndDecode(gray)

  • detectAndDecode(gray):

    • Detects QR code in the image.

    • Returns:

      • decoded_text → The text inside the QR code.

      • points → The four corner points of the QR code.

      • _ → A binary mask (not used here).

If Detection Fails, Try Preprocessing

if not decoded_text:  

    enhanced = cv2.GaussianBlur(gray, (5, 5), 0)

    enhanced = cv2.adaptiveThreshold(enhanced, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, 

                                     cv2.THRESH_BINARY, 11, 2)

    decoded_text, points, _ = detector.detectAndDecode(enhanced)

  • If the first detection attempt fails, the script applies:

    • Gaussian Blur → Reduces noise.

    • Adaptive Thresholding → Enhances contrast.

  • Then, it retries QR code detection on the enhanced image.

Handle Successful QR Code Detection

if points is not None and decoded_text:

  • If a QR code is successfully detected, process it.

Prevent Repeated Decoding

if decoded_text != scanned_text:

    print(f"Decoded: {decoded_text}")

    scanned_text = decoded_text

  • Ensures the script does not print the same QR code multiple times.

Draw Bounding Box Around QR Code

points = points.astype(int)  # Convert float points to integer

cv2.polylines(frame, [points], isClosed=True, color=(0, 255, 0), thickness=3)

  • Converts points to integer values.

  • Uses cv2.polylines() to draw a green bounding box around the detected QR code.

Display the Frame

cv2.imshow("QR Scanner", frame)

  • Opens a live OpenCV window displaying the video stream with QR detection.

Quit on 'q' Key Press

if cv2.waitKey(1) & 0xFF == ord('q'):

    break

  • Waits 1 millisecond for a key press.

  • If the user presses 'q', the loop exits.

Cleanup

cv2.destroyAllWindows()

  • Closes all OpenCV windows and frees resources.

Let’s test the setup!

Run the Python code and place your camera in front of a QR code. The QR code will be detected inside a green bounding box. 

Fig: QR code detected


You will see the decoded QR code in the output window.

Wrapping It Up

And there you have it! We successfully built a real-time QR code scanner using an ESP32-CAM and OpenCV. The script continuously grabs frames from the ESP32-CAM’s live feed, detects QR codes, and even draws a bounding box around them. If the initial detection doesn’t work, it smartly enhances the image to improve accuracy.

This setup can be super handy for things like automated check-ins, inventory tracking, or even smart home projects. But this is just the beginning! You can take it even further by Storing scanned QR codes in a database, triggering automated actions based on the scanned data
and expanding it to multiple cameras for larger applications

With the power of computer vision and the flexibility of the ESP32-CAM, the possibilities are endless. So go ahead, experiment, tweak, and see where you can take it!

Text Recognition from Video Feed using ESP32-CAM

Hello, dear tech savvies! We hope everything is going fine with you. Today we’re back with another interesting project. Do you ever wonder how amazing it would be to have a text reader that would be able to read texts from pictures and videos? Think about a self-driving car that can read the road signs meticulously and go to the right direction.  Or imagine an AI bot that can read what is written on images uploaded to social media. How nice it would be to have such a system that will be able to read vulgar posts and filter them even when they are in picture format?   Or imagine a caregiver robot that can read the medicine bottle levels and give medicines to the patients always on time. Now you understand how important it is for AI solutions to recognize texts, right?

Today, we are going to do the same task in this project. The main component of our project is an ESP32-CAM. We will integrate it with the OpenCV library of Python. The Python code will read text from the video feed and show the text in the output terminal.

Introduction to the ESP32-CAM

The ESP32-CAM is a powerful yet affordable development board that combines the ESP32 microcontroller with an integrated camera module, making it an excellent choice for IoT and vision-based applications. Whether you're building a wireless security camera, a QR code scanner, or an AI-powered image recognition system, the ESP32-CAM provides a compact and cost-effective solution.

One of its standout features is built-in WiFi and Bluetooth connectivity, allowing it to stream video or capture images remotely. Despite its small size, it packs a punch with a dual-core processor, support for microSD card storage, and compatibility with various camera sensors (such as the OV2640). However, since it lacks built-in USB-to-serial functionality, flashing firmware requires an external FTDI adapter.

System Architecture

Overview

This system consists of an ESP32-CAM module capturing images and serving them over a web server. A separate Python-based OpenCV application fetches the images, processes them for Optical Character Recognition (OCR) using EasyOCR, and displays the results.

Components

  1. ESP32-CAM Module

    • Captures images at 800x600 resolution.

    • Hosts a web server on port 80 to serve the images.

    • Connects to a Wi-Fi network as a station.

    • Provides image data when requested via an HTTP GET request.

  2. Python OpenCV & EasyOCR Client

    • Requests images from the ESP32-CAM web server via HTTP GET requests.

    • Decodes the image and preprocesses it (resizing & grayscale conversion).

    • Performs OCR using EasyOCR.

    • Displays the real-time camera feed and extracted text.

Workflow

Step 1: ESP32-CAM Setup & Image Hosting

  1. The ESP32-CAM initializes and configures the camera settings.

  2. It connects to the Wi-Fi network.

  3. It starts an HTTP web server that serves JPEG images via the endpoint http:///cam-hi.jpg.

  4. When a request is received on /cam-hi.jpg, the ESP32-CAM captures an image and returns it as a response.

Step 2: Image Retrieval and Processing (Python OpenCV)

  1. The Python script continuously fetches images from the ESP32-CAM.

  2. The image is converted from a raw HTTP response into an OpenCV-compatible format.

  3. It is resized to 400x300 for faster processing.

  4. It is converted to grayscale to improve OCR accuracy.

Step 3: OCR and Text Extraction

  1. EasyOCR processes the grayscale image to recognize text.

  2. Detected text is printed to the console.

  3. The processed image feed is displayed using OpenCV.

Step 4: User Interaction

  • The user can view the real-time video feed.

  • The recognized text is displayed in the terminal.

  • The script can be terminated by pressing 'q'.


List of components

Components

Quantity

ESP32-CAM WiFi + Bluetooth Camera Module

1

FTDI USB to Serial Converter 3V3-5V

1

Male-to-female jumper wires

4

Female-to-female jumper wire

1

MicroUSB data cable

1

Circuit diagram

The following is the circuit diagram for this project:

Fig: Circuit diagram

ESP32-CAM WiFi + Bluetooth Camera Module

FTDI USB to Serial Converter 3V3-5V (Voltage selection button should be in 5V position)

5V

VCC

GND

GND

UOT

Rx

UOR

TX

IO0

GND (FTDI or ESP32-CAM)

Programming

If this is your first project with an ESP32 board, you need to do board installation. You will also need to download and install the ESP32-CAM library. To make the camera functional, the cp210x USB driver and the FTDI driver must be properly installed on your computer. Here is a detailed tutorial that shows how to get started with the ESP32-CAM.

ESP32-CAM code

#include

#include

#include

const char* WIFI_SSID = "SSID";

const char* WIFI_PASS = "password"; 

WebServer server(80);

static auto hiRes = esp32cam::Resolution::find(800, 600);

void serveJpg()

{

  auto frame = esp32cam::capture();

  if (frame == nullptr) {

    Serial.println("CAPTURE FAIL");

    server.send(503, "", "");

    return;

  }

  Serial.printf("CAPTURE OK %dx%d %db\n", frame->getWidth(), frame->getHeight(),

                static_cast(frame->size()));

  server.setContentLength(frame->size());

  server.send(200, "image/jpeg");

  WiFiClient client = server.client();

  frame->writeTo(client);

}

void handleJpgHi()

{

  if (!esp32cam::Camera.changeResolution(hiRes)) {

    Serial.println("SET-HI-RES FAIL");

  }

  serveJpg();

}

void  setup(){

  Serial.begin(115200);

  Serial.println();

  {

    using namespace esp32cam;

    Config cfg;

    cfg.setPins(pins::AiThinker);

    cfg.setResolution(hiRes);

    cfg.setBufferCount(2);

    cfg.setJpeg(80);

    bool ok = Camera.begin(cfg);

    Serial.println(ok ? "CAMERA OK" : "CAMERA FAIL");

  }

  WiFi.persistent(false);

  WiFi.mode(WIFI_STA);

  WiFi.begin(WIFI_SSID, WIFI_PASS);

  while (WiFi.status() != WL_CONNECTED) {

    delay(500);

  }

  Serial.print("http://");

  Serial.println(WiFi.localIP());

  Serial.println("  /cam-hi.jpg");

  server.on("/cam-hi.jpg", handleJpgHi);

  server.begin();

void loop()

{

  server.handleClient();

}

After uploading the code, disconnect the IO0 pin of the camera from GND. Then press the RST pin. The following messages will appear.

Fig: Code successfully uploaded to ESP32-CAM

You have to copy the IP address and paste it into the following part of your Python code.

Fig: Copy-pasting the URL to the Python script

Code breakdown

#include

#include

#include

  • #include : Adds support for creating a lightweight HTTP server.

  • #include : Allows the ESP32 to connect to Wi-Fi networks.

  • #include : Provides functions to control the ESP32-CAM module, including camera initialization and capturing images.

const char* WIFI_SSID = "SSID";

const char* WIFI_PASS = "password";

  • WIFI_SSID and WIFI_PASS: Define the SSID and password of the Wi-Fi network that the ESP32 will connect to.

 WebServer server(80);

  • WebServer server(80): Creates an HTTP server instance that listens on port 80 (default HTTP port). 

static auto hiRes = esp32cam::Resolution::find(800, 600);

esp32cam::Resolution::find: Defines camera resolutions:

  • hiRes: High-resolution (800x600).

void serveJpg()

{

  auto frame = esp32cam::capture();

  if (frame == nullptr) {

    Serial.println("CAPTURE FAIL");

    server.send(503, "", "");

    return;

  }

  Serial.printf("CAPTURE OK %dx%d %db\n", frame->getWidth(), frame->getHeight(),

                static_cast(frame->size()));

  server.setContentLength(frame->size());

  server.send(200, "image/jpeg");

  WiFiClient client = server.client();

  frame->writeTo(client);

}

  • esp32cam::capture: Captures a frame from the camera.

  • Failure Handling: If no frame is captured, it logs a failure and sends a 503 error response.

  • Logging Success: Prints the resolution and size of the captured image.

  • Serving the Image:

  • Sets the content length and MIME type as image/jpeg.

  • Writes the image data directly to the client.

void handleJpgHi()

{

  if (!esp32cam::Camera.changeResolution(hiRes)) {

    Serial.println("SET-HI-RES FAIL");

  }

  serveJpg();

}

  • handleJpgHi: Switches the camera to high resolution using esp32cam::Camera.changeResolution(hiRes) and calls serveJpg.

  • Error Logging: If the resolution change fails, it logs a failure message to the Serial Monitor.

void  setup(){

  Serial.begin(115200);

  Serial.println();

  {

    using namespace esp32cam;

    Config cfg;

    cfg.setPins(pins::AiThinker);

    cfg.setResolution(hiRes);

    cfg.setBufferCount(2);

    cfg.setJpeg(80);

 

    bool ok = Camera.begin(cfg);

    Serial.println(ok ? "CAMERA OK" : "CAMERA FAIL");

  }

  WiFi.persistent(false);

  WiFi.mode(WIFI_STA);

  WiFi.begin(WIFI_SSID, WIFI_PASS);

  while (WiFi.status() != WL_CONNECTED) {

    delay(500);

  }

  Serial.print("http://");

  Serial.println(WiFi.localIP());

  Serial.println("  /cam-hi.jpg");

  server.on("/cam-hi.jpg", handleJpgHi);

  server.begin();

}

∙  Serial Initialization:

  • Initializes the serial port for debugging.

  • Sets baud rate to 115200.

∙  Camera Configuration:

  • Sets pins for the AI Thinker ESP32-CAM module.

  • Configures the default resolution, buffer count, and JPEG quality (80%).

  • Attempts to initialize the camera and log the status.

∙  Wi-Fi Setup:

  • Connects to the specified Wi-Fi network in station mode.

  • Waits for the connection and logs the device's IP address.

∙  Web Server Routes:

  • Maps URL endpoint ( /cam-hi.jpg).

  • ∙  Server Start:

  • Starts the web server.

void loop()

{

  server.handleClient();

}

  • server.handleClient(): Continuously listens for incoming HTTP requests and serves responses based on the defined endpoints.

Summary of Workflow

  1. The ESP32-CAM connects to Wi-Fi and starts a web server.

  2. URL endpoint /cam-hi.jpg) lets the user request images at high resolution.

  3. The camera captures an image and serves it to the client as a JPEG.

  4. The system continuously handles new client requests.


Python code

import cv2

import requests

import numpy as np

import easyocr

import time


# Replace with your ESP32-CAM IP

ESP32_CAM_URL = "http://192.168.1.101/cam-hi.jpg"


# Initialize EasyOCR reader

reader = easyocr.Reader(['en'], gpu=False)


def capture_image():

    """ Captures an image from the ESP32-CAM """

    try:

        start_time = time.time()

        response = requests.get(ESP32_CAM_URL, timeout=2)  # Reduced timeout for faster response

        if response.status_code == 200:

            img_arr = np.frombuffer(response.content, np.uint8)

            img = cv2.imdecode(img_arr, cv2.IMREAD_COLOR)

            print(f"[INFO] Image received in {time.time() - start_time:.2f} seconds")

            return img

        else:

            print("[Error] Failed to get image from ESP32-CAM.")

            return None

    except Exception as e:

        print(f"[Error] {e}")

        return None


print("[INFO] Starting text recognition...")


while True:

    frame = capture_image()

    if frame is None:

        continue  # Skip this iteration if the image wasn't retrieved


    # Resize image for faster processing

    frame_resized = cv2.resize(frame, (400, 300))


    # Convert to grayscale (better OCR accuracy)

    gray = cv2.cvtColor(frame_resized, cv2.COLOR_BGR2GRAY)


    # Process image with EasyOCR

    start_time = time.time()

    results = reader.readtext(gray, detail=0, paragraph=True)

    print(f"[INFO] OCR processed in {time.time() - start_time:.2f} seconds")


    if results:

        detected_text = " ".join(results)

        print(f"[INFO] Recognized Text: {detected_text}")


    # Display the image feed

    cv2.imshow("ESP32-CAM Feed", frame_resized)


    # Press 'q' to exit the loop

    if cv2.waitKey(1) & 0xFF == ord('q'):

        break


# Cleanup

cv2.destroyAllWindows()

Code Breakdown: ESP32-CAM Text Recognition Using EasyOCR

This Python script captures images from an ESP32-CAM, processes them, and extracts text using EasyOCR. Below is a detailed breakdown of each part of the code.


Importing Required Libraries


import cv2         # OpenCV for image processing and display

import requests    # To send HTTP requests to the ESP32-CAM

import numpy as np # NumPy for handling image arrays

import easyocr     # EasyOCR for text recognition

import time        # For measuring performance time

  • cv2 (OpenCV) → Used for decoding, processing, and displaying images.

  • requests → Fetches the image from the ESP32-CAM.

  • numpy → Converts the image data into a format usable by OpenCV.

  • easyocr → Runs Optical Character Recognition (OCR) on the image.

  • time → Measures execution time for optimization.


Define ESP32-CAM IP Address

ESP32_CAM_URL = "http://192.168.1.100/cam-hi.jpg"

  • The ESP32-CAM hosts an image at this URL.

  • Ensure your ESP32-CAM and PC are on the same network.


 Initialize EasyOCR


reader = easyocr.Reader(['en'], gpu=False)


  • EasyOCR is initialized with English ('en') as the recognition language.

  • gpu=False ensures it runs on the CPU (Set gpu=True if using a GPU for faster processing).


Function to Capture Image from ESP32-CAM

def capture_image():

    """ Captures an image from the ESP32-CAM """

    try:

        start_time = time.time()

        response = requests.get(ESP32_CAM_URL, timeout=2)  # Reduced timeout for faster response

  • Sends an HTTP GET request to fetch an image.

  • timeout=2 → Ensures it doesn’t wait too long (prevents network lag).

        if response.status_code == 200:

            img_arr = np.frombuffer(response.content, np.uint8)

            img = cv2.imdecode(img_arr, cv2.IMREAD_COLOR)

            print(f"[INFO] Image received in {time.time() - start_time:.2f} seconds")

            return img

  • If HTTP response is successful (200 OK): 

    • Convert raw binary data (response.content) into a NumPy array.

    • Use cv2.imdecode() to convert it into an OpenCV image.

    • Print how long the image retrieval took.

    • Return the image.

        else:

            print("[Error] Failed to get image from ESP32-CAM.")

            return None

  • If the ESP32-CAM fails to respond, it prints an error message and returns None.

    except Exception as e:

        print(f"[Error] {e}")

        return None

  • Handles connection errors (e.g., ESP32-CAM offline, network issues).


Start Text Recognition

print("[INFO] Starting text recognition...")

  • Logs a message when the program starts.

Main Loop: Capturing & Processing Images

while True:

    frame = capture_image()

    if frame is None:

        continue  # Skip this iteration if the image wasn't retrieved

  • Continuously fetch images from ESP32-CAM.

  • If None (failed to capture), skip processing and retry.


Resize & Convert the Image to Grayscale


    # Resize image for faster processing

    frame_resized = cv2.resize(frame, (400, 300))


    # Convert to grayscale (better OCR accuracy)

    gray = cv2.cvtColor(frame_resized, cv2.COLOR_BGR2GRAY)

  • Resizing to (400, 300) → Speeds up OCR processing without losing clarity.

  • Converting to grayscale → Improves OCR accuracy.

Perform OCR (Text Recognition)

    start_time = time.time()

    results = reader.readtext(gray, detail=0, paragraph=True)

    print(f"[INFO] OCR processed in {time.time() - start_time:.2f} seconds")

  • Calls reader.readtext(gray, detail=0, paragraph=True). 

    • detail=0 → Returns only the recognized text.

    • paragraph=True → Groups words into sentences.

  • Logs how long OCR processing takes.

    if results:

        detected_text = " ".join(results)

        print(f"[INFO] Recognized Text: {detected_text}")

  • If text is detected, print the recognized text.

 Display the Image (Optional)

cv2.imshow("ESP32-CAM Feed", frame_resized)

  • Opens a real-time preview window of the ESP32-CAM feed.

    # Press 'q' to exit the loop

    if cv2.waitKey(1) & 0xFF == ord('q'):

        break

  • Press 'q' to exit the loop and stop the program.

Cleanup


cv2.destroyAllWindows()

  • Closes all OpenCV windows when the program exits.

Setting Up Python Environment

Install Dependencies:

Create a virtual environment:
python -m venv ocr_env  

source ocr_env/bin/activate  # Linux/Mac  

ocr_env\Scripts\activate   # Windows  

Install required libraries:

pip install opencv-python numpy easyocr requests

After setting up the Python environment, run the Python code to capture images from the ESP32-CAM and perform text recognition using EasyOCR.

Let’s test the setup!

Run the Python code and place your camera in front of a text. The text will be detected.

Fig: Sample

You will see the text in the output window.

 Fig: Detected text shown

fig: sample

fig: Detected text

Wrapping It Up

Congratulations! You've successfully built a real-time OCR system using ESP32-CAM and Python. With this setup, your ESP32-CAM captures images and streams them to your Python script, where OpenCV and EasyOCR extract text from the visuals. Whether you're automating data entry, reading license plates, or enhancing accessibility, this project lays the foundation for countless applications.

Now that you have it running, why not take it a step further? You could improve accuracy with better lighting, add pre-processing filters, or even integrate the results into a database or web dashboard. The possibilities are endless!

If you run into any issues or have ideas for improvements, feel free to experiment, tweak the code, and keep learning. Happy coding!

ESP32-CAM based RGB Color Identifier

Hello friends. We hope you are doing fine. The world is full of colours. Isn’t it? We humans can see and differentiate the colours very easily. But teaching robots and AI apps about colours is a real challenge. With the advancement of computer vision and embedded systems, this task has become easier than before. Today, we are going to make an RGB colour identifier using the ESP32-CAM. This project combines the power of OpenCV with the ESP32-CAM module to create a simple but effective system for detecting and tracking basic colors in real time.

System Architecture 

1. Overview

This system consists of an ESP32-CAM module acting as a live-streaming camera server and a Python-based computer vision application running on a remote computer. The Python application fetches images from the ESP32-CAM, processes them using OpenCV, and detects objects of specific colours (red, green, and blue) based on HSV filtering.

2. System Components

A. Hardware Components

  1. ESP32-CAM (AI Thinker module)

    • Captures images in JPEG format.

    • Streams images over WiFi using a built-in web server.

  2. WiFi Router/Network

    • Connects ESP32-CAM and the processing computer.

  3. Processing Computer (Laptop/Desktop/Raspberry Pi)

    • Runs Python with OpenCV to process images from ESP32-CAM.

    • Performs colour detection and contour analysis.

B. Software Components

  1. ESP32-CAM Firmware (Arduino Code)

    • Uses the esp32cam library for camera control.

    • Uses WiFi.h for network connectivity.

    • Uses WebServer.h to create an HTTP server.

    • Captures and serves images at http:///cam-hi.jpg.

  2. Python OpenCV Script (Color Detection Algorithm)

    • Fetches images from ESP32-CAM via urllib.request.

    • Converts images to HSV format for color-based segmentation.

    • Detects red, green, and blue objects using defined HSV thresholds.

    • Draws bounding contours and labels detected colours.

    • Displays processed video frames with detected objects.

4. Data Flow

Step 1: ESP32-CAM Initialization

  • ESP32-CAM connects to WiFi.

  • Sets up a web server to serve captured images at http:///cam-hi.jpg.

Step 2: Image Capture and Streaming

  • The camera captures images in JPEG format (800x600 resolution).

  • Stores and serves the latest frame via an HTTP endpoint.

Step 3: Python Application Fetches Image

  • The Python script sends a request to ESP32-CAM to get the latest image frame.

  • The image is received in JPEG format and decoded using OpenCV.

Step 4: Color Detection Processing

  • Converts the image from BGR to HSV.

  • Applies thresholding masks to detect red, green, and blue objects.

  • Extracts contours of detected objects.

  • Filters out small objects using an area threshold (>2000 pixels).

  • Computes the centroid of detected objects.

  • Draws bounding contours and labels detected objects.

Step 5: Displaying Processed Image

  • Shows the original frame with detected objects and labels.

  • Pressing 'q' stops execution and closes all OpenCV windows.

List of components

Components

Quantity

ESP32-CAM WiFi + Bluetooth Camera Module

1

FTDI USB to Serial Converter 3V3-5V

1

Male-to-female jumper wires

4

Female-to-female jumper wire

1

MicroUSB data cable

1

Circuit diagram

The following is the circuit diagram for this project:

Fig: Circuit diagram

ESP32-CAM WiFi + Bluetooth Camera Module

FTDI USB to Serial Converter 3V3-5V (Voltage selection button should be in 5V position)

5V

VCC

GND

GND

UOT

Rx

UOR

TX

IO0

GND (FTDI or ESP32-CAM)

Programming

Board installation

If it is your first project with any board of the ESP32 series, this part of the tutorial is for you.  you need to do the board installation.  You may also need to install the CP210x USB driver. If ESP32 boards are already installed in your Arduino IDE, you can skip this installation section. Go to File > preferences, type https://dl.espressif.com/dl/package_esp32_index.json and click OK.

Fig: Board Installation

  • Go to Tools>Board>Boards Manager and install the ESP32 boards. 

Fig: Board Installation

Install the ESP32-CAM library.

  • Download the ESP32-CAM library from Github (the link is given in the reference section). Then install it by following the path sketch>include library> add.zip library. 

Now select the correct path to the library, click on the library folder and press open.

Board selection and code uploading

Connect the camera board to your computer. Some camera boards come with a micro USB connector of their own. You can connect the camera to the computer by using a micro USB data cable. If the board has no connector, you have to connect the FTDI module to the computer with the data cable. If you never used the FTDI board on your computer, you will need to install the FTDI driver first.

  • After connecting the camera,  Go to Tools>boards>esp32>Ai thinker ESP32-CAM

Fig: Camera board selection

After selecting the board, select the appropriate COM port and upload the following code:

#include

#include

#include  

const char* WIFI_SSID = "SSID";

const char* WIFI_PASS = "password";

WebServer server(80);

static auto hiRes = esp32cam::Resolution::find(800, 600);

void serveJpg()

{

  auto frame = esp32cam::capture();

  if (frame == nullptr) {

    Serial.println("CAPTURE FAIL");

    server.send(503, "", "");

    return;

  }

  Serial.printf("CAPTURE OK %dx%d %db\n", frame->getWidth(), frame->getHeight(),

                static_cast(frame->size()));

  server.setContentLength(frame->size());

  server.send(200, "image/jpeg");

  WiFiClient client = server.client();

  frame->writeTo(client);

void handleJpgHi()

{

  if (!esp32cam::Camera.changeResolution(hiRes)) {

    Serial.println("SET-HI-RES FAIL");

  }

  serveJpg();

}

void  setup(){

  Serial.begin(115200);

  Serial.println();

  {

    using namespace esp32cam;

    Config cfg;

    cfg.setPins(pins::AiThinker);

    cfg.setResolution(hiRes);

    cfg.setBufferCount(2);

    cfg.setJpeg(80);

 

    bool ok = Camera.begin(cfg);

    Serial.println(ok ? "CAMERA OK" : "CAMERA FAIL");

  }

  WiFi.persistent(false);

  WiFi.mode(WIFI_STA);

  WiFi.begin(WIFI_SSID, WIFI_PASS);

  while (WiFi.status() != WL_CONNECTED) {

    delay(500);

  }

  Serial.print("http://");

  Serial.println(WiFi.localIP());

  Serial.println("  /cam-hi.jpg"); 

  server.on("/cam-hi.jpg", handleJpgHi); 

  server.begin();

}

 

void loop()

{

  server.handleClient();

}



After uploading the code, disconnect the IO0 pin of the camera from GND. Then press the RST pin. The following messages will appear.

Fig: Code successfully uploaded to ESP32-CAM

You have to copy the IP address and paste it into the following part of your Python code.

Python code

Main python script 

Copy-paste the following Python code and save it using a Python interpreter. 

import cv2

import urllib.request

import numpy as np

def nothing(x):

    pass

url = 'http://192.168.1.108/cam-hi.jpg'

cv2.namedWindow("live transmission", cv2.WINDOW_AUTOSIZE)

# Red, Green, and Blue HSV ranges

red_lower1 = np.array([0, 120, 70])

red_upper1 = np.array([10, 255, 255])

red_lower2 = np.array([170, 120, 70])

red_upper2 = np.array([180, 255, 255])

green_lower = np.array([40, 70, 70])

green_upper = np.array([80, 255, 255])

blue_lower = np.array([90, 70, 70])

blue_upper = np.array([130, 255, 255])

while True:

    img_resp = urllib.request.urlopen(url)

    imgnp = np.array(bytearray(img_resp.read()), dtype=np.uint8)

    frame = cv2.imdecode(imgnp, -1)

    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

    # Create masks for Red, Green, and Blue

    mask_red1 = cv2.inRange(hsv, red_lower1, red_upper1)

    mask_red2 = cv2.inRange(hsv, red_lower2, red_upper2)

    mask_red = cv2.bitwise_or(mask_red1, mask_red2)

    mask_green = cv2.inRange(hsv, green_lower, green_upper)

    mask_blue = cv2.inRange(hsv, blue_lower, blue_upper)

    # Find contours for each color independently

    for color, mask, lower, upper in [("red", mask_red, red_lower1, red_upper1), 

                                      ("green", mask_green, green_lower, green_upper),

                                      ("blue", mask_blue, blue_lower, blue_upper)]:

        cnts, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

        for c in cnts:

            area = cv2.contourArea(c)

            if area > 2000:  # Only consider large contours

                # Get contour center

                M = cv2.moments(c)

                if M["m00"] != 0:  # Avoid division by zero

                    cx = int(M["m10"] / M["m00"])

                    cy = int(M["m01"] / M["m00"])

                # Draw contours and color label

                cv2.drawContours(frame, [c], -1, (255, 0, 0), 3)  # Draw contour in blue

                cv2.circle(frame, (cx, cy), 7, (255, 255, 255), -1)  # Draw center circle

                cv2.putText(frame, color, (cx - 20, cy - 20), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)

    res = cv2.bitwise_and(frame, frame, mask=mask_red)  # Show result with red mask

    cv2.imshow("live transmission", frame)

    cv2.imshow("res", res)

    key = cv2.waitKey(5)

    if key == ord('q'):

        break

cv2.destroyAllWindows()

Setting Up Python Environmen

Install Dependencies:

1)Create a virtual environment:
python -m venv venv

source venv/bin/activate  # Linux/Mac

venv\Scripts\activate   # Windows

2)Install required libraries:

pip install opencv-python numpy

pip install urllib3

After setting the Pythong Environment, run the Python code. 

ESP32-CAM code breakdown

#include

#include

#include


  • #include : Adds support for creating a lightweight HTTP server.

  • #include : Allows the ESP32 to connect to Wi-Fi networks.

  • #include : Provides functions to control the ESP32-CAM module, including camera initialization and capturing images.

 

const char* WIFI_SSID = "SSID";

const char* WIFI_PASS = "password";

 


  • WIFI_SSID and WIFI_PASS: Define the SSID and password of the Wi-Fi network that the ESP32 will connect to.

 WebServer server(80);


  • WebServer server(80): Creates an HTTP server instance that listens on port 80 (default HTTP port).

 


static auto hiRes = esp32cam::Resolution::find(800, 600);


esp32cam::Resolution::find: Defines camera resolutions:

  • hiRes: High resolution (800x600).

void serveJpg()

{

  auto frame = esp32cam::capture();

  if (frame == nullptr) {

    Serial.println("CAPTURE FAIL");

    server.send(503, "", "");

    return;

  }

  Serial.printf("CAPTURE OK %dx%d %db\n", frame->getWidth(), frame->getHeight(),

                static_cast(frame->size()));

 

  server.setContentLength(frame->size());

  server.send(200, "image/jpeg");

  WiFiClient client = server.client();

  frame->writeTo(client);

}

 

 


  • esp32cam::capture: Captures a frame from the camera.

  • Failure Handling: If no frame is captured, it logs a failure and sends a 503 error response.

  • Logging Success: Prints the resolution and size of the captured image.

  • Serving the Image:

    • Sets the content length and MIME type as image/jpeg.

    • Writes the image data directly to the client.

void handleJpgHi()

{

  if (!esp32cam::Camera.changeResolution(hiRes)) {

    Serial.println("SET-HI-RES FAIL");

  }

  serveJpg();

}

 


  • handleJpgHi: Switches the camera to high resolution using esp32cam::Camera.changeResolution(hiRes) and calls serveJpg.

  • Error Logging: If the resolution change fails, it logs a failure message to the Serial Monitor.

void  setup(){

  Serial.begin(115200);

  Serial.println();

  {

    using namespace esp32cam;

    Config cfg;

    cfg.setPins(pins::AiThinker);

    cfg.setResolution(hiRes);

    cfg.setBufferCount(2);

    cfg.setJpeg(80);

 

    bool ok = Camera.begin(cfg);

    Serial.println(ok ? "CAMERA OK" : "CAMERA FAIL");

  }

  WiFi.persistent(false);

  WiFi.mode(WIFI_STA);

  WiFi.begin(WIFI_SSID, WIFI_PASS);

  while (WiFi.status() != WL_CONNECTED) {

    delay(500);

  }

  Serial.print("http://");

  Serial.println(WiFi.localIP());

  Serial.println("  /cam-hi.jpg");


 

  server.on("/cam-hi.jpg", handleJpgHi);

 

 

  server.begin();

}


  Serial Initialization:

  • Initializes the serial port for debugging.

  • Sets baud rate to 115200.

  Camera Configuration:

  • Sets pins for the AI Thinker ESP32-CAM module.

  • Configures the default resolution, buffer count, and JPEG quality (80%).

  • Attempts to initialize the camera and log the status.

  Wi-Fi Setup:

  • Connects to the specified Wi-Fi network in station mode.

  • Waits for the connection and logs the device's IP address.

  Web Server Routes:

  • Maps URL endpoint ( /cam-hi.jpg).

  •   Server Start:

  • Starts the web server.

void loop()

{

  server.handleClient();

}


  • server.handleClient(): Continuously listens for incoming HTTP requests and serves responses based on the defined endpoints.

Summary of Workflow

  1. The ESP32-CAM connects to Wi-Fi and starts a web server.

  2. URL endpoint /cam-hi.jpg) lets the user request images at high resolution.

  3. The camera captures an image and serves it to the client as a JPEG.

  4. The system continuously handles new client requests.


Python code breakdown

Code Breakdown

This code captures images from a live video stream over the network, processes them to detect red, green, and blue regions, and highlights these regions on the video feed.


Imports

cv2 (OpenCV):

  • Used for image and video processing, including reading, decoding, and displaying images.

urllib.request:

  • Handles HTTP requests to fetch the video feed from the given URL.

numpy:

  • Handles array operations, which are used for creating HSV ranges and masks.

Function Definition

nothing(x)

  • Purpose: A placeholder function that does nothing. Typically used for trackbar callbacks in OpenCV.

  • Usage in Code: It's defined but not used in this snippet.


Global Variables

url:

  • Stores the URL of the live video feed (http://192.168.1.106/cam-hi.jpg).

Colour Ranges:

  • Red: Two HSV ranges for red, as red wraps around the HSV hue space (0–10 and 170–180 degrees).

  • Green: HSV range for green (40–80 degrees).

  • Blue: HSV range for blue (90–130 degrees).

Window Initialization

cv2.namedWindow

  • Creates a window named "live transmission" for displaying the processed video feed.

  • cv2.WINDOW_AUTOSIZE: Ensures the window size adjusts automatically based on the image size.


Main Loop (while True)

Fetch Image:

img_resp = urllib.request.urlopen(url)

imgnp = np.array(bytearray(img_resp.read()), dtype=np.uint8)

frame = cv2.imdecode(imgnp, -1)


  • urllib.request.urlopen(url): Opens the URL and fetches the image bytes.

  • bytearray(img_resp.read()): Converts the response data to a byte array.

  • np.array(..., dtype=np.uint8): Converts the byte array into a NumPy array.

  • cv2.imdecode(imgnp, -1): Decodes the NumPy array into an image (frame).

Convert to HSV:

hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)


  • Converts the image from BGR to HSV color space, which makes color detection easier.

Create Color Masks:

mask_red1 = cv2.inRange(hsv, red_lower1, red_upper1)

mask_red2 = cv2.inRange(hsv, red_lower2, red_upper2)

mask_red = cv2.bitwise_or(mask_red1, mask_red2)

mask_green = cv2.inRange(hsv, green_lower, green_upper)

mask_blue = cv2.inRange(hsv, blue_lower, blue_upper)


  • cv2.inRange(hsv, lower, upper): Creates a binary mask where pixels in the HSV range are white (255) and others are black (0).

  • Combines two masks for red (since red spans two HSV ranges).

  • Creates masks for green and blue.

Find and Process Contours:

cnts, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)


  • cv2.findContours:

    • Finds contours (boundaries of white regions) in the binary mask.

    • cv2.RETR_TREE: Retrieves all contours and reconstructs a full hierarchy.

    • cv2.CHAIN_APPROX_SIMPLE: Compresses horizontal, vertical, and diagonal segments to save memory.

Contour Processing:

for c in cnts:

    area = cv2.contourArea(c)

    if area > 2000:  # Only consider large contours

        M = cv2.moments(c)

        if M["m00"] != 0:

            cx = int(M["m10"] / M["m00"])

            cy = int(M["m01"] / M["m00"])

        cv2.drawContours(frame, [c], -1, (255, 0, 0), 3)

        cv2.circle(frame, (cx, cy), 7, (255, 255, 255), -1)

        cv2.putText(frame, color, (cx - 20, cy - 20), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 

  • cv2.contourArea(c): Calculates the area of the contour.

  • Threshold: Only processes contours with an area > 2000 to ignore noise.

  • Moments: Used to calculate the centre of the contour (cx, cy).

  • Drawing:

    • cv2.drawContours: It draws the contour in blue.

    • cv2.circle:  It draws a white circle at the center.

    • cv2.putText: Labels the contour with its colour name.

Display the Results:

res = cv2.bitwise_and(frame, frame, mask=mask_red)

cv2.imshow("live transmission", frame)

cv2.imshow("res", res)


  • cv2.bitwise_and: Applies the red mask to the original frame, keeping only the red regions visible.

  • cv2.imshow: Displays the processed video feed in two windows:

    • "live transmission" shows the annotated frame.

    • "res" shows only the red regions.

Exit Condition:

key = cv2.waitKey(5)

if key == ord('q'):

    break


  • cv2.waitKey(5): Waits for 5 ms for a key press.

  • Exit Key: If 'q' is pressed, the loop breaks.


Cleanup

       cv2.destroyAllWindows()


  • Closes all OpenCV windows after exiting the loop.


Summary

This script continuously fetches images from a network camera, processes them to detect red, green, and blue regions, and overlays visual markers and labels on the detected regions. It is a real-time colour detection and visualization application with a clear exit mechanism.

Let’s test the setup

  1. Power up the ESP32-CAM and connect it to Wi-Fi.

  2. Run the Python script. Make sure that the ESP32-CAM URL is correctly set.

  3. Test with Red, Green and Blue objects. You have to place the objects in front of the ESP32-CAM.

Fig: Green detected

Fig: Red and blue detected

Fig: Blue detected

Troubleshooting:

  • Guru Meditation Error: Ensure stable power to the ESP32-CAM.

  • No Image Display: You probably entered the wrong IP address! Check the IP address and ensure the ESP32-CAM is accessible from your computer.

  • Library Conflicts: Use a virtual environment to isolate Python dependencies.

  • Dots when uploading the code: Immediately press the RST button.

  • Multiple failed upload attempts despite pressing the RST button: Restart your computer and try again. 

To wrap up

By integrating ESP32 and OpenCV, we have made a basic RGB colour identifier in this project. We can use this to make apps for colour-blind people. Depending on colours, industrial control systems often need to sort products and raw materials. This project can be integrated with such sorting systems. Colour detection is also important for humanoid robots. Our project can be integrated with humanoid robots to add that feature. The code can be further fine-tuned to identify more colours.

Syed Zain Nasir

I am Syed Zain Nasir, the founder of <a href=https://www.TheEngineeringProjects.com/>The Engineering Projects</a> (TEP). I am a programmer since 2009 before that I just search things, make small projects and now I am sharing my knowledge through this platform.I also work as a freelancer and did many projects related to programming and electrical circuitry. <a href=https://plus.google.com/+SyedZainNasir/>My Google Profile+</a>

Share
Published by
Syed Zain Nasir