Notes on object detection

Roboflow demo, Yolov8 demo

Related topics
Image classification
Image segmentation

What we want
name
bounding box or mask
ability to improve performance with new training data (fine tuning)

IOU evaluation criterion

First Image/Object Recognition
Then Object Detection

Problems with Object Recognition/Detection
1. Illumination conditions change
2. Viewpoint Variation esp. unusual views
3. Occlusion
4. moving camera, moving object, motion blur

5. Cluttered or textured background -> False alarms, spurious objects
6. Many objects
7. Many moving objects

8. Position in image - convolution
9. image orientation - tricks and rotational convolution
      (polar representations)
10. scale: very small and very big objects - Use pyramids

11. Appearance variation
12. Shape variation, including deformation (people, trees,
       vegetables, liquids, granular materials, grass)
13. Intra-class variation - chair, house, ...

Tool: OpenCV: python, C++ (options: packages, install from source)

CPU ONLY: Google "opencv object detection cpu only"

OpenCV non-GPU classifier: HAAR features, Cascade classifier, Python tutorial, Wikipedia

Histogram of oriented gradients (HOG), more, people detector

More traditional techniques (which can be used to refine learned detectors)
Image segmentation and blob analysis, which uses simple object
properties such as size, shape, or color
Feature-based object detection, which uses feature extraction,
matching, and RANSAC to estimate the location of an object
The Viola-Jones algorithm for human face or upper body detection based
on Haar features
SVM classification using histograms of oriented gradient (HOG) features
Aggregate channel features (ACF)

Step 1: Produce a pyramidal feature map

Features:
JPEG compression blocks
Haar
keypoints,
Scale-invariant feature transform (SIFT), SURF
local invariant descriptors,
bag-of-visual-words models.
Histogram of Oriented Gradients.
deformable parts models.  https://cs.brown.edu/people/pfelzens/papers/lsvm-pami.pdf
Exemplar models.  https://www.cs.cmu.edu/~efros/exemplarsvm-iccv11.pdf

Deep Learning Approaches

Tool: conda for version control

2 or multi- stage: accurate but slower
 - region proposals
 - classification within regions
1 stage
 - predefined regions/anchor points
 - competition and non-maximum suppression


Example multi-stage system
Mask-RCNN

Multiscale
Feature maps  -------------------------- ROI Align ---> Refined proposals (P)
               |                          ^
               ---> Region Proposals -----|

P -> Mask generation -> Masks

P -> BBox generation/ID -> Object detections


Need labelled training data!
How could this be done "self-supervised"?

YOLOv8.1, github, blog, blog

CPU based deep learning?
******************************************************************

https://learnopencv.com/mask-r-cnn-instance-segmentation-with-pytorch/
https://catalog.ngc.nvidia.com/orgs/nvidia/resources/mask_r_cnn_for_pytorch
https://blog.roboflow.com/mask-rcnn/ - masks
https://viso.ai/deep-learning/mask-r-cnn/
https://www.analyticsvidhya.com/blog/2019/07/computer-vision-implementing-mask-r-cnn-image-segmentation/
https://www.geeksforgeeks.org/mask-r-cnn-ml/
https://pyimagesearch.com/2020/02/10/opencv-dnn-with-nvidia-gpus-1549-faster-yolo-ssd-and-mask-r-cnn/
https://pyimagesearch.com/2018/11/12/yolo-object-detection-with-opencv/
https://pyimagesearch.com/2018/11/19/mask-r-cnn-with-opencv/
https://pyimagesearch.com/2014/11/10/histogram-oriented-gradients-object-detection/
https://pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/
points to useful Non maximum suppression
https://pyimagesearch.com/2018/05/14/a-gentle-guide-to-deep-learning-object-detection/

https://github.com/tusen-ai/simpledet
https://github.com/facebookresearch/detr
https://github.com/open-mmlab/mmdetection
https://github.com/leaplabthu/rank-detr
Semi-DETR: Semi-Supervised Object Detection with Detection Transformers
https://arxiv.org/abs/2307.08095

https://viso.ai/deep-learning/object-detection/

Intel CPU
https://www.intel.com/content/www/us/en/docs/edge-insights-vision/get-started-guide/2021-3/single-and-multi-object-detection-with-hardware.html

OpenCV CPU for YOLOv3
https://learnopencv.com/deep-learning-based-object-detection-using-yolov3-with-opencv-python-c/

****************************
****************************

What could we do with Object Detection?
  - find objects - logistics, where are my keys
  - track objects
  - manipulate objects
  - get list of objects in view/room, talk to ChatGPT about it.