Related topics Image classification Image segmentation What we want name bounding box or mask ability to improve performance with new training data (fine tuning) IOU evaluation criterion
First Image/Object Recognition Then Object Detection Problems with Object Recognition/Detection 1. Illumination conditions change 2. Viewpoint Variation esp. unusual views 3. Occlusion 4. moving camera, moving object, motion blur 5. Cluttered or textured background -> False alarms, spurious objects 6. Many objects 7. Many moving objects 8. Position in image - convolution 9. image orientation - tricks and rotational convolution (polar representations) 10. scale: very small and very big objects - Use pyramids 11. Appearance variation 12. Shape variation, including deformation (people, trees, vegetables, liquids, granular materials, grass) 13. Intra-class variation - chair, house, ...
Tool: OpenCV: python, C++ (options: packages, install from source)
CPU ONLY: Google "opencv object detection cpu only"
OpenCV non-GPU classifier: HAAR features, Cascade classifier, Python tutorial, Wikipedia
Histogram of oriented gradients (HOG), more, people detector
More traditional techniques (which can be used to refine learned detectors) Image segmentation and blob analysis, which uses simple object properties such as size, shape, or color Feature-based object detection, which uses feature extraction, matching, and RANSAC to estimate the location of an object The Viola-Jones algorithm for human face or upper body detection based on Haar features SVM classification using histograms of oriented gradient (HOG) features Aggregate channel features (ACF) Step 1: Produce a pyramidal feature map Features: JPEG compression blocks Haar keypoints, Scale-invariant feature transform (SIFT), SURF local invariant descriptors, bag-of-visual-words models. Histogram of Oriented Gradients. deformable parts models. https://cs.brown.edu/people/pfelzens/papers/lsvm-pami.pdf Exemplar models. https://www.cs.cmu.edu/~efros/exemplarsvm-iccv11.pdf
2 or multi- stage: accurate but slower - region proposals - classification within regions 1 stage - predefined regions/anchor points - competition and non-maximum suppression Example multi-stage system Mask-RCNN Multiscale Feature maps -------------------------- ROI Align ---> Refined proposals (P) | ^ ---> Region Proposals -----| P -> Mask generation -> Masks P -> BBox generation/ID -> Object detections Need labelled training data! How could this be done "self-supervised"?
CPU based deep learning?
******************************************************************
https://learnopencv.com/mask-r-cnn-instance-segmentation-with-pytorch/ https://catalog.ngc.nvidia.com/orgs/nvidia/resources/mask_r_cnn_for_pytorch https://blog.roboflow.com/mask-rcnn/ - masks https://viso.ai/deep-learning/mask-r-cnn/ https://www.analyticsvidhya.com/blog/2019/07/computer-vision-implementing-mask-r-cnn-image-segmentation/ https://www.geeksforgeeks.org/mask-r-cnn-ml/ https://pyimagesearch.com/2020/02/10/opencv-dnn-with-nvidia-gpus-1549-faster-yolo-ssd-and-mask-r-cnn/ https://pyimagesearch.com/2018/11/12/yolo-object-detection-with-opencv/ https://pyimagesearch.com/2018/11/19/mask-r-cnn-with-opencv/ https://pyimagesearch.com/2014/11/10/histogram-oriented-gradients-object-detection/ https://pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/ points to useful Non maximum suppression https://pyimagesearch.com/2018/05/14/a-gentle-guide-to-deep-learning-object-detection/ https://github.com/tusen-ai/simpledet https://github.com/facebookresearch/detr https://github.com/open-mmlab/mmdetection https://github.com/leaplabthu/rank-detr Semi-DETR: Semi-Supervised Object Detection with Detection Transformers https://arxiv.org/abs/2307.08095 https://viso.ai/deep-learning/object-detection/ Intel CPU https://www.intel.com/content/www/us/en/docs/edge-insights-vision/get-started-guide/2021-3/single-and-multi-object-detection-with-hardware.html OpenCV CPU for YOLOv3 https://learnopencv.com/deep-learning-based-object-detection-using-yolov3-with-opencv-python-c/ **************************** ****************************
What could we do with Object Detection? - find objects - logistics, where are my keys - track objects - manipulate objects - get list of objects in view/room, talk to ChatGPT about it.