Monocular Camera Based Fruit Counting and Mapping with Semantic Data Association

Xu Liu, Steven W. Chen, Chenhao Liu, Shreyas S. Shivakumar, Jnaneshwar Das, Camillo J. Taylor, James Underwood, Vijay Kumar

Research output: Contribution to journalArticlepeer-review

56 Scopus citations


In this letter, we present a cheap, lightweight, and fast fruit counting pipeline. Our pipeline relies only on a monocular camera, and achieves counting performance comparable to a state-of-the-art fruit counting system that utilizes an expensive sensor suite including a monocular camera, LiDAR and GPS/INS on a mango dataset. Our pipeline begins with a fruit and tree trunk detection component that uses state-of-the-art convolutional neural networks (CNNs). It then tracks fruits and tree trunks across images, with a Kalman Filter fusing measurements from the CNN detectors and an optical flow estimator. Finally, fruit count and map are estimated by an efficient fruit-as-feature semantic structure from motion algorithm that converts two-dimensional (2-D) tracks of fruits and trunks into 3-D landmarks, and uses these landmarks to identify double counting scenarios. There are many benefits of developing such a low cost and lightweight fruit counting system, including applicability to agriculture in developing countries, where monetary constraints or unstructured environments necessitate cheaper hardware solutions.

Original languageEnglish (US)
Article number8653965
Pages (from-to)2296-2303
Number of pages8
JournalIEEE Robotics and Automation Letters
Issue number3
StatePublished - Jul 2019


  • Robotics in agriculture and forestry
  • deep learning in robotics and automation
  • mapping
  • object detection
  • segmentation and categorization
  • visual tracking

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Biomedical Engineering
  • Human-Computer Interaction
  • Mechanical Engineering
  • Computer Vision and Pattern Recognition
  • Computer Science Applications
  • Control and Optimization
  • Artificial Intelligence


Dive into the research topics of 'Monocular Camera Based Fruit Counting and Mapping with Semantic Data Association'. Together they form a unique fingerprint.

Cite this