As an extension of my table detection technique, I aim to develop a model that can identify graphs on printed pages. I’m not searching for specific ideas; rather, I’m wondering where to start. Any previous models would be welcomed as well, as a source of inspiration.
To detect graphs on digital pages, use image recognition software or Optical Character Recognition (OCR) tools with graph-detection algorithms. Many modern tools can analyze visual elements and identify graphs.
Kosmos 2.5 generates bounding boxes around text based on image input. Plots and tables appear to be able to have boxes drawn around them, however there is no fine-tuning coding.
To detect graphs on digital pages, use image recognition software or algorithms like OCR combined with machine learning. These tools can identify graphical elements by analyzing patterns, shapes, and colors, distinguishing graphs from text or other visual content.
Florence2, SAM2, YOLOv8 (in increasing order of speed) could all draw bounding boxes around different bits of the page.