Detect bounding box of characters written by convert

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
kymillev
Posts: 2
Joined: 2019-08-08T02:42:13-07:00
Authentication code: 1152

Detect bounding box of characters written by convert

Post by kymillev » 2019-08-08T03:01:39-07:00

I came across a paper (https://arxiv.org/pdf/1608.04224.pdf) which uses synthetic handwriting data generated with the ImageMagick convert command, using a lot of different handwriting fonts like the images below.
Image
I am new to using the ImageMagick tool but I was wondering if I can get a tight bounding box for each individual character in each word. I'm not sure how the convert function works, but if it writes each letter separately, it should be possible to get the bounding box after writing the letter right?

snibgo
Posts: 12024
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Detect bounding box of characters written by convert

Post by snibgo » 2019-08-08T03:44:57-07:00

The PDF has error 403 "Forbidden". If you give the title and author, perhaps the paper is available elsewhere.

IM calls a delegate to rasterize text strings. It can do this multiple times to build up a string, which could give the bounding box of each added character. It's a bit messy because glyphs can overlap.

For example, it can rasterize "easil" and then "easily". The difference between these gives the bounding box of the final "y".
snibgo's IM pages: im.snibgo.com

User avatar
fmw42
Posts: 25414
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Detect bounding box of characters written by convert

Post by fmw42 » 2019-08-08T08:52:44-07:00

If each character is separate (does not touch any other character), then you can use -connected-components to get the bounding boxes of each character. If they touch, then it would get the bounding box of all connected characters and not each one separately. See https://imagemagick.org/script/connected-components.php. Once you have the bounding boxes, you can draw boxes around each separate region.

kymillev
Posts: 2
Joined: 2019-08-08T02:42:13-07:00
Authentication code: 1152

Re: Detect bounding box of characters written by convert

Post by kymillev » 2019-08-13T05:38:45-07:00

Thanks for the advice, I have solved the problem by generating each letter iteratively and masking the previously generated letters to get the bounding box of the new one with OpenCV. This of course causes a lot of overlap for each bounding box, but that should be fine.

Example output:
Image

Post Reply