Crash related to semaphores

Questions and postings pertaining to the development of ImageMagick, feature enhancements, and ImageMagick internals. ImageMagick source code and algorithms are discussed here. Usage questions which are too arcane for the normal user list should also be posted here.
Post Reply
Rhialto
Posts: 5
Joined: 2014-04-26T11:21:41-07:00
Authentication code: 6789

Crash related to semaphores

Post by Rhialto »

Hi there,

a fellow NetBSD user and I independently came across two crashes in Calibre, which seemed to have in common that they both occurred due to calls from the libMagickCore-6.Q16.so.2 library. (The version is ImageMagick-6.8.8.5)

The common pattern is that something from ImageMagic calls a "gomp" library function (which is a library that comes with gcc and is the gnu implementation of the OpenMP API (a parallel programming toolkit)). That function then calls a semaphore function, which crashes, probably because the semaphore is invalid.

Below I reproduce the stack backtraces from the original reports.
The original thread for my report is at http://mail-index.netbsd.org/netbsd-use ... 14494.html.

Any ideas what might be going on here?

Their stack backtrace is this:

Code: Select all

#0  0x00007f7ff740656d in sem_wait (sem=0x7f7fd81fda80) at /archive/foreign/src/lib/libpthread/sem.c:260
#1  0x00007f7fd12034e9 in omp_set_lock (lock=0x7f7fd81fda80) at /archive/foreign/src/external/gpl3/gcc/dist/libgomp/config/posix/lock.c:138
#2  0x00007f7fd4f149b5 in DestroyExceptionInfo () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#3  0x00007f7fd4f49235 in IsEventLogging () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#4  0x00007f7fd56eb354 in NewPixelWand () from /usr/pkg/lib/libMagickWand-6.Q16.so.2
#5  0x00007f7fd56eb4c2 in NewPixelWands () from /usr/pkg/lib/libMagickWand-6.Q16.so.2
#6  0x00007f7fd56e97fa in NewPixelIterator () from /usr/pkg/lib/libMagickWand-6.Q16.so.2
#7  0x00007f7fd5a03ec3 in magick_Image_has_transparent_pixels () from /usr/pkg/lib/calibre/calibre/plugins/magick.so
#8  0x00007f7ff78cd4c8 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#9  0x00007f7ff78cde47 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#10 0x00007f7ff78ccf98 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#11 0x00007f7ff78cde47 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#12 0x00007f7ff78ccf98 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#13 0x00007f7ff78cde47 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#14 0x00007f7ff78ccf98 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#15 0x00007f7ff78cde47 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#16 0x00007f7ff78ccf98 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#17 0x00007f7ff78cde47 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#18 0x00007f7ff7866d76 in function_call () from /usr/pkg/lib/libpython2.7.so.1.0
#19 0x00007f7ff7846a98 in PyObject_Call () from /usr/pkg/lib/libpython2.7.so.1.0
#20 0x00007f7ff78cab54 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#21 0x00007f7ff78cde47 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#22 0x00007f7ff78ccf98 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#23 0x00007f7ff78cde47 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#24 0x00007f7ff78ccf98 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#25 0x00007f7ff78cd560 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#26 0x00007f7ff78cde47 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#27 0x00007f7ff7866ca8 in function_call () from /usr/pkg/lib/libpython2.7.so.1.0
#28 0x00007f7ff7846a98 in PyObject_Call () from /usr/pkg/lib/libpython2.7.so.1.0
#29 0x00007f7ff7853908 in instancemethod_call () from /usr/pkg/lib/libpython2.7.so.1.0
#30 0x00007f7ff7846a98 in PyObject_Call () from /usr/pkg/lib/libpython2.7.so.1.0
#31 0x00007f7ff78c7d40 in PyEval_CallObjectWithKeywords () from /usr/pkg/lib/libpython2.7.so.1.0
#32 0x00007f7feb8122ea in sip_api_invoke_slot () from /usr/pkg/lib/python2.7/site-packages/sip.so
#33 0x00007f7fec3e066f in PyQtProxy::invokeSlot(qpycore_slot const&, void**) () from /usr/pkg/lib/python2.7/site-packages/PyQt4/QtCore.so
#34 0x00007f7fec3e0944 in PyQtProxy::unislot(void**) () from /usr/pkg/lib/python2.7/site-packages/PyQt4/QtCore.so
#35 0x00007f7fec3e1428 in PyQtProxy::qt_metacall(QMetaObject::Call, int, void**) () from /usr/pkg/lib/python2.7/site-packages/PyQt4/QtCore.so
#36 0x00007f7febd9c33e in QMetaObject::activate(QObject*, QMetaObject const*, int, void**) () from /usr/pkg/qt4/lib/libQtCore.so.4
#37 0x00007f7febda31b9 in QSingleShotTimer::timerEvent(QTimerEvent*) () from /usr/pkg/qt4/lib/libQtCore.so.4
#38 0x00007f7febda0038 in QObject::event(QEvent*) () from /usr/pkg/qt4/lib/libQtCore.so.4
#39 0x00007f7fe9effe5c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/pkg/qt4/lib/libQtGui.so.4
#40 0x00007f7fe9f05df4 in QApplication::notify(QObject*, QEvent*) () from /usr/pkg/qt4/lib/libQtGui.so.4
#41 0x00007f7feb1bfc5a in sipQApplication::notify(QObject*, QEvent*) () from /usr/pkg/lib/python2.7/site-packages/PyQt4/QtGui.so
#42 0x00007f7febd8c5c0 in QCoreApplication::notifyInternal(QObject*, QEvent*) () from /usr/pkg/qt4/lib/libQtCore.so.4
#43 0x00007f7febdb3179 in QTimerInfoList::activateTimers() () from /usr/pkg/qt4/lib/libQtCore.so.4
#44 0x00007f7febdb3935 in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/pkg/qt4/lib/libQtCore.so.4
#45 0x00007f7fe9f857ab in QEventDispatcherX11::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/pkg/qt4/lib/libQtGui.so.4
and mine is this:

Code: Select all

(gdb) bt
#0  0x0000000d00000024 in ?? ()
#1  0x00007f7ff740947a in ?? () from /usr/lib/libpthread.so.1
#2  0x00007f7ff7406b5c in sem_post () from /usr/lib/libpthread.so.1
#3  0x00007f7f8d206007 in gomp_barrier_wait_end () from /usr/lib/libgomp.so=
=2E1
#4  0x00007f7f8d20644e in ?? () from /usr/lib/libgomp.so.1
#5  0x00007f7ff740b2ce in ?? () from /usr/lib/libpthread.so.1
#6  0x00007f7ff6875d80 in ___lwp_park50 () from /usr/lib/libc.so.12
Cannot access memory at address 0x7f7f8d000000


(gdb) info thread
[New LWP 3]
[New LWP 2]
  Id   Target Id         Frame=20
  4    LWP 2             0x00007f7ff6875d6a in ___lwp_park50 () from /usr/lib/libc.so.12
  3    LWP 3             0x00007f7ff683938a in _sys___select50 () from /usr/lib/libc.so.12
* 2    LWP 4             0x0000000d00000024 in ?? ()
  1    LWP 1             0x00007f7ff6875d6a in ___lwp_park50 () from /usr/lib/libc.so.12


(gdb) thread 1
[Switching to thread 1 (LWP 1)]
#0  0x00007f7ff6875d6a in ___lwp_park50 () from /usr/lib/libc.so.12
(gdb) bt
#0  0x00007f7ff6875d6a in ___lwp_park50 () from /usr/lib/libc.so.12
#1  0x00007f7ff7409d50 in pthread_cond_wait () from /usr/lib/libpthread.so.1
#2  0x00007f7ff7406901 in sem_wait () from /usr/lib/libpthread.so.1
#3  0x00007f7f8d208754 in gomp_sem_wait () from /usr/lib/libgomp.so.1
#4  0x00007f7f8d20602f in gomp_barrier_wait_end () from /usr/lib/libgomp.so.1
#5  0x00007f7f8d20688d in gomp_team_start () from /usr/lib/libgomp.so.1
#6  0x00007f7f9028c499 in ClonePixelCacheRepository () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#7  0x00007f7f9028d3d2 in OpenPixelCache () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#8  0x00007f7f90270a92 in GetImagePixelCache () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#9  0x00007f7f9028dfb4 in SyncImagePixelCache () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#10 0x00007f7f904c0941 in ReadJPEGImage () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#11 0x00007f7f902b925b in ReadImage () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#12 0x00007f7f90285c21 in BlobToImage () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#13 0x00007f7f90aa5e7c in MagickReadImageBlob () from /usr/pkg/lib/libMagickWand-6.Q16.so.2
#14 0x00007f7f90e06ffa in magick_Image_load () from /usr/pkg/lib/calibre/calibre/plugins/magick.so
#15 0x00007f7ff78468af in PyObject_Call () from /usr/pkg/lib/libpython2.7.so.1.0
#16 0x00007f7ff78caa90 in PyEval_CallObjectWithKeywords () from /usr/pkg/lib/libpython2.7.so.1.0
#17 0x00007f7ff7859783 in methoddescr_call () from /usr/pkg/lib/libpython2.7.so.1.0
#18 0x00007f7ff78468af in PyObject_Call () from /usr/pkg/lib/libpython2.7.so.1.0
#19 0x00007f7ff78cf197 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#20 0x00007f7ff78d0f2a in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#21 0x00007f7ff78d219a in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2=2E7.so.1.0
#22 0x00007f7ff78cfe8f in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#23 0x00007f7ff78d219a in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2=2E7.so.1.0
#24 0x00007f7ff7867aba in function_call () from /usr/pkg/lib/libpython2.7.so.1.0
#25 0x00007f7ff78468af in PyObject_Call () from /usr/pkg/lib/libpython2.7.so.1.0
#26 0x00007f7ff7853b2c in instancemethod_call () from /usr/pkg/lib/libpython2.7.so.1.0
#27 0x00007f7ff78468af in PyObject_Call () from /usr/pkg/lib/libpython2.7.so.1.0
#28 0x00007f7ff78927ed in slot_tp_call () from /usr/pkg/lib/libpython2.7.so.1.0
#29 0x00007f7ff78468af in PyObject_Call () from /usr/pkg/lib/libpython2.7.so.1.0
#30 0x00007f7ff78cf197 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#31 0x00007f7ff78d219a in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#32 0x00007f7ff78cfe8f in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#33 0x00007f7ff78d0f2a in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#34 0x00007f7ff78d219a in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#35 0x00007f7ff78cfe8f in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0
#36 0x00007f7ff78d219a in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.7.so.1.0
#37 0x00007f7ff78d2200 in PyEval_EvalCode () from /usr/pkg/lib/libpython2.7.so.1.0
#38 0x00007f7ff78ea0a1 in run_mod () from /usr/pkg/lib/libpython2.7.so.1.0
#39 0x00007f7ff78ead7c in PyRun_FileExFlags () from /usr/pkg/lib/libpython2.7.so.1.0
#40 0x00007f7ff78eb7d7 in PyRun_SimpleFileExFlags () from /usr/pkg/lib/libpython2.7.so.1.0
#41 0x00007f7ff78fb80a in Py_Main () from /usr/pkg/lib/libpython2.7.so.1.0
#42 0x0000000000400972 in _start ()

Last edited by Rhialto on 2014-04-26T12:27:10-07:00, edited 1 time in total.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Crash related to semaphores

Post by magick »

The bug could be in ImageMagick or libgomp(). Does the bug occur with the latest release of ImageMagick 6.8.9-0? If the problem persists, you can build ImageMagick without OpenMP support, add --disable-openmp to your configure script command-line.
Rhialto
Posts: 5
Joined: 2014-04-26T11:21:41-07:00
Authentication code: 6789

Re: Crash related to semaphores

Post by Rhialto »

It happens with version ImageMagick-6.8.8.5. It's not so easy to update it due to the things that use it. But it should be doable to replace in-place it with a version configured with --disable-openmp. I'll try that first, to confirm that that avoids the problem.
Rhialto
Posts: 5
Joined: 2014-04-26T11:21:41-07:00
Authentication code: 6789

Re: Crash related to semaphores

Post by Rhialto »

Yes, I notice no problem when configuring with --disable-openmp. So that seems to indicate I'm at least looking in the right area.

Next I'll try to see how doable it is to update ImageMagick. I was going to try that on a laptop which is a bit expendable than the machine where I discovered the problem. But alas, I found that my problem isn't reproducible on it. (Not so strange with multithreading problems, but still annoying).
Rhialto
Posts: 5
Joined: 2014-04-26T11:21:41-07:00
Authentication code: 6789

Re: Crash related to semaphores

Post by Rhialto »

sorry to keep replying to myself... it is news as it happens ;-)

I've updated my ImageMagick to ImageMagick-6.8.9.0 and alas, it doesn't help.

The other person reported they were at that version already, I just hear.
They also report:
On Sat, Apr 26, 2014 at 10:32:30PM +0200, Rhialto wrote:
> I just wondered about that. It doesn't make a difference for me, anyway.
> And on my laptop I can't reproduce the issue, so it is somehow
> timing-dependent but not too much (since it happens all the time on my
> main computer).

I just retried; the first time I got a 100% CPU hang instead.
I think there's a slight random element here.

> For me it helped to add --disable-openmp to CONFIGURE_ARGS. Probably for
> you too. So we are not too far off the mark I think.

Hm, I'll give that a try later.

For now, some interesting info from my backtrace:
#0 0x00007f7ff740656d in sem_wait (sem=0x7f7fd8539280) at
/archive/foreign/src/lib/libpthread/sem.c:260
#1 0x00007f7fd0e034e9 in omp_set_lock (lock=0x7f7fd8539280) at
/archive/foreign/src/external/gpl3/gcc/dist/libgomp/config/posix/lock.c:138
#2 0x00007f7fd4b149b5 in DestroyExceptionInfo () from
/usr/pkg/lib/libMagickCore-6.Q16.so.2
...
(gdb) fr 0
#0 0x00007f7ff740656d in sem_wait (sem=0x7f7fd8539280) at
/archive/foreign/src/lib/libpthread/sem.c:260
260 return (_ksem_wait((*sem)->ksem_semid));
(gdb) p *sem
$1 = (sem_t) 0x0
(gdb) fr 1
#1 0x00007f7fd0e034e9 in omp_set_lock (lock=0x7f7fd8539280) at
/archive/foreign/src/external/gpl3/gcc/dist/libgomp/config/posix/lock.c:138
138 while (sem_wait (lock) != 0)
(gdb) p *lock
$2 = (omp_lock_t) 0x0

So it looks to me like ImageMagick passes in a NULL lock.
User avatar
dlemstra
Posts: 1570
Joined: 2013-05-04T15:28:54-07:00
Authentication code: 6789
Contact:

Re: Crash related to semaphores

Post by dlemstra »

Can you apply the following patch: http://trac.imagemagick.org/changeset/15481 and see if that resolves your issue?
.NET + ImageMagick = Magick.NET https://github.com/dlemstra/Magick.NET, @MagickNET, Donate
Rhialto
Posts: 5
Joined: 2014-04-26T11:21:41-07:00
Authentication code: 6789

Re: Crash related to semaphores

Post by Rhialto »

That patch looked promising, at least for the other guy's backtrace.
For me it doesn't seem to make a difference, and he reports:
Yes, it looked promising, but it didn't help. Same backtrace:

#0 0x00007f7ff740656d in sem_wait (sem=0x7f7ff31bbf00) at
/archive/foreign/src/lib/libpthread/sem.c:260
#1 0x00007f7fd0e034e9 in omp_set_lock (lock=0x7f7ff31bbf00) at
/archive/foreign/src/external/gpl3/gcc/dist/libgomp/config/posix/lock.c:138

sem and lock still NULL

Btw, --disable-openmp fixes the problem for me too.
User avatar
dlemstra
Posts: 1570
Joined: 2013-05-04T15:28:54-07:00
Authentication code: 6789
Contact:

Re: Crash related to semaphores

Post by dlemstra »

Is the crash still coming from a call to 'DestroyExceptionInfo'?
.NET + ImageMagick = Magick.NET https://github.com/dlemstra/Magick.NET, @MagickNET, Donate
_tk_
Posts: 6
Joined: 2014-04-26T13:29:09-07:00
Authentication code: 6789

Re: Crash related to semaphores

Post by _tk_ »

Hi!
I'm the other guy.
These are the first frames of the backtrace with the patch from trac applied:
(gdb) bt
#0 0x00007f7ff740656d in sem_wait (sem=0x7f7fd95b2200) at /archive/foreign/src/lib/libpthread/sem.c:260
#1 0x00007f7fd0e034e9 in omp_set_lock (lock=0x7f7fd95b2200) at /archive/foreign/src/external/gpl3/gcc/dist/libgomp/config/posix/lock.c:138
#2 0x00007f7fd4b149e5 in DestroyExceptionInfo () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#3 0x00007f7fd4b492a5 in IsEventLogging () from /usr/pkg/lib/libMagickCore-6.Q16.so.2
#4 0x00007f7fd52eb354 in NewPixelWand () from /usr/pkg/lib/libMagickWand-6.Q16.so.2
#5 0x00007f7fd52eb4c2 in NewPixelWands () from /usr/pkg/lib/libMagickWand-6.Q16.so.2
#6 0x00007f7fd52e97fa in NewPixelIterator () from /usr/pkg/lib/libMagickWand-6.Q16.so.2
#7 0x00007f7fd5603ec3 in magick_Image_has_transparent_pixels () from /usr/pkg/lib/calibre/calibre/plugins/magick.so
...
sem and lock are still 0x0.
User avatar
dlemstra
Posts: 1570
Joined: 2013-05-04T15:28:54-07:00
Authentication code: 6789
Contact:

Re: Crash related to semaphores

Post by dlemstra »

Can you add a link to 'OEBPS/images/img0007.jpg' from the original thread? Maybe there is a problem with the image?
.NET + ImageMagick = Magick.NET https://github.com/dlemstra/Magick.NET, @MagickNET, Donate
_tk_
Posts: 6
Joined: 2014-04-26T13:29:09-07:00
Authentication code: 6789

Re: Crash related to semaphores

Post by _tk_ »

Epub files are zip files, so you can just download
http://waterman.mine.nu/~wood/epub/be_26.epub
and unpack it with unzip.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Crash related to semaphores

Post by magick »

We're running on Fedora 20 with ImageMagick 6.8.9-1 Beta and Python 2.7.5. Your command works:

Code: Select all

ebook-convert be_26.epub be_26b.epub
1% Converting input to HTML...
InputFormatPlugin: EPUB Input running
on /data/loco/Downloads/be_26.epub
Found HTML cover OEBPS/text/content001.xhtml
loaded the Generic plugin 
Parsing all content...
34% Running transforms on ebook...
Merging user specified metadata...
Detecting structure...
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Cleaning up manifest...
Trimming unused files from manifest...
Trimming u'OEBPS/images/img0001.jpg' from manifest
Creating EPUB Output...
67% Running EPUB Output plugin
Splitting markup on page breaks and flow limits, if any...
        Looking for large trees in OEBPS/text/content002.xhtml...
        Found large tree #0
        Split into 2 parts
The cover image has an id != "cover". Renaming to work around bug in Nook Color
EPUB output written to /data/loco/Downloads/be_26b.epub
Output saved to   /data/loco/Downloads/be_26b.epub
We're not sure why its failing for you.
_tk_
Posts: 6
Joined: 2014-04-26T13:29:09-07:00
Authentication code: 6789

Re: Crash related to semaphores

Post by _tk_ »

I think there's two separate problems. The one rhialto sees, which I can't reproduce either, and the thread destruction issue, that only I can see, it seems.

As for the thread destruction: NetBSD's threading library is more picky that others, but AFAIK standards conformant. I don't know enough about ImageMagick to know how to improve it. This is where I had hoped for your help :)
_tk_
Posts: 6
Joined: 2014-04-26T13:29:09-07:00
Authentication code: 6789

Re: Crash related to semaphores

Post by _tk_ »

I've got a report from a third party that ImageMagick (display) was leaking fds. Disabling openmp support resolved this problem as well, so we've disabled openmp support in pkgsrc for ImageMagick.
Post Reply