Post by Matthew DillonThe real culprit here is passing held mutexes to unrelated procedures
in the first place because those procedures might have to block, in
order so those procedures can release and reacquire the mutex.
That's just bad coding in my view. The unrelated procedure has no
clue as to what the mutex is or why it is being held and really has no
business messing with it.
What I did was implement spinlocks with VERY restricted capabilities,
far more restricted then the capabilities of your mutexes. Our
spinlocks are meant only to be used to lock up tiny pieces of code
(like for ref counting or structural or flag-changing operations).
Plus the kernel automatically acts as if it were in a critical section
if it takes an interrupt while the current thread is holding a spinlock.
That way mainline code can just use a spinlock to deal with small bits
of interlocked information without it costing much in the way of
overhead.
Well, this is currently what our spinmutexes do too.
The couplet mtx_lock_spin()/mtx_unlock_spin() simply starts/exits a
critical section (disabling interrupts in the while and avoiding
preemption). They are intended to be used for very small pieces of code too.
Post by Matthew DillonI made the decision that ANYTHING more complex then that would have to
use a real lock, like a lockmgr lock or a token, depending on the
characteristics desired. To make it even more desireable I also
stripped down the lockmgr() lock implementation, removing numerous
bits that were inherited from very old code methodologies that have no
business being in a modern operating system, like LK_DRAIN. And I
removed the passing of an interlocking spinlock to the lockmgr code,
because that methodology was being massively abused in existing code
(and I do mean massively).
Well, if you add a more smart interface, you have *exactly* our sx locks
implementation.
Basically, sx and lockmgr in FreeBSD just differs beacause of the
lockmgr's stupid API, beacause of draining and beacause of interlock.
But they are basically very very similar*.
Post by Matthew DillonI'm not quite sure what the best way to go is for FreeBSD, because
you guys have made your mutexes just as or even more sophisticated
then your normal locks in many respects, and you have like 50 different
types of locks now (I can't keep track of them all).
Not sure what you mean with 'more sophisticated'... anyways...
The only one problem I currently see with our locking primitives is that
they are not very well documented (or part of the documentation is
stale) and this can be a problem when there are a couple of locking
primitives as we have but this doesn't mean that they are complex.
Really, any primitive is very simple and is thought to be used in its
particular context. The restriction we have on locks just are a sort of
warning for people developing wrong locking strategies.
For example, there are not tecnological difficulties in allowing holding
mutexes when sleeping but if this really happen, probabilly there is a
problem in your locking scheme.
Post by Matthew DillonIf I were to offer advise it would be: Just stop trying to mix water
and hot wax. Stop holding mutexes across potentially blocking procedure
calls. Stop passing mutexes into unrelated bits of code in order for
them to be released and reacquired somewhere deep in that code. Just
doing that will probably solve all of the problems being reported.
I cannot understand what part of the codes you are referring with this...
Thanks,
Attilio
* Another difference is about upgrading, but I consider FreeBSD's
lockmgr upgrading a really bad choice of design, and world could be a
very better place without it