sched, lockdep: inline double_unlock_balance()

We have a test case which measures the variation in the amount of time
needed to perform a fixed amount of work on the preempt_rt kernel. We
started seeing deterioration in it's performance recently. The test
should never take more than 10 microseconds, but we started 5-10%
failure rate.

Using elimination method, we traced the problem to commit
1b12bbc747560ea68bcc132c3d05699e52271da0 (lockdep: re-annotate
scheduler runqueues).

When LOCKDEP is disabled, this patch only adds an additional function
call to double_unlock_balance(). Hence I inlined double_unlock_balance()
and the problem went away. Here is a patch to make this change.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
diff --git a/kernel/sched.c b/kernel/sched.c
index e8819bc..ad10d0a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2825,7 +2825,7 @@
 	return ret;
 }
 
-static void double_unlock_balance(struct rq *this_rq, struct rq *busiest)
+static inline void double_unlock_balance(struct rq *this_rq, struct rq *busiest)
 	__releases(busiest->lock)
 {
 	spin_unlock(&busiest->lock);