Blackfin arch: Faster C implementation of  no-MPU CPLB handler

This is a mixture ofcMichael McTernan's patch and the existing cplb-mpu code.

We ditch the old cplb-nompu implementation, which is a good example of
why a good algorithm in a HLL is preferrable to a bad algorithm written in
assembly.  Rather than try to construct a table of all posible CPLBs and
search it, we just create a (smaller) table of memory regions and
their attributes.  Some of the data structures are now unified for both
the mpu and nompu cases.  A lot of needless complexity in cplbinit.c is
removed.

Further optimizations:
  * compile cplbmgr.c with a lot of -ffixed-reg options, and omit saving
    these registers on the stack when entering a CPLB exception.
  * lose cli/nop/nop/sti sequences for some workarounds - these don't
  * make
    sense in an exception context

Additional code unification should be possible after this.

[Mike Frysinger <vapier.adi@gmail.com>:
 - convert CPP if statements to C if statements
 - remove redundant statements
 - use a do...while loop rather than a for loop to get slightly better
   optimization and to avoid gcc "may be used uninitialized" warnings ...
   we know that the [id]cplb_nr_bounds variables will never be 0, so this
   is OK
 - the no-mpu code was the last user of MAX_MEM_SIZE and with that rewritten,
   we can punt it
 - add some BUG_ON() checks to make sure we dont overflow the small
   cplb_bounds array
 - add i/d cplb entries for the bootrom because there is functions/data in
   there we want to access
 - we do not need a NULL trailing entry as any time we access the bounds
   arrays, we use the nr_bounds variable
]

Signed-off-by: Michael McTernan <mmcternan@airvana.com>
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

diff --git a/arch/blackfin/kernel/setup.c b/arch/blackfin/kernel/setup.c
index d3d37e7..20d04a1 100644
--- a/arch/blackfin/kernel/setup.c
+++ b/arch/blackfin/kernel/setup.c
@@ -88,6 +88,7 @@
 {
 	unsigned int cpu;
 
+	generate_cplb_tables_all();
 	/* Generate per-CPU I&D CPLB tables */
 	for (cpu = 0; cpu < num_possible_cpus(); ++cpu)
 		generate_cplb_tables_cpu(cpu);
@@ -97,19 +98,11 @@
 void __cpuinit bfin_setup_caches(unsigned int cpu)
 {
 #ifdef CONFIG_BFIN_ICACHE
-#ifdef CONFIG_MPU
 	bfin_icache_init(icplb_tbl[cpu]);
-#else
-	bfin_icache_init(icplb_tables[cpu]);
-#endif
 #endif
 
 #ifdef CONFIG_BFIN_DCACHE
-#ifdef CONFIG_MPU
 	bfin_dcache_init(dcplb_tbl[cpu]);
-#else
-	bfin_dcache_init(dcplb_tables[cpu]);
-#endif
 #endif
 
 	/*