libc: Add NEON optimized memmove

ported from cm10.1

bionic-benchmarks on cortex-a9
generic:
BM_string_memmove/8    50000000         67   119.35 MiB/s
BM_string_memmove/64   10000000        175   364.59 MiB/s
BM_string_memmove/512   1000000       1078   474.86 MiB/s
BM_string_memmove/1K    1000000       2072   494.20 MiB/s
BM_string_memmove/8K     100000      16400   499.50 MiB/s
BM_string_memmove/16K     50000      32293   507.35 MiB/s
BM_string_memmove/32K     50000      66585   492.12 MiB/s
BM_string_memmove/64K     10000     160435   408.49 MiB/s

NEON-optimized:
BM_string_memmove/8   100000000         25   319.06 MiB/s
BM_string_memmove/64   50000000         43  1472.60 MiB/s
BM_string_memmove/512  10000000        247  2069.74 MiB/s
BM_string_memmove/1K    5000000        463  2210.08 MiB/s
BM_string_memmove/8K     500000       3465  2363.69 MiB/s
BM_string_memmove/16K    500000       6894  2376.30 MiB/s
BM_string_memmove/32K    100000      15490  2115.38 MiB/s
BM_string_memmove/64K     50000      42097  1556.75 MiB/s

Change-Id: I89253a01fb811438089e16320ac265177a2ca152
3 files changed