sha1write: don't copy full sized buffers
No need to memcpy() source buffer data when we might just process the
data in place instead of accumulating it into a separate buffer.
This is the case when a whole buffer would have been copied, summed,
written out and then discarded right away.
Also move the CRC32 processing within the loop so the data is more likely
to remain in the L1 CPU cache between the CRC32 sum, SHA1 sum and the
write call.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>