Synchronous access
The simplest way for a program to read (or to write) a chunk
of file is to allocate a buffer for the data, issue a request
to the OS and then sit there and wait until the request is
fullfiled.
Once the data is in (or out), proceed to the next step.
Since the program effectively stops running while
waiting for the request to complete, this is called
a
synchronous IO.

It is very simple to implement, it keeps the code
nice and tidy and it is widely used in software
for reading/writing files.
However, if we want to read/write
fast,
we can do significantly better.
Asynchronous access
Instead of waiting for a request to complete,
a program can make a note that the request is pending
and move on to doing other things. Then, it will
periodically check if the request is done and when it
is, it will deal with the result.

Since we are no longer blocking around an IO
request, this is called an
asynchronous IO.
Note that in terms of the IO performance, it is
so far exactly the same as the synchronous case.
Asynchronous, multiple buffers
Now that we are free to do something else while our
request is pending, what we can do is submit
another request. And then, perhaps, few more,
all of which will be pending and queued somewhere
in the guts of the OS.

What this does is it ensures that once the OS
is done with one request, it will
immediately
have another one to process.
This eliminates idling when reading/writing data
from the storage device, so we have data flowing
through the file stack
continuously.
Knowing when to stop
It may seem that if we just throw a boatload
of requests at the OS, it should allow us to
go through a file as quickly as possible.
However there's really no point in having
too many requests in a queue, because it simply
doesn't give us any faster processing.
What we need is to merely make sure the request queue
is never empty, so if we can achieve that with as
few requests as possible, we'll have the fastest
processing rate
and the lowest memory usage.