The support forum

Parallel copying / the tech demo

Alex Pankratov :

Oct 04, 2016



As you might've heard Bvckup 2 is getting support for copying several files in parallel during a backup - https://bvckup2.com/wip/26052015

Background


When a file is copied, there's a fixed time cost of opening and closing it and then there's a cost of actual copying that is proportional to the file size. For larger files the opening/closing time is dwarfed by the copying time, but for smaller files they are comparable, so if we can task the OS with opening/closing several files in parallel, then we speed things up.

* This is similar to the robocopy /mt option if you are familiar with the tool.

Tech demo


We now have a reasonably stable build that supports parallel copying and you are welcome to play with if you'd like:

        https://bvckup2.com/files/bvckup2-setup-1.76.7.3.exe

Configuration


The app defaults to running:

    * 16 file copies in parallel if copying to or from a network share
    * 4 copies if copying between two local drives and
    * 1 copy at a time if backing up in confines of one drive

It will log the exact thread count in Processing section of the log:

    2016.10.07 15:09:54.872 (UTC+1) 2 0 Running the backup ...
    2016.10.07 15:09:55.098 (UTC+1) 2 1     Processing ...
    2016.10.07 15:09:55.098 (UTC+1) 3 2         A total of 1602 steps
    2016.10.07 15:09:55.098 (UTC+1) 3 2         Copying threads: 16        <<<

It is possible to override the thread count by changing the following option in job's settings.ini:

        conf.copying_threads  24

The default is 0, which is the 16-4-1 auto behavior described above.

! Make sure to exit the app before making any INI changes or your changes will be ignored and overwritten by the app.

Testing


The effects of parallel copying are most pronounced when copying to or from a network shared, and especially when going over a fast connection, e.g. 1 Gbps wire link.

For local copies it should also have noticeable effect, but again, the faster the drives the more pronounced the effect should be. That is, don't expect any miracles with external USB 2.0 drives or SD cards.

If you give it a try, please share the results below. I'm very curious to see other people's numbers. Thanks!

Dino :

Oct 05, 2016

Works like a charm. The first run (I use alternating flash drives) I didn't notice any difference after installing they upgrade.

After closing bvckup2 I checked the settings.ini of the jobs involved and it had "conf.copying_threads 1" in there so I changed "1" to "8" manually and ran it again. Now I could see 8 files being processed in parallel and noted the decrease in backup time.

Will keep using it and if anything strange happens let you know.  Very cool, thanks for making it available :-)

MWorthington :

Oct 05, 2016

In reading "Configuration" late last night, I was somewhat confused. I thought Bvckup2 would determine "up to 8 file copies in parallel", while "thread count" pertained to something else (how many threads Bvckup2 uses anyway?). Seeing the "conf.copying_threads  16" convinced me that this setting was not the one to determine how many parallel paths there are.

Have I mis-read this? Is it indeed that variable we should adjust (presumably from 1 to 8, only) to see the effect?

Dino: how did you actually "see 8 files being processed in parallel"?

Sorry if I'm being a bit slow here :(

Ditto, a great addition to Bvckup2!

Alex Pankratov :

Oct 05, 2016

Argh! My bad, gentlemen!

There was an over-zealous configuration file check that "corrected" copying_threads being at 0 by setting it to 1.

I've pushed out an updated build that doesn't do that (i.e. happily accepts 0 from the INI) and it also changes copying_threads from 1 back to 0 if it detects an update from 76.7.1.

  =>  https://bvckup2.com/files/bvckup2-setup-1.76.7.2.exe

Is it indeed that variable we should adjust (presumably from 1 to 8, only) to see the effect?


Yes, it's the one.

MWorthington :

Oct 06, 2016

:)

MWorthington :

Oct 06, 2016

I get a "not found" message when I click that link.

The About shows the version as 76.7, I guess it would be more informative if it showed it as 76.7.1 etc

Another "Argh!" maybe .... if the 76.7.2 update changes copying_threads from 1 back to 0 if it detects an update from 76.7.1, what does it do about itself?! :)

Mark

Alex Pankratov :

Oct 06, 2016

Got too excited for an update, forgot to type .exe at the end :)

76.7.2 patches copying_threads only when it sees that the config was saved by an earlier version. That part is fine.

MWorthington :

Oct 06, 2016

Alex,

I need to ask for instructions!

I have a test folder which I will run v76.7. By default, it will run in single-copy mode.

I have v76.7.2 as a separate install. I have re-created the backup settings, and after deleting the destination, will run the backup. With the default conf.copying_threads=0, I expect the same performance.

What should I change conf.copying_threads to next? Do I change it to 1 and expect "The app will default to running up to 8 file copies in parallel"?

Thanks

Alex Pankratov :

Oct 06, 2016

With the default conf.copying_threads=0, I expect the same performance.


Nope, zero stands for the "auto" mode = use 8 threads for cross-device copying and 1 thread for the same-device case.

Any non-zero value is the unconditional thread count override.

With 76.7.2 the default is zero, so you don't need to change anything and you should see better performance than with stock 76.7.

MWorthington :

Oct 06, 2016

Gotcha! Sorry for being slow ....

In my situation (2 separate installs), would I expect any speed reduction for the unlicenced version?

MWorthington :

Oct 06, 2016

Ignore that question ..... :)

MWorthington :

Oct 06, 2016

Not sure what to make of this .....

Source on local hard drive, destination on external USB drive. PC in active use (other than source folder), backup while working

Backup everything
Re-scan destination
Use delta copying

76.7
1st Backup to empty destination
19 min 54 secs with no errors
Read 15.08 MB, wrote 15.08 MB, throughput 13.16 MBps/ 13799783 bps

2nd Backup to empty destination
26 min 7 secs with no errors
Read 15.08 MB, wrote 15.08 MB, throughput 9.99 MBps/ 10470304 bps

76.7.2
conf.copying_threads   0

1st Backup to empty destination
29 min 23 secs with no errors
Read 15.08 MB, wrote 15.08 MB (no throughput data available)

2nd Backup to empty destination
22 min 32 secs with no errors
Read 15.08 MB, wrote 15.08 MB (no throughput data available)

Alex Pankratov :

Oct 06, 2016

The effects of parallel copying are mostly noticeable when shoveling around a lot of smaller files. For larger files running several copies in parallel may actually have an opposite effect because of the disk thrashing and copies fighting for the I/O bandwidth.

Second case for parallel copying are the "long fat IO pipes", e.g. gigabit links. Long = extra latency, fat = extra bandwidth. For these, the more operations we can arrange for being "in flight" at the same time - the better we'll be utilizing the bandwidth.

MWorthington :

Oct 06, 2016

Understood. The data I used consists of:

29706 files in total
23792 files less than 200 KB
3335 files between 200 and 1000 KB
2571 files between 1000 & 100,000 KB
8 files above 100,000 KB

I thought this would actually benefit from parallel copying.

MWorthington :

Oct 06, 2016

Would the results indicate that the 76.7.2 copy had somehow been done as if a same-device case?

Is there a way to check what is actually happening during the backup?

Alex Pankratov :

Oct 07, 2016

23792 files less than 200 KB


It really comes down to how fast are file opening/closing operations in comparison to the time needed to read/write the data. I agree that in your case you should've seen some improvement from using parallel copying, but it looks like the app manages to fully saturate IO capacity of your drive combination as is, with just one thread.

Is there a way to check what is actually happening during the backup?


When a backup is running, the UI will show a list of all currently executing steps, like this: https://bvckup2.com/support/data/parallel-copying-ui.png

You should see more than one number there if the parallel copying is active. If it's just a single-threaded run, then there'll be just one step listed there at any time.

MWorthington :

Oct 07, 2016

Alex,

Firstly, let me say that considering what's it's doing, and comparing it to my previous software, Retrospect, I am more than satisfied at how quick, solid and accurate Bcvkup is :)

I agree with your first paragraph. But carrying on with the investigation .... I am familiar with the UI & executing steps, and I checked that when doing a previous backup. I saw nothing different ….. of course, I should have seen something, but didn’t really know it. Repeating the process:

I double checked, settings.ini has conf.copying_threads = 0
Backup to empty destination,
UI shows conventional one step at a time

Changed destination to a network drive
UI shows conventional one step at a time

So that's clear, parallel copying is not active.

What do you advise? It seems I'm the only one feeding back at the mo. I asked Dino how he saw "8 files being processed in parallel", but haven't heard back.

Mark

Alex Pankratov :

Oct 07, 2016

Stand by for an update...

Alex Pankratov :

Oct 07, 2016

Ok, the update's out:

  =>  https://bvckup2.com/files/bvckup2-setup-1.76.7.3.exe

First change is that it now logs the exact thread count in Processing section of the log:

    2016.10.07 15:09:54.872 (UTC+1) 2 0 Running the backup ...
    2016.10.07 15:09:55.098 (UTC+1) 2 1     Processing ...
    2016.10.07 15:09:55.098 (UTC+1) 3 2         A total of 1602 steps
    2016.10.07 15:09:55.098 (UTC+1) 3 2         Copying threads: 16        <<<

Second change is that it now correctly treats mapped drives as remote locations and applies over-the-network defaults.

Thirds, the defaults changed - 16 threads for over-the-network backups, 4 threads for local different drives backups and 1 if using the same local drive.

It's still of course possible to override the count with copying_threads variable in settings.ini and I would actually encourage you to experiment with it, especially for network backups. The 16 might be a bit too aggressive in some cases or too conservative in others.

MWorthington :

Oct 07, 2016

Good stuff!

I'm back home now, so will have to recreate the test albeit on different hardware. However, I think the change will be immediately noticeable :)

Dino :

Oct 07, 2016

@MWorthington - sorry for the late reply, I saw it just as shown in the screenshot post above by Alex Pankratov posted and in the log window as you can see file 15 being done before file 10 for example.

MWorthington :

Oct 07, 2016

Thanks Dino! Yes, that was what I was missing, I should have thought that parallel copying would be visible in the GUI ....

MWorthington :

Oct 08, 2016

Alex,

At home, running on an older system, I’ve run some tests. Continuing with the large data set mentioned above:

76.7
Therefore, Copying threads = 1
Backup size:   15.00 GB / 29706 files / 3628 folders

Source on a local SATA hard drive, destination (empty) on a local IDE hard drive.
Duration:      00:27:21.266

Source on a local SATA hard drive, destination (empty) on a flash memory stick
Duration:      00:36:42.000

76.7.3
conf.copying_threads=0
Copying threads = 4

Source on a local SATA hard drive, destination (empty) on a local IDE hard drive.
Duration:      00:32:55.469

Source on a local SATA hard drive, destination (empty) on a flash memory stick
Duration:      00:39:48.062

OK, so moved on to a different scenario, and just used hard drives:

2551 files in total
1722 files less than 200 KB
379 files between 200 and 1000 KB
499 files between 1000 & 100,000 KB
1 files above 100,000 KB

76.7.3
Source on a local SATA hard drive, destination (empty) on a local IDE hard drive.
Backup size:   2.83 GB / 2551 files / 500 folders

conf.copying_threads=1
Copying threads = 1
Duration:      00:02:51.781

conf.copying_threads=0
Copying threads = 4
Duration:      00:04:53.329

conf.copying_threads=8
Copying threads = 8
Duration:      00:04:18.703

Now with a photo & video scenario:

48 files in total
4 files less than 200 KB
5 files between 200 and 1000 KB
48 files between 1000 & 100,000 KB
8 files above 100,000 KB

conf.copying_threads=1
Copying threads = 1
Duration:      00:01:08.063

conf.copying_threads=0
Copying threads = 4
Duration:      00:03:30.875

conf.copying_threads=8
Copying threads = 8
Duration:      00:05:30.359

My conclusion is that for my system and my type of backups, parallel copying is not good :(

What’s more, my backups are generally incremental anyway, ie daily updating a small number of changes, and Bvckup 2 is blindingly fast as it is!

However, I am aware that network backups of large data sets is a different matter entirely.

I have a NAS here, but it’s not in daily use yet and won’t add any value, I think to what you already know about network performance. I can check v76.7.3 on my network at work, on Monday, for interest.

It is obviously hardware sensitive, as well as fundamentally dependent on the particular backup one is considering (my daily work folders are quite different to my photo & video folders). Might it be worth adding this option to the settings for each backup job (with the default set to single-copy mode)?

You said “This feature is going to be available under a Pro license only. More on this to follow”. What about Personal Use licences?

Hope all this helps!

Mark

MWorthington :

Oct 08, 2016

By the way, once I've finished experimenting, how can I get back to v76.7?

Alex Pankratov :

Oct 10, 2016

Interesting numbers, thanks, Mark.

Might it be worth adding this option to the settings for each backup job (with the default set to single-copy mode)?


Yep, this is coming shortly.

You said “This feature is going to be available under a Pro license only. More on this to follow”. What about Personal Use licences?


This has to do with upcoming licensing changes. We will be retiring (but still fully supporting) existing Personal and Professional licenses and replacing them with Basic, Pro and Server licenses. Basic will cover core functions needed for replicating A to B, but not some advanced options, Pro will include everything and Server will be required for Windows Server installations.

Some details were in the last newsletter that went out and I will post an updated version of the same once we are close to flipping the switch. This should be within a month or two from now.

The bottom line is that existing licenses will continue functioning as they do now, with Personal being functionally equivalent to the new Pro.

By the way, once I've finished experimenting, how can I get back to v76.7?


Yep, just grab the installer from https://bvckup2.com/get and run it.

guybor :

Oct 10, 2016

"Some details were in the last newsletter that went out"

How do I get on this mailing list?  I don't recall seeing one.  (it may be in my spam)

What email address does it come from so I can make sure it is white listed.

Thank you.
Guy

Alex Pankratov :

Oct 10, 2016

How do I get on this mailing list?


With the "Newsletter" link in the footer of this page, or through the "Subscribe to the updates" link in the purchase receipt.

The newsletter was using support@pipemetrics.com address, but will switch to news@... starting with the next issue.

MWorthington :

Oct 10, 2016

Alex,

Thanks for the feedback and the v76.7 file.

Ref. previous testing at work, 15Gb, ~30,000 files, new backups today:

Source on a local hard drive, destination (empty) on a local USB 2.0 hard drive.
76.7
19 min 1 secs
76.7.3
conf.copying_threads=0
Copying threads = 4
21 min 58 secs

Source on a local hard drive, destination (empty) on a network drive
76.7
42 min 41 secs
76.7.3
conf.copying_threads=0
Copying threads = 16
27 min 46 secs with no errors

Mark

FChwolka :

Oct 13, 2016

Hi  all,
as I know the robocopy with /mt option  I used robocopy for backup the partition with a lot of small files about my retro systems. 80% of the files are less than 32k and with actualy  4934250 files it needs a long time for backup taking only one file and then the next.  

2016.10.12 22:07:10.038 (UTC+1) 2 2         Scanning source ...
2016.10.12 22:10:17.759 (UTC+1) 3 3             Completed in 3 min 7 sec
2016.10.12 22:10:17.759 (UTC+1) 3 3             412389 folders, 4934250 files, 2.74

The backuptime needs some hours and is actually running as I changed the backupserver. With 32 concurrent files I see a network load of 60% for 1GB and thats a lot more than in the past.

For my application, the current configuration is almost perfect and I will continue to use it. Changing the ini-file is not a problem, you just have to control that the value was taken. If the ini-file change can be made via the user interface it is of course easier. ... and already there are new wishes ..

Best Regards

Fritz

Alex Pankratov :

Oct 13, 2016


Interesting. Thanks, Fritz.

From what I've seen there always appears to be a thread count, at which the I/O performance flattens out *and* the CPU usage starts to grow at the same time. So it makes sense to do a form of binary search for the optimal count value - try with 1, 2, 4, 8, 16, 32. Once the performance stops improving, scale back to the middle of the last range, e.g. 24, and repeat.

Have you tried going above 32 or slightly below it?

Tweedal_D :

Nov 17, 2016

11/17/2016:  Using v.76.12, Personal Use.  Not seeing the conf.copying_threads= variable at all in the settings.ini file.  Can this still be adjusted in this version???

Alex Pankratov :

Nov 17, 2016

Parallel copying is not yet a part of the mainstream release. If you'd like to play with it, you will need to use 76.7.3 build, which is our internal test build - https://bvckup2.com/files/bvckup2-setup-1.76.7.3.exe

There is a couple of loose ends (one being the UI support for this feature), which is parallel copying hasn't been formally rolled out yet.

AaronP :

Nov 19, 2016

This is a *huge* performance improvement for backing up to a NAS (tons of mixed file types and sizes, with many small ones). Outstanding work.  Random side plug for the whole-task MB/s transfer average (or sliding window), since it'd make measuring the improvement even easier ;)

jlm111 :

Dec 21, 2016

Hi everyone, there any indication of when this will be merged with the mainstream release?

Alex Pankratov :

Dec 26, 2016



I've been dragging feet merging this into mainstream, because I don't think running multiple parallel copies in parallel is the right way to improve bulk small-file throughput. There's basically a better option.

More specifically, I've been playing with a little side app that estimates maximum possible read and write performance for a volume. It goes through a bunch of different IO buffer sizes and counts and also tries direct and buffered access. The output is the max read/write rate and an IO querying "recipe" that achieves it. However this assumes that one file is read at a time and there's no competing (bulk) disk access.

So the point #1 is that from IO throughput perspective it makes sense to copy files back to back rather than in parallel.

---

Secondly, the reason for low throughput when copying smaller files is that the per-file overhead (of opening and closing files) is in the same order of magnitude as the data transfer. But opening and closing are synchronous operations and they cause the copying pipeline to stall a little for every file.

Parallel copying (and "robocopy /mt") resolves this by copying each file with its own thread - this neatly solves an issue of opening/closing files more quickly, but it also causes the bulk IO to be done in parallel as well.

---

So, a proper "parallel" copying would be to parallelize the pre- and post- work for each file, but still process the _copying_ on the contents on file-by-file basis.

In particular, this approach should work well regardless of exact job composition - be it a lot of smaller files or a lot of really large ones.

Makes sense?

MWorthington :

Dec 27, 2016

Alex, sounds eminently sensible :)

Let me know if you want me to repeats my tests ....

Happy (up-coming) New Year!

Mark

Dino :

Dec 30, 2016

What @MWorthington said :-)

Would it still be included in Bvckup 2 or would this new method mean it will be moved to a feature for Bvckup 3?

and Happy New Year to all.

Alex Pankratov :

Jan 04, 2017

Ok, thank you, gentlemen. Will make sure to take you up on your kind offer :)

iantls :

May 22, 2017

Hi Alex. Just trying out your tool and this feature on our backup. I have about 1.2m files, and am backing up to OneDrive via a WebDAV network share. This is basically your ideal case for multi threading - gigabit speeds, but massive latency.

So far this is working a whole lot better than a bunch of other tools I tried which just crashed and burned with this volume of files. However, one possible improvement: multi-threading folder creation. So far my backup has been running 13 hours, and it's created about 6400 folders. At that speed it will take roughly 15 days to create 160,000 odd folders, even before it starts to look at the files...

Alex Pankratov :

May 23, 2017

I hear you, iantls. This is already in the works, together with parallel deletion.

New topic

Create
Made by Pipemetrics in Switzerland
Support


Follow
Twitter
Dev blog
Miscellanea Press resources
Testimonials
On robocopy
Company
Imprint

Legal Terms
Privacy