Just updated the script on the first post. Everything seems to be working great. Let me know if there is anything else you want to see in the new log file. /var/tmp/copyscript.log
Dusty
|
|
Just updated the script on the first post. Everything seems to be working great. Let me know if there is anything else you want to see in the new log file. /var/tmp/copyscript.log
Dusty
hey Dusty
great work on the script.
if you want a more accurate % monitor and a progress bar then check out gooey gadgets http://sibr.com/blog/?p=104
and use du -ck to get a total of the bytes you want to transfer across
then every 2 seconds do du -ck on your destination (to see how many bytes are in at the moment)
and then make a percentage out of $copy_size/$expected_total_size*100
you will have to do bash math with bc and printf to get nice round numbers, but those values go into gooey gadgets to make your progress bar. cool huh?
i've done all this in my software (alexicc) and it is the first time i've got a progress bar to actually work from a bash script.
gooey gadgets will need to be on the machine you run it on but I make that happen by putting the binaries i need inside the app folder then use the path to those binaries in the shell script app.
it all sounds more complicated than it is !!!
cheers
jamie
Jamie,
Thanks. I will have to look into that. Sounds like it does the same thing I am doing du on source and destination and some bc math.
The problem with the du approach is different file systems have different sizes for some things. That is why I had to make the script stop counting at 90%. Had a couple times with SxS cards and CF cards where the destination copy never got over 95% of the source and my percent complete loop never stopped.
I am still interested in your script for transcoding Alexa footage with a LUT.
The next thing I want to work on is something that syncs sound and video based on timecode and a script that pulls scene and take metadata out of sound and transcodes dailies with scene and take in the filename.
Dusty
Dustin,
Any chance you can change the order of operations slightly so that it makes the first copy, then checksums the first copy, then makes all other copies and does other checksums? That way when the source footage has been copied and checksummed it can be ejected before any other copies are made.
I just used your script on a 3 week, 3 camera job, and mags start stacking up pretty quickly so it would be nice if you could eject the cards/drives as soon as possible to start the next copy.
Also, I've had about 8 instances of the original script open at once, so I don't know what you changed to make that possible, but it seemed to be working fine for me before.
Also, it would be nice if when you chose a source folder, it would check for checksum text files from previous copy operations that would be in the same parent folder of the source folder, so that you could use that for comparisons instead of checksumming the source again, which would save time for additional copies, for example when post makes copies for visual f/x.
All in all, it already works great, and I'm thankful that you've spent so much time making it even better.
Tim
Tim,
Thanks for the feedback. Did you use the version that tells you percent complete or the old version?
Right now the script does the source checksum and second copy in parallel. I normally have my source media and destination media on different buses, so they both run at max speed. The checksum is usually faster than the second copies, so as soon as it says the source checksums are done you could eject the source drive while it is still doing the extra copies. I could have the script eject the source media as soon as it finishes the source checksum. Would that work for you?
The change to allow multiple instances was because of the percent complete temp files and thenew log it is writing. The percent complete temp files were all the same name and multiple instances would conflict with each other. Also I wanted things to stay organized in the log file so one drives log is all together and you don't have a copy line from source A mixed with a checksum line from source B. Check out the new log and let me know what you think. /var/tmp/copyscript.log
I will look into making the script a little smarter about checksums.
p.s. - that was you I ran into at the 3cP booth with Scott Mason right?
Dusty
good call on the du. I'd forgotten about copying to ntfs and other systems! that's me being mr Mac only in my house!
my workaround(s) for that are
ls -l reports the same values on different filesystems as it doesn't bother with block sizes but i think that only works on FILES not folders
so to get the size of all the movs in a folder i do this
find /Users/jamie/Desktop/movz -iname \*.mov -exec ls -l {} \;|awk '{print $5}'| awk '{ sum+=$1} END {print sum}'
find is the command then the folder you're interested in then the name of the things (i've chosen any .mov file here) then the command to EXECcute on each thing found.
all that mess at the end takes the 5th column of ls -l output (which is the file size in bytes on my os) and then adds all the columns to make a total byte size
hopefully irrespective of Block size which varies with filesystem like you said.
the other workaround is to use gnu du (google coreutils for os x) which has some options for ignoring blocksize as well
I think its gdu -k --apparent-size or something like that
these 2 hacky workarounds get over the weirdo problem of block size overhead. Why is nothing simple any more eh? :-)
the alexicc thing is available as an app thingy from lightillusion.com
it's priced as it is cos steve shaw's icc profile service is NOT free and he wanted to share proceeds with other developers of Spaceman
it does seem to work and Job ter Burg and wouter have done over 1500 ish clips with it in their movie.
it has some issues (ie don't use it with prores422 log C !!!) and is not a pretty looking app but it gets people home quicker and that's why i made my bit of it!
all the best
jamie
Jamie,
I am 100% Mac too, but teh media I have to deal with is primarily FAT32 and SxS cards are UDF. They both do things a little differently than HFS+.
I should change the du command to look at files only like I do with the md5 command. Something really simple like:
find -s * -type f
That way find ignores anything that is not a file (-type f). the only problem with that is I have to find all files, du to get their size, then add all the sizes to get the total. It was so much easier to change 99% to 90% and be done.
Dusty
Dustin, checksum and second copy in parallel is great, no need to auto eject. I was using the very first version before the percentage, so that's why I had multiple instances.
And yes, that was me at NAB. Next time you find yourself in LA we should hang out.
I'll look at the logs tomorrow.
I think this now might be the fastest way to copy data to multiple destinations when the destinations are different speeds.
Tim
Have not been on this list in a bit.
md5 does not compare files but uses an algorithm to calculate a unique "fingerprint" for the file. (see: http://en.wikipedia.org/wiki/Md5sum). The diff program just compares the files byte for byte. That is why diff is much faster (just byte comparison vs the computer actually calculation doing work). If you are using GNU diff that is highly tested and should be ok.
md5 is usually used when you are transporting data across a network and so the user at the destination end will not be able to compare your data directly. In this case since you have direct access to both copies you don't necessarily need to take the time to do checksums; they provide a nice double verification.
(also have you looked at rsync?)
Richard,
I understand md5 and what diff does for text files, but these are binary files and I can't find anything that really says what diff does with binary files. My concern with diff was it was so fast it could not have read all the data. diff on a file seemed to be faster than the drive could read the file, so it is only reading parts of the file and that is a problem. We have to know that every bit is the same for what we do, so we must have checksums, therefore the diff was kinda pointless. I have access to the source media on set, but the guys I send the drives to do not have that access. Those checksums are how we verify the files they are working with six months from now are the exact same as what came off the camera.
I did test with rsync, cpio, and every other way to copy files I could find on the internet. cp ended up being the fastest. There were other benefits to other ways to copy the files, but speed was what I wanted.
Dusty
| « Previous Thread | Next Thread » |