Version 0.2, a new version of par2unpack.sh

Tuesday, 31 October 2006

I’ve now started rewriting the original script, since the original was unable to deal with cases where only vol.XX+YY.PAR2 files are available to repair a set of files. For now, this version only repairs all files in one directory. It only unpacks a file, if the filename without suffix has *changed*. This way, it avoids unpacking the same file over and over again. For example, suppose a directory contains the following files:
file1.vol01+00.PAR2
file1.vol01+08.PAR2
file1.vol02+13.PAR2
file2.par2
file2.vol01+77.PAR2

Without any kind of string comparing, newsunpack would unpack file1 three times, and file2 two times! The script I am writing would unpack file1.vol01+00.PAR2, and then skip all the other names starting with file1, and then unpack file2.par2.
Here is what I got so far:


#!/bin/bash

unpack() {
oldfile=$1
ls *[pP][aA][rR]2 2> /dev/null | while read parfile ; do
if [ `echo "$parfile" | grep '..vol[0-9]\+[+-][0-9]\+.[pP][aA][rR]2'` ]
then
newfile=${parfile%.vol*[-+]*.[pP][aA][rR]2}
else
newfile=${parfile%.[pP][aA][rR]2}
fi

if [ $newfile == $oldfile ]
then
continue
else
/usr/local/bin/newsunpack.py $parfile
oldfile=$newfile
fi
done
return 0
}

if [ $# -eq 1 ] && [ -d $1 ]
then
pushd "$1" > /dev/null 2>&1
ls *[pP][aA][rR]2 2> /dev/null | while read parfile ; do
if [[ `echo "$parfile" | grep '..vol[0-9]\+[+-][0-9]\+.[pP][aA][rR]2'` ]]
then
firstfile=${parfile%.vol*[-+]*.[pP][aA][rR]2}
else
firstfile=${parfile%.[pP][aA][rR]2}
fi
/usr/local/bin/newsunpack.py $parfile
unpack $firstfile
popd > /dev/null 2>&1
break
done
exit 0
fi

Notice the break statement near the end. I’m afraid that is because I have not been able to find a way to read just the first par2 file in a directory. So I am using a while loop, and then simply breaking out of it at the end so it does not do any looping at all. It’s not pretty, but it works!

I apologize if it looks like gobbledygook, I am not using those <pre> tags anymore, as I have found that using them makes the code continue all the way off the page, and end up somewhere in China. And I just don’t know of any better way to post code examples in my posts, sorry :(


Version 0.1 of the script for automatic par2 repair and unpacking

Sunday, 29 October 2006

I’ve decided to call it “par2unpack.sh”, a contraction of “par2verify” and “newsunpack”. I’m not going to post the whole script here, just the two methods for normal directory unpacking and recursive unpacking.
If anyone wants to try it in a shell script, be my guest, but you do so at your own risk! If it deletes every single file in your home directory, or causes your smelly socks to knock out small children and Republicans under the age of five, then that is *your* problem ;)

All you need to do is open a text editor, put #!/bin/bash at the top and copy paste the code below. And you obviously also need newsunpack.py in the directory /usr/local/bin for it to work. Then save the file and make it executable. That would be a quick way to try it, I’m still working on the exception handling side (e.g. what to do when the wrong number/type of arguments are provided). And I’m still learning, so here goes nothing:


# If one argument is passed, and it is a directory, then unpack 
# using the .par2/.PAR2 files. TODO: currently does not support 
# automatic unpacking if you *only* have parity archives with 
# .volXX+YY.par2 extensions. Use manual unpacking instead.
if [ $# -eq 1 ] && [ -d $1 ]
then
ls $1/*{.par2,.PAR2} 2> /dev/null | while read \
parfile;
do
# Grep matches any file ending in .volX+Y.par2 and #.volX+Y.PAR2, 
# where X and Y are a series of numbers, thus effectively eliminating
# them from the loop. 
if [[ `echo $parfile | grep '.*.vol[0-9]\+[+-][0-9]\+ \
.[pP][aA][rR]2'` ]] 
then
continue
fi
    
if [[ -f $parfile ]]
then
/usr/local/bin/newsunpack.py -d "$parfile"
fi
done
exit 0
fi

# If the -r option is used, the script recursively unpacks in sub-
# directories too, using find. But as before, it skips the volumes 
# with recovery slices (the files with ".volXX-YY"  in their 
# extensions).
if [ $# -eq 2 ] && [ $1 == "-r" ] && [ -d $2 ]
then
for dir in `find $2 -type d` ;
do	
ls "$dir"/*par2 2> /dev/null | while read parfile;
do
if [[ `echo $parfile | grep '.*.vol[0-9]\+[+-][0-9]\+ \
.[pP][aA][rR]2'` ]]
then
continue
fi				
if [[ -f $parfile ]]
then
/usr/local/bin/newsunpack.py -d "$parfile"
fi
done
done
exit 0
fi

[EDIT]
It seems I just cannot post code on this blog or something. Using <pre> tags makes it possible for code to extend all the way off the text area, and onto the links and everything, on the right. Which, needless to say, is extremely annoying. I’ll have to do some searching on the web to find a solution for this major inconvenience.
[/EDIT]
[EDIT2]
I’m now using backslashes (“\”) to make bash continue to next line. For now it’s the best way, it seems, I can fit the code on the page, but the code should still work.
[/EDIT2]


The first part of the script for automatic par2 repair and rar unpacking

Friday, 27 October 2006

Well, I’ve worked my way through some bits of the “Advanced Shell-Scripting Guide”, and written the first part of the script. When writing it, I experience much difficulty making it match only files ending in .par2 (e.g. “file.par2”) and not files ending in volXX+YY.par2 (e.g. “file.vol00+13.par2”). So I used grep. My style is probably not very good, and there is undoubtedly both much to improve and much to add. Hopefully I’ll learn from my mistakes.
Anyway, for the interested here is the most important bit I’ve done so far. It recursively finds all files ending in .par2, and then repairs/unpacks the binary. But it skips files with a .volXX+YY.part2 suffix, which is necessary if you do not invoke newsunpack.py with the -d option.

And as you can see, I am now able to make wordpress respect my indentations by using <pre> tags. Yay!


if [ $# -eq 2 ] && [ $1 == "-r" ] && [ -d $2 ]
then
for dir in `find $2 -type d` ;
do
ls "$dir"/*par2 | while read parfile;
do
if [[ `echo $parfile | grep '.*.vol[0-9]\++[0-9]\+.PAR2'` ]] || \
   [[ `echo $parfile | grep '.*.vol[0-9]\++[0-9]\+.par2'` ]]
then
continue
fi				
						
/usr/local/bin/newsunpack.py "$parfile"
		
done
		
done
exit 0
fi


Newsunpack.py: the solution to the automatic par2 verifying and rar unpacking problem?

Wednesday, 25 October 2006

Finally, I have found a possible solution! Whilst searching on google, yet again, for a program that will automatically verify par2 files and extract rar files, I came across a program called “newsunpack”. It is written by someone named “Freddie”, who is clearly a highly competent python coder, have a look at the website “MadCowDisease“.
His program makes something like this possible, which I have not been able to do with hellanzb:


~$ ls ./*.par2 | while read par ; \
do /usr/local/bin/newsunpack.py -d "$par" ; done

Or something along those lines, this is just an example, I have not tested this command. But if newsunpack.py is in the specified path, and made executable, then I *think* that should verify/fix and extract all rar files with par2 recovery blocks (in the directory wherein it is executed) and deletes all the rar/par2 files afterwards.

In any case, this forms an excellent basis for a script so simple, even an idiot like myself will probably be able to write it without too many awful mistakes. Perhaps, if I manage to write it, I will post it here, for your entertainment. ;)


The problem with hellanzb and local processing

Wednesday, 25 October 2006

Well, in spite of my attempt to use hellanzb in a script to automate par2 verifying and unrar-ing of my usenet downloads, I have still found an insurmountable problem. Of course, hellanzb was never designed with this particular use in mind, so I suppose I should not be too surprised.

The problem is defined as follows. Whenever I invoke hellanzb on a directory to process, containing many different downloads, it will verify all the par2 recovery blocks just fine, as long as they are complete. If, however, it finds a problem with one of the par2 recovery blocks, then it just stops right there and does not move on. It also does not extract the downloads that it just verified as having complete data integrity. Not good.

It seems that hellanzb, being designed for directories containing only one download at a time, is also treating directories with many downloads as if it is dealing with a single download! So when it finds a broken par2 block, then why indeed move on? That does seem logical. So for the kind of local processing I have in mind, hellanzb would really need a new and improved method.

In the mean time, I am occasionally reading from the “Advanced Bash-Scripting Guide”, as I try to write a shell script that does a better job. And searching google of course, to make sure I’m not reinventing a wheel or anything. But so far, no luck.