Fooling Around With Ruby

Firstly I must clarify that, though fun for me on this post, Ruby is not a woman. Ruby is a pretty nice language that I am coming into awareness on and this post covers a solution using Ruby. There are at least two things you might derive from this post:

1. You can recover hard deleted mp3 files from an ntfs mount in linux with ntfsundelete, banshee, and Ruby.

2. My music taste is a little dated.

On to the story...

I had just ripped my Carpenters CD to MP3 in Windows and had transferred it via Flash Drive to my Linux machine's Jump Drive. I hummed along to "For All We Know" and began deleting files from the Flash Drive that I didn't need now that they were on the jump drive, conserve space you know. "TheCarpenters" deleted, "NIN" deleted, "Steve Vai" deleted, "01 - HolodeckChoices" deleted, "ELO" deleted. I thought to myself "HolodeckChoices has hundreds of files in it... why would i have that on a little flash drive?". I opened up the root of the jump drive and HolodeckChoices was gone. Imagine the horror and then have it affirmed by the fact that the ONLY instance of some of these mp3 files is on that Jump Drive. Then compound it with the fact that I was using Shift+Delete... instant vaporization.

I flipped to and from the Jump Drive root and the Flash Drive root and realized the only folder lost was the HolodeckChoices folder which contained a huge collection of tunes from trashed CDs, Cassettes, and some from downloads... needless to say it didn't look good for me to revive the music from the crypt. I Googled anyway, hoping something would come up allowing me to restore my deleted files from an NTFS volume on the jump drive... but in linux. I ended up finding a little utility in linux called "ntfsundelete" reading through the docs it was exactly what the doctor ordered. As long as my machine had not been shut down and the disk had not been overwritten ntfsundelete would recover those deleted files with their original paths and everything. AWESOME!!

I ran the utility on a couple files with good results, they each came back with original filenames. The tool would work but it only runs one file at a time... no bulk mode. I've pretty much resolved at this point to begin using Ruby scripts for my automation tasks these days since it is my latest learning curve. I opened up my handy dandy gedit app and began wrapping a little Ruby, digging through books and googling on the stuff I didn't know... this was going to be a really short script. I was reading through one of my online resources when I heard a loud beep come out of my laptop. I blankly stared at the screen wondering what I had done to get a beep and suddenly realized there was no power cord trailing from the computer... the battery is dead! I jumped up and ran to the cube next door for my power supply and as i rounded the corner with it I saw the screen go blank. I approached and sat before the quiet, dark computer and became nearly lost in despair. Then I reminded myself that worse things had happened, I had an entire disk grenade before. That made me feel better.

From this point forward I'm going to relive the thought processes that I went through to attempt my file recovery... leaving out the truly wasted research and failed experiments along the way...

After giving the ThinkPad juice and firing it back up I did my ntfsdelete test again this time when I passed in the file number the filenames came back as file numbers... file numbers assigned by the file system. This would be a setback since the filenames are the only way to know what most of the files contained. (I'm not really vigilent on embedding the Media Information. I went back to work on my ruby script for a bulk recovery. First I needed to get a list of the files that ntfsundelete could give me a 100% recovery on:

rwheadonTPT60:/home/rwheadon # ntfsundelete /dev/sdb1 -f > deletedFiles.txt

This created a list of all the deleted files on my jumpdrive...



Inode Flags %age Date Size Filename

 ______________________________

16 F..! 0% 1969-12-31 0  

17 FN.. 0% 2008-06-10 2682 bootex.log 

23 F..! 0% 1969-12-31 0



135380   FN..     100%  2008-02-23   2617344  <none>

135384   FN..     100%  2008-02-23   2277376  <none>

136089   D...     0%    2009-01-06   0        <none>

136090   FN..     100%  2000-09-29   3467389  <none>

136091   FN..     100%  2000-09-29   4528142  <none>

now to get only those with 100% accuracy I just pasted the list into a spreadsheet and deleted all the lines that weren't 100%. Now I *should* have just written a ruby script to go through the file and return the Inode for each line that had "100%" in it... but I erred to the (arguable) "easy" route. Once I was down to the lines with "100%" in them I just highlighted the first column, copied, and pasted into gedit. I then highlighted the list of numbers and did a replace on "\n" with "," and I had the makings of a string array. Put it all in a ruby script...

#this ruby script will use a file to call the ntfsundelete command:

#ntfsundelete /dev/sdb1/ -u -i 16-147232 -f -o {procNum}.recovered

theTime = Time.new.to_s

rtFileLog = theTime.gsub(/[:-]/,"").gsub(/\s/,"") +".log"

listOfNumbers = [135380,135384,...,138652]

listOfNumbers.each{ |num| system("ntfsundelete /dev/sdb1 -u -i #{num} -f -o #{num}.mp3") }

... and we are ready to do some bulk recovery. I must say that coming from another VBish scripting language that only provides file operation through "shell" I was totally excited about Ruby's system command that actually waits for a return from the system before moving on. With my handy little script I went ahead and did the recovery and 273 files were undeleted from the jump drive and into a folder on my HDD.

Now I needed to get all those files renamed, so I plugged in my Bose Buds and got ready to sample all those files through Banshee and do some renaming. I dragged the whole folder into my banshee window and the "File System Queue" loaded up with the music files. Then I noticed something interesting... about 150 of the files had artist - song information!!! I exported the file system queue into a playlist and got the following type of data:

[playlist]

File1=/home/rwheadon/recoveredFiles/136149.mp3

Title1=Unknown Artist - Unknown Title

Length1=199

(...)

File5=/home/rwheadon/recoveredFiles/136157.mp3

Title5=Ann Wilson - The Best Man in the World

Length5=204

(...)

Excellent! this is something I could work with. By going back into Banshee I sorted my list by Title and all those "Unknown" Songs got all mashed up together allowing me to work through them easily. I coursed through the volume of music listening and defining Artist/Title/Album information. As I moved through the list the revised entries moved out of my "Unknown" grouping and life was good. After revising all the "Unknown" selections I exported my playlist again:

[playlist]

File1=/home/rwheadon/recoveredFiles/136149.mp3

Title1=Al Green - Let's Stay Together

Length1=199

(...)

File5=/home/rwheadon/recoveredFiles/136157.mp3

Title5=Ann Wilson - The Best Man in the World

Length5=204

(...)

Now we are cooking!! Next stop we do a replace in gedit again.

1. Replace "\nTitle=" with ",Title"

2. Replace "\nLength" with ",Length"

3. Delete the line with "[playlist]" and save.

We now have a file format kinda like this:

File1=136149.mp3,Title1=Al Green - Let's Stay Together,Length1=199

(...)

File5=136157.mp3,Title5=Ann Wilson - The Best Man in the World,Length5=204

Here's where I get my hand slapped by my local resident rubyist. ("You could have done that in one line! C'mon Rich!")

#This is the expected incoming line format

#File1=136149.mp3,Title1=Al Green - Let's Stay Together,Length1=199

require 'ftools'

puts "Usage: ruby formatParagraphs.rb\nExample: ruby formatParagraphs.rb myFileWithDelimitedParams.txt" if ARGV.size != 1

if File.exists?(ARGV[0]) then

    fin = File.open(ARGV[0],'r')

    fin.each_line do | line |

        parms = line.split(',')

        origFile=parms[0].split('=')[1]

        newFile=parms[1].split('=')[1]

        if File.exists?(origFile) then

            File.copy(origFile, newFile + "#{File.extname(origFile)}")

        end

    end

end

At 1:30 AM, after 5 hours of grinding through trauma and research and wrapping and testing, my mp3 restore was complete and I copied local restored files onto the Jump Drive. My wife was a little less than impressed that I had spent most of the night alone at work with Ruby, but I assured her that this is a relationship that will really benefit me in the long run. I'm not sure even today she would chuckle at my humor on this one.

I would be remiss if I didn't bring true poetic justice to this story by disclosing the fact that I found all 273 files in another folder (on the jump drive) about a week later. Apparently I had begun "flattening" my mp3 collection by beginning to copy mp3 files into a central folder and renaming them so that eventually all of my files would be on the same directory level instead of scattered through directory trees.

D15c0v3r3d L05T

p05t

Fooling Around With Ruby

Having a healthy relationship with the language