Fooling Around With RubyHaving a healthy relationship with the languageEarth Date 2009.01.20 |
Firstly I must clarify that, though fun for me on this post, Ruby is not a woman. Ruby is a pretty nice language that I am coming into awareness on and this post covers a solution using Ruby. There are at least two things you might derive from this post:
1. You can recover hard deleted mp3 files from an ntfs mount in linux with ntfsundelete, banshee, and Ruby.
2. My music taste is a little dated.
On to the story...
I had just ripped my Carpenters CD to MP3 in Windows and had transferred it via Flash Drive to my Linux machine's Jump Drive. I hummed along to "For All We Know" and began deleting files from the Flash Drive that I didn't need now that they were on the jump drive, conserve space you know. "TheCarpenters" deleted, "NIN" deleted, "Steve Vai" deleted, "01 - HolodeckChoices" deleted, "ELO" deleted. I thought to myself "HolodeckChoices has hundreds of files in it... why would i have that on a little flash drive?". I opened up the root of the jump drive and HolodeckChoices was gone. Imagine the horror and then have it affirmed by the fact that the ONLY instance of some of these mp3 files is on that Jump Drive. Then compound it with the fact that I was using Shift+Delete... instant vaporization.
I flipped to and from the Jump Drive root and the Flash Drive root and realized the only folder lost was the HolodeckChoices folder which contained a huge collection of tunes from trashed CDs, Cassettes, and some from downloads... needless to say it didn't look good for me to revive the music from the crypt. I Googled anyway, hoping something would come up allowing me to restore my deleted files from an NTFS volume on the jump drive... but in linux. I ended up finding a little utility in linux called "ntfsundelete" reading through the docs it was exactly what the doctor ordered. As long as my machine had not been shut down and the disk had not been overwritten ntfsundelete would recover those deleted files with their original paths and everything. AWESOME!!
I ran the utility on a couple files with good results, they each came back with original filenames. The tool would work but it only runs one file at a time... no bulk mode. I've pretty much resolved at this point to begin using Ruby scripts for my automation tasks these days since it is my latest learning curve. I opened up my handy dandy gedit app and began wrapping a little Ruby, digging through books and googling on the stuff I didn't know... this was going to be a really short script. I was reading through one of my online resources when I heard a loud beep come out of my laptop. I blankly stared at the screen wondering what I had done to get a beep and suddenly realized there was no power cord trailing from the computer... the battery is dead! I jumped up and ran to the cube next door for my power supply and as i rounded the corner with it I saw the screen go blank. I approached and sat before the quiet, dark computer and became nearly lost in despair. Then I reminded myself that worse things had happened, I had an entire disk grenade before. That made me feel better.
From this point forward I'm going to relive the thought processes that I went through to attempt my file recovery... leaving out the truly wasted research and failed experiments along the way...
After giving the ThinkPad juice and firing it back up I did my ntfsdelete test again this time when I passed in the file number the filenames came back as file numbers... file numbers assigned by the file system. This would be a setback since the filenames are the only way to know what most of the files contained. (I'm not really vigilent on embedding the Media Information. I went back to work on my ruby script for a bulk recovery. First I needed to get a list of the files that ntfsundelete could give me a 100% recovery on:
rwheadonTPT60:/home/rwheadon # ntfsundelete /dev/sdb1 -f > deletedFiles.txt
This created a list of all the deleted files on my jumpdrive...
Inode Flags %age Date Size Filename
______________________________
16 F..! 0% 1969-12-31 0
17 FN.. 0% 2008-06-10 2682 bootex.log
23 F..! 0% 1969-12-31 0
135380 FN.. 100% 2008-02-23 2617344 <none>
135384 FN.. 100% 2008-02-23 2277376 <none>
136089 D... 0% 2009-01-06 0 <none>
136090 FN.. 100% 2000-09-29 3467389 <none>
136091 FN.. 100% 2000-09-29 4528142 <none>
now to get only those with 100% accuracy I just pasted the list into a spreadsheet and deleted all the lines that weren't 100%. Now I *should* have just written a ruby script to go through the file and return the Inode for each line that had "100%" in it... but I erred to the (arguable) "easy" route. Once I was down to the lines with "100%" in them I just highlighted the first column, copied, and pasted into gedit. I then highlighted the list of numbers and did a replace on "\n" with "," and I had the makings of a string array. Put it all in a ruby script...
#this ruby script will use a file to call the ntfsundelete command:
#ntfsundelete /dev/sdb1/ -u -i 16-147232 -f -o {procNum}.recovered
theTime = Time.new.to_s
rtFileLog = theTime.gsub(/[:-]/,"").gsub(/\s/,"") +".log"
listOfNumbers = [135380,135384,...,138652]
listOfNumbers.each{ |num| system("ntfsundelete /dev/sdb1 -u -i #{num} -f -o #{num}.mp3") }
... and we are ready to do some bulk recovery. I must say that coming from another VBish scripting language that only provides file operation through "shell" I was totally excited about Ruby's system command that actually waits for a return from the system before moving on. With my handy little script I went ahead and did the recovery and 273 files were undeleted from the jump drive and into a folder on my HDD.
Now I needed to get all those files renamed, so I plugged in my Bose Buds and got ready to sample all those files through Banshee and do some renaming. I dragged the whole folder into my banshee window and the "File System Queue" loaded up with the music files. Then I noticed something interesting... about 150 of the files had artist - song information!!! I exported the file system queue into a playlist and got the following type of data:
[playlist]
File1=/home/rwheadon/recoveredFiles/136149.mp3
Title1=Unknown Artist - Unknown Title
Length1=199
(...)
File5=/home/rwheadon/recoveredFiles/136157.mp3
Title5=Ann Wilson - The Best Man in the World
Length5=204
(...)
Excellent! this is something I could work with. By going back into Banshee I sorted my list by Title and all those "Unknown" Songs got all mashed up together allowing me to work through them easily. I coursed through the volume of music listening and defining Artist/Title/Album information. As I moved through the list the revised entries moved out of my "Unknown" grouping and life was good. After revising all the "Unknown" selections I exported my playlist again:
[playlist]
File1=/home/rwheadon/recoveredFiles/136149.mp3
Title1=Al Green - Let's Stay Together
Length1=199
(...)
File5=/home/rwheadon/recoveredFiles/136157.mp3
Title5=Ann Wilson - The Best Man in the World
Length5=204
(...)
Now we are cooking!! Next stop we do a replace in gedit again.
1. Replace "\nTitle=" with ",Title"
2. Replace "\nLength" with ",Length"
3. Delete the line with "[playlist]" and save.
We now have a file format kinda like this:
File1=136149.mp3,Title1=Al Green - Let's Stay Together,Length1=199
(...)
File5=136157.mp3,Title5=Ann Wilson - The Best Man in the World,Length5=204
Here's where I get my hand slapped by my local resident rubyist. ("You could have done that in one line! C'mon Rich!")
#This is the expected incoming line format
#File1=136149.mp3,Title1=Al Green - Let's Stay Together,Length1=199