How Did We Restore the Blog Without a Backup

17th September was a tragic day for Me. The site was down, and we lost everything, including articles, related images, themes, and design. The VPS node was corrupt, and the server’s hard disk crashed. It was 2 A.m., and I was in a state of shock. Manav and I had no idea what we were going to do next. If you are wondering why the host never took a backup, they did, but the program used to take a backup never worked.

How Did We Restore the Blog Without a Backup

How Did We Restore the Blog Without a Backup

Call it dramatic, but suddenly, the Newsgator feed reader, which I use to read feeds, started popping up the feed updates. It triggered my mind, and I rushed to check my blog feed there. 80 Posts I had there, which made me a little happy, and then suddenly, I got the idea of Google Reader. I found some posts there on a quick check, but then I kept scrolling. Guess what? I had all 1080 posts there. Wow! That was one major luck.

Google Reader containing Posts and their date
Google Reader containing Posts and their date

Google Feed reader had stored all my posts from the day I started. That was so surprising. I could have never expected it. Now we were sure that the articles were there. The next was to plan out, and which was:-

Our Strategy:

  • To post all the posts with the exact URL.
  • To bring most of the comments (If possible).
  • Find the changes I would have made with time, as Google Reader doesn’t keep the changed things.
  • Find when the articles were written so we can maintain the chronology.

Hurdles :

  • To maintain the traffic.
  • Find all the popular posts.
  • Changes are made with time in the posts.
  • Recovering lost images.

Tools we used to get the posts back:

  1. Google Reader: For Approximate Dates, Post Content
  2. Google Cache: To get the change done on the articles.
  3. Google Analytics and WordPress Stats to find the popular posts.

Step: 1 Pulling out Popular Posts that changed with time

I started pulling all the popular posts one by one. The best way to use Google Search is with my exact title for the titles, as seen in Google Reader. This gets my posts in the first result. Next, I hit the cache link of the article to get the latest updates, if any.

Using Google Cache to get latest updates
Using Google Cache to get latest updates

We had to be quick on this, as Google Cache changes every week. I found whichever posts I knew I had changed and started posting updates. These steps, which I did for 250+ posts, started to slowly bring a good number of hits.

The hurdle of getting the exact url was getting over. We could pull the correct URL from both Feedburner and Google Cache. We could not find the exact date, but we used the dates we saw in Google Reader as approximate dates.

Step 2: Checking for Not Found Pages

is impossible to find all the posts that would get hit, especially the smaller ones. So I spent 2-3 hours tracking the not found pages.

  • I used the Alexa King 404 Plugin, which sends an email or feed for every unfound item on a WordPress blog.
  • I used Woopra for live tracking. This was very handy as I could find in real time which articles gave me 404. I immediately recorded everything down so I could pull it up later.
Woopra for Live tracking for 404
Woopra for Live tracking for 404

This gave me around 200 articles that used to get hits. Remember I was watching the traffic when it was high traffic time.

Step 3: Pulling Articles by Month:

By now, I already had more than 400 articles up and running. However, traffic was still down. Since Google did not find articles, I guess traffic was going down slowly.

The 404 errors were reduced, and the cache had already begun to change. So we had no choice but to start pulling from the Google Reader and putting it in individually. Trust me; it was so painful!

Step 4: Image Recovery (This was Major Luck)

Well, there is no means we could have recovered images, but when I started blogging, I did not use to have my images here. I had many of my images hosted in the photo bucket. This was a little relief. I decided to start re-making the images article by article until tomorrow.

I was in the office, and it alarmed me that I had my backups of images to an extent on my old laptop. I got back home and found it. Wow, I was so happy I did not reformat it after I got my new laptop.

So now most of the problems have been solved. I had the images, article, and dates. The problem was 80% solved.

Image Recovery
Image Recovery

Stage 5: Check and recheck everything.

I am doing this now as I write this post. Since I copied everything, I had to check if something was missing or if I was losing formatting and links. In this phase, I also put back the images. This should be over by tonight or by morning at the earliest.

We have recovered 85% and should be okay in a week or so as the traffic starts flowing back. So this is how Technospot.Net is back now. We have a new theme, which we are still working on tweaking and adding utilities everywhere.

Summary:

  • We found all our posts in Google Reader. Since I used to give full feeds, we had everything, thanks to Darren Rowse, whose one of the posts inspired me to switch from partial feed to complete feed.
  • Google Cache helped us in getting the modified articles.
  • Woopra and Alexa Kings 404 Plugin helped me to find quick, not-found pages.
  • Hosting Images on other servers is a good idea, but it depends.

What did I learn from this :

The major mistake I made was relying on the host backup. I had never had any issues with my hosting except this.

  • Even though I pay 10$ for the backup, they have given me direct access to those backups.
  • I should always take a backup of my Images, Database, and Theme on my side, which I have started doing again and will continue to do.
  • One critical thing I gained was a chance to revisit all my posts and fix many things. Even the images I am adding have their alt tags available. It would be interesting to see if this affects you.

The two weeks of painful time are coming to an end now. I had been through a lot, but it was a learning experience. I wonder if this happened next year, and I would have never checked my backup. I thought of sharing this with you all in case somebody loses his data, as I did; this is one good way.

In the end, I would also like to thank some people, especially people on Twitter, who have always asked me how it is going. My Wife has seen me sitting at the computer all the time since last week and has had no time to talk to her either. Thanks to Sampat, who helped me find things at places, and, of course, Manav, who has been so morally supportive of me.

That’s the story, and We are back!

30 COMMENTS

  1. That would have been a really challenging work! But still, good that everything is back now.

    Backing up is really essential, and i seem to be not taking much care about that.

    2 Weeks working on just the recreation process is just hard.

  2. Pretty tough to do all this tasks in few days! Kudos to you man.. Thanks for alerting us to make frequent backups..

  3. one thing you never need to do is updating from the cache…you know what? the feedbot would have fetched the updates as well…

    As an example you make a post check the feed, change the content(you do some updates) and then check the feed again..it would have changed…

    you could have confirmed me via twitter that you were doing restores from google reader and i would have helped you there…

    anyway you are done now and congrats for getting back on track…

  4. Rajesh, It would have updated only when there was a feed availble. By the time I discovered my site was gone and there was no feed. If I would have fetched again I would have lost everything right there.

  5. Iam still confused how were you able to get back all the comments if you lost all the data, ie commentors IP, other comment stats etc

  6. Hard work paid it all and Technospot is back with a bang, congrats both of you to bring it back.

    BTW whats your host doing now for this damage? I think most of the VPS have RAID protection for hard drives, then how come they dont have it?

  7. I’m glad that you were able to pull it off in the end. It’s understandable what kind of effort it must have required and certainly a lesson learned for you and for everyone who was following this – backup regularly at your end.

  8. Thank goodness for Google its Cache and its other web services. I think it would next to impossible to recover without all of it.

    Btw, will your host do something about their failed backup systems? I mean like a refund for that or something?

  9. @Jhay: Yes my host refunded around 160$ which includes mone for downtime of this month and money which I had been giving them for backup.

    @Abhijeet : Yes people should learn from this. Its very necessary to take backup.

    @Nirmal : I will ask my host about it. Not sure about it.

    @Amit : We did it manually like the post but very few.

    @Naryanan: he he we will plan something

    @rajesh,rockstar, chetan and amdhur. Thanks guys

  10. Google reader had all the content right..Google reader stores all your feed data in their database.If any updates had happened, i guess google reader would also have updated the post/s in their database.

    Comments is something amazing.How did you recover the comments?

  11. Excellent article buddy, and I can understand and see the pain you have gone through to get your blog back on its legs. But that said kudos to you on such a huge feat.

    Also please remember to backup your blog database regularly now :-).

  12. Congrats man. This is one of the biggest post recoveries I have seen till now. Its kinda cool though it was sad and tiring. And as Narayanan said we want the party. 😛

  13. Great post. When in need, you thought quickly and restored your blog. Thanks for sharing it with all of us. As a blogger, I can understand the pain of losing all that you have put together over the years.

    This also highlights the downside to this. Internet call have a long memory of things you say, so, make sure you say the right things!

  14. I have been through this when my hosting deleted all my data in mid 2007, at that time i also don’t have the backup.

    The only thing i was left with that time was some days html image of my blog on my local system, which helped me to reproduce some post, even then i lost most of my most of my data.

    I can understand the pain, and really commendable job on your part for making everything up.

    Great job ashish :), feel good to see technospot.net coming back with a bang

  15. None of your readers will become victim for the same pain which you faced… Anyway, a really good way to recover and you did everything in time… Its now better that you are back completely….

  16. Ashish, hats off to you for taking so much pain and recovering all of your posts.the grit and determination with which you pulled this off is amazing and inspiring. I remember reading your tweets early morning and late nights on you starting your work on recovering the data. And each time you posted a tweet on x articles updated, I would get all very happy. Also remember the way you explained how you were recovering it during the tweet up. Awesome stuff dude. Old Monk party soon!! 😛
    PS: I recommed WP-DB backup plugin: it creates your database backup on demand, as well as emails you the backup at specific times:every hour, every day, everyweek etc. Must have plugin IMO.
    Cheers mate!

  17. That was a great recovery! Congrats. Its really nice to see how much effort you put into this!

    And very thanx for sharing what you did, will come in handy for sure!

  18. Hats of to You guys, if some thing like this happen to my blog i am sure i will quit blogging .

    Amount of work required to recover 1000+ posts manually is unimaginable but you guys done it.

    Hope you guys are not suffering from blog crash nightmare any more. 😛

    Can any one have wordpress plugin which can automatically backup wordpress db and upload it to multiple private FTP server (Keith !!!) ??

LEAVE A REPLY

Please enter your comment!
Please enter your name here