Disaster strikes

Posted by David N. Welton Mon, 30 Jan 2006 11:31:00 GMT

Where I'm working, they wanted to try "netoffice", a PHP collaboration/time tracking/productivity thing. I downloaded the "dwins" version of it, and ran the install script after reading the instructions. Being a bit late, I hit the return key one too many times, and noticed some funny error messages about problems deleting files in /dev. Oh shit...

Turns out the script has the following line:

rm -rf ${INSTALL_DIR}/*

Simply brilliant. Thanks a lot, guys... So at this point I'm sitting there with two shells opened via ssh, no /bin/, no /etc/ and nothing much in /dev. For reasons I won't discuss here we don't have backups - suffice it to say that it wasn't my decision.

Not good. ssh didn't work any more, either in or out, because it needs files in /etc/ and /dev, and so I couldn't just copy new stuff in. Luckily, I had a fully stocked /usr/bin, and I did the first thing that came to my head: a small server in Tcl that copied its input into the specified file. Using that and netcat on the other side to send the files, I recreated enough of /bin to get MAKEDEV working in /dev. Phew... a little bit better, but /etc/ was still gone. I brought in a few more bits and pieces like /etc/passwd and /etc/group using my Tcl server, which was enough to get scp working from the machine. At that point, I brought in the rest of /etc. I was starting to think about going home when I ran the fateful script, so at this point it was pretty late, I was starving, so the rest will have to wait for tomorrow.

The biggest problem, I think, is how to restore the files in /etc at least to the pristine installed state required by the packages that own them? The other problem is that the password file uid's are now out of sync with the file system, which is causing problems here and there. Unless I think of a brilliant way to fix those two issues, I'll probably consider myself satisfied that I got things running to a point where I can get the important data off the system, set up some temporary services on other machines, and reinstall the whole thing.

This is all on Ubuntu systems, by the way.

5 comments | atom

Trackbacks

Use the following link to trackback from your own site:
http://journal.dedasys.com/trackbacks?article_id=disaster-strikes&day=30&month=01&year=2006

Comments

Leave a response

  1. Dave brondsema
    about 3 hours later:
    Ouch. I'd recommend trying to at least store configuration files in SVN/CVS, even if you can't get good full system backups.
  2. madduck
    about 3 hours later:
    first, check /var/backups for old passwd files. then, since you still have the dpkg database (right), reinstall all packages and use --force-confmiss as dpkg option (not sure how to tell APT do actually pass it on to dpkg [0]) to make it restore the configuration files. please let me know if you figure out the latter. 0. http://lists.debian.org/debian-user/2006/01/msg00795.html'>http://lists.debian.org/debian-user/2006/01/msg00795.html
  3. Jon Dowland
    about 16 hours later:
    PLEASE file a bug report on this netoffice so nobody else suffers the same fate :)
  4. David Welton
    about 16 hours later:
    Thanks for the suggestion, "madduck", but I decided that I wanted to be sure and just reinstalled it. Kind of anticlimatic after my hacking yesterday, but at least I managed to get what I could off the machine. Netoffice, to be clear, is not a Debian or Ubuntu package. This was something I downloaded 'cause the boss wanted it:-/ I decided to wait to file a bug report so as not to be as offensive as I might have liked...
  5. Stephen Touset
    about 20 hours later:
    Ouch. Someone at p.d.o also had a post regarding buggy shell scripts with gotchas like this, although they thankfully weren't hit by it. Sorry to hear about your wiped system.

Leave a comment