Friday, December 4, 2009

Firebird database-corruption and what you can do about it

There are usually only two reasons why a database can be corrupted in Firebird: 1) You copied the fdb-file while the Server was accessing it and 2) You don't have forced writes enabled and the server shut down unexpectedly (e.g. power-outage).


In these two cases you often get error-messages like "Internal gds software-consistency check", "Database file appears corrupt", "Wrong record length" or similar when trying to access certain tables or datasets. This is quite bad and means that your database is corrupted and that you have to try to recover it. Because recovery may fail and other steps like using IB Surgeon may be neccessary, it's a good advice to first stop the Firebird server, copy the fdb-file and start the server again. This copy can be used to start over using a different approach or send to a database-repair-service.


Database recovery is done in four steps. Here's how you do it:


Validating the database


The first step for recovery is letting gfix validate the database structure. This is done with the parameters -v -f. A call to gfix could look like this:

gfix -v -f -user sysdba -pass masterkey server:/path/to/database.fdb

gfix may then report errors in the database-structure. When it does not, your database is fine and recovery is not necessary.


Preparing for backup


The second step is preparing the database for a backup, using gfix again. The parameter is -mend in this case:

gfix -mend -user sysdba -pass masterkey server:/path/to/database.fdb


Creating a backup


The third step is a critical one. If this step fails, recovery using tools shipped with Firebird may not be possible. In this step we will create a backup of the fdb-file using gbak. gbak may fail to create the backup because unrecoverable errors exist in the database. Creating the backup is the same as always:

gbak -b -v -user sysdba -pass masterkey server:/path/to/database.fdb /path/to/backup.fbk

Note that paths of the database are remote (that means the path has to exist on the server), while the path to the backup-file is local (and has to exist on the machine you are executing gbak at). When you are lucky, gbak finishes this task without errors. You can then be certain that your data has been saved!


Restoring the backup


The fourth and last step is to restore the backup as usual, again with gbak:

gbak -c -r -v -user sysdba -pass masterkey /path/to/backup.fbk server:/path/to/database.fdb


Finished!


Now everything should be fine. But how can we prevent this from happening again? I initially mentioned two reasons for database-corruption to happen, which both can easily be avoided.


Never copy a database-file when the Firebird-server is accessing it.


Or better yet, never touch the file when the Firebird-server is running! That's the safest way to ensure that your database-file is copied uncorrupted and your data stays safe.


Enable forced writes unless you have very good reasons not to do so


Forced writes are a mechanism by the Firebird-server to tell the operating system to disable any disk-caching mechanisms for this file. When forced writes are not enabled, Firebird does not know what parts of the file are physically stored to the medium. That's one reason why having databases on a network-share is not possible and/or highly discouraged. Because disk-caches are not invented without a reason, this comes with a drawback. Writing will be a tad slower with forced writes disabled, so there might be a reason you don't want to enable it. When you disable forced writes, you have to ensure that the server is connected to an UPS and will always properly shut down. This means hitting the reset-button is a big no-no with forced writes disabled!


When following these rules, it should never be neccessary to recover a database again.

5 comments:

  1. Thanks for the link to us, but I need to comment two things:

    1. if you try to make backup of the corrupted database, always use -g option. Because without this option server will initiate garbage collection, and backup can fail, if there are broken version chains.

    2. never use -c -r, because it is oxymoron, and -r option will kill you original database, since in the example commands broken database and restored database have the same name (database.fdb). Also, -r option will not work for FB 2.x, because now it is -replace. So, use -c and restore DB to the name like tmp.fdb.

    ReplyDelete
  2. How do I enable on-waith transactions to avoid Dead Lock?

    I had a bad experience last week with a deadlock that it made the DB slower and I had to backup and restore the DB to fix it, the backup took to long in the locked table. Firebird 2.5

    ReplyDelete
  3. For settling out troubles with problem mdf files you can use sql database repair. It uses contemporary ways of restoring .mdf files, the application starts under all PC software configuration and Windows OS. It demonstrates results of working with bad .mdf.

    ReplyDelete
  4. Use SQL Database Recovery utility repairs all your SQL server database tables, views, triggers, stored procedures, user defined functions, rules and indexes from the corrupted MS SQL database. You can try this tool from here: http://www.recoverydeletedfiles.com/sql-database-recovery-software.html


    ReplyDelete