There are some things to consider when setting up a Netezza backup approach, as well as restore for one of the available Disaster Recovery (DR) options. Typically Netezza is hosting rather large databases, so a hefty amount of storage is going to be required to back them up, and writing to this storage can take some time as well. Note that all data stored on Netezza is compressed in a proprietary hybrid compression algorithm that typically achieves around 3-4x compression ratios. Netezza backups are also stored in a compressed format – however in a different algorithm.

One common method of creating a warm DR system is via backup shipping; taking a backup in one Netezza system, shipping the backup to another system, and restoring the backup into that environment. This is commonly done with the UAT environment acting as DR in case of the event where the Production system fails.

netezza backup

The picture simplifies the architecture which typically involves a database storage box in between; a common case is the backup being written to an EMC device in one datacenter, and replicated out to another EMC device in the other datacenter. Scripts check for replication completion, and restore full and incremental database cuts upon availability after the replication completes.

Netezza benefits from having multiple parallel backup threads take place, this is configurable although most commonly used is 4. Separate from this may be where these threads are chosen to write to, as how EMC storage containers are used may have effects on file replication times or in prioritizing some databases vs. others. In the ideal scenario or in larger database instances, the backup itself is done on a separate backup entirely than the one used for typical database ETL and querying. The database is ideally then set up for large data throughput with things such as jumbo frames configured. Netezza write to a storage device is also impacted by both the storage device physics as well as configuration; for example EMC DataDomain devices recommend multiple IPs to force multiple nfs v3 tcp connections, although in practice, man organizations set up their devices to present a single IP. Although common to use these devices, much of their benefit is in their ability to dedupe data – which is not going to happen when for Netezza data as it has already been compressed and will be in a proprietary binary format. So other methods of fast storage should be considered as well, without any favor paid to the dedupe functionality does little to aid a Netezza database.