Wednesday, February 14, 2007

Using DB2 Load to Batch Upload Data (DRAFT)

UPDATE 1: I've since found out that although DB2 LOAD is quick, it's more complicated than the documentation implies. It also gets you into an inconsistent state nightmare really easily. I'm investigating DB2 IMPORT as an (slower but more reliable) alternative.

UPDATE 2: A DBA friend has just given me a command to use to list all the tables in DB2 which are in an inconsistent state. It is as follows:

select substr(tabname,1,30),status from syscat.tables where tabschema='[your schema owner]' and status='C'

On our current project we have the need to load large volumes of data into DB2. We also dont want to sit around for years while we do it with SQL files. We used DB2 Loader to speed things up.

Firstly we needed to create a csv file for each table where we were to load our data. Luckily we had a set of java classes which we'd used to generate our SQL files so we simply added some additional code to write the same information comma seperated. I'll not provide any more detail. It's kind of obvious what this means.

Next, we scripted our calls to the "db2 load" command so that all our table data could be loaded with a single command. An example script follows:

echo "Connecting to the database..."
db2 "connect to [your_database] user [db_user] using [db_password]
LOG="load.log"

echo "Loading data into mytable..."
db2 "load from [your_data.csv] of del replace into [mytable] (c1, c2, c3)" > $LOG
db2 "set integrity for mytable immediate checked" > $LOG

NOTE: It assumes that we are loading into a table called mytable as follows:
CREATE TABLE mytable (c1 VARCHAR(30), c2 INT, c3 CHAR(1))
and that the csv file looks like this:
Hello world, 1, Y
Hi again, 2, N

No comments: