Tuesday, February 27, 2007

DB2 Import Commands Failing With "transaction log for db is full" Errors

Trying to upload test data with the DB2 Import Utility. I have a .CSV file with 100,000 records in it. It used to work; then I added another column and now it fails with "transaction log for database is full" errors.

:-(

It turns out that it wasn't committing enough. In fact I didn;t know when I was committing at all. I discovered that the "sensible" defaults can sometimes not be so sensible. In order to explicitly set a commit count you need to add the following to your import command (highlighted in red):
db2 "import from [your_csv_file_name] of del commitcount 1000 insert into [table_name]([column A], [column B], [column C], ...)"

Wednesday, February 14, 2007

Using DB2 Load to Batch Upload Data (DRAFT)

UPDATE 1: I've since found out that although DB2 LOAD is quick, it's more complicated than the documentation implies. It also gets you into an inconsistent state nightmare really easily. I'm investigating DB2 IMPORT as an (slower but more reliable) alternative.

UPDATE 2: A DBA friend has just given me a command to use to list all the tables in DB2 which are in an inconsistent state. It is as follows:

select substr(tabname,1,30),status from syscat.tables where tabschema='[your schema owner]' and status='C'

On our current project we have the need to load large volumes of data into DB2. We also dont want to sit around for years while we do it with SQL files. We used DB2 Loader to speed things up.

Firstly we needed to create a csv file for each table where we were to load our data. Luckily we had a set of java classes which we'd used to generate our SQL files so we simply added some additional code to write the same information comma seperated. I'll not provide any more detail. It's kind of obvious what this means.

Next, we scripted our calls to the "db2 load" command so that all our table data could be loaded with a single command. An example script follows:

echo "Connecting to the database..."
db2 "connect to [your_database] user [db_user] using [db_password]
LOG="load.log"

echo "Loading data into mytable..."
db2 "load from [your_data.csv] of del replace into [mytable] (c1, c2, c3)" > $LOG
db2 "set integrity for mytable immediate checked" > $LOG

NOTE: It assumes that we are loading into a table called mytable as follows:
CREATE TABLE mytable (c1 VARCHAR(30), c2 INT, c3 CHAR(1))
and that the csv file looks like this:
Hello world, 1, Y
Hi again, 2, N

DB2 the Hard Way...

I'm having to get to grips with DB2. Here is a list of things I've learned the hard way so far...
  1. You don't put semicolons at the end of DB2 statements
  2. When doing lots of INSERTs put a COMMIT every 1000 or so. (As pointed out here)
  3. If you're uploading a lot of data consider using the DB2 LOAD utility. [More info on how to use it when I get it working]
  4. To get a nice graphical view of your data use something like SQuirreL or DBVisualizer (Update: don't try and use IBM's DB2 Connect, it leaves locks all over the place and locks in DB2 are pretty pervasive...)
  5. Don't try and do bulk uploads with triggers switched on
  6. Remember to create the statistics for your database tables (RUNSTATS SQL command)
  7. Find out what an error code means with db2= ? SQL-[xxx] NOTE: The angle brackets are part of the command
  8. Don't delete a load of rows all at once using "DELETE FROM..." - it'll go very slowly. Instead do in bunches of 10,000 rows. E.g. db2= DELETE FROM things WHERE things_id 900000; DELETE FROM things WHERE things_id 890000; etc; etc
  9. Or, if you can bear it, drop everything and start again from scratch...

Blog Entries I Would Write, Had I the Time...

I keep thinking of blogs entries I would like to write. Get things out of my brain and down on a page (of sorts). If I ever get round to it, my current output would include:

  • What I'd do with an Open Source Second Life Universe - There are a million uses I can think of for a virtual Second Life. None of which I'd ever get the chance to try out myself. Maybe I could write about them instead...
  • Tips for a First-Time-Java-Project - If I knew then what I do now, I wouldn't have made a bunch of mistakes.

  • Building JRuby with Netbeans and Testing it against Oracle - I have barely fiddled with it, let along built and ran test suites against it but then this is a wish list...
  • A Java Mobile version of my Expenses Tracker app (and get a decent phone to run it on)

Problem is, I never seem to have the time... Perhaps I should cut out all the dodgy Italian gore films I keep getting sent from LoveFilm?...





powered by performancing firefox

Saturday, February 03, 2007

My First Maven Plugin: A Unit Test Packager

I've just written my first Maven 1.0.x plugin. It's a monumentally simple affair but does what it needs to do quite elegantly; that is to take the Unit tests for your project and zip them up (along with your project's dependencies, the required Junit jars and some auto generated shell scripts) so you can run them anywhere. I don't have any personal hosted space to provide the actual plugin jar but here's the jelly code as a starter:


There's only one configurable (and that's because I ran out of time). You need to set the following property: suite.class.name= [the name of your test suite class]

[BTW, If anyone out there wants to offer me some place to host this I'd be dead greatful. Its a tiny litte jar. Tiny.]

EJB 3.0 Outside the Container, Inside the JVM - Part 3: Wrapping the Model

Welcome to part three of my blog on using EJB3.0 / JPA and Derby inside the JVM. The previous parts can be found here and here.

We're almost ready to go. However, we still need to wrap our domain model with a facade in order to manage interactions with it and prevent any JPA specific code creeping outside the model boundaries. We can do this by placing a session bean in front of our Claim POJO. (Why a session bean when we're not really going to use this within a Java EE container? Aha! Maybe something will come along in a later posting...)

We'll start by creating a new POJO class called ExpensesTrackerService. To make it a Stateless Session bean we simply the @Stateless annotation. We also need to add the following dependency to the project so that we can get the imports needed:

Now we need to provide a means to address and manage the Claim entity. In JPA this is via the EntityManager which we obtain via an EntityManagerFactory


Now that we have an EntityManager we can use it within our CRUD methods. Here:

Here:

And here:

Now all we have to do is build our maven project (run the jar:install maven goal) which will also copy the resulting jar containing our wrapped model to the repository. In the next entry we'll use it from within our Swing client

EJB 3.0 Outside the Container, Inside the JVM - Part 2: Configuring JPA Declaratively

Welcome to part two of this blog entry introducing using EJB3.0 inside the JVM. Click here to view the first part.

As we've seen, JPA can use our annotated POJO model to create a database for us, but first we need to give it some information on how to do it. We provide this information in a file called persistence.xml. Let's create a blank file with this name in {PROJECT_HOME_DIR}/expenses-tracker-model/
resources/META-INF/.
We also need to let maven know where this is so we add the following to our project.xml file:


This entry ensures that this file will be included on the build classpath.

Now lets add the content to persistence.xml. We'll start at the top:


The persistence-unit tag tells JPA about our (future) database. A single persistence.xml file can have mulitple persistence units but we will keep things simple and just stick to one. Attributes to this tag provide the persistence unit name (which we'll use to refer to it later in code) and the transaction-type. (We have the default). We'll only need the former as we'll see later on.

Enclosed within this top level is the tag which tells JPA which implementation will provide the actual persistence functionality - the "provider" tag. Our example uses the Toplink ORM tool but we could swap this to something like Hibernate simply by changing the contents of this tag.

Finally there is the tag where we declare our new entity class - "class". This tells JPA that we want it to be considered as part of this persistence unit. (Note: if you find JPA doing things you don't expect, check that you have listed all the classes you wish it to be aware of here. It can't work with what it has no idea about!)

Ignore the commented out tags. If you really want to know what they do then just google or yahoo! them.

This is all well and good but clearly my file is a little more complicated. Toplink will need some additional information in order to do its job and we can put this in this file as well. These go within the "properties" tags. You can see below that we use these to tell Toplink the driver class to use, the url for the database connection and well get to the third piece of magic later.