473,395 Members | 1,488 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

LOAD vs. IMPORT


Hi.

Can any of you explain the major differences in LOAD and IMPORT in
laymen terms?

I've read the DB2 docs: "[IMPORT] Inserts data from an external file
with a supported file format into a table, hierarchy, or view. A faster
alternative is LOAD; however, the load utility does not support loading
data at the hierarchy level."

What does "loading data at the hierarchy level" imply? I have 5 tables,
no (enforced) referential constraints. The biggest table has 500.000
rows and if possible I would like to avoid locking the table for long
(rather return an empty result set that have user wait).

Given my situation, what are the pros and cons of the two?

Thanks.

Morten

Nov 12 '05 #1
13 17773
use...@kikobu.com wrote:
Hi.

Can any of you explain the major differences in LOAD and IMPORT in
laymen terms?


The biggest difference is that import is a logged operation - similar
to doing your own inserts, while load bypasses the logs and adds the
data directly to the table. I think the issue about hierarchies is
regarding the implementation of sub-types through db2. Doesn't sound
like this is what you're doing.

Pros of each for your situation would probably look like:

import
- pros: best solution for concurrency & recoverability.
- cons: slower of the two solutions, and could result in quality errors
to queries while it is running - due to partial data caused by multiple
commits.

load
- pros: fastest solution
- cons: poor solution for recoverability in your situation, and
concurrency limitations (note though when used with "ALLOW READ ACCESS"
concurrency is usually fine).

Another question might be whether your concurrent queries will return
erroneous results if you load or import just a single table at a
time....
Ken

Nov 12 '05 #2
us****@kikobu.com wrote:
Hi.

Can any of you explain the major differences in LOAD and IMPORT in
laymen terms?

I've read the DB2 docs: "[IMPORT] Inserts data from an external file
with a supported file format into a table, hierarchy, or view. A faster
alternative is LOAD; however, the load utility does not support loading
data at the hierarchy level."

What does "loading data at the hierarchy level" imply? I have 5 tables,
no (enforced) referential constraints. The biggest table has 500.000
rows and if possible I would like to avoid locking the table for long
(rather return an empty result set that have user wait).

Given my situation, what are the pros and cons of the two?

Thanks.

Morten

The "hierarchy level" refers to typed table hierarchies. Nothing you
have to worry about.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #3
"kenfar" <ke****@gmail.com> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
The biggest difference is that import is a logged operation - similar
to doing your own inserts, while load bypasses the logs and adds the
data directly to the table. I think the issue about hierarchies is
regarding the implementation of sub-types through db2. Doesn't sound
like this is what you're doing.

Pros of each for your situation would probably look like:

import
- pros: best solution for concurrency & recoverability.
- cons: slower of the two solutions, and could result in quality errors
to queries while it is running - due to partial data caused by multiple
commits.

load
- pros: fastest solution
- cons: poor solution for recoverability in your situation, and
concurrency limitations (note though when used with "ALLOW READ ACCESS"
concurrency is usually fine).

Another question might be whether your concurrent queries will return
erroneous results if you load or import just a single table at a
time....

Ken


Keep in mind that since import does inserts, it could fire triggers defined
on the tables. Load will not fire any triggers.

If doing an import, you should use the commitcount parm to do a commit every
1000 rows or so if you have a large amount of data to import. This will keep
the logs from filling up.
Nov 12 '05 #4
In article <11**********************@g44g2000cwa.googlegroups .com>, "kenfar" <ke****@gmail.com> writes:
use...@kikobu.com wrote:
Hi.

Can any of you explain the major differences in LOAD and IMPORT in
laymen terms?


The biggest difference is that import is a logged operation - similar
to doing your own inserts, while load bypasses the logs and adds the
data directly to the table. I think the issue about hierarchies is
regarding the implementation of sub-types through db2. Doesn't sound
like this is what you're doing.

Pros of each for your situation would probably look like:

import
- pros: best solution for concurrency & recoverability.
- cons: slower of the two solutions, and could result in quality errors
to queries while it is running - due to partial data caused by multiple
commits.

load
- pros: fastest solution
- cons: poor solution for recoverability in your situation, and
concurrency limitations (note though when used with "ALLOW READ ACCESS"
concurrency is usually fine).

Another question might be whether your concurrent queries will return
erroneous results if you load or import just a single table at a
time....
Ken


A big con for Load that we ran into was:
- load would leave the tablespace in 'backup pending' mode, thus requiring
a backup to continue to use the table
- you could use the NONRECOVERABLE option to avoid this, but then you
couldn't rollforward any journals after the load.
- load also needed to acquire a super exclusive lock on the table

Once we went production with our warehouse, we had to convert all of our
LOAD scripts to use IMPORT.

This is LUW, 8.1.6 I don't know if 8.2 has the same restrictions...

Doug

Nov 12 '05 #5
Ian
Doug Crowson wrote:

A big con for Load that we ran into was:
- load would leave the tablespace in 'backup pending' mode, thus requiring
a backup to continue to use the table
- you could use the NONRECOVERABLE option to avoid this, but then you
couldn't rollforward any journals after the load.
You need to use COPY YES in order to perform a recoverable load and
avoid placing the tablespace into backup pending state.
- load also needed to acquire a super exclusive lock on the table


Yes, but if you are doing an online load, the Z-lock on the table is
only held for a short time.

Nov 12 '05 #6
>> A big con for Load that we ran into was:
- load would leave the tablespace in 'backup pending' mode, thus requiring
a backup to continue to use the table
- you could use the NONRECOVERABLE option to avoid this, but then you
couldn't rollforward any journals after the load. You need to use COPY YES in order to perform a recoverable load and
avoid placing the tablespace into backup pending state.


Yeah, i try very hard to avoid using load on a transactional database.
In the case of my warehouse, it is non-transactional - and the load
files are the backups: if a recovery is required we move the compressed
files from archive to input, and the loader takes care of it.
Simplifies most things.
- load also needed to acquire a

- load also needed to acquire a super exclusive lock on the table

Yes, but if you are doing an online load, the Z-lock on the table is
only held for a short time.


yeah, i've only really had problems with load allowing read access when
the server was getting hit by a massive barrage of like 60,000 queries
- to drive canned reports. Other than that, a long lockwaittime of 120
seconds or so (assuming average query duration of 5 seconds) has worked
fine.

One other thing I forgot to mention: the insert_update import option
is really handy, and i'm often now using it for smaller volume (<
100,000 row) table ETL operations. Especially when concurrency is
tricky.

Nov 12 '05 #7
Ian
kenfar wrote:

One other thing I forgot to mention: the insert_update import option
is really handy, and i'm often now using it for smaller volume (<
100,000 row) table ETL operations. Especially when concurrency is
tricky.


Obviously you've been around the block a few times, but from a
concurrency perspective, you do realize that IMPORT (by default)
takes an exclusive lock on the table it is writing to, right?
This was a MAJOR concurrency issue; I had to write a replacement
for the import utility to avoid this in V7.2.

One of the V8 fixpacks (finally!) allowed you to work around this
requirement, and FP9 added the 'ALLOW WRITE ACCESS' option to IMPORT.

Nov 12 '05 #8
Thanks for all the input, it has been really insightful. I've decided
to go with LOAD first (COPY YES), and see how long the job takes. There
will be hardly any users on the system at the time I do the load, so if
I can do it below 1 minute, it's okay.

If this does not work, I'll wipe the table first (empty LOAD) and then
do an IMPORT. As mentioned, responsiveness is more of an issue than the
user seeing wrong/missing data for a few seconds.

This job is the only that ever writes to the tables.

Thanks.

Morten

Nov 12 '05 #9

"... IMPORT (by default)
takes an exclusive lock on the table it is writing to" - if I do an
IMPORT with REPLACE option, and someone tries to SELECT from the table
while I'm importing, their SELECT will wait until..? Next commit from
the IMPORT or until the job is done?

Thanks.

Morten

Nov 12 '05 #10
us****@kikobu.com wrote:
"... IMPORT (by default)
takes an exclusive lock on the table it is writing to" - if I do an
IMPORT with REPLACE option, and someone tries to SELECT from the table
while I'm importing, their SELECT will wait until..? Next commit from
the IMPORT or until the job is done?


Not positive on this one - but I don't think this will work for you: I
assume that the data previously in the table is completely wiped out by
the time of the first commit. So, you would likely want to have this
operation be all or nothing via load or by only commiting once.

An alternative in this scenario might be the insert_update option -
which will only create row locks if you use allow write access (thanks
Ian for the reminder). Of course, that won't delete data.

ken

Nov 12 '05 #11
Thanks for the input. If the REPLACE option is used, IMPORT will
truncate the table when it starts (as far as I can read the docs). The
question is now, whether IMPORT locks the table during the entire
process, or it allows pending read processes to run in between commits.
I need to delete the existing data (thus the REPLACE option), so I'm
afraid an "all or nothing" scenario is very hard to make using
LOAD/IMPORT.

Nov 12 '05 #12
Welp.. Import does not allow me to read in between commits. But it
suprises me that it is so extremely slow. I have written a client in
Java which using JDBC is very much faster (20-50 times) and that really
surprises me. The Java client uses batching (100 inserts sent over the
wire at a time, then commit) and stored procedures, the IMPORT job
commits after each 500 rows. I guess the IMPORT job sends one insert at
the time across the wire.

Nov 12 '05 #13
Ian
us****@kikobu.com wrote:
Welp.. Import does not allow me to read in between commits. But it
suprises me that it is so extremely slow. I have written a client in
Java which using JDBC is very much faster (20-50 times) and that really
surprises me. The Java client uses batching (100 inserts sent over the
wire at a time, then commit) and stored procedures, the IMPORT job
commits after each 500 rows. I guess the IMPORT job sends one insert at
the time across the wire.


Yes, by default IMPORT does 1 row per insert statement. You can say,

import ... of del MODIFIED BY COMPOUND=x ...

(I think 0 < x <= 100)

This allows multiple rows to be written in each INSERT statement and can
often result in better performance.
Nov 12 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: davvid | last post by:
hello, please I would like to know if it's possible to load data in hierarchical table structure with identity column used as pk (for all the tables) : I mean table root (Id_root(pk,identity...
2
by: jay | last post by:
hi, Question on Load/import command. consider a sample table create table table_name ( col1 timestamp not null default current timestamp, col2 int, col3 int, col4 int, primary key(col1) );...
2
by: Ravi | last post by:
Hi friends, I have to lode/import the data from files, is there any command to lode/import, in same time ( with singal command) iam using db2 8.2. Thanks In advance, Ravi.
7
by: Eric | last post by:
Hi, Any suggestion to Rapid Application Dev with postgreSQL ? I have a small app to make. I would prefer open-source solution if possible. I also would like to developp on my linux box but...
1
by: aj | last post by:
DB2 WSE LUW 8.1.5 A few questions: Why doesn't IMPORT have IDENTITYOVERRIDE support like LOAD does?? Why doesn't LOAD have CREATE INTO (ala IXF) like IMPORT DOES? Why doesn't: db2 -tvf...
6
by: Hemant Shah | last post by:
Folks, Today, I was exporting a table in one database and then importing it in another database. The table in destination database was missing one column (my mistake while creating the table),...
1
by: jrickard | last post by:
Hi, I'm getting an SQL3193N on DB2 v8 (fixpak 11)... SQL3193N The specified view or materialized query table cannot be updated. You cannot LOAD/IMPORT into this view or LOAD into this...
1
by: sai1001 | last post by:
hi, I am trying to load the data from flat file to temp table in different ways. import from 'c:\OCT SC REPORT TAKE 2.csv' into session.kfs_file; load from 'c:\OCT SC REPORT TAKE 2.csv' of ASC...
0
by: rradhak | last post by:
Hi, We use udb version 8.1 fixpack 11 under AIX 5.3 Is there a way to use load/import against temporary tables? DB2 does not recognize session tables. Is there any workaround?
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.