Friday, August 28, 2015

Importing/Loading data from CSV files into SAP Hana

SAP Hana and loading of data from CSV files

Use case: There is one machine with CSV file and another machine with SAP Hana database. You want to import data from the CSV file into the SAP Hana table. That's it.

There are at least 4 options how to import CSV files into SAP Hana.

  1. Import using SAP Hana Studio
  2. Import using control file and import statement
  3. Parse the file on your own and insert values using JDBC driver
  4. Import using Smart Data Integration
Lets compare all listed methods.

SAP Hana Studio

Import using SAP Hana Studio is manual approach of loading CSV file into SAP Hana. It is very easy and straight forward. The file is on your local machine, the studio will transfer the data into Hana table. The biggest disadvantages are the speed of the process and the disability of triggering the load automatically.

Import statement and control files

This method seems to be the fastest and can be automatized but... your file has to reside on Hana server. That means that you have to transfer your data first. If you deal with big amount of data then your data file might have 20 GB or even 100+ times more. You can use FTP to transfer the data but you will waste disk resources on Hana server. Other option is to share file system and this option might not be the ideal for you.

JDBC Import

So you think that all previous options are bad and you want to import your data using JDBC which is standard and easy to use. You can read your CSV file and push/insert the data using multiple threads. You will use batches, that means you will commit after few thousands of inserts. You can disable persistent merge on the target table and the import will be much faster. There is always one problem with this approach and it is wasting of database resources - connections, transactions. Wasting of such resources might cause some problems for your enterprise application.

Smart Data Integration

Smart Data Integration is probably very good solution for loading any data into SAP Hana. Your source data can be stored in different types of databases or even in CSV file but.. You will have to run SDI Agent on the machine which will read your source CSV data. Your account has to be connected with license that is eligible to download the client:) as it is mentioned here in this post https://scn.sap.com/thread/3707101


Thursday, August 13, 2015

SAP Hana is missing MERGE INTO so let's use UPSERT

MERGE INTO is great. SAP HANA SPS 10 has no MERGE INTO... So, let's use UPSERT - it might be fine for certain use cases, especially when your target table has no PRIMARY KEY. You can create PRIMARY KEY and UPSERT will use it to determine if the row should be inserted or updated. This example will show you how to do it:
create table a (x int, y int, primary key (x));

create table b (x int, y int);

insert into a (x, y) values (1,1);
insert into a (x, y) values (2,2);
insert into a (x, y) values (3,3);
insert into a (x, y) values (4,4);
insert into a (x, y) values (5,5);

insert into b (x, y) values (1, 10);
insert into b (x, y) values (2, 20);
insert into b (x, y) values (3, 30);
insert into b (x, y) values (40, 40);
insert into b (x, y) values (50, 50);

upsert a select * from b;

select * from a;
The last select will result in:
X;Y
1;10
2;20
3;30
4;4
5;5
40;40
50;50
So records were updated if the key has been already there and new records were inserted.