There are a lot of situations where you will need to move data from one database to another, or from one data format to another. For example, if you have a CSV file containing an address list, it will most likely not fit the format/style of your existing address table. Another situation is converting data from one e-store database to another -- most e-commerce software has similar data formats but different fieldnames and possibly different data structures. This article will show a few techniques for moving data accurately. The techniques should work with any good database administrative interface that has the capability to import data and to write/execute SQL statements.

Importing

The first thing I do when I need to move data into an already existing database is to bring the new tables into the database temporarily. It is faster and easier to work within the same database rather than using a text file or working across different databases. To keep the tables somewhat separated from the others, prepend the table with TEMP_ or something similar. This allows you to write SQL statements to import the data into your existing tables and make your data transformations.

Planning and Moving Data

Next, you will need to plan the migration process. Does the data map correctly field to field? If so, the process is easy and not much planning is required. The data can usually be simply appended to a table with a simple INSERT statement. You can write a SQL statement like this to move the data:

INSERT Products
(productName,
productPrice,
productCategory)

SELECT somefield AS productName,
someOtherField AS productPrice,
yetAnotherField AS productCategory)
FROM TEMP_products

Is there an auto-increment or identity column? If so, there could be some additional problems. If you are inserting data without the primary key (simply adding new data to the table without worrying about keeping an existing primary key) the data can be imported directly exactly as shown in the previous example, but if you need to keep your existing key field you could have some difficulty. Using a basic e-commerce situation as an example, let's say your existing table has productID values of 1-200, but the table you want to import has productID values of 90-300. Obviously there is an overlap in key values from 90-200. This might not be a problem if you can simply assign new productIDs, but if you need to keep your existing productID values you will need to make some adjustments. For example, if the data you are importing has associated orders (a customer has bought X items of productID "93"), then you want to keep this existing imported productID, but still give the product a new productID to keep it consistent with the database. In this case, you can still do a fairly simple insert, but you will need to plan these other steps along the way:

  1. Add a temporary field to the table that you are importing TO: old_productID
  2. Import the data as shown above, but move the existing productID field into the new old_productID field
  3. After importing your Orders or OrderDetails table, you will need to create an UPDATE statement to update the productID field to match the NEW productID:

UPDATE Orders
SET Orders.productID = Products.productID
FROM Orders, Products
WHERE Orders.productID = Products.old_productID

If there is no overlap in primary key data, you can simply import your data as described in the first situation, however you will need to turn off the identity insert. The following works for SQL Server:

SET IDENTITY_INSERT Products ON

INSERT Products
(productID,
productName,
productPrice,
productCategory)

SELECT productId,
somefield AS productName,
someOtherField AS productPrice,
yetAnotherField AS productCategory)
FROM TEMP_products

SET IDENTITY_INSERT Products OFF

If you are using a database other than SQL Server, you can temporarily set the properties of the primary key to be a regular INT data type, then set it back to an auto-incrementing field after the data is inserted. Note that this only works when there are no overlapping key fields.

Is the data normalized? What if your Products table does not have a place for a product category -- product categories are stored in a many-to-many table -- but the table you are importing has categories stored with the product. Figure 1 shows the data relationship for your old table, and Figure 2 shows the new data relationships.

Figure 1: Old data relationships
Figure 1: Old data relationships

Figure 2: New Data Relationships
Figure 2: New data relationships

Assuming you have a Products table, Categories table, and ProductsCategories table, you'll need to follow these steps:

  1. Add a temporary field to the Categories table called old_categoryID
  2. Import categories using a statement like this:

INSERT Categories (CategoryName, old_categoryID)

SELECT somefield AS CategoryName,
idfield AS categoryID
FROM TEMP_categories

  1. Import your products as the previous example, making sure your old productID field is stored in the new Products table as old_productID.
  2. Insert category relation data into the ProductsCategories table

INSERT ProductsCategories

SELECT c.categoryID, p.ProductID
FROM Products p
INNER JOIN TEMP_products t
ON t.productID = p.old_productID
INNER JOIN Categories c
ON c.old_categoryID = t.categoryID

After performing these steps, your old products are now in the new Products table, assigned new ids, and linked to the new Categories table.

Conclusion

With a few simple steps, you can usually accomplish a database import whether your data matches completely or not. The key to performing the operation is to maintain your old primary keys while giving the data new primary keys. This allows you to do inserts and updates easily.