banner.jpg
Web Development and Tools Powered by WebSellerTM

 

Home

Tips
  Umlauts/UTF-8
  Normalization
  OemToAnsi
  MySQL Import
  Site Search
  City Desk
  DynDNS
  MXLookup
  SPAM

SDT

Support
  Downloads
  Contact Us

Impressum

German

RSS Feed

Site search:

Web Search

Made with CityDesk


MySQL Data Import

By Antje Binas-Holz
Last Update: Thursday, February 16, 2006

More information about phpMyAdmin and other tools can be found at PhpMyAdmin's Homepage.

Do you know the problem? You are about to migrate an existing project to MySQL while looking for an appropriate conversion tool. Utility ImpSql can be the solution for this problem. It supports DBF and CSV file import as well as all usual memo file formats like FPT (FoxPro, Comix), DBT (Clipper) and DBV (FlexFile).

ImpSql is a "waste product" of  SqlDBU dedicated to data file import for two different modes: Stand alone and Connection based.

Stand alone mode

When using ImpSql in stand alone (connectionless) mode data are converted without a connection to the target MySQL database. Instead of importing data to the appropriate table a text file containing MySQL import statements will be created. The resulting file can be imported into MySQL using a third party tool like  PhpMyAdmin or SqlDBU. Another interesting option of stand alone mode is using it as CGI application. Just type in the name of the file(s) you want to upload to the import utility and you will receive the resulting file as web download in return:

DBF or CSV file:
Memo file(.dbt,.fpt,.dbv):
Oem/Ansi Conversion:
Strings as UTF-8:
Suppress creation of primary key field RECNO:
Include deleted records
MySQL Version:

Notes: Please don't load up files bigger then 2 MByte. Don't use .xls files. This format is not supported.Your data will be converted but not stored at the server in any way. All temporary files necessary for conversion will be immediately deleted after finishing the request. Please set argument MySQL Version to 5 if MySQL version 5 or above has been installed at target sytem (see also Command line arguments).

Connection based mode

When using the utility along with a valid MySQL connection data will be straight imported into the appropriate table. It could be demonstrated here as well but how should you receive the resulting (already imported) MySQL table? You better download the project archive file impsql-10.zip or Impsql-1.0.tar.gz in order to unpack and run ImpSql at your machine. Please

Notes:

  • Free version of ImpSQL is limited to 50 entries. Please request a license file in order to abolish this limitation.
  • ImpSql can be used as CGI script or command line tool. The import file should be defined like follows (Windows):

    c:\> impsql.exe file=<DBF/CSV source file>

    Linux version:

    lx:/temp # impsql.cgi file=<DBF/CSV source file>

  • SQL connection data can be configured by setup  file impsql.ini. Type in your connection data like Host, User, password etc. before starting the Tool.

CSV Data Import Features

Please make sure to provide structure information along with you CSV data. T he first line of the .Csv file should contain the list of field names in order to deliver necessary database structure information. Here are two samples of valid CSV data:

"FIRST";"LAST";"LASTCALL"
"Wiley";"Morse";"2003-03-22"
"Edward";"Hyatt";"2003-08-11"
...

Another valid format looks like follows:

FIRST;LAST;LASTCALL
Wiley;Morse;2003-03-22
Edward;Hyatt;2003-08-11
...

But even without structure information ImpSql does the job for you with following built in features:

  • If field names are missing, ImpSql creates columns like "FIELD1", "Field2" until "FIELDn".
  • Column values are evaluated during import in order to recognize correct structure information like column size and type.
  • Invalid column names like empty strings or reserved words will be cured in order to prevent MySQL import errors.
  • Strings containing characters above 127 (non printable) will be imported as hex strings (e.g.  "Häberle" encoded as 0x48E46265726C65) in order to prevent changes in strings based on different character sets by MySQL escape function.
  • Optional Oem/Ansi translation if your CSV file contains Oem character set based strings. 
  • Optional conversion to UTF-8 for character and text fields in order to support UNICODE. 
        

DBF Import Features

The original dBASE II based DBF format has been extended into different directions. Development tools like CA-Clipper, CA-Visual Objects, MS-FoxPro or dBASE are using different techniques and formats for storing memo or blob values into DBF tables.

Utility Impsql is trying to support as much as possible DBF flavours including following features:

  • Full support of .DBT, .DBF and .FPT memo files .
  • Strings containing characters above 127 (non printable) will be imported as hex strings (e.g.  "Häberle" encoded as 0x48E46265726C65) in order to prevent changes in strings based on different character sets by MySQL escape function.
  • BLOB (Binary Large OBjects) will be translated to hex strings too for appropriate INSERT commands in order to preserve their exact content.
  • Optional Oem/Ansi translation for DBF file containing Oem (Windows/Dos) based strings.
  • Optional conversion to UTF-8 for character and memo fields in order to support UNICODE. 
     

Command line arguments

Passing command line options depends on the environment impsql is used within. When calling impsql within a command shell arguments are separated by blanks (" "), and within a CGI context separator is ampersand ("&"). Options itself are the same for CGI and command shell environment:

  • file=file_name (Input file name - required)
  • table=table_name (Name of the resulting MySQL table - optional)
  • oem2ansi=oem_file_name (File containing the Oem/Ansi translation codes - optional)
  • delim=delimiter_char (Delimiter character for CSV files - optional)
  • childtables=column_list (List of column names which has to be stored to separate tables - See MySQL Import and Normalization for more information)  
  • charset=utf8 (Strings and text fields will be stored as UTF-8 ).
  • primkey=off (Suppress primary key creation - field RECNO ).
  • deleted=[on|off] (Include deleted records while importing DBF files - on, or ignore deleted records - off. Default setting is off).
  • updatekey=<column_name> (Update existing rows and insert nonexisting ones. Use column <column_name> in order to identify which rows have to updated.
  • sqlversion=<version_number> Major MySQL Version at target system (optional - default version number is 4). Not necessary for connection based import since major version number will be determined internally. Important to specify for stand alone mode, if major MySQL version of target system is 5 or above. Avoiding argument sqlversion=5 for stand alone mode can cause error message "Data too long for column" during import of file created by ImpSQL into MySQL 5 database. For more info see Guaranteeing Data Integrity with MySQL 5.0.

Sample 1:
Calling impsql.exe running on top of Windows (W9x, W2K, XP) in order to import DBF file customer.dbf containing German Oem strings into table CUST:

c:\temp> impsql.exe file=customer.dbf oem2ansi=german.oem table=CUST

Sample 2:
Calling the Linux version in order to import CSV file cust.csv into table CUST using hash character "#" as delimiter:

lx:/tmp # impsql.cgi file=cust.csv table=CUST delim=#

Sample 3:
Calling impsql as CGI script (Linux version) in order to import CSV file cust.csv into table CUST:

http://www.mydomain.com/cgi-bin/impsql.cgi?file=cust.csv&table=CUST

Note: File cust.csv must have been uploaded by a POST request or via FTP onto the server.

Sample 4:
Calling impsql.exe running on top of Windows (W9x, W2K, XP) in order to import CSV file test.csv into the first normal form (see MySQL Import and Normalization for more information):

c:\temp> impsql.exe file=test.csv childtables=Group1,Group2,OS

Executables and other files

Directory bin of project archive file contains following executables and configuration files necessary to run MySQL import utility:

  • Impsql.cgi (Executable - Linux version)
  • Impsql.ini (Platform independing configuration file)
  • *.oem (Oem to Ansi translation codes for different languages)
  • Reserved.cfg (Configuration file containing all tokens reserved by MySQL)
  • Oemtab.exe (Win32 tool for creating missing Oem to Ansi translation codes)

Oem/Ansi Translation

One of ImpSql's optional features is Oem to Ansi translation for files created by Windows/Dos applications. This kind of translation is language depending and a constant source of problems. The best way to solve problems like this is to make the solution fully configurable. To do so the current translation code table can be set by configuration key OEM2ANSI (see setup file impsql.ini, section [ENVIRONMENT]). By default value OEM2ANSI points to another setup file containing translation codes for German language:

[ENVIRONMENT]
...
Oem2Ansi=german.oem

This default setting can be changed or overwritten by command line optioin oem2ansi=mylanguage.oem, which assumes that mylanguage.oem is available. If the .oem file for the language you are looking for is not part of the current archive it can be easily created. See article From Oem to Ansi in order to see how.