How to learn XLS-file structure?
- Eugene Lutsenko
- Posts: 1649
- Joined: Sat Feb 04, 2012 2:23 am
- Location: Russia, Southern federal district, city of Krasnodar
- Contact:
How to learn XLS-file structure?
How to learn XLS-file structure, including for MS Excel-2010? It is necessary to load the XLS-file into the DBF-table as previously before loading this table has to be created (empty). Maybe someone has a ready function which returns structure in the form of the array for the DBF-table if this function as parameter to set the XLS-file? Or there can be someone knows XLS to DBF converter which could be started from a command line or the program on Alaska with parameters: "The entrance XLS-file", "the output DBF-file". Certainly it isn't difficult to make it by means of the dialogue converter or of MS Excel (2003) or OpenOffice, but there is no wish to load with it users, so for some of them it is a problem. And just such users don't have these programs.
-
- Posts: 484
- Joined: Wed Jan 27, 2010 10:25 pm
- Location: Berlin Germany
Re: How to learn XLS-file structure?
Hi Eugene,
perhaps you can start with the following:
Now you have the hole excelfile in aValues. Then you can analyse the header:
here you have to check the length of colname<=10, delete all non usable characters from the name and so on.
Then you can analyse the values in each column from 2 to len(aValues) to get the the max length for "C"/"M" fields or the max value for "N" fields.
Add colname, Valtype() of column, maxlen to an array and build a database from that array.
Merry christmas
perhaps you can start with the following:
Code: Select all
v_file:="C:\exceltest\inp_data5.xls"
oExcel:= CreateObject( "EXCEL.APPLICATION" )
IF !( oExcel == NIL )
oWorkBook := oExcel:workbooks:open(v_file)
aValues := oWorkBook:workSheets(1):usedRange:value
oWorkBook:close()
oExcel:quit()
oExcel:destroy()
Code: Select all
aHeaderNames:=aValues[1]
ll:=len(aHeaderNames)
aStruct:={}
for i:=1 to ll
colname:=aValues[1][i]
....
Then you can analyse the values in each column from 2 to len(aValues) to get the the max length for "C"/"M" fields or the max value for "N" fields.
Add colname, Valtype() of column, maxlen to an array and build a database from that array.
Merry christmas
_______________________
Best Regards
Wolfgang
Best Regards
Wolfgang
- Eugene Lutsenko
- Posts: 1649
- Joined: Sat Feb 04, 2012 2:23 am
- Location: Russia, Southern federal district, city of Krasnodar
- Contact:
Re: How to learn XLS-file structure?
I thank, Wolfgang! I will try now. So cheerful Christmas is provided to me: ).
And to you too I wish to have well Christmas holidays, health and successes in New Year!
And to you too I wish to have well Christmas holidays, health and successes in New Year!
- Eugene Lutsenko
- Posts: 1649
- Joined: Sat Feb 04, 2012 2:23 am
- Location: Russia, Southern federal district, city of Krasnodar
- Contact:
Re: How to learn XLS-file structure?
Thanks Wolfgang and Roger!
Everything understood.
Everything understood.
Code: Select all
FUNCTION DC_Excel2WorkArea( cExcelFile )
LOCAL lStatus := .f., oExcel, cPath, oSheet, oBook, aValues, i, j, ;
aStru, xValue
#if XPPVER > 1900000
// Create the "Excel.Application" object
oExcel := CreateObject("Excel.Application")
IF Empty( oExcel )
DC_WinAlert( "Excel is not installed" )
RETURN .f.
ENDIF
#else
DC_WinAlert('This feature is available in Xbase++ 1.9 and later only!')
RETURN .f.
#endif
oExcel:Visible := .f.
// Load a Workbook from an .XLS file
// Get path from ini file
IF !File(cExcelFile)
DC_WinAlert( 'File does not exist:' + Chr(13) + cExcelFile )
RETURN lStatus
ENDIF
oBook := oExcel:Workbooks:Open(cExcelFile)
oSheet := oBook:activeSheet
aValues := oBook:workSheets(1):usedRange:value
aStru := dbStruct()
FOR i := 3 TO Len(aValues)
dbAppend()
FOR j := 1 TO Len(aValues[i])
xValue := aValues[i,j]
IF Valtype(xValue) == 'N' .AND. aStru[j,2] == 'C'
xValue := DC_XtoC(xValue)
ELSEIF aStru[j,2] == 'D' .AND. Valtype(xValue) == 'N'
xValue := Str(xValue)
xValue := StoD(xValue)
ELSEIF aStru[j,2] == 'D' .AND. Valtype(xValue) == 'C'
xValue := Ctod(xValue)
ELSEIF aStru[j,2] == 'C' .AND. Valtype(xValue) == 'D'
xValue := DtoS(xValue)
ENDIF
FieldPut(j,xValue)
NEXT
NEXT
oBook:close()
oBook:destroy()
// Quit Excel
oExcel:Quit()
oExcel:Destroy()
RETURN lStatus
Last edited by Eugene Lutsenko on Sun Dec 23, 2012 8:47 am, edited 1 time in total.
- Eugene Lutsenko
- Posts: 1649
- Joined: Sat Feb 04, 2012 2:23 am
- Location: Russia, Southern federal district, city of Krasnodar
- Contact:
Re: How to learn XLS-file structure?
Thanks Wolfgang and Roger! Everything understood.
Last edited by Eugene Lutsenko on Sun Dec 23, 2012 8:47 am, edited 1 time in total.
- Eugene Lutsenko
- Posts: 1649
- Joined: Sat Feb 04, 2012 2:23 am
- Location: Russia, Southern federal district, city of Krasnodar
- Contact:
Re: How to learn XLS-file structure?
While here that turns out:
To me it is unclear why numerical values of the xls-file are loaded always with two signs after a comma whereas them there can be a different quantity.
Can be to eat possibility of access to information on formats and types of data of the xls-file not to learn these parameters by the analysis of directly values. It would be very good since values can be much and it can take noticeable time.
Still I didn't lick into shape definition of formats of numerical values with fractional part.
In the future I want to make option of this function which can be started from a command line with parameter - the name xls or the xlsx-file.
Code: Select all
/*
This test program will export Excel-Sheet to Work Area DBF
*/
#INCLUDE "inkey.ch"
#INCLUDE "dcdir.ch"
#INCLUDE "appevent.ch"
#INCLUDE "xbp.ch"
#INCLUDE "dcprint.ch"
#INCLUDE "dcmsg.ch"
#INCLUDE "dll.ch"
#include "dccursor.ch"
#Include "thread.ch"
#INCLUDE "class.CH"
#INCLUDE "dmlb.CH"
#INCLUDE "fileio.CH"
#INCLUDE "dctree.CH"
#INCLUDE "dcicon.CH"
#INCLUDE "dcdialog.CH"
#INCLUDE "outlook.CH"
#pragma Library("ascom10.lib")
#pragma library( "dclip1.lib" )
#pragma library( "dclip2.lib" )
#pragma library( "dclipx.lib" )
FUNCTION Main()
PUBLIC cExcelFile := 'Inp_data'
PUBLIC cDbaseFile := cExcelFile
IF LEN(ALLTRIM(cExcelFile)) = 0
MsgBox('Задайте в качестве параметра имя конвертируемого Excel-файла без расширения: "XLS-DBF FileName"')
RETURN NIL
ENDIF
***** В будущем сделать определение по расширению какой Excel: 2003 или 2010
***** Проверить, как будет работать для 2010
CLOSE ALL
cExcelFile = DC_CurPath() + '\' + ALLTRIM(cExcelFile) + '.xls'
DC_Excel2WorkArea( cExcelFile )
RETURN nil
* ---------------
PROC appsys ; RETURN
* -------------
FUNCTION DC_Excel2WorkArea( cExcelFile )
LOCAL lStatus := .f., oExcel, cPath, oSheet, oBook, aValues, i, j, ;
aStru, xValue
#if XPPVER > 1900000
// Create the "Excel.Application" object
oExcel := CreateObject("Excel.Application")
IF Empty( oExcel )
DC_WinAlert( "Excel is not installed" )
RETURN .f.
ENDIF
#else
DC_WinAlert('This feature is available in Xbase++ 1.9 and later only!')
RETURN .f.
#endif
oExcel:Visible := .f.
// Load a Workbook from an .XLS file
// Get path from ini file
IF !File(cExcelFile)
DC_WinAlert( 'File does not exist:' + Chr(13) + cExcelFile )
RETURN lStatus
ENDIF
oBook := oExcel:Workbooks:Open(cExcelFile)
oSheet := oBook:activeSheet
aValues := oBook:workSheets(1):usedRange:value // Загрузка 1-го листа Excel-книги
aFieldName := aValues[1] // Имена колонок 1-го листа Excel-книги
DC_DebugQout( { aValues[1], aValues[2], aValues[3] } )
*DC_DebugQout( { aValues } )
MsgBox('Stop')
N_Rec = LEN(aValues) // Количество строк с данными в xls-файле
N_Col = LEN(aFieldName) // Количество колонок с данными в xls-файле
*MsgBox(STR(N_rec,15))
// Создание пустой dbf-таблицы для записи в нее данных из 1-го листа xls-файла ******************
// Определение структуры для dbf-таблицы по значениям всех строк (тип и формат данных в колонках)
// На основе просмотра ВСЕГО xls-файла сделать массивы описания полей и для создания структуры брать MAX из каждого массива
// В dbf-таблицу переносить данные, начиная со 2-й строки xls-листа
// 1-ю строку с наименованиями полей использовать вместо Inp_name.txt или чтобы автоматически его делать
aStructure := {} // Массив структуры dbf-таблицы
PRIVATE aFieldType[N_Col] // Тип значений в j-й колонке (если есть хотя бы одно текстовое, то текстовое)
PRIVATE aFieldSize[N_Col] // Максимальный размер значения в j-й колонке в символах
PRIVATE aFieldInt [N_Col] // Максимальное количество значащих цифр до десятичной точки в значениях j-й колонки
PRIVATE aFieldDeci[N_Col] // Максимальное количество значащих цифр после десятичной точки в значениях j-й колонки
PRIVATE aFlag_neg[N_Col] // Флаг наличия отрицательных значений в j-й колонке
AFILL(aFieldType, "X" )
AFILL(aFieldSize, 1 )
AFILL(aFieldInt , 1 )
AFILL(aFieldDeci, 0 )
AFILL(aFlag_neg , .F. )
FOR j=1 TO N_Col // Цикл по колонкам
FOR i=2 TO N_Rec // Цикл по строкам
mFieldType = VALTYPE(aValues[i,j])
DO CASE
CASE mFieldType = "C" // **************************************************************************
aFieldType[j] = "C" // Тип значений в j-й колонке (если есть хотя бы одно текстовое, то текстовое)
aFieldSize[j] = MAX(aFieldSize[j], LEN(ALLTRIM(aValues[i,j])))
aFieldDeci[j] = 0
CASE mFieldType = "N" // **************************************************************************
IF aValues[i,j] < 0
aFlag_neg[j] = .T.
ENDIF
IF aFieldType[j] <> "C"
aFieldType[j] = mFieldType
ENDIF
mCel = INT(aValues[i,j]) // Целая часть
FOR mNznCel = 1 TO 22 // Определить число значащих цифр ДО запятой без ведущих нулей
IF mCel < 10 ^ mNznCel
aFieldInt[j] = MAX(aFieldInt[j], mNznCel)
EXIT
ENDIF
NEXT
IF aValues[i,j] = mCel // Целое
aFieldDeci[j] = 0
ELSE // С дробной частью
mDrob = aValues[i,j] - mCel // Дробная часть
FOR mFieldDeci = 1 TO 22 // Определить число значащих цифр ПОСЛЕ запятой (найти первый 0 и выйти)
mV = mDrob * 10 ^ mFieldDeci
IF mV = INT(mV)
aFieldDeci[j] = MAX(aFieldDeci[j], mFieldDeci)
EXIT
ENDIF
NEXT
ENDIF
// Если с дробной частью, то плюс символ на точку
// Если отрицательное, то символ на знак
IF aFieldDeci[j] = 0
aFieldSize[j] = MAX(aFieldSize[j], aFieldInt[j] + IF(aFlag_neg[j],1,0))
ELSE
aFieldSize[j] = MAX(aFieldSize[j], aFieldInt[j] + aFieldDeci[j] + 1 + IF(aFlag_neg[j],1,0))
ENDIF
CASE mFieldType = "D" // **************************************************************************
aFieldType[j] = "C"
aFieldSize[j] = 8
aFieldDeci[j] = 0
ENDCASE
NEXT
AADD(aStructure, { aFieldName[j] , aFieldType[j], aFieldSize[j], aFieldDeci[j] }) // Добавить строку в структуру dbf-таблицы
NEXT
DC_DebugQout( { aStructure } )
MsgBox('Stop')
cDbaseName = cDbaseFile
cDbaseFile = cDbaseFile + '.dbf'
DbCreate( cDbaseFile , aStructure )
***** Цикл переноса данных из xls-листа в dbf-таблицу
CLOSE ALL
cDbaseName = SUBSTR(cDbaseFile, 1, AT('.',cDbaseFile)-1)
USE (cDbaseName) EXCLUSIVE NEW;ZAP
SELECT (cDbaseName)
FOR i=2 TO N_Rec
DBAPPEND()
FOR j=1 TO N_Col
// Преобразовывать значения с типами числа и даты в текст, если тип поля текст
DO CASE
CASE VALTYPE(aValues[i,j]) = 'C' .AND. aFieldType[j] = 'C'
FIELDPUT(j, aValues[i,j]) // Просто занести
CASE VALTYPE(aValues[i,j]) = 'N' .AND. aFieldType[j] = 'N'
FIELDPUT(j, aValues[i,j]) // Просто занести
CASE VALTYPE(aValues[i,j]) = 'N' .AND. aFieldType[j] = 'C'
FIELDPUT(j, STR(aValues[i,j], aFieldSize[j], FieldDeci[j] )) // Преобразовать число к текстовому виду
CASE VALTYPE(aValues[i,j]) = 'D' .AND. aFieldType[j] = 'C'
FIELDPUT(j, DTOC(aValues[i,j])) // Преобразовать дату к текстовому виду
ENDCASE
NEXT
NEXT
* IF Valtype(xValue) == 'N' .AND. aStru[j,2] == 'C'
* xValue := DC_XtoC(xValue)
* ELSEIF aStru[j,2] == 'D' .AND. Valtype(xValue) == 'N'
* xValue := Str(xValue)
* xValue := StoD(xValue)
* ELSEIF aStru[j,2] == 'D' .AND. Valtype(xValue) == 'C'
* xValue := Ctod(xValue)
* ELSEIF aStru[j,2] == 'C' .AND. Valtype(xValue) == 'D'
* xValue := DtoS(xValue)
*ENDIF
oBook:close()
oBook:destroy()
// Quit Excel
oExcel:Quit()
oExcel:Destroy()
RETURN lStatus
* -------------
Can be to eat possibility of access to information on formats and types of data of the xls-file not to learn these parameters by the analysis of directly values. It would be very good since values can be much and it can take noticeable time.
Still I didn't lick into shape definition of formats of numerical values with fractional part.
In the future I want to make option of this function which can be started from a command line with parameter - the name xls or the xlsx-file.