Eugene,
Now I have not time to search prospects for coding ascii files, because I tested many softwares, utilities, when work on coding, decoding and test code pages of text files in my app.
but one from utility I use this is xcode.exe, in attach, but I cannot found web adresss from whitch this is, but I mean this is russian product
data:image/s3,"s3://crabby-images/150fd/150fd8e4b8c461645e4ffa0fd50095542784b6ce" alt="Smile :)"
that you must found it.
This utility can test and convert from to other code pages, also CP866 and W1251.
syntax is xcode -c -e %1 %1x
where -e is in English, that you do not put here and will be in russian.
when run with xcode -c -e zp807231oem.red zp807231oem.redx
that program write how code page is used in file ZP807231oem.red to file zp807231oem.redx
(red file is some txt file)
this is result : cp866: zp807231ansi.rec
Syntax for xcode is :
Usage: xcode -E -[hH?] -[wkaim1234567890] +[wkaim1234567890] [-q] [in [out]]
-E -h in English (don't forget to add -h or -H switch!)
-v print version information
-H manual, list of 14 encodings supported, and view YO-ware license
-d double recoding (try if simple 'xcode' failed)
-q quoted-printable decoding (useful for decoding MIME-files)
-l decode html Unicoded text (like Дима)
-c determine encoding and print it to the output (see details by -H)
-t do unix2dos transformation (convert LF to CR/LF) in DOS/Win only
-p pipe mode (applies to DOS/Win environment only)
-s silent mode (no information on encodings displayed)
If input/output files are not specified, the standard input/output is used.
-a to set cp866 output (default)
-w to set cp1251 output
-k to set koi8-r output
-i to set iso8859-5 output
-m to set mac output
+a to force cp866 input
+w to force cp1251 input
+k to force koi8-r input
+i to force iso8859-5 input
+m to force mac input
Other utility is free converter PokludaCZ,
http://www.pokluda.cz
also can run from command line :
czkonverze /00 /20 "zp807231oem.red" >vystup1.log
czkonverze /20 /00 "zp807231ansi.red" >vystup2.log
but this utility have only W1250 , not W1251 code page.
I have writed in Alaska only detector code page which test multiplicity some characters and then statistic count for what code page is this near.
Here some source , input parameter is some row from text:
**********************************
* DETEKTOR KÓDOVEJ STRĮNKY TEXTU *
**********************************
****************************
FUNCTION DETEKTORCP(riadok)
****************************
* zadefinovanie premennżch a po¾a znakov pre detekciu
Local pocet[7]
/*
Local detect := ;
{ "č‡č‹Ćc", ;
"ų©żųŽŅr", ;
"Øē¹äÓs", ;
"˛‘§¾ģŚz", ;
"ó¢¢ó—Ļo", ;
"į į‡Įa", ;
"é‚‚éˇ×e", ;
"ś££śÕu", ;
"ķķ’Éi" ;
}
*/
Local detect := ;
{ "č‡č‹Ćc", ;
"ų©żųŽŅr", ;
"Øē¹äÓs", ;
"˛‘§¾ģŚz", ;
"ó¢¢ó—Ļo", ;
"į į‡Įa", ;
"é‚‚éˇ×e", ;
"ś££śÕu", ;
"ķķ’Éi", ;
"Čķ’Éi", ;
"¼ķ’Éi", ;
"ķ’Éi", ;
"¨ķ’Éi", ;
"ˇķ’Éi", ;
"Ļķ’Éi", ;
"żķ’Éi" ;
}
* vynulovanie počķtadla
for k=1 to 7
pocet[k]:=0
next
* cyklus pre načķtanie a otestovanie vetkżch znakov riadku
for i=1 to len(riadok)
* testujem iba znaky nad CHR(127)
if riadok
>chr(127)
* skenujem 9 variantov znakov
* for j=1 to 9
for j=1 to 16
* testujem ka˛dż znak sady, v ka˛dej sade je 7 znakov
for k=1 to 7
if riadok==detect[j][k]
pocet[k]++
endif
next
next
endif
next
* tu vyhodnoti¯ ktorżch znakov je najviac pod¾a pocet[k]
*ladenie("pocet[1]"+str(pocet[1]))
*ladenie("pocet[2]"+str(pocet[2]))
*ladenie("pocet[3]"+str(pocet[3]))
*ladenie("pocet[4]"+str(pocet[4]))
*ladenie("pocet[5]"+str(pocet[5]))
*ladenie("pocet[6]"+str(pocet[6]))
*ladenie("pocet[7]"+str(pocet[7]))
/*
k=1
pompocet=pocet[k]
if pocet[2]>pompocet
pompocet=pocet[2]
k=2
endif
if pocet[3]>pompocet
pompocet=pocet[3]
k=3
endif
if pocet[4]>pompocet
pompocet=pocet[4]
k=4
endif
if pocet[5]>pompocet
pompocet=pocet[5]
k=5
endif
if pocet[6]>pompocet
pompocet=pocet[6]
k=6
endif
if pocet[7]>pompocet
pompocet=pocet[7]
k=7
endif
*/
* zatia¾ jednoduchie vyhodnotenie lebo kompletné nedįva korektné vżsledky CP850/Win1250
if pocet[1]>0
kodstr=1250
else
kodstr=852
endif
ladenie("kódovį strįnka "+str(kodstr))
RETURN kodstr
Maybe some inspiration for you..