This sample can do batch conversion of text files with different code pages - Unicode, utf-8, windows-1250 and others to one selected code page. The algorithm contains simple detection of source file code page using BOM.
You can choose any destination charset. See also ByteArray - save unicode data (string) as utf-8 with BOM to save files with BOM (unicode Little/Big, utf-8)
Batch file conversion - character set and BOM detection of html files | |
---|---|
Const DestCharSet = "utf-8" 'Const DestCharSet = "ascii" Dim FS Set fs = CreateObject("Scripting.FileSystemObject") ConvertFolder "f:\", "f:\1" Function ConvertFolder(byval InputPath, OutputPath) Dim InputFolder, File Set InputFolder = fs.GetFolder(InputPath) For Each File In InputFolder.Files If LCase(Right(File.Name,4)) = ".htm" Then Wscript.Echo File.Path 'wscript.echo OutputPath & "\" & replace(file.path,":","") ConvertFile File.Path, OutputPath & "\" & file.Name, DestCharSet End If Next Dim FilesFolder For Each FilesFolder In InputFolder.SubFolders ConvertFolder FilesFolder.Path, OutputPath Next End Function Sub ConvertFile(SourceFileName, DestFileName, DestCharSet) 'read the source file contents Dim FileContents Set FileContents = ReadOneFile(SourceFileName) 'Convert to the destination charset Set FileContents = FileContents.CharSetConvert(DestCharSet) 'Save to a destination file FileContents.SaveAs DestFileName End Sub Function ReadOneFile(FileName) Dim ByteArray Set ByteArray = CreateObject("ScriptUtils.ByteArray") 'Read first two bytes from the file ByteArray.ReadFrom FileName,,2 Select Case ByteArray.HexString 'unicode big endian Case "FEFF": ByteArray.CharSet = "unicodebig" 'Read the file from 3rd byte to end. ByteArray.ReadFrom FileName,3 'unicode little endian Case "FFFE": ByteArray.CharSet = "unicodelittle" 'Read the file from 3rd byte to end. ByteArray.ReadFrom FileName,3 Case Else: 'Read first three bytes from the file ByteArray.ReadFrom FileName,,3 If ByteArray.HexString = "EFBBBF" Then 'unicode utf-8 'read a file contents behind the BOM header ByteArray.ReadFrom FileName,4 ByteArray.CharSet = "utf-8" Else 'read whole contents of the file in other cases ByteArray.ReadFrom FileName On Error Resume Next 'try to detect charset from the data source' ByteArray.CharSet = DetectCharSet(ByteArray.String) 'Set some default charset (default is OEM) 'if err<>0 then ByteArray.CharSet = "windows-1250" End If End Select Set ReadOneFile = ByteArray End Function 'The Function detects charset from the source string data. Function DetectCharSet(Data) On Error Resume Next Dim charset 'the charset tag usually look like '<meta http-equiv="Content-Type" content="text/html; charset=windows-1250"> charset = Split(Data, "charset=", 2, vbTextCompare)(1) If Len(charset)>0 Then charset = Split(charset, """", 2, vbTextCompare)(0) End If DetectCharSet = charset End Function |
Works with safearray binary data - save/restore binary data from/to a disk, convert to a string/hexstring, codepage/charset conversions, Base64 conversion, etc.
ByteArray is a COM class specially designed to work with Microsoft Windows Scripting engines - VB Script and JScript in Active Server Pages or WSH and in CHM or HTA applications. It also works with VB Net, Visual basic (VBA - VB 5, VB 6, Word, Excel, Access, …), C#, J#, C++, ASP, ASP.Net, Delphi and with T-SQL OLE functions - see Use ByteArray object article. You can also use the object in other programming environments with COM support, such is PowerBuilder.
Source code for ByteArray is available within distribution license, please see License page for ASP file upload and ScriptUtilities.
Huge ASP upload is easy to use, hi-performance ASP file upload component with progress bar indicator. This component lets you upload multiple files with size up to 4GB to a disk or a database along with another form fields. Huge ASP file upload is a most featured upload component on a market with competitive price and a great performance . The software has also a free version of asp upload with progress, called Pure asp upload , written in plain VBS, without components (so you do not need to install anything on server). This installation package contains also ScriptUtilities library. Script Utilities lets you create hi-performance log files , works with binary data , you can download multiple files with zip/arj compression, work with INI files and much more with the ASP utility.
© 1996 - 2011 Antonin Foller, Motobit Software | About, Contacts | e-mail: info@pstruh.cz