You’ll find all the code in the demo database of my reading_text_files github repository. Converting from UTF8 to UTF16 with the Win32 API Unfortunately, VBA cannot help here, so let’s take a detour to our trustworthy Win32 API. Then, at some point, we’ll have to convert an UTF8 representation of string to a UTF16 VBA friendly one. We see at lines 0 and 30 that our accented “é” are encoded as the two bytes C3 and A9, so this is a UTF8 file.
Let’s take a look at the bytes in the file:
Vb net 2010 read text file line by line windows 10#
UTF8 is sort of a packed representation of a series of Unicode characters, where one or two bytes can be used to represent a wide character.Īt this point, we can guess that our Notepad old friend (on Windows 10 en_US version in my setup), probably stored our text file using a UTF8 encoding, which VBA is not aware of.
UTF8: not all the characters in the Unicode character set really need two bytes of encoding.The representation of a character in Unicode is also called a code point.(Note: UCS2 is history, assume UCS2 (or UCS-2) is UTF16) Unicode (UTF16) encodes a character with two bytes (a “wide” character, in extension “wide” strings).Unicode is a big character set which is meant to be able to represent the character glyphs of different languages. Two forms of Unicode will be of interest here: UTF8 and UTF16.Let’s state some facts before banging our heads on that: The accented characters don’t display correctly. Now we’ll try to read it and display it in Visual Basic, line by line, as usual:Įxecuting the “Test_ReadTextFileByLine” Sub (in the debug window) from this simple code snippet should do it… Save the file let’s say in c:\temp\textfiles\notepad_text.txt.(Just my lame try to compose words with diacritics, in english. You hide your true motive behind a friendly façade. Open Windows notepad and copy/paste (or type) this text:įancy a café ? Or a piña colada ? – Oh, that’s so cliché!.Just immediately actionable, simple and humble, VBA code with one function to rule them all, and a 10 to 15 minutes read to understand it all. NET or undecipherable C++ code complications here. We’ll shed some light on essential Unicode concepts you’ve preferred to leave aside until now, because – let’s face it – who wants to spend hours reading wikipedia or MSDN just to read a text file or understand the many rules and APIs for converting between encodings ? This post will guide thru the experience of reading a text file with VBA, explain some of the pitfalls you may encounter on this path when dealing with different text encodings and file formats. Unicode variants like UTF8 and UTF16 and how they impact your Office VBA development is not so straightforward.The user does not have sufficient permissions to access the file ( UnauthorizedAccessException). The path is too long ( PathTooLongException). The specified file does not exist ( FileNotFoundException).Ī partial-trust situation in which the user does not have sufficient permissions to access the file. The exception message specifies the line causing the exception, while the ErrorLine property is assigned the text contained in the line. The following conditions may cause an exception:Ī row cannot be parsed using the specified format ( MalformedLineException). Using MyReader As New Microsoft.VisualBasic. This example reads from the file test.txt. The following code loops through the file, displaying each field in turn and reporting any fields that are formatted incorrectly.Ĭlose the While and Using blocks with End While and End Using. If any lines are corrupt, report an error and continue parsing. The following code defines the TextFieldType property as Delimited and the delimiter as ",". Using MyReader As New Microsoft.VisualBasic.ĭefine the TextField type and delimiter. The following code creates the TextFieldParser named MyReader and opens the file test.txt. To parse a comma delimited text fileĬreate a new TextFieldParser. The TextFieldType property defines whether it is a delimited file or one with fixed-width fields of text. The TextFieldParser object provides a way to easily and efficiently parse structured text files, such as logs.