Skip to content

The Magic File Encoding lib helps to load and transform simple an closed scope char set text files.

License

Notifications You must be signed in to change notification settings

Jan5366x/MagicFileEncoding

Repository files navigation

Magic File Encoding

NuGet Downloads Actions Status GitHub

The Magic File Encoding Library is designed to assist you in loading and transforming simple and closed scope character set text files. Whether you're working with txt, xml, json, EDIFACT or similar text formats, this library provides a comprehensive solution to handle various encoding scenarios effortlessly.

Nuget Package

MagicFileEncoding at nuget.org

.Net Version

  • .Net 8: Magic File Encoding 3.0.0 and newer
  • .Net 6: Magic File Encoding 2.0.1

Transformation Considerations

When performing encoding transformations, it is important to be mindful of potential issues that may arise if the target encoding is simpler than the source encoding. Certain characters or language-specific symbols in the source encoding may not be accurately represented or fully preserved in the target encoding.

Thorough testing and validation are recommended to ensure the desired outcome during the transformation process.

Fallback Encoding

The Magic File Encoding Library incorporates a fallback encoding system. By default, it uses ISO-8859-1 (Latin-1) as the fallback encoding. This fallback encoding is specifically designed to cater to the encoding requirements within the German cultural space.

However, it provides the flexibility to modify the fallback encoding through an optional method argument, enabling adaptation to different encoding needs like UTF-8 fallbacks.

Usage

Here are some code examples demonstrating the usage of the code library:

File System

Example 1: Getting the acceptable encoding of a file

string filePath = "~/example.txt";
Encoding fallbackEncoding = Encoding.UTF8;

Encoding acceptableEncoding = FileEncoding.GetAcceptableEncoding(filePath, fallbackEncoding);

Console.WriteLine("Acceptable encoding: " + acceptableEncoding.EncodingName);

Example 2: Reading all text from a file using automatic encoding detection

string filePath = "~/example.txt";

string text = FileEncoding.ReadAllText(filePath);
 Console.WriteLine("Text: " + text);

Example 3: Reading all text from a file and transforming it into a target encoding

string filePath = "~/example.txt";
Encoding targetEncoding = Encoding.UTF8;
Encoding fallbackEncoding = Encoding.GetEncoding("ISO-8859-1");

string text = FileEncoding.ReadAllText(filePath, targetEncoding, fallbackEncoding);
Console.WriteLine("Text: " + text);

Example 4: Writing text to a file in a specific encoding

string filePath = "~/output.txt";
Encoding targetEncoding = Encoding.Unicode;
string text = "\u2387 Hello, world!";

FileEncoding.WriteAllText(filePath, targetEncoding, text);
Console.WriteLine("Text written to file.");

Example 5: Providing writer access to a file in a specific encoding

string filePath = "~/output.txt";
Encoding targetEncoding = Encoding.UTF8;

FileEncoding.Write(filePath, targetEncoding, writer =>
{
    writer.WriteLine("Line 1");
    writer.WriteLine("Line 2");
    writer.WriteLine("Line 3");
});

Console.WriteLine("Text written to file.");

Byte Array

Example 6: Getting the acceptable encoding of a byte array

string filePath = "~/example.txt";
byte[] bytes = File.ReadAllBytes(filePath);
Encoding fallbackEncoding = Encoding.UTF8;

Encoding acceptableEncoding = FileEncoding.GetAcceptableEncoding(bytes, fallbackEncoding);
Console.WriteLine("Acceptable encoding: " + acceptableEncoding.EncodingName);

Example 7: Reading all text from a byte array using automatic encoding detection

string filePath = "~/example.txt";
byte[] bytes = File.ReadAllBytes(filePath);

string text = FileEncoding.ReadAllBytes(bytes);
Console.WriteLine("Text: " + text);

Example 8: Reading all text from a byte array and transforming it into a target encoding

string filePath = "~/example.txt";
byte[] bytes = File.ReadAllBytes(filePath);
Encoding targetEncoding = Encoding.UTF8;
Encoding fallbackEncoding = Encoding.GetEncoding("ISO-8859-1");

string text = FileEncoding.ReadAllBytes(bytes, targetEncoding, fallbackEncoding);
Console.WriteLine("Text: " + text);

Versioning & Breaking Changes

Major.Minor.Patch-Suffix

  • Major: Breaking changes
  • Minor: New features, but backwards compatible
  • Patch: Backwards compatible bug fixes only
  • -Suffix (optional): a hyphen followed by a string denoting a pre-release version

See: https://docs.microsoft.com/en-us/nuget/concepts/package-versioning

Credits

This work is heavily based on the following stack overflow and web articles:
determine-a-strings-encoding-in-c-sharp
check-for-invalid-utf8
how-to-detect-utf-8-in-plain-c
strip-byte-order-mark-from-string-in-c-sharp
what-is-the-most-common-encoding-of-each-language
utf-bom4

Contributions and Support

Contributions to the Magic File Encoding Library are welcome! If you encounter any issues, have suggestions for improvements, or would like to contribute to its development, please visit our GitHub repository.

About

The Magic File Encoding lib helps to load and transform simple an closed scope char set text files.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages