utf8rewind
|
utf8rewind
is a C library designed to extend default string handling functions in order to add support for UTF-8 encoded text. Besides providing functions to deal with UTF-8 encoded text, it also provides functions for converting to and from UTF-16 encoded text, the default on Windows.
For a full summary of the interface, please refer to the library interface.
For detailed examples showing how to use the library, please refer to the examples page.
UTF-8 encoded Unicode accounts for over 60 percent of the web. And with good reason! Because UTF-8 is completely backwards-compatible with ASCII, developers only need to change code dealing with codepoints. UTF-8 can encode the full range of Unicode codepoints in a maximum of six bytes per codepoint. However, because most text tends to be the Latin alphabet mixed with special characters, the common case is strings not much longer than pure ASCII.
UTF-16 encoding solves the same problems as UTF-8, but in a different way. UTF-16 is not backwards-compatible with ASCII, resulting in invalid codepoints being encountered when the string is treated as ASCII. As a result, all code dealing with strings must be changed in order to handle these new strings. This can be seen in the changes made in the C strings API:
Description | ASCII | UTF-16 |
---|---|---|
Get the length of a string | strlen | wcslen |
Copy a string to another | strcpy | wcscpy |
Append to a string | strcat | wcscat |
Convert to lowercase | tolower | towlower |
Converting a project to use UTF-16 after the fact is a serious endeavour that touches all code dealing with strings. On the other hand, changing existing code to use UTF-8 only deals with codepoint processing.
This project is licensed under the MIT license, a full copy of which should have been provided with the project.
All supported platforms use GYP to generate a solution. This generated solution can be used to compile the project and its dependencies.
You will need to have Visual Studio 2010 or above installed.
Open a command window at the project's root.
Execute the following to generate a solution:
tools\gyp\gyp --depth --format=msvs utf8rewind.gyp
Open the solution in Visual Studio and you can build the library and tests.
Open a command window at the project's root.
First, make sure you have all dependencies installed using your favorite package manager.
sudo apt-get install gyp gcc g++
Next, execute the following command to generate a makefile:
gyp --depth=./ --format=make utf8rewind.gyp
Now you can build the project:
make
For a release build, specify the build type:
make BUILDTYPE=Release
Open a command window at the project's root.
Execute the following to generate a solution:
tools\gyp\gyp --depth --format=xcode utf8rewind.gyp
Open the solution in Xcode and you can build the library and tests.
Copy 'include/utf8rewind/utf8rewind.h' and 'source/utf8rewind.c' directly into your existing solution. Make sure you specify that the source file should be compiled as C code (/TC
in Visual Studio). Include the header from your source and start using it.
After generating a solution, build and run the "tests-rewind" project. Verify that all tests pass on your system before continuing.
As a user, you can help the project in a number of ways, in order of difficulty:
Use it - Designers of a public interface often have very different ideas about usability than those actually using it. By using the library, you are helping the project spread and could potentially improve it by us taking your project into consideration when we design the API.
Spread the word - If you find utf8rewind
useful, recommend it to your friends and coworkers.
Complain - No library is perfect and utf8rewind
is no exception. If you find a fault but lack the means (time, resources, etc.) to fix it, sending complaints to the proper channels can help the project out a lot.
Write a failing test - If a feature is not working as intended, you can prove it by writing a failing test. By sending the test to us, we can make the adjustments necessary for it to pass.
Write a patch - Patches include a code change that help tests to pass. A patch must always include a set of tests that fail to pass without the patch. All patches will be reviewed and possibly cleaned up before being accepted.
For inquiries, complaints and patches, please contact {quinten}{lansu} {at} {gmail}.{com}
. Remove the brackets to get a valid e-mail address.