GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
scan_utf8(3) FreeBSD Library Functions Manual scan_utf8(3)

scan_utf8 - decode an unsigned integer from UTF-8 encoding

#include <libowfat/scan.h>

size_t scan_utf8(const char *src,size_t len,uint32_t *dest);

size_t scan_utf8_sem(const char *src,size_t len,uint32_t *dest);

scan_utf8 decodes an unsigned integer in UTF-8 encoding from a memory area holding binary data. It writes the decode value in dest and returns the number of bytes it read from src.

scan_utf8 never reads more than len bytes from src. If the sequence is longer than that, or the memory area contains an invalid sequence, scan_utf8 returns 0 and does not touch dest.

The length of the longest valid UTF-8 sequence is 6.

scan_utf8 will reject syntactically invalid encodings, but not semantically invalid ones. scan_utf8_sem will additionally reject surrogates.

fmt_utf8 and scan_utf8 implement the encoding from UTF-8, but are meant to be able to store integers, not just Unicode code points. Values above 0x10ffff are not valid UTF-8. If you are using this function to parse UTF-8, you need to reject them (see RFC 3629).

fmt_utf8(3), scan_utf8_sem(3)

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.