less out of bounds read access - TFPA 002/2014

Posted by Hanno Böck on Sunday, November 30. 2014

An out of bounds read access in the UTF-8 decoding can be triggered with a malformed file in the tool less. The access happens in the function is_utf8_well_formed (charset.c, line 534) due to a truncated multibyte character in the sample file. It affects the latest upstream less version 470. The bug does not crash less, it can only be made visible by running less with valgrind or compiling it with Address Sanitizer. The security impact is likely minor as it is only an invalid read access.

This issue has been found with the help of Address Sanitizer.

The upstream developers have been informed about this issue on 4th November 2014, no fix is available yet. The less webpage has no bug tracker, no open mailing list and no other way to publicly report and document bugs.

Conclusion

Even tools that only do very minor file parsing can expose bugs due to charset encoding, especially in multibyte characters. Please note that the bigger security threat in less comes from the use of lesspipe.

It is unsettling that the upstream project of an important tool like less is completely unresponsive to bugs and has no public way to discuss them.

less out of bounds read sample with gif header
simpler sample with no header, only works when LESSOPEN is not set
OSVDB 115007 : less GIF File Handling Out-of-bounds Read Issue
Discussion of lesspipe security issues on oss-security
CVE-2014-9488

Update 2014-12-15: less released a new version 471 and this issue is not fixed.

Update 2015-03-10: Version 475 of less contains a fix for this issue. I never received any reply from the developers.

less doesn't have public release announcements or a repository, so it's hard to track their changes. The file version.c contains some entry mentioning this issue (without any credit):
v475 3/2/15 Fix possible buffer overrun with invalid UTF-8

The fix is in the file charset.c. Here is a patch.

Update 2015-06-03: It has been pointed out in the comments that the patch I provided was wrong (it was an unrelated fix for another issue). I have replaced it with the correct patch now. Also for clarification it should be pointed out that the latest "stable" version (according to the less webpage) 458 is not affected.

Trackbacks

Trackback specific URI for this entry

The Fuzzing Project on Tuesday, December 23. 2014: Several status updates

Show preview

I think it's time for some status updates from the Fuzzing Project. In a few days I'll be at the 31C3 congress in Hamburg. I'll have a short lightning talk about the Fuzzing Project there. I'll also be at Real World Crypto a few days later in London. O

www.us-cert.gov on Monday, April 20. 2015: PingBack

Unfortunately, the contents of this trackback can not be displayed.

Comments

Display comments as Linear | Threaded

stsp on Wednesday, June 3. 2015:

The bug was in the utf_bin_count() function in charset.c, not line.c.

The patch you've linked shows another caller of is_utf8_well_formed that was updated to pass a new parameter but is otherwise unrelated to the problem.

--- less-470/charset.c Sun Oct 5 23:03:31 2014
+++ less-475/charset.c Sat May 2 22:08:38 2015
@@ -506,8 +506,9 @@ utf_len(ch)
Does the parameter point to the lead byte of a well-formed UTF-8 character?
/
public int
-is_utf8_well_formed(s)
+is_utf8_well_formed(s, slen)
unsigned char *s;
+ int slen;
{
int i;
int len;
@@ -516,6 +517,8 @@ is_utf8_well_formed(s)
return (0);

len = utf_len((char) s[0]);
+ if (len > slen)
+ return (0);
if (len == 1)
return (1);
if (len == 2)
@@ -547,7 +550,7 @@ utf_bin_count(data, len)
int bin_count = 0;
while (len > 0)
{
- if (is_utf8_well_formed(data))
+ if (is_utf8_well_formed(data, len))
{
int clen = utf_len(*data);
data += clen;

Hanno on Wednesday, June 3. 2015:

Thanks for noting that, you are right. I've replaced the patch with the correct one now and added an update notice.

Add Comment

Name

Homepage

Comment

In reply to

Phone*

What is nine minus three?

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.

Standard emoticons like :-) and ;-) are converted to images.

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

Enter the string from the spam-prevention image above:

Form options

Remember Information?

Subscribe to this entry