Deriving from std::streambuf

It’s nice, elegant and effective to make use of standard library from C++. Just to mention less dependencies on third party libraries is less mess and less risk in your project. When it comes to input/output C++ standard library offers us “iostreams” framework. Unfortunately its design is not intuitive when it comes to extending standard functionality.

Question: how to provide standard input stream (a std::istream derived object, so software parts prepared for std::istream will be able to use it) for reading data from a byte array? And how to do it effectively?

Ok, we’ve got std::istringstream (or std::stringstream) which allows us to read data from bytes contained in std::string object. But what if we’ve got bytes array outside of std::string object? One solution is to copy data from the array to the std::string object (that’s something we want to avoid as it’s not effective). Thanks to std::string characteristics it’s able to hold any binary string, not only text. Example:

boost::shared_array<char> data(/* ... */);   // points to len bytes
const std::string s(data.get(), len);        // copy!
std::istringstream is(s);                    // now we can read...

But this solutions leads to data duplication (the code above copies data) in computer memory what can hurt… – a situation I’ve faced. If someone wants to avoid data copying (to cut down memory usage of his program) and take advantage of std::istream functionality the solution is to prepare a dedicated class derived from std::streambuf. Just to mention: creating own class derived from std::istream is a bad design (std::istream and std::ostream are designed to be fixed, not extended themselves – instead they’re prepared to work with any stream buffer, derived standard stream classes only provide proper stream buffers. So when creating std::istream derived class it should only initialize its base class with some std::streambuf derived object).

All C++ input streams are derived from std::istream (for simplicity let’s just stick to template instantiations for ‘char’ type). But std::istream is not an abstract class. We can create pure std::istream objects, but we have to deliver std::streambuf object for std::istream constructor.

Thus we need to create a custom class derived from std::streambuf – later I’ll show you that we even don’t have to create one ourselves (Boost…). And that’s where problem arises. Properly deriving from std::streambuf is not easy and intuitive because its interface is complicated. There are many methods provided but some of them are based on protected virtual methods which must or may be implemented in a derived class.

The only valuable example of deriving from std::streambuf I found was the article “A beginner’s guide to writing a custom stream buffer (std::streambuf)”. It’s excelent but… the example 2 (“Example 2: reading from an array of bytes in memory”) contains a very subtle bug which took me several hours to detect. The autor forgot to use char_trait::to_int_type method. In effect, some input data caused invalid program behaviour in form of unexpected exceptions being thrown.

The correct form of read-only std::streambuf derived class for reading from a byte array is (I put only most important things here):

class char_array_buffer : public std::streambuf {
public:
    char_array_buffer(const char *data, unsigned int len);

private:
    int_type underflow();
    int_type uflow();
    int_type pbackfail(int_type ch);
    std::streamsize showmanyc();

    const char * const begin_;
    const char * const end_;
    const char * current_;
};

char_array_buffer::char_array_buffer(const char *data, unsigned int len)
: begin_(data), end_(data + len), current_(data) { }

char_array_buffer::int_type char_array_buffer::underflow() {
    if (current_ == end_) {
        return traits_type::eof();
    }
    return traits_type::to_int_type(*current_);     // HERE!
}

char_array_buffer::int_type char_array_buffer::uflow() {
    if (current_ == end_) {
        return traits_type::eof();
    }
    return traits_type::to_int_type(*current_++);   // HERE!
}

char_array_buffer::int_type char_array_buffer::pbackfail(int_type ch) {
    if (current_ == begin_ || (ch != traits_type::eof() && ch != current_[-1])) {
        return traits_type::eof();
    }
    return traits_type::to_int_type(*--current_);   // HERE!
}

std::streamsize char_array_buffer::showmanyc() {
    return end_ - current_;
}

Now it’s ready to be used with our array:

boost::shared_array<char> data(/* ... */);  // points to len bytes
char_array_buffer buf(data.get(), len);     // no copy here!!!
std::istream is(&buf);                      // now we can read

And voila! Without these calls to to_int_type method (marked with “// HERE!” comment) the class is not functioning correctly on some input data and behaves like the data stream ends too early (data value is treated as EOF).

Next time I’ll describe how to avoid all this just by using Boost library (but this article is still valuable if someone doesn’t want to make dependency on Boost).

Advertisements

About krzysztoftomaszewski

I've got M.Sc. in software engineering. I graduated in 2005 at Institute of Computer Science, Warsaw University of Technology, Faculty of Electronics and Information Technology. I'm working on computer software design and engineering continuously since 2004.
This entry was posted in C++, C++ stdlib and tagged , , , , , . Bookmark the permalink.

8 Responses to Deriving from std::streambuf

  1. Thank you very much for the solution !!! 🙂

  2. Mikhail Zhigun says:

    There’s a bug in showmanyc(): it should return -1 instead of zero, when end is reached (i.e. when (m_end – m_current) <= 0 ), or std::istream::eof() does not return true when the end is reached. Apart from that, many thanks for the solution.

  3. Torkel Bjørnson-Langen says:

    I don’t think you have to define uflow(). The base class version of the function calls underflow() and increments gptr().

  4. rharish says:

    Hey,

    Thanks for this solution. I’m just wondering though, did you ever get down to writing how boost library could be used to avoid these ? I’d find it quite useful for sure. I’d also appreciate if you could maybe point me to something along these lines, in case you haven’t written about it in your blog.

    Thanks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s