Extensibility – A Reason For Using Streams in C++

Copyright © Mark Radford, December 2000

Preamble

Should C++ streams be preferred as a method of i/o over the C-style <cstdio> functions? Well, the short answer is a resounding yes, but many people and in particular those coming from a C background, are slow to move to them. In my experience, it is common for people to pick up on STL (and use it to the point where they couldn’t manage without it) but they still can not see the advantages of streams.

This article will attempt to show by means of an ongoing example, the reasons for the resounding yes stated above. First, a simple example of a C-style output statement will be presented, followed by the equivalent using C++ streams. In passing, we will note that the streams library defines how the user can implement custom streams, and that the run-time polymorphism granted by the design of streams gives us flexibility not afforded by the <cstdio> functions. The streams approach will then be developed to show how the extensibility of streams can be used to produce code which is simpler and more self documenting.

Example and Discussion

Recently on the accu-general mailing list, Hubert Matthews posted an example very similar to the following [HM]:

First, C-style code like this:

 

std::sprintf(buf, "%04d/%02d/%02d %02d:%02d\n",

            (current_tm.tm_year + 1900),

            (current_tm.tm_mon + 1),

            current_tm.tm_mday,

            current_tm.tm_hour,

            current_tm.tm_min);

 

Then, the C++ streams equivalent for comparison:

 

os      // os is an instance of std::ostream

    << setw(4) << setfill('0') << (time_value.tm_year + 1900) << '/'

    << setw(2) << setfill('0') << (time_value.tm_mon + 1)     << '/'

    << setw(2) << setfill('0') << time_value.tm_mday          << ' '

    << setw(2) << setfill('0') << time_value.tm_hour          << ':'

    << setw(2) << setfill('0') << time_value.tm_min;

 

Hubert asked in his posting: “The old stdio way seems much more efficient in terms of readability, typing, run-time speed and space.  Are type-safety and the avoidance of possible buffer overflows (which are extremely unlikely in this case) worth it?”.

The compile time type safety of the latter (C++ streams) version is certainly a big plus (indeed this is probably the most often quoted reason for preferring C++ streams over sticking to the C-style <cstdio> functions). Further, it should be noted that some of the most likely sources of error have been transferred from run time to compile time. It is true that memory allocation and deletion are typically an overhead with streams. Having said that, input and output typically are not fast operations, so any such saving in this area should be considered an optimisation – that is, only carried out when an unacceptable lack of performance has been measured.

Run-Time Polymorphism

Moving on, consider how we could change the destination of the output. The C-style example above uses printf(), which is actually equivalent to fprintf() with the output directed to stdout. Therefore, it is not a problem to write code the output from which can be directed to either to stdout or to some other file created by the program. Indeed, it is even possible by using fprintf(), to write code in which the binding of the output destination – be it to a file on disk or to stdout – is at run-time, because stdout is actually just a built in instance of a FILE type. However, often it is desirable for the possible output destinations to include a buffer in memory, as well as a disk file and stdout. Here, the <cstdio> functions are more limiting, and in fact, there is no obvious way to achieve it: the only <cstdio> function facilitating this is sprintf(), and using this necessitates changing the code, whereas streams are interchangeable at run-time. Certainly, it should be possible to have an instance of a FILE type which binds its destination to a memory buffer, but no such thing exists in the <cstdio> library. Further, it appears that there is no way for the user to implement it, as this would mean interacting with the internals of FILE, which means poking around in the implementation details.

The C++ output streams library includes string streams. For example, std::basic_ostringstream<> binds the destination to a standard library string. Further, the streams library defines how to implement user-defined streams. As output streams are all derived from the common base class std::basic_ostream<>, so a function of the form

 

template <typename T> void f(std::basic_ostream<T>& os)

{


}

 

will work for any output stream including those implemented by the user.

User-Defined Extensions

Now back to the code shown above which writes an object of type tm to an output stream. What we would really like to write is code of this form:

 

std::cout << format_date_time(4, 2, 2, 2, 2, '0') << time_value;

 

If we can write the code like this, we will have achieved a great step forward in readability, not particularly because the code is so much shorter, but because it is self documenting – it says exactly what it does! This is the starting point of the development of our own extensions which will make it possible to write the code this way. The reason for doing this is to present an example of how such an extension can be developed; although this involves some work on the part of the developer, the result can be used over and over again. For the purposes of this article we will, to keep things simple, take some liberties. We will:

Develop for streams of ordinary chars, so we can use things like std::ostream, ignoring the longer template forms.

Use our own formatting, ignoring the locales in the C++ standard library.

Hardwire delimiters, such as ‘/’ for delimiting days and months, and ‘:’ for delimiting hours and minutes.

A Simple User-Defined Output Operator

To start with, let us leave out the formatting, and content ourselves to be able to write

std::cout << time_value

This is easy. We just write our own output operator, placing its declaration in a header file:

// io_time.h

 

#ifndef IO_TIME_H

#define IO_TIME_H

 

#include <iosfwd>

 

namespace io_time

{

    std::ostream& operator<<(

        std::ostream& os, std::tm const& time_value);

 

}

 

#endif // IO_TIME_H

 

and its definition in an implementation file (this is possible because we are sticking to plain char):

 

// io_time.cpp

 

namespace io_time

{

    std::ostream& operator<<(

        std::ostream& os, std::tm const& time_value)

    {

      os

            << (time_value.tm_year + 1900) << '/'

            << (time_value.tm_mon + 1) << '/'

            << time_value.tm_mday  << ' '

            << time_value.tm_hour << ':'

            << time_value.tm_min;

 

        return os;

    }

}

 

Now we can write the simplified code (at this point we’re still not concerning ourselves with formatting).

A User-Defined Manipulator – Header File

The next step is to provide the formatting mechanism – the manipulator format_date_time.

The standard library provides manipulators such as setfill() and setw(), which in the case of these examples alter (respectively) the fill character and the width of the output field. These are specified as functions returning objects which are inserted into the stream causing the stream to be manipulated, and we can follow the same pattern. We need a header file:

 

// io_time.h

 

#ifndef IO_TIME_H

#define IO_TIME_H

 

#include <ostream>

#include <iomanip>

#include <ctime>

 

namespace io_time

{

 in which we define the formatting object class:

    class date_time_formatter

    {

    public:

        date_time_formatter(

                unsigned int in_year_width,

                unsigned int in_month_width,

                unsigned int in_day_width,

                unsigned int in_hour_width,

                unsigned int in_minute_width,

                char         in_fill_char);

 

        date_time_formatter(date_time_formatter const& original);

 

        void set_ostream_info(std::ostream& os) const;

 

    private:

        unsigned int const  year_width;

        unsigned int const  month_width;

        unsigned int const  day_width;

        unsigned int const  hour_width;

        unsigned int const  minute_width;

        char         const  fill_char;

 

        date_time_formatter& operator=(date_time_formatter const&);

    };

 

The contents contain no surprises: this class just holds some formatting information. Note however, the set_ostream_info() member function, which takes as its parameter, an object of type std::ostream – when called it will apply the format information it contains to this stream. Also, assignment is disabled – as the mode of usage does not require assignment, this is (probably) best, to avoid any possibility of complications.

Next we need the actual manipulator function:

 

    date_time_formatter format_date_time(

                unsigned int year_w,

                unsigned int month_w,

                unsigned int day_w,

                unsigned int hour_w,

                unsigned int min_w,

                char         fill_c)

    {

        return date_time_formatter(

                year_w, month_w, day_w, hour_w, min_w, fill_c);

    }

 

an output operator for objects of this class:

 

    std::ostream& operator<<(

        std::ostream& os, date_time_formatter const& formatter);

 

as before, an output operator which outputs an object of type tm:

 

    std::ostream& operator<<(

        std::ostream& os, std::tm const& time_value);

 

and this completes the header file:

 

} // end namespace io_time

 

 

#endif // IO_TIME_H

A User-Defined Manipulator – Implementation File

Now to construct the implementation file.

// io_time.cpp

 

#include "io_time.h"

#include <cctype>

 

In doing this we face the problem of where to store the formatting information from date_time_formatter. In the standard library the base class std::ios_base contains storage for the state set by the likes of setfill() and setw(), but we have no power to extend this.

To accommodate this need, the streams library provides a mechanism for the developer to provide custom information storage: here we will be using the ios_base member functions iword() and (the static) xalloc(). The iword() member function has return type long& and takes an int parameter. This parameter – which should be obtained from xalloc()  identies a particular user-specified (and zero-initialised) flag (for further explanation the reader is referred to a standard library text such as the one by Nicolai Josuttis [NJ]).

Next, we need to reserve some flags, and provide some helper functions which make their use less long-winded. The best place for all this is in the anonymous namespace. First, the constants naming the flags:

 

namespace

{

    int const formatted_flag_index = std::ios_base::xalloc();

    int const year_width_index     = std::ios_base::xalloc();

    int const month_width_index    = std::ios_base::xalloc();

    int const day_width_index      = std::ios_base::xalloc();

    int const hour_width_index     = std::ios_base::xalloc();

    int const minute_width_index   = std::ios_base::xalloc();

    int const fill_char_index      = std::ios_base::xalloc();

and then the helpers:

    bool is_formatted(std::ostream& os)

    { return (0 != os.iword(formatted_flag_index)); }

 

    int const year_width(std::ostream& os)

    { return os.iword(year_width_index); }

 

    int const month_width(std::ostream& os)

    { return os.iword(month_width_index); }

 

    int const day_width(std::ostream& os)

    { return os.iword(day_width_index); }

 

    int const hour_width(std::ostream& os)

    { return os.iword(hour_width_index); }

 

    int const minute_width(std::ostream& os)

    { return os.iword(minute_width_index); }

 

    void set_fill_char(std::ostream& os)

    {

        int fill_char = os.iword(fill_char_index);

        if (std::isprint(fill_char)) os.fill(fill_char);

    }

} // end anonymous namespace

 

Note that setting the fill character is delegated to the helper function set_fill_char(): this is to accommodate the possibility that the user may have set the fill character to a non-printing character – if this happens, the fill character is ignored.

Moving on:

namespace io_time

{

    void date_time_formatter::set_ostream_info(std::ostream& os) const

    {

        os.iword(year_width_index)      = year_width;

        os.iword(month_width_index)     = month_width;

        os.iword(day_width_index)       = day_width;

        os.iword(hour_width_index)      = hour_width;

        os.iword(minute_width_index)    = minute_width;

        os.iword(fill_char_index)       = fill_char;

        os.iword(formatted_flag_index ) = true;

    }

 

 

    std::ostream& operator<<(

        std::ostream& os, date_time_formatter const& formatter)

    {

        formatter.set_ostream_info(os);

        return os;

    }

 

Now it can be seen how the manipulation of the stream works and why date_time_formatter has the member function set_ostream_info): the actual work is delegated to set_ostream_info() – after all it’s this object which has the information needed, so let it pass the information on to where it’s needed. All that the output operator must do now is call this function on the formatter object, passing the stream object as the argument to the call.

There remains only to define the output operator for tm type objects, and the constructors for date_time_formatter (which will not be listed here for the sake of simplicity – all they have to do is initialise the object’s state).

 

    std::ostream& operator<<(

        std::ostream& os, std::tm const& time_value)

    {

        using std::setw;

 

        if (is_formatted(os))

        {

            set_fill_char(os);

            os  << setw(year_width(os))

                << (time_value.tm_year + 1900) << '/';

 

            set_fill_char(os);

            os  << setw(month_width(os))

                << (time_value.tm_mon + 1) << '/';

 

            set_fill_char(os);

            os  << setw(day_width(os))

                << time_value.tm_mday  << ' ';

 

            set_fill_char(os);

            os  << setw(hour_width(os))

                << time_value.tm_hour << ':';

 

            set_fill_char(os);

            os  << setw(minute_width(os))

                << time_value.tm_min;

        }

        else

        {

            os 

                << (time_value.tm_year + 1900) << '/'

                << (time_value.tm_mon + 1) << '/'

                << time_value.tm_mday  << ' '

                << time_value.tm_hour << ':'

                << time_value.tm_min;

        }

 

        os.iword(formatted_flag_index ) = false;

 

        return os;

    }

Finally

That’s it – we’ve arrived! We can now write:

 

std::cout << format_date_time(4, 2, 2, 2, 2, '0') << time_value;

 

as we wanted to. Further, we can use the io_time extension whenever it suits us, so we can keep on writing expressions like this.

There’s much more development which could be done. A couple of examples are …

-         The typing of the parameters to format_date_time() could be strengthened

-         Modification could be made to the formatting to use standard locale information

… but there’s only so much which can usefully be fitted into one article.

The C++ streams library may not be perfect, but it is still very useful. Hopefully this article has demonstrated where at least some of that usefulness lies.

References  and Acknowledgements

[HM]. Hubert Matthews posting on the accu-general email list on the thread “String streams v. sprintf” on 3/11/2000. Thanks are due to Hubert for his help during an email discussion prior to the writing of this article (I was unaware of the mechanism for adding user-defined formatting information to streams until Hubert came across the explanation in [NJ] and pointed me to it).

[NJ]. Nicolai Josuttis, “The C++ Standard Library: A Tutorial and Reference”, published by Addison-Wesley.