std::istream_iterator<> with copy_n() and friends

The snippet below reads three integers from std::cin; it writes two into numbers and discards the third:

std::vector<int> numbers(2);
copy_n(std::istream_iterator<int>(std::cin), 2, numbers.begin());

I'd expect the code to read exactly two integers from std::cin, but it turns out this is a correct, standard-conforming behaviour. Is this an oversight in the standard? What is the rationale for this behaviour?


From 24.5.1/1 in the C++03 standard:

After it is constructed, and every time ++ is used, the iterator reads and stores a value of T.

So in the code above at the point of call the stream iterator already reads one integer. From that point onward every read by the iterator in the algorithm is a read-ahead, yielding the value cached from the previous read.

The latest draft of the next standard, n3225, doesn't seem to bear any change here (24.6.1/1).

On a related note, 24.5.1.1/2 of the current standard in reference to the istream_iterator(istream_type& s) constructor reads

Effects: Initializes in_stream with s. value may be initialized during construction or the first time it is referenced.

With emphasis on "value may be initialized ..." as opposed to "shall be initialized". This sounds contradicting with 24.5.1/1, but maybe that deserves a question of its own.

Answers


Unfortunately the implementer of copy_n has failed to account for the read ahead in the copy loop. The Visual C++ implementation works as you expect on both stringstream and std::cin. I also checked the case from the original example where the istream_iterator is constructed in line.

Here is the key piece of code from the STL implementation.

template<class _InIt,
    class _Diff,
    class _OutIt> inline
    _OutIt _Copy_n(_InIt _First, _Diff _Count,
        _OutIt _Dest, input_iterator_tag)
    {   // copy [_First, _First + _Count) to [_Dest, ...), arbitrary input
    *_Dest = *_First;   // 0 < _Count has been guaranteed
    while (0 < --_Count)
        *++_Dest = *++_First;
    return (++_Dest);
    }

Here is the test code

#include <iostream>
#include <istream>
#include <sstream>
#include <vector>
#include <iterator>

int _tmain(int argc, _TCHAR* argv[])
{
    std::stringstream ss;
    ss << 1 << ' ' << 2 << ' ' << 3 << ' ' << 4 << std::endl;
    ss.seekg(0);
    std::vector<int> numbers(2);
    std::istream_iterator<int> ii(ss);
    std::cout << *ii << std::endl;  // shows that read ahead happened.
    std::copy_n(ii, 2, numbers.begin());
    int i = 0;
    ss >> i;
    std::cout << numbers[0] << ' ' << numbers[1] << ' ' << i << std::endl;

    std::istream_iterator<int> ii2(std::cin);
    std::cout << *ii2 << std::endl;  // shows that read ahead happened.
    std::copy_n(ii2, 2, numbers.begin());
    std::cin >> i;
    std::cout << numbers[0] << ' ' << numbers[1] << ' ' << i << std::endl;

    return 0;
}


/* Output
1
1 2 3
4 5 6
4
4 5 6
*/

Today I encountered very similar problem, and here is the example:

#include <iostream>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <string>

struct A
{
    float a[3];
    unsigned short int b[6];
};

void ParseLine( const std::string & line, A & a )
{
    std::stringstream ss( line );

    std::copy_n( std::istream_iterator<float>( ss ), 3, a.a );
    std::copy_n( std::istream_iterator<unsigned short int>( ss ), 6, a.b );
}

void PrintValues( const A & a )
{
    for ( int i =0;i<3;++i)
    {
        std::cout<<a.a[i]<<std::endl;
    }
    for ( int i =0;i<6;++i)
    {
        std::cout<<a.b[i]<<std::endl;
    }
}

int main()
{
    A a;

    const std::string line( "1.1 2.2 3.3  8 7 6 3 2 1" );

    ParseLine( line, a );

    PrintValues( a );
}

Compiling the above example with g++ 4.6.3 produces one:

1.1 2.2 3.3 7 6 3 2 1 1

, and compiling with g++ 4.7.2 produces another result :

1.1 2.2 3.3 8 7 6 3 2 1

The c++11 standard tells this about copy_n :

template<class InputIterator, class Size, class OutputIterator>
OutputIterator copy_n(InputIterator first, Size n, OutputIterator result);

Effects: For each non-negative integer i < n, performs *(result + i) = *(first + i). Returns: result + n. Complexity: Exactly n assignments.

As you can see, it is not specified what exactly happens with the iterators, which means it is implementation dependent.

My opinion is that your example should not read the 3rd value, which means this is a small flaw in the standard that they haven't specified the behavior.


I don't know the exact rationale, but as the iterator also has to support operator*(), it will have to cache the values it reads. Allowing the iterator to cache the first value at construction simplifies this. It also helps in detecting end-of-stream when the stream is initially empty.

Perhaps your use case is one the committee didn't consider?


Need Your Help

Json.Net unexpected characters ("\") when serializing my entities

c# entity-framework wcf json.net

I am using the excellent Json.Net library to serialize my entities generated by entity framework. I use the following code to do so :

How to have DesignTime data in WinRT XAML?

windows-8 winrt-xaml

How can I get DesignTime data in WinRT XAML so the designer shows sample data?