• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 271
  • Last Modified:

2nd argument to copy algorithm in STL c++

I have this simple basic code which inserts array of strings in to vector of strings. This was written for purpose of test program.

#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <iterator>

using namespace std;

int main()
{
    vector<string> strVec;

    string arr[4] = {"Foo","Bar","Baz","Bee"};
 
    copy(&arr[0], &arr[4], back_inserter(strVec));
    copy(strVec.begin(),strVec.end(),ostream_iterator<string>(cout, "\n"));
    return 0;
}

Open in new window


In the first use of "copy" above, why STL copy algorithm takes 2nd argument as an address of 5th element of array "arr" when it doesn't exists and how it actually treats it legally internally? Why such an "illegal" (it appears that way) arrangement is allowed?? Without that it copies one element less in to the vector. I could not find any explanation anywhere so would appreciate. Why cant it take address of last element i.e. &arr[3]??
0
James Bond
Asked:
James Bond
  • 5
  • 5
3 Solutions
 
rapeepakCommented:
C++ isn't that handy to double check if it is out of range and using the last element instead.
0
 
James BondSoftware ProfessionalAuthor Commented:
Pardon me.  What do you mean by that??
0
 
jkrCommented:
It makes sense when you look how 'std::copy()' is declared and implemented. The declaration is

template<class InputIterator, class OutputIterator>
   OutputIterator copy(
      InputIterator _First, 
      InputIterator _Last, 
      OutputIterator _DestBeg
   );

Open in new window


'copy()' uses iterators (http://en.cppreference.com/w/cpp/iterator) to advance through the container it operates on, and iterators behave syntactically like pointers. The container could be a list, a vector or any other STL container. Your array is just a simple and special case. Now, let's look at the implementation (in this case MS' STL implementation, snippet taken from 'xutility'):

		// TEMPLATE FUNCTION copy
template<class _InIt, class _OutIt, class _InOutItCat>
inline
	_OutIt __CLRCALL_OR_CDECL _Copy_opt(_InIt _First, _InIt _Last, _OutIt _Dest,
		_InOutItCat, _Nonscalar_ptr_iterator_tag, _Range_checked_iterator_tag)
	{	// copy [_First, _Last) to [_Dest, ...), arbitrary iterators
	_DEBUG_RANGE(_First, _Last);
	for (; _First != _Last; ++_Dest, ++_First)
		*_Dest = *_First;
	return (_Dest);
	}

Open in new window


So, the scoop is the line 'for (; _First != _Last; ++_Dest, ++_First)'. In the case of an STL container, the stop condition '_First != _Last' fires when 'container.end()' is reached (you'd usually call 'copy()' like 'copy(cont.begin(), cont.end(), ...);').

In your case, the address passed to 'copy()' does not represent a valid array element, but still a valid memory address that fulfils the above stop condition. That's why it works that way.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
James BondSoftware ProfessionalAuthor Commented:
Thanks jkr. I understood why copy works that way, However I have few more (related) questions -
1. If arr[4] works as _Last then arr[5], arr[6],... and so on right?? Since they are also valid memory addresses. I now understand there is no reliable way for checking array bounds as pointed by someone above.
2. How does iterator.end() knows its the end of the container especially if one is storing address of something in the container and in cases where resizes or automatic reallocation happens and we dont know how much legal memory is allocated to our container? How does it know that it is now pointing to one element beyond the last legal element when no fixed size is given? When there is no reliable way of checking bounds.
3. If we can use &arr[4] and so on as valid memory addresses then why cant we dereference iterator returned by end() as valid address?? (I understand we never do that purposely).  Internally these iterators must have been implemented as pointers, i think so.
0
 
jkrCommented:
Careful, you can't read from these addresses, that would mold likely cause an exception.

>> How does iterator.end() knows its the end of the container

It is implemented that way. The return value is a pseudo-iterator that always indicates the end of a sequence. Consequently, trying to 'read' from 'end()' also causes an exception.

>> in cases where resizes or automatic reallocation happens and we dont know how much legal memory is allocated to our
>> container?

In such cases, you need to assume all iterators you had from that container as invalid (with the notable exception of 'list', though). That applies to both additions and removals from elements to or from any container.

>>why cant we dereference iterator returned by end() as valid address?

For the same reasdon, see above ;o)
0
 
James BondSoftware ProfessionalAuthor Commented:
Sorry. Still need to make one point clear. You said that these addresses can't be read and may throw exceptions (which they do after I tested. arr[5] onwards does not work!!). Then how come this line works in STL implementations -

_First != _Last

Isn't _Last points to &arr[4] which is also not readable? Or is it just one more memory address is allowed and no more?? What exactly is _Last? So we are allowed to compare with any memory address even though we are not allowed to "access" or "read" or "write" to/from that memory address??
0
 
jkrCommented:
No reason to be sorry ;o)

Think of pointers - while you cannot read from them, you can still compare them. That's what this very expression does. '&arr[4]' is just a pointer value, or in other words equal to 'BYTE* p = (BYTE*) a + 4 * sizeof(a[0]);'. Again, you cannot read from that memory area without causing a runtime error, but comparing the pointer values is always possible.
0
 
James BondSoftware ProfessionalAuthor Commented:
Seems like, STL maintains some kind of a counter which, iterator uses to figure out what is valid and invalid memory address for a given range. And then end() always points to that same 'invalid' memory location. So in a for loop like -
for(i = v.begin(); i != v.end(); ++i)

for every loop iteration, end() gets called but returns same memory address every time no matter how many inserts or deletes we do in each iteration. So that makes end() return value invalid because even though it is called again and again, it always return same pointer. It doesn't check counter value and updates itself every time. I hope this is correct.
0
 
jkrCommented:
In case of 'end()' it is not that it points to an invalid memory location - it acts like one (keep in mind, 'end()' returns a pseudo iterator). While iterators can be used like pointers, they are no pointers in the C sense, they are an abstract C++ construct, see the link above (http://en.cppreference.com/w/cpp/iterator).

And yes, you could indeed use that counter analogy, yet it does not represent the 'true' picture that is hidden in the implementing code.
0
 
James BondSoftware ProfessionalAuthor Commented:
Where can we get details of true picture. The link gives high level overview of what iterator end is etc.
0
 
jkrCommented:
You can get the details in the STL's source code. Since it is template code, it comes with every C++ compiler. Yet that's quite a bit to chew on, I have to admit...
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 5
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now