C plus plus:Modern C plus plus:Vectors

From GPWiki
Jump to: navigation, search

Modern C++ : Going Beyond "C with Classes"

Introduction

std::vector is perhaps the most widely known and used template ( except maybe for std::string ) in the STL-inspired part of the Standard C++ Library, and for good reason: It's basically an easy-to-use, safe, and fast dynamic array.

Easy To Use

std::vector includes most of the member functions you'd expect, and with intuitive names.

#include <iostream>
#include <vector>
#include <stdexcept>
 
int main() {
    // declare a new vector of ints
    std::vector<int> v;
 
    // std::vector keeps track of its size for you
    // initially there's nothing in it
    std::cout << "v contains " << v.size() << " elements" << std::endl;
 
    // this line does what you'd expect
    v.resize( 4 );
 
    // we resized to 4, so now size is 4
    std::cout << "v contains " << v.size() << " elements" << std::endl;
 
    // and you can specify what you want to fill the new elements with, if you want
    v.resize( 8, 42 );
    // only the four new elements are 42 -- resize doesn't change existing elements
 
    // thanks to an overloaded operator[], you can use it just like an array
    for ( std::vector<int>::size_type i = 0; i < v.size(); ++i ) {
        std::cout << i << "th element is " << v[i] << '\n';
    }
    std::cout << std::flush;
 
    // for speed, operator[] doesn't check bounds
    // for that, use the at member function which can throw
    try {
        v.at( 85693 ) = 68;
    } catch ( std::out_of_range &e ) {
        std::cerr << "Index out of range: " << e.what() << std::endl;
    }
 
    // std::vectors are very good at adding to the end,
    // so there's a special function that guarantees
    // it'll be done in amortized constant time
    v.push_back( 127 );
 
    // inserting is obviously not as fast, but one line is sure
    // more convenient that moving and reallocating everything yourself
    v.insert( v.begin()+1, 2 ); // insert a two before v[1]
 
    // you can erase single elements
    v.erase( v.begin()+2 ); // erase v[2]
    // or whole ranges
    v.erase( v.begin()+3, v.end()-2 ); // erase the middle,
                                       // leaving only the first 3 and
                                       // the last 2
 
    for ( std::vector<int>::size_type i = 0; i < v.size(); ++i ) {
        std::cout << i << "th element is " << v[i] << '\n';
    }
    std::cout << std::flush;
}

You end up with the following output:

v contains 0 elements
v contains 4 elements
0th element is 0
1th element is 0
2th element is 0
3th element is 0
4th element is 42
5th element is 42
6th element is 42
7th element is 42
Index out of range: vector::_M_range_check
0th element is 0
1th element is 2
2th element is 0
3th element is 42
4th element is 127

For a full list of what std::vector can do, check out Roguewave's std::vector documentation.

Safe & Fast

Much of the time, people new to the Standard C++ Library don't want to use it because they think it's slow and end up using their own solution instead. ( A fairly common form of NIH Syndrome. ) There are a whole bunch of counterarguments for this.

  • std::vector works and is complete. A custom solution needs to be debugged.
  • std::vector is written by people that know the compiler very well and often includes optimization tricks that of which most people would never have thought.
  • std::vector is standard, so many people are already familiar with it. It's also consistant with the rest of the Standard C++ Library.
  • std::vector is safe. There are well-defined exception-safety guarantees for all of its member functions. If it holds elements that act like they're supposed to, which mostly means obvious things like no exceptions from destructors and copies are equivalent, then there is never a case where an exception can put a std::vector into an unusable state.

To illustrate these, I wrote up a fairly simple example. Both of these programs do basically the same thing: add numbers to a dynamic array, one at a time.

The first uses a fairly basic append_to_array that you could reasonably expect to find in a custom solution. It works by allocating a new, larger array; copying the elements over; and copying the new element to the end.

#include <new>
#include <iostream>
 
void append_to_array(long *&array, long &array_size, long what) {
        long *newarray = new long[array_size+1];
        for ( long i = 0; i < array_size; ++i ) newarray[i] = array[i];
        newarray[array_size] = what;
        delete[] array;
        array = newarray;
        ++array_size; 
}
 
int main() {
 
    long *v = 0;
    long v_size = 0;
 
    for ( long i = 0; i < 0x10000; ++i ) {
        append_to_array( v, v_size, i );
    }
 
    std::cout << "Array size: " << v_size << std::endl;
}

This second version uses std::vector's push_back member function.

#include <vector>
#include <iostream>
 
int main() {
 
    std::vector<long> v;
 
    for ( long i = 0; i < 0x1000000; ++i ) {
        v.push_back(i);   
    }
 
    std::cout << "Vector size: " << v.size() << std::endl;
 
}

The second version is easier to read and the author got it right on the first try. For the first one, he initially forgot to pass array_size by reference, forgot to delete the old array, and forgot to initialise v and v_size to 0. Also, append_to_array is only fine for non-throwing types -- it's untemplated it because it'll leak memory if a copy assignment fails.

The most effective point, however, is the speed one:

$ time ./dynarray_new
Array size: 65536

real    0m25.149s
user    0m16.129s
sys     0m7.784s
$ time ./dynarray_vector
Vector size: 16777216

real    0m0.421s
user    0m0.288s
sys     0m0.128s

The std::vector version builds its array 256 times larger, and yet is roughly 60 times faster.

( If you're wondering how it manages this, look in the next section. )

More Nice Things

One fact of life in C++ is that, unfortunately, you can't always use C++ libraries. Most libraries have a C interface, which might make you think that you can't use std::vector -- it's a templated class in a namespace and C has neither templates, classes, nor namespaces. Luckily, the elements in a std::vector are guaranteed to be stored contiguously in memory. In other words, given std::vector<T> v; and std::vector<T>::size_type i, then it's guaranteed that for all i < v.size(), &v[i] == &v[0]+i.

If you're confused by the math-talk, the important things about that is that you can treat the address of the first element of your array like a pointer to the first element of a normal array:

#include <iostream>
#include <vector>
#include <cstring>
 
int main() {
    std::vector<char> v(256, '\0');
    std::strncpy( &v[0], "Hello World!", v.size()-1 );
    std::cout << &v[0] << std::endl;
}

This trick is extremely useful for storing the pixels in an image, for example, since you can then pass it to something like glTexImage2D.

Of course, please prefer std::string to std::vector<char>, this is just a nice example since it gives readable output. This trick is, however, still useful because unlike std::string's c_str() member function, you are allowed to write to the std::vector through the pointer.

In Closing

If you remember nothing else from this series, remember std::vector. You can use it in almost any program and it'll make your code shorter, easier to understand, and quite possibly faster.

What's Next

Having a class that can clean up after itself automatically turns out to be an idiom with much wider applicability than just memory. RAII can be used to handle all resources in a safe, consistent manner.