1 of 19

Introduction to Programming in C++

string_view. Resources.

Georgii Zhulikov

Apr 17, 2023

Lecture #27

2 of 19

std::string and C-strings

C-strings

  1. Inconvenient (no methods)
  2. “Get length” is a O(n) operation
  3. Termination with \0 means you can’t store that character in the string
  4. Fixed buffer length (resize manually)

Read-only access means reusing a const pointer.

Read-only substrings can be defined as a pointer and size pair (manually handled).

std::string

  1. Convenient (methods, algorithms, iterators, stores length)
  2. Can store \0
    • Have to be careful about it (in some cases conversions to C-strings break it)
  3. Resizable

Read-only access through const references.

Read-only substrings lead to copies.

2

3 of 19

Read-only substrings

Many string-processing tasks only require reading the strings and computing statistics, but not modifying anything.�Many modifications can be made in advance.

Result: a large buffer of text which is split, analyzed, counted, but not modified.

  • std::string makes a copy for each substring
  • C-string + size, doesn’t have a nice interface

What if we combine these two approaches?

3

std::string textData = getTextData(filePath);

// make a list of all words

std::vector<std::string> words = extractWords(textData);

// make a list of all sentences

std::vector<std::string> sentences = extractSentences(textData);

// make a map word-sentence

// "in which sentences does this word appear?"

using WordMap = std::map<std::string, std::string>;

WordMap wordMap = makeWordSentenceMap(words, sentences);

// do complex processing with the map

4 of 19

Why const std::string& doesn’t solve the problem

Passing objects by reference is supposed to solve exactly this problem - unnecessary copies.

However, there are still:

  1. Copies because of implicit conversion
  2. Copies because of substrings

copy

no copy (reference)

4

std::string getLongestWord(std::string sentence)

{

...

}

std::string getLongestWord(

const std::string& sentence)

{

...

}

5 of 19

Why const std::string& doesn’t solve the problem

The steps happening here are:

  1. Create a C-string “A small sentence”
    • variable data
  2. Create a new std::string “A small sentence”
    • implicit conversion
    • copies data
    • maybe to the heap! (slow)
  3. Create a reference to the std::string
    • no copying!
  4. Extract the longest word
  5. Create a new std::string for the word
    • copies data
  6. Return it.

5

std::string word;

word = getLongestWord("A small sentence");

std::string getLongestWord(

const std::string& sentence)

{

...

}

const char* data = "A small sentence";

std::string word;

word = getLongestWord(data);

6 of 19

Read access to a single buffer

  • Create a C-string “A very long string”
  • Create a new const pointer
    • no conversion
    • data not copied
  • Extract the longest word
  • Save the position of the word and its length as a new pointer-size pair
    • data not copied
  • Return it.

C-strings can solve this without unnecessary copying, but they need to store the size.

6

"A very long string."

std::pair<const char*, size_t> getLongestWord(

const char* sentence, size_t size)

{

}

7 of 19

string_view

std::string_view provides exactly this solution.

In addition, it provides an interface with read-only operations find, substr, and so on.

7

"A very long string"

std::string_view

ptr = str

size = 18

std::string_view

ptr = str + 12

size = 6

8 of 19

string_view

  • Provides read-only access to an existing character sequence
  • Can be created from both C-strings and std::strings
  • Copying is very fast (can be passed by value into functions)
  • The only two fields are:
    • Data pointer const char*
    • Size
  • Added in C++17

8

std::string_view getLongestWord(std::string_view sentence)

{

...

}

9 of 19

string_view

The steps happening here are:

  • Create a C-string “A small sentence”
    • variable data
  • Create a new std::string_view “A small sentence”
    • implicit conversion
    • no data copying
  • Copy std::string_view
    • fast and cheap
  • Extract the longest word
  • Create a new std::string_view for the word
    • no data copying
  • Return it (and make a copy because the outside code uses std::string).

9

std::string word;

word = getLongestWord("A small sentence");

std::string_view getLongestWord(

std::string_view sentence)

{

...

}

const char* data = "A small sentence";

std::string word;

word = getLongestWord(data);

10 of 19

Resources and Owners

  • std::string_view uses a completely different resource system
    • It is a “view” or a resource, but some other object manages the resource itself
  • Other objects manage their own resources
    • std::string has its internal buffer where it stores character
      • delete the variable = delete the buffer
    • std::vector, std::map, std::set all store copies of objects
      • unless you use pointers like std::vector<Data*>
  • Usually to do something like that you must manually use pointers

10

std::string

String data

std::vector

Vector elements

std::string_view

pointer

11 of 19

Resources and Owners

Manually managed data must be treated carefully.

  • Lose a pointer = can’t free memory
  • Lose track of a pointer = unexpected modifications
  • Lose track of the object = pointer tries to access broken data

STL containers solve this by providing non-pointer interface.

11

Data* createData()

{

Data* data = new Data();

initialize(data);

preprocess(data);

//...

return data;

}

Data* demoData = createData();

ModificationStatus status = modifyData(demoData);

AnalysisResult analysis = analyzeData(demoData);

DATA

data

demoData

modifyData() input

analyzeData() input

12 of 19

Resource Management with std::string_view

std::string_view doesn’t keep track of the original object except by its pointer. It is susceptible to the point:

  • Lose track of the object = pointer tries to access broken data

Solution? Keep track of this manually.

Example on the right is wrong. Why?

12

std::string_view getFirstWord(std::string text)

{

int endIndex = text.size();

for (int i = 0; i < text.size(); ++i)

{

if (text[i] == ' ')

{

endIndex = i;

}

}

std::string_view word = text.substr(0, endIndex);

return word;

}

13 of 19

Resource Management with std::string_view

std::string is passed by copy, so the variable text only exists within the function.

std::string_view “attaches” to this temporary variable.

When the variable is destroyed (at the end of the function), std::string_view becomes attached to unused memory.

13

std::string_view getFirstWord(const std::string& text)

{

int endIndex = text.size();

for (int i = 0; i < text.size(); ++i)

{

if (text[i] == ' ')

{

endIndex = i;

}

}

std::string_view word = text.substr(0, endIndex);

return word;

}

std::string

String Data

std::string

String Data

std::string_view

std::string_view

14 of 19

Implementing string_view

An example how some basic parts of a string_view class might work.

14

size_t getSize() const { return _size; }

MyStringView getSubString(size_t pos, size_t size) const

{

MyStringView result(_ptr + pos, size);

return result;

}

char operator[](int index) const

{

return _ptr[index];

}

protected:

const char* _ptr;

size_t _size;

};

class MyStringView

{

public:

MyStringView(const std::string& orig)

: _ptr(orig.c_str()),

_size(orig.size()) {}

MyStringView(const char* ptr, size_t size)

: _ptr(ptr),

_size(size) {}

MyStringView(const MyStringView& orig)

: _ptr(orig._ptr),

_size(orig._size) {}

15 of 19

Resource Management with Smart Pointers

  • As your program grows, issues with memory management increase in complexity.
    • Free memory when the object isn’t used anymore
    • Do not delete memory when the object is still used
    • Keep track of what functions/objects can access the memory (especially writing)
  • C++ has smart pointers to help solve these problems
    • std::unique_ptr<> is the only pointer to the corresponding memory. It frees memory when the pointer is deleted
    • std::shared_ptr<> keeps track of how many pointers can access the memory. If it’s 0, the memory is freed
    • std::weak_ptr<> solves some special cases for std::shared_ptr<> objects

15

16 of 19

Resource Management with Smart Pointers

A simple example is using unique_ptr to automatically free memory in a function.

If a regular pointer is used, you must keep track of all ways the function may be terminated.

With unique_ptr memory is freed automatically.

16

void my_func()

{

int* valuePtr = new int(15);

int x = 45;

// ...

if (x == 45)

return; // here we have a memory leak,

// valuePtr is not deleted

// ...

delete valuePtr;

}

#include <memory>

void my_func()

{

std::unique_ptr<int> valuePtr(new int(15));

int x = 45;

// ...

if (x == 45)

return; // no memory leak anymore!

// ...

}

17 of 19

Qt and Smart Pointers

  • In this course the largest application will be the Qt project.
  • It will use the Qt Framework for most operations.
  • Qt has its own memory management model - it uses inheritance and parent-child relationships to prevent memory leaks.
  • Smart pointers are not needed with the Qt project. They are an extra topic.

Instructions for downloading and setting up Qt will be released soon. Quick links:

17

18 of 19

Recap. Questions.

18

19 of 19

Next time

Qt

19