@rkenmi - Python String Tricks - Performance considerations

# Python String Tricks - Performance considerations

### Python String Tricks - Performance considerations

##### Updated on September 21, 2017
• If a string is changed to become bigger, consider trying to find the maximum size of the final string early on in a quick pass (strive for $$O(n)$$)

• Consider working from the tail of the string and iterating backwards if, say, 1 character needs to be replaced with 2.
• Traditional deletion of 1 character in a string is a $$O(n)$$ process because elements to the right of the deleted index has to be shifted to the left, and this has a worst case scenario of iterating through ~$$n$$ elements.

• If deleting entries in a string, consider swapping indices to make it seem like entries are being deleted.
• Do note that you cannot do index assignments with strings in Python. This is because strings are immutable.
• Consider the consequences of += and = operators

• This is naturally ~$$O(n)$$ in time complexity because you are allocating a new string and copying over \n\ elements, where \n\ represents the amount of characters in the new string.
• This entails that using += with strings in a single for-loop would result in $$O(n^2)$$ performance, which isn't good!
• But... the standard Python interpreter (CPython) does use a bit of magic (using bytearray) to try to reduce this down to $$O(n)$$. However this performance reduction depends on some edge cases, and in general, we shouldn't rely on implementation details. See here.
• String slicing does copy

• A statement like new_s = s[3:7] does cost $$O(n)$$.
• bytearray: a way to work with mutable strings

• We can use bytearray to manipulate strings in a fashion similar to lists.
• String index assignment can be done with bytearray. i.e. s[0] = ord('A') ord returns an integer representing the Unicode code point of that character.
• bytearray can easily be decoded to a Python 3 string and encoded back to a bytearray. For example: b = bytearray("hell0", "utf-8") encodes the unicode string into a bytearray. b.decode() will output 'hell0' with type 'str'.
• Usage of bytearray is uncommon because majority of the time when we work with strings, we either compare or format them instead of modifying them.