About
Text - String in Python
A string literal is a sequence data type.
Strings in Python are:
- sequence with characters as elements
Each character in a string has a subscript or offset (id). The number starts at 0 for the leftmost character and increases by one as you move character-by-character to the right.
The string python has 6 characters, numbered 0 to 5:
+---+---+---+---+---+---+
| P | Y | T | H | O | N |
+---+---+---+---+---+---+
0 1 2 3 4 5
To get the letter P,
letter_P = "PYTHON"[0]
Articles Related
Initialization
Code
It's created by writing it down between quotation marks (' ' or “ ”). The escape character is the backslash character (\)
my_string = "I'm a string!"
my_other_string = 'and me too'
my_string_with_comma = 'It\'s great!'
string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
File
with open("file.txt", "r") as fh:
my_description = fh.read()
str
str is the class that creates strings objects.
Character Set
Unicode
data prefixed with the letter “u” are unicode strings. For example:
s = u"This is an unicode string"
print type(s)
<type 'unicode'>
python i/o is byte based.
s = file.readline() # bytes
print type(s)
s = file.readline().decode('utf-8') # unicode
print type(s)
s = u'Hello world!'
print type(s), repr(s)
s = 'Hello world!'
print type(s), repr(s)
If you encounter an error involving printing unicode, you can use the encode method to properly print the international characters, like this:
unicode_string = u"aaaàçççñññ"
encoded_string = unicode_string.encode('utf-8')
print encoded_string
“decode” converts from bytes to unicode. “encode” converts from unicode to bytes.
Loop
For
string = "Nico!"
for character in string:
print character
N
i
c
o
!
Operator
in
if 'a' in 'Nicolas':
print('a is a letter of Nicolas')
if 'z' not in 'Nicolas':
print('a is not a letter of Nicolas')
if 'Nico' in 'Nicolas':
print('Nico is in Nicolas')
+ (Concat)
The + operator between two strings concatenates them
print "gerard"+" "+"nico"
Function
Split
The string function will split a sentence in a list of words.
text = "How do you do?"
for word in text.split():
print word
How
do
you
do?
Split returns a list data type
>>> type(text.split())
<class 'list'>
split() can be called directly on a unicode or str object. For example,
>>> u'split,me'.split(',')
[u'split', u'me']
length
- len(). Length of a string of numbers of characters
my_string = "Nico Gerard"
print len(my_string)
# Dot notation works only for string specific methods (ie that don't work on anything else).
# my_string.len() is then not good because len() can work on different objects.
Lower
s.lower()
Upper
s.upper()
(Cast|Str)
str(). Makes strings out of non-strings. Explicit string conversion.
str(2)
Isalphanumeric
Isalphanumeric: “J123”.isalpha() == False
Slicing
Slicing of substring: string[i:j] gives the characters from position i to j.
>>> 'foo'[0:2]
'fo'
>>> 'foo'[2:]
'o'
Replace
Syntax:
str.replace(old, new[, max])
Example:
- Replace boum by hop
'Youplaboum'.replace('boum', 'hop')
'Youplahop'
- Replace the first boum by hop
'Youplaboumboum'.replace('boum', 'hop',1)
'Youplahopboum'
Strip
remove trailing and leading spaces.
s.strip()
Encoding
Characters are represented using a variable-length encoding scheme called UTF-8.
Each character is represented by some number of bytes.
Ord
You can find the value of a character c using ord©.
Example of numeric values of the characters 'a', 'A' and space:
>>> ord('a')
97
>>> ord('A')
65
>>> ord(' ')
32
Chr
You can obtain the character from a numerical value using chr(i).
To see the string of characters numbered 0 through 10, you can use the following:
s = ' - '.join([chr(i) for i in range(10)])
'\x00 - \x01 - \x02 - \x03 - \x04 - \x05 - \x06 - \x07 - \x08 - \t'
to
toInt
int('24')
toByte
myString = "hello";
myStringInByte = myString.encode()
Properties
immutable
Strings in Pyhton are immutable.
From Why are Python strings immutable?. There are several advantages.
- One is performance: knowing that a string is immutable means we can allocate space for it at creation time, and the storage requirements are fixed and unchanging. This is also one of the reasons for the distinction between tuples and lists.
- Another advantage is that strings in Python are considered as “elemental” as numbers. No amount of activity will change the value 8 to anything else, and in Python, no amount of activity will change the string “eight” to anything else.