Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages, and makes importing and analyzing data much easier. Pandas builds on packages like NumPy and matplotlib to give you a single, convenient, place to do most of your data analysis and visualization work.
Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called index.
pandas.Series
A pandas Series can be created using the following constructor −
pandas.Series( data, index, dtype, copy)
The parameters of the constructor are as follows −
S.No | Parameter & Description |
1 | data: data takes various forms like ndarray, list, constants |
2 | index : Index values must be unique and hashable, its length is same as data. Default np.arrange(n) if no index is passed. |
3 | dtype : dtype is for data type. If None, data type will be inferred |
4 | copy : Copy data. Default False |
A series can be created by passing inputs like −
- Array
- Dict
- Scalar value or constant
Create an Empty Series
A basic series, which can be created is an Empty Series.
Example
import pandas as pd #import the pandas library and aliasing as pd
s = pd.Series()
print (s)
output Series([], dtype: float64)
Create a Series from ndarray
If data is an ndarray, then index passed must be of the same length. If no index is passed, then by default index will be range(n) where n is array length, i.e., [0,1,2,3…. range(len(array))-1].
Example 1
import pandas as pd #import the pandas library and aliasing as pd
import numpy as np
data = np.array([‘i’,’n’,’d’,’i’,’a’])
s = pd.Series(data)
print (s)
output
0 i
1 n
2 d
3 i
4 a
- Here 0 to 4 are index array and i,n,d,I,a, is data array
Example 2
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['i','n','d','i',’a’])
s = pd.Series(data,index=[10,11,12,13,14]) # index changed
print (s)
Its output is as follows −
10 i
11 n
12 d
13 i
14 a
dtype: object
Create a Series from dictionary
A dict can be passed as input and if no index is specified, then the dictionary keys are taken in a sorted order to construct index. If index is passed, the values in data corresponding to the labels in the index will be pulled out.
Example 1
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = {'a' : 0., 'e' : 1., 'i' : 2.,’o’:3.,’u’:4.} # Dictionary keys are used to construct index
s = pd.Series(data)
print (s)
Its output is as follows −
a 0.0
e 1.0
i 2.0
o 3.0
u 4.0
dtype: float64
Example 2
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data,index=['b','c','d','a'])
print (s)
output −
b 1.0
c 2.0
d NaN
a 0.0
dtype: float64
Note − missing element is filled with NaN (Not a Number).
Create a Series from Scalar Value :
If data is a scalar value, an index must be provided. The value will be repeated to match the length of index
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
s = pd.Series(‘its empty’, index=[0, 1, 2, 3])
print s
Its output is as follows −
0 its empty
1 its empty
2 its empty
3 its empty
dtype: object
Accessing Data from Series with Position
Data in the series can be accessed similar to that in an ndarray.
Example 1
Retrieve the first element. As we already know, the counting starts from zero for the array, which means the first element is stored at zeroth position and so on.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print s[0] #accessing the first element
output −
1
Example 2
Retrieve the first three elements in the Series. If a : is inserted in front of it, all items from that index onwards will be extracted. If two parameters (with : between them) is used, items between the two indexes (not including the stop index)
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve the first three element
print (s[:3])
Its output is as follows −
a 1
b 2
c 3
dtype: int64
SLICING (Extracting some of the elements )
- (Retrieve the last three elements)
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
Print (s[-3:]) #retrieve the last three element
output −
c 3
d 4
e 5
dtype: int64
Retrieve Data Using (Index)
A Series is like a fixed-size dict in that you can get and set values by index label.
Example 1 Retrieve a single element using index label value.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print s['a'] #retrieve a single element
Its output −
1
Example 2 Retrieve multiple elements using a list of index values.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve multiple elements
print (s[['a','c','d']])
Its output is as follows −
a 1
c 3
d 4
dtype: int64
Example 3
If a index is not contained, an exception(error) is raised.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print (s['f']) #trying to access index which is not exists
Its output is as follows −
…
KeyError: 'f'
- Head() and tail() functions
import pandas as pd
import numpy as np
data = np.array(['i','n','d','i',’a’,’I’,’s’,’g’,’r’,’e’,’a’,’t’])
print (data.head()) # will return first 5 values
print (data.tail()) # will return last 5 values
print (data.head(7)) # will return first 7 values
print (data.tail(6)) # will return last 6 values
- Arithmetic with Series : Arithmetic operations (+,-,*,/) can be performed on Two series Objects with matching indexes. If any data items of the two Series object is not having same index it will return NaN as a result of any arithmetic operation.
>>> import pandas as pd
>>> import numpy as np
>>> D1=np.array([1,2,3,4])
>>> D2=np.array([2,1,3,2])
>>> D1*D2
Output array([2, 2, 9, 8])
>>> D1-D2
Output array([-1, 1, 0, 2])
>>> D1+D2
Output array([3, 3, 6, 6])
>>> D1/D2
Output array([0.5, 2. , 1. , 2. ])
- Vector Operations on Series : If any function/operation is applied on on Series Object it will be applied on each item of that object
>>> D1=np.array([10,12,23,24])
>>> D1*2
Output array([20, 24, 46, 48])
>>> D1-5
Output array([ 5, 7, 18, 19])
>>> D1+5
Output array([15, 17, 28, 29])
>>> D1>6
Output array([ True, True, True, True])
>>> D1<16
Output array([ True, True, False, False])
>>> D1**2
Output array([100, 144, 529, 576], dtype=int32)