Numpy arrays are one of the most efficient data structures for handling numerical data. You can perform different mathematical operations on numpy arrays using built-in functions. In this article, we will discuss how to split a numpy array in Python using different function.
The Numpy module provides us with various functions to split a numpy array into different sub-arrays. Let us discuss them one by one.
Split a numpy array using the split() function
The split() method can be used to split a numpy array into equal parts as well as on the basis of indices. It has the following syntax.
numpy.split(myArr, index_array_or_parts, axis)
Here,
- myArr is the array that we have to split.
- The index_array_or_parts determines how the array is split into subarrays. If it is a numerical value, the array is split into equal parts. If index_array_or_parts is an array of indices, the subarrays are based on the indices in the array of indices.
- The parameter axis determines the axis along which the array is split. By default, it has the value 0. You can use this parameter to split a 2-D array.
To understand the working of the split() function, consider the following example.
import numpy as np
myArr=np.arange(9)
print("The array is:")
print(myArr)
arr=np.split(myArr,3)
print("The split array is:")
print(arr)
Output:
The array is:
[0 1 2 3 4 5 6 7 8]
The split array is:
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]
Here, we have split myArr into 3 equal parts. You can observe that the split() method returns a numpy array containing sub-arrays of the original array.
The sub-arrays returned by the split() function are views of the original array. Hence, any changes made to the sub-arrays returned by the split() function will reflect in myArr.
import numpy as np
myArr=np.arange(9)
print("The array is:")
print(myArr)
arr=np.split(myArr,3)
print("The split array is:")
print(arr)
arr[0][1]=999
print("The original arrays is:")
print(myArr)
Output:
The array is:
[0 1 2 3 4 5 6 7 8]
The split array is:
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]
The original arrays is:
[ 0 999 2 3 4 5 6 7 8]
In the above example, you can observe that we have made changes to one of the sub-arrays returned by the split() method. However, the change is reflected in the original array. This shows that the split arrays are only the views of the original numpy array.
While splitting an array into equal parts, you need to make sure that the numerical value index_array_or_parts must be a factor of the length of the array. Otherwise, the program will run into a ValueError exception with the message “array split does not result in an equal division”. You can observe this in the following example.
import numpy as np
myArr=np.arange(9)
print("The arrays is:")
print(myArr)
arr=np.split(myArr,4)
print("The split array is:")
print(arr)
Output:
ValueError: array split does not result in an equal division
In the above example, we have tried to split an array containing 9 elements into 4 parts. Due to this, the program runs into ValueError exception.
You can also split an array at its indices. For this, you need to pass an array of indices to the split() function as shown in the following syntax.
numpy.split(myArr, [index1,index2,index3,….., indexN])
If you use the above syntax, myArr is split into different sub-arrays.
- The first sub-array consists of elements from index 0 to index index1-1.
- The second sub-array consists of elements from index index1 to index index2-1.
- The third sub-array consists of elements from index index2 to index index3-1.
- If indexN is less than the length of the array, the last sub-array consists of elements from the index indexN-1 till the last element.
You can observe this in the following example.
import numpy as np
myArr=np.arange(12)
print("The array is:")
print(myArr)
arr=np.split(myArr,[1,4,7])
print("The split array is:")
print(arr)
Output:
The array is:
[ 0 1 2 3 4 5 6 7 8 9 10 11]
The split array is:
[array([0]), array([1, 2, 3]), array([4, 5, 6]), array([ 7, 8, 9, 10, 11])]
In this example, we have split the input array at the indices 1, 4, and 7. Due to this, the first sub-array contains the element at index 0, the second sub-array contains elements from index 1 to 3, the third sub-array contains elements from indices 4 to 6 and the last sub-array contains elements from index 7 till the last element of the original array.
If indexN is greater than the length of the array, you can observe that the last sub-arrays will just be an empty numpy array.
import numpy as np
myArr=np.arange(12)
print("The array is:")
print(myArr)
arr=np.split(myArr,[1,4,11,14,17,20])
print("The split array is:")
print(arr)
Output:
The array is:
[ 0 1 2 3 4 5 6 7 8 9 10 11]
The split array is:
[array([0]), array([1, 2, 3]), array([ 4, 5, 6, 7, 8, 9, 10]), array([11]), array([], dtype=int64), array([], dtype=int64), array([], dtype=int64)]
In this example, we have tried to split the input array at indices 1,4,11,14,17, and 20. As the array contains only 12 elements, the last three sub-arrays returned by the split() method are empty numpy arrays.
Split 2-D numpy arrays vertically and horizontally into equal parts
You can also split a 2-D numpy array vertically using the split() function. When we pass a 2-D array to the split() function, the rows of the array are grouped into sub-arrays.
For instance, you can split a 2-D numpy array into different sub-arrays of equal number of rows as shown below.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.split(myArr,3)
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26]]), array([[27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44],
[45, 46, 47, 48, 49, 50, 51, 52, 53]]), array([[54, 55, 56, 57, 58, 59, 60, 61, 62],
[63, 64, 65, 66, 67, 68, 69, 70, 71],
[72, 73, 74, 75, 76, 77, 78, 79, 80]])]
In this example, we have split a 2-D numpy array of shape 9×9 vertically into 3 sub-arrays. Thus, the split() method returns an array containing 3 arrays of shape 3×9.
Here, the required number of sub-arrays should be a factor of the number of rows in the original array. Otherwise the split() method won’t be able to split the original array and the program runs into ValueError exception. You can observe this in the following example.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.split(myArr,4)
print("The split array is:")
print(arr)
Output:
ValueError: array split does not result in an equal division
Here, we have tried to split an array having 9 rows to 4 sub-arrays. As a result, the program runs into ValueError exception.
If you want to split a 2-dimensional array horizontally i.e. along the columns, you can use the parameter axis with value 1 in the split() method as shown below.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.split(myArr,3, axis=1)
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2],
[ 9, 10, 11],
[18, 19, 20],
[27, 28, 29],
[36, 37, 38],
[45, 46, 47],
[54, 55, 56],
[63, 64, 65],
[72, 73, 74]]), array([[ 3, 4, 5],
[12, 13, 14],
[21, 22, 23],
[30, 31, 32],
[39, 40, 41],
[48, 49, 50],
[57, 58, 59],
[66, 67, 68],
[75, 76, 77]]), array([[ 6, 7, 8],
[15, 16, 17],
[24, 25, 26],
[33, 34, 35],
[42, 43, 44],
[51, 52, 53],
[60, 61, 62],
[69, 70, 71],
[78, 79, 80]])]
In this example, we have split an array having 9 columns to sub-arrays having three columns each. Here, the number of sub-arrays required should be a factor of number of columns. Otherwise, the program runs into ValueError exception.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.split(myArr,4,axis=1)
print("The split array is:")
print(arr)
Output:
ValueError: array split does not result in an equal division
Here, we have tried to split an array having 9 columns to 4 sub-arrays. As a result, the program runs into ValueError exception.
Split 2-D numpy arrays using row and column indices
You can also split the numpy arrays vertically based on their row index using the following syntax.
numpy.split(myArr, [rowindex1,rowindex2,rowindex3,….., rowindexN])
If you use the above syntax, myArr is split vertically into different subarrays.
- The first subarray consists of rows from row index 0 to row index rowindex1-1.
- The second subarray consists of rows from row index rowindex1 to row index rowindex2-1.
- The third subarray consists of rows from row index rowindex2 to row index rowindex3-1.
- If indexN is less than the number of rows in the array, the last subarray consists of rows from row index rowindexN-1 till the last row.
You can observe this in the following example.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.split(myArr,[2,5])
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17]]), array([[18, 19, 20, 21, 22, 23, 24, 25, 26],
[27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44]]), array([[45, 46, 47, 48, 49, 50, 51, 52, 53],
[54, 55, 56, 57, 58, 59, 60, 61, 62],
[63, 64, 65, 66, 67, 68, 69, 70, 71],
[72, 73, 74, 75, 76, 77, 78, 79, 80]])]
In the above example, we have split the input array at the row index 2 and 5. Thus we get 3 sub-arrays. The first sub-array contains rows from start till index 1. The second row sub-array contains rows from index 2 to index 4. The last sub-array contains rows from index 5 till last.
If rowindexN
is greater than the length of the array, you can observe that the last sub-arrays will just be an empty numpy array.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.split(myArr,[2,5,10,12,15])
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17]]), array([[18, 19, 20, 21, 22, 23, 24, 25, 26],
[27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44]]), array([[45, 46, 47, 48, 49, 50, 51, 52, 53],
[54, 55, 56, 57, 58, 59, 60, 61, 62],
[63, 64, 65, 66, 67, 68, 69, 70, 71],
[72, 73, 74, 75, 76, 77, 78, 79, 80]]), array([], shape=(0, 9), dtype=int64), array([], shape=(0, 9), dtype=int64), array([], shape=(0, 9), dtype=int64)]
In the above example, we have tried to split the original array at the row indices 2, 5, 10, 12, and 15. However, the input array has only 9 rows. Hence, the output array contains 3 empty numpy arrays.
To horizontally split a numpy array i.e. to group the columns based on their column index, you can use the following syntax.
numpy.split(myArr, [columnindex1,columnindex2,columnindex3,….., columnindexN], axis=1)
If you use the above syntax, myArr is split into different sub-arrays.
- The first sub-array consists of columns from column index 0 to column index columnindex1-1.
- The second sub-array consists of columns from column index columnindex1 to column index columnindex2-1.
- The third sub-array consists of columns from column index columnindex2 to column index columnindex3-1.
- If indexN is less than the number of column in the array, the last sub-array consists of columns from column index columnindexN-1 till the last row.
You can observe this in the following example.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.split(myArr,[2,5],axis=1)
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1],
[ 9, 10],
[18, 19],
[27, 28],
[36, 37],
[45, 46],
[54, 55],
[63, 64],
[72, 73]]), array([[ 2, 3, 4],
[11, 12, 13],
[20, 21, 22],
[29, 30, 31],
[38, 39, 40],
[47, 48, 49],
[56, 57, 58],
[65, 66, 67],
[74, 75, 76]]), array([[ 5, 6, 7, 8],
[14, 15, 16, 17],
[23, 24, 25, 26],
[32, 33, 34, 35],
[41, 42, 43, 44],
[50, 51, 52, 53],
[59, 60, 61, 62],
[68, 69, 70, 71],
[77, 78, 79, 80]])]
In the above example, we have split the input array at column index 2 and 5. Hence, the output array contains three sub-arrays. The first sub-array contains columns at index 0 and 1. The second sub-array contains columns from index 2 to 4. The last sub-array contains columns from index 5 to the last column.
If columnindexN is greater than the length of the array, you can observe that the last sub-arrays will just be an empty numpy array.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.split(myArr,[2,5,10,12,15],axis=1)
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1],
[ 9, 10],
[18, 19],
[27, 28],
[36, 37],
[45, 46],
[54, 55],
[63, 64],
[72, 73]]), array([[ 2, 3, 4],
[11, 12, 13],
[20, 21, 22],
[29, 30, 31],
[38, 39, 40],
[47, 48, 49],
[56, 57, 58],
[65, 66, 67],
[74, 75, 76]]), array([[ 5, 6, 7, 8],
[14, 15, 16, 17],
[23, 24, 25, 26],
[32, 33, 34, 35],
[41, 42, 43, 44],
[50, 51, 52, 53],
[59, 60, 61, 62],
[68, 69, 70, 71],
[77, 78, 79, 80]]), array([], shape=(9, 0), dtype=int64), array([], shape=(9, 0), dtype=int64), array([], shape=(9, 0), dtype=int64)]
In the above example, we have tried to split the original array at the column indices 2, 5, 10, 12, and 15. However, the input array has only 9 columns. Hence, the output array contains 3 empty numpy arrays.
The split() function has a drawback. It raises a ValueError exception when it cannot split the array into subarrays of equal length. To avoid running into the ValueError exception, you can use the array_split() function.
Split a Numpy array using the array_split() function
The array_split() function works in a similar manner to the split() function. The only difference is that it doesn’t raise a ValueError exception when it cannot split an array into equal elements. Instead, For an array of length l that should be split into n parts, it returns l % n sub-arrays of size l//n + 1 and the rest of size l//n.
You can observe this in the following example.
import numpy as np
myArr=np.arange(12)
print("The array is:")
print(myArr)
arr=np.array_split(myArr,5)
print("The split array is:")
print(arr)
Output:
The array is:
[ 0 1 2 3 4 5 6 7 8 9 10 11]
The split array is:
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7]), array([8, 9]), array([10, 11])]
In the above example, the array_split() function cannot split the input array into 5 equal parts. However, it doesn’t raise the ValueError exception. Now, the array_split() function returns 12%5 i.e. 2 sub-arrays of size (12/5)+1 i.e. 3. Rest of the sub-arrays are of the size 12//5 i.e. 2.
In the output array, you can observe that there are 2 sub-arrays with 3 elements each and 3 sub-arrays having 2 elements each.
You can also split a 2-D array along rows or columns using the array_split() function as shown below.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.array_split(myArr,4)
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26]]), array([[27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44]]), array([[45, 46, 47, 48, 49, 50, 51, 52, 53],
[54, 55, 56, 57, 58, 59, 60, 61, 62]]), array([[63, 64, 65, 66, 67, 68, 69, 70, 71],
[72, 73, 74, 75, 76, 77, 78, 79, 80]])]
In this example, we have tried to vertically split a numpy array with 9 rows into 4 parts. Due to this, we will get 9%4 i.e. 1 sub-array of size (9//4)+1 i.e. 3. Rest of the sub-arrays will be of the size 9//4 i.e. 2.
In the output array, you can observe that one sub-array contains three rows and rest of the sub-arrays contains two rows each.
Split a Numpy Array Using the hsplit() Function
The hsplit() function is used to split a 2-D numpy array horizontally i.e. along the columns. To split a 2-D array into sub-arrays having equal number of columns, you can pass the original array and number of required sub-arrays to the hsplit() function. After execution, it returns a numpy array containing all the sub-arrays as shown below.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.hsplit(myArr,3)
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2],
[ 9, 10, 11],
[18, 19, 20],
[27, 28, 29],
[36, 37, 38],
[45, 46, 47],
[54, 55, 56],
[63, 64, 65],
[72, 73, 74]]), array([[ 3, 4, 5],
[12, 13, 14],
[21, 22, 23],
[30, 31, 32],
[39, 40, 41],
[48, 49, 50],
[57, 58, 59],
[66, 67, 68],
[75, 76, 77]]), array([[ 6, 7, 8],
[15, 16, 17],
[24, 25, 26],
[33, 34, 35],
[42, 43, 44],
[51, 52, 53],
[60, 61, 62],
[69, 70, 71],
[78, 79, 80]])]
In this example, we have split a 9×9 array into three sub-arrays of 9×3 shape using the hsplit() function.
If the number of columns in the original array is not a multiple of the required number of subarrays, the hsplit() function won’t be able to divide the original array equally into subarrays. In such a case, the program runs into ValueError exception. You can observe this in the following example.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.hsplit(myArr,4)
print("The split array is:")
print(arr)
Output:
ValueError: array split does not result in an equal division
In the above example, we have tried to horizontally split an array with 9 columns into four sub-arrays. As 4 is not a factor of 9, the hsplit() function cannot divide the array equally and the program runs into ValueError exception.
To horizontally split a numpy array using the column indices, you can use the following syntax.
numpy.hsplit(myArr, [columnindex1,columnindex2,columnindex3,….., columnindexN])
If you use the above syntax, myArr is split horizontally into different sub-arrays.
- The first sub-array consists of columns from column index 0 to column index columnindex1-1.
- The second sub-array consists of columns from column index columnindex1 to column index columnindex2-1.
- The third sub-array consists of columns from column index columnindex2 to column index columnindex3-1.
If indexN is less than the number of column in the array, the last subarray consists of columns from column index columnindexN-1 till the last row.
You can observe this in the following example.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.hsplit(myArr,[2,5,8])
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1],
[ 9, 10],
[18, 19],
[27, 28],
[36, 37],
[45, 46],
[54, 55],
[63, 64],
[72, 73]]), array([[ 2, 3, 4],
[11, 12, 13],
[20, 21, 22],
[29, 30, 31],
[38, 39, 40],
[47, 48, 49],
[56, 57, 58],
[65, 66, 67],
[74, 75, 76]]), array([[ 5, 6, 7],
[14, 15, 16],
[23, 24, 25],
[32, 33, 34],
[41, 42, 43],
[50, 51, 52],
[59, 60, 61],
[68, 69, 70],
[77, 78, 79]]), array([[ 8],
[17],
[26],
[35],
[44],
[53],
[62],
[71],
[80]])]
In this example, we have split an array with 9 columns at column index 2, 5, and 8. Hence, the array is split into 4 sub-arrays. The first sub-array contains columns from index 0 to 1, the second sub-array contains columns from index 2 to 4, the third sub-array contains columns from index 5 to 7 and the fourth sub-array contains the column at index 8 in the original array.
If columnindexN is greater than the length of the array, you can observe that the last sub-arrays will just be an empty numpy array.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The arrays is:")
print(myArr)
arr=np.hsplit(myArr,[2,5,8,12,20])
print("The split array is:")
print(arr)
Output:
The arrays is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1],
[ 9, 10],
[18, 19],
[27, 28],
[36, 37],
[45, 46],
[54, 55],
[63, 64],
[72, 73]]), array([[ 2, 3, 4],
[11, 12, 13],
[20, 21, 22],
[29, 30, 31],
[38, 39, 40],
[47, 48, 49],
[56, 57, 58],
[65, 66, 67],
[74, 75, 76]]), array([[ 5, 6, 7],
[14, 15, 16],
[23, 24, 25],
[32, 33, 34],
[41, 42, 43],
[50, 51, 52],
[59, 60, 61],
[68, 69, 70],
[77, 78, 79]]), array([[ 8],
[17],
[26],
[35],
[44],
[53],
[62],
[71],
[80]]), array([], shape=(9, 0), dtype=int64), array([], shape=(9, 0), dtype=int64)]
In the above example, we have split the original array at indices 2, 5, 8, 12, and 20. As the input array contains only 9 columns, the output array contains two empty sub-arrays.
In essence, the hsplit() function works exactly like the split() function with the parameter axis=1.
Split arrays using the vsplit() function
You can split a numpy array vertically i.e. along the rows using the vsplit() function. To split a 2-D array into sub-arrays having equal number of rows, you can pass the original array and number of required sub-arrays to the vsplit() function. After execution, it returns a numpy array containing all the sub-arrays as shown below.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The array is:")
print(myArr)
arr=np.vsplit(myArr,3)
print("The split array is:")
print(arr)
Output:
The array is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23, 24, 25, 26]]), array([[27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44],
[45, 46, 47, 48, 49, 50, 51, 52, 53]]), array([[54, 55, 56, 57, 58, 59, 60, 61, 62],
[63, 64, 65, 66, 67, 68, 69, 70, 71],
[72, 73, 74, 75, 76, 77, 78, 79, 80]])]
In this example, we have vertically split a 9×9 array into three sub-arrays of size 3×9 using the vsplit() function.
If the number of rows in the original array is not a multiple of the required number of subarrays, the vsplit() function won’t be able to divide the original array equally into subarrays. In such a case, the program runs into ValueError exception. You can observe this in the following example.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The array is:")
print(myArr)
arr=np.vsplit(myArr,4)
print("The split array is:")
print(arr)
Output:
ValueError: array split does not result in an equal division
In this example, you can observe that we have tried to vertically split a numpy array with 9 rows into 4 sub-arrays. As 4 is not a factor of 9, the vsplit() function cannot split the original and the program runs into ValueError exception.
To split a 2-D numpy array vertically based on the row indices, you can use the following syntax.
numpy.vsplit(myArr, [rowindex1,rowindex2,rowindex3,….., rowindexN])
If you use the above syntax, myArr is split vertically into different sub-arrays.
- The first sub-array consists of rows from row index 0 to row index rowindex1-1.
- The second sub-array consists of rows from row index rowindex1 to row index rowindex2-1.
- The third sub-array consists of rows from row index rowindex2 to row index rowindex3-1.
If indexN is less than the number of rows in the array, the last subarray consists of rows from row index rowindexN-1 till the last row.
You can observe this in the following example.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The array is:")
print(myArr)
arr=np.vsplit(myArr,[2,5,8])
print("The split array is:")
print(arr)
Output:
The array is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17]]), array([[18, 19, 20, 21, 22, 23, 24, 25, 26],
[27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44]]), array([[45, 46, 47, 48, 49, 50, 51, 52, 53],
[54, 55, 56, 57, 58, 59, 60, 61, 62],
[63, 64, 65, 66, 67, 68, 69, 70, 71]]), array([[72, 73, 74, 75, 76, 77, 78, 79, 80]])]
In this example, we have vertically split an array with 9 rows at row index 2, 5, and 8. Hence, the array is split into 4 sub-arrays. The first sub-array contains rows from index 0 to 1, the second sub-array contains rows from index 2 to 4, the third sub-array contains rows from index 5 to 7 and the fourth sub-array contains row at index 8 from the original array.
If rowindexN is greater than the length of the array, you can observe that the last sub-arrays will just be an empty numpy array.
import numpy as np
myArr=np.arange(81).reshape((9,9))
print("The array is:")
print(myArr)
arr=np.vsplit(myArr,[2,5,8,12,20])
print("The split array is:")
print(arr)
Output:
The array is:
[[ 0 1 2 3 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 17]
[18 19 20 21 22 23 24 25 26]
[27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44]
[45 46 47 48 49 50 51 52 53]
[54 55 56 57 58 59 60 61 62]
[63 64 65 66 67 68 69 70 71]
[72 73 74 75 76 77 78 79 80]]
The split array is:
[array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16, 17]]), array([[18, 19, 20, 21, 22, 23, 24, 25, 26],
[27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44]]), array([[45, 46, 47, 48, 49, 50, 51, 52, 53],
[54, 55, 56, 57, 58, 59, 60, 61, 62],
[63, 64, 65, 66, 67, 68, 69, 70, 71]]), array([[72, 73, 74, 75, 76, 77, 78, 79, 80]]), array([], shape=(0, 9), dtype=int64), array([], shape=(0, 9), dtype=int64)]
In the above example, we have vertically split the original array at indices 2, 5, 8, 12, and 20. As the input array contains only 9 columns, the output array contains two empty sub-arrays.
The vsplit() function works in a similar manner to the split() function with parameter axis=0.
Suggested Reading: If you are into machine learning, you can read this article on regression in machine learning. You might also like this article on k-means clustering with numerical example.
Split a 3-D array across depth in Python
If you have a 3-D numpy array, you can use the split() function, array_split() function or the split() function to split the array across depth.
Split a 3-D array across depth using the split() function
To split a 3-D array into sub-arrays having equal depth, you can pass the original array and number of required sub-arrays to the split() or array_split() function with the parameter axis=2. After execution, the functions return a numpy array containing all the sub-arrays as shown below.
import numpy as np
myArr=np.arange(64).reshape((4,4,4))
print("The array is:")
print(myArr)
arr=np.split(myArr,2, axis=2)
print("The split array is:")
print(arr)
Output:
The array is:
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
[[16 17 18 19]
[20 21 22 23]
[24 25 26 27]
[28 29 30 31]]
[[32 33 34 35]
[36 37 38 39]
[40 41 42 43]
[44 45 46 47]]
[[48 49 50 51]
[52 53 54 55]
[56 57 58 59]
[60 61 62 63]]]
The split array is:
[array([[[ 0, 1],
[ 4, 5],
[ 8, 9],
[12, 13]],
[[16, 17],
[20, 21],
[24, 25],
[28, 29]],
[[32, 33],
[36, 37],
[40, 41],
[44, 45]],
[[48, 49],
[52, 53],
[56, 57],
[60, 61]]]), array([[[ 2, 3],
[ 6, 7],
[10, 11],
[14, 15]],
[[18, 19],
[22, 23],
[26, 27],
[30, 31]],
[[34, 35],
[38, 39],
[42, 43],
[46, 47]],
[[50, 51],
[54, 55],
[58, 59],
[62, 63]]])]
In the above example, we have split a 4x4x4 3-D numpy array across depth into two arrays of shape 4x4x2.
If the depth of the original array is not a multiple of the required number of sub-arrays, the split() function won’t be able to divide the original array equally into sub-arrays. In such a case, the program runs into ValueError exception. You can observe this in the following example.
import numpy as np
myArr=np.arange(64).reshape((4,4,4))
print("The array is:")
print(myArr)
arr=np.split(myArr,3, axis=2)
print("The split array is:")
print(arr)
Output:
ValueError: array split does not result in an equal division
In the above example, we have tried to split a 4x4x4 array into 3 parts. As 3 is not a factor of 4, the program runs into ValueError exception.
Split a 3-D array across depth using the array_split() function
When the number of required sub-arrays is not a factor of the depth of the array, the array_split() function won’t throw any array. The array_split() function will return depth%number_of _sub-arrays arrays of depth (depth//number_of _sub-arrays) and rest of the sub-arrays will have depth of (depth//number_of _sub-arrays). For instance, consider the following example.
import numpy as np
myArr=np.arange(64).reshape((4,4,4))
print("The array is:")
print(myArr)
arr=np.array_split(myArr,3, axis=2)
print("The split array is:")
print(arr)
Output:
The array is:
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
[[16 17 18 19]
[20 21 22 23]
[24 25 26 27]
[28 29 30 31]]
[[32 33 34 35]
[36 37 38 39]
[40 41 42 43]
[44 45 46 47]]
[[48 49 50 51]
[52 53 54 55]
[56 57 58 59]
[60 61 62 63]]]
The split array is:
[array([[[ 0, 1],
[ 4, 5],
[ 8, 9],
[12, 13]],
[[16, 17],
[20, 21],
[24, 25],
[28, 29]],
[[32, 33],
[36, 37],
[40, 41],
[44, 45]],
[[48, 49],
[52, 53],
[56, 57],
[60, 61]]]), array([[[ 2],
[ 6],
[10],
[14]],
[[18],
[22],
[26],
[30]],
[[34],
[38],
[42],
[46]],
[[50],
[54],
[58],
[62]]]), array([[[ 3],
[ 7],
[11],
[15]],
[[19],
[23],
[27],
[31]],
[[35],
[39],
[43],
[47]],
[[51],
[55],
[59],
[63]]])]
Here, we have tried to split a 4x4x4 array across depth into 3 parts. Hence, the output array contains 4%3 i.e. 1 sub-array of depth (4//3)+1 i.e. 2 and the rest of the sub-arrays have depth 4//3 i.e. 1.
In the output, you can observe that we have on sub-array of shape 4x4x2 and 2 sub-arrays of shape 4x4x1.
Split a 3-D array across depth using the dsplit() function
Instead of the split() function, you can also use the dsplit() function to split a numpy array across its depth. For this, you just need to pass the original array and the number of required sub-arrays to the dsplit() function. After execution, the dsplit() function returns a numpy array containing all the sub-arrays. You can observe this in the following example.
import numpy as np
myArr=np.arange(64).reshape((4,4,4))
print("The array is:")
print(myArr)
arr=np.dsplit(myArr,2)
print("The split array is:")
print(arr)
Output:
The array is:
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
[[16 17 18 19]
[20 21 22 23]
[24 25 26 27]
[28 29 30 31]]
[[32 33 34 35]
[36 37 38 39]
[40 41 42 43]
[44 45 46 47]]
[[48 49 50 51]
[52 53 54 55]
[56 57 58 59]
[60 61 62 63]]]
The split array is:
[array([[[ 0, 1],
[ 4, 5],
[ 8, 9],
[12, 13]],
[[16, 17],
[20, 21],
[24, 25],
[28, 29]],
[[32, 33],
[36, 37],
[40, 41],
[44, 45]],
[[48, 49],
[52, 53],
[56, 57],
[60, 61]]]), array([[[ 2, 3],
[ 6, 7],
[10, 11],
[14, 15]],
[[18, 19],
[22, 23],
[26, 27],
[30, 31]],
[[34, 35],
[38, 39],
[42, 43],
[46, 47]],
[[50, 51],
[54, 55],
[58, 59],
[62, 63]]])]
In the above example, we have split a 4x4x4 3-D numpy array across depth into two arrays of shape 4x4x2 using the dsplit() function.
If the depth of the original array is not a multiple of the required number of sub-arrays, the dsplit() function won’t be able to divide the original array equally into sub-arrays. In such a case, the program runs into ValueError exception. You can observe this in the following example.
import numpy as np
myArr=np.arange(64).reshape((4,4,4))
print("The array is:")
print(myArr)
arr=np.dsplit(myArr,3)
print("The split array is:")
print(arr)
Output:
ValueError: array split does not result in an equal division
In the above example, we have tried to split a 4x4x4 array into 3 parts. As 3 is not a factor of 4, the program runs into ValueError exception.
To split a 3-D numpy array across depth using indices, you can use the following syntax.
numpy.dsplit(myArr, [depthindex1,depthindex2,depthindex3,….., depthindexN])
If you use the above syntax, myArr is split across depth into different subarrays.
- The first sub-array consists of elements from depth index 0 to depth index depthindex1-1.
- The second sub-array consists of elements from depth index depthindex1 to depth index depthindex2-1.
- The third sub-array consists of elements from depth index depthindex2 to depth index depthindex3-1.
If depthindexN is less than the number of elements across depth in the array, the last subarray consists of elements from depth index depthindexN till the last row.
You can observe this in the following example.
import numpy as np
myArr=np.arange(64).reshape((4,4,4))
print("The array is:")
print(myArr)
arr=np.dsplit(myArr,[1,3])
print("The split array is:")
print(arr)
Output:
The array is:
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
[[16 17 18 19]
[20 21 22 23]
[24 25 26 27]
[28 29 30 31]]
[[32 33 34 35]
[36 37 38 39]
[40 41 42 43]
[44 45 46 47]]
[[48 49 50 51]
[52 53 54 55]
[56 57 58 59]
[60 61 62 63]]]
The split array is:
[array([[[ 0],
[ 4],
[ 8],
[12]],
[[16],
[20],
[24],
[28]],
[[32],
[36],
[40],
[44]],
[[48],
[52],
[56],
[60]]]), array([[[ 1, 2],
[ 5, 6],
[ 9, 10],
[13, 14]],
[[17, 18],
[21, 22],
[25, 26],
[29, 30]],
[[33, 34],
[37, 38],
[41, 42],
[45, 46]],
[[49, 50],
[53, 54],
[57, 58],
[61, 62]]]), array([[[ 3],
[ 7],
[11],
[15]],
[[19],
[23],
[27],
[31]],
[[35],
[39],
[43],
[47]],
[[51],
[55],
[59],
[63]]])]
In the above example, we have split the input array at depth index 1 and 3 using the dsplit() function. Hence, the first sub-array in the output contains elements till index 0, the second sub-array contains elements from depth index 1 till depth index 2. The third sub-array contains elements at depth index 3.
You can verify that the output sub-arrays are of the shape (4, 4, 1), (4, 4, 2), and (4, 4, 1).
If depthindexN is greater than the number of elements across depth, you can observe that the last sub-arrays will just be an empty numpy array as shown below.
import numpy as np
myArr=np.arange(64).reshape((4,4,4))
print("The array is:")
print(myArr)
arr=np.dsplit(myArr,[1,3,5,8])
print("The split array is:")
print(arr)
Output:
The array is:
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
[[16 17 18 19]
[20 21 22 23]
[24 25 26 27]
[28 29 30 31]]
[[32 33 34 35]
[36 37 38 39]
[40 41 42 43]
[44 45 46 47]]
[[48 49 50 51]
[52 53 54 55]
[56 57 58 59]
[60 61 62 63]]]
The split array is:
[array([[[ 0],
[ 4],
[ 8],
[12]],
[[16],
[20],
[24],
[28]],
[[32],
[36],
[40],
[44]],
[[48],
[52],
[56],
[60]]]), array([[[ 1, 2],
[ 5, 6],
[ 9, 10],
[13, 14]],
[[17, 18],
[21, 22],
[25, 26],
[29, 30]],
[[33, 34],
[37, 38],
[41, 42],
[45, 46]],
[[49, 50],
[53, 54],
[57, 58],
[61, 62]]]), array([[[ 3],
[ 7],
[11],
[15]],
[[19],
[23],
[27],
[31]],
[[35],
[39],
[43],
[47]],
[[51],
[55],
[59],
[63]]]), array([], shape=(4, 4, 0), dtype=int64), array([], shape=(4, 4, 0), dtype=int64)]
In this example, we have tried to split the input array at indices 1, 3, 5, and 8. As the input array is of depth 4, the output array contains two empty numpy arrays.
Conclusion
In this article, we have discussed how to split a numpy array in Python. To learn more about dataframes and numpy arrays, you can read this article on pandas dataframe index. You might also like this article on text analysis in Python.
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.